Node Specification Language#
SAMSON has a powerful Node Specification Language (NSL) that may be used to select various structures (ligands, receptors, etc.) and nodes (atoms, residues, etc.) based on their properties. NSL can be used in the Find command or to filter nodes in the Document view.
Interactive tutorial in SAMSON (Help > Tutorials): "Selecting using the Node Specification Language".
Find command#
The Find command (1) in SAMSON lets you enter a NSL string to select nodes (atoms, residues, etc.) from the active document:
- Select > Find, , : Ctrl+F, : Cmd+F
When using the Find command, you may press the Tab key in the search box for context-sensitive completion. This is particularly useful when searching nodes by name. For example, entering "ALA
(with the opening quote sign and without the closing quote sign to indicate a start of a name) and pressing the Tab key will list all nodes whose name begins with "ALA", e.g:
"ALA 22 Backbone"
"ALA 22 Side chain"
"ALA 22"
"ALA 28 Backbone"
"ALA 28 Side chain"
"ALA 28"
...
Find using Document View#
A NSL expression may also be used to filter nodes in the Document view (1) and to select them by clicking Enter, as shown in the image below with selecting side chains via the following NSL: n.t sc
(full version: node.type sidechain
):
- Interface > Document view, , : Ctrl+1, : Cmd+1
Examples#
Here are some examples of NSL expressions:
node.category ligand, receptor
(short version:n.c lig, rec
): matches ligands and receptorsHydrogen
(short version:H
): matches hydrogensresidue.id 20:40, 50:60
(short version:r.id 20:40, 50:60
): matches residues with IDs between 20 and 40 and between 50 and 60atom.chainID > 2
(short version:a.ci > 2
): matches atoms with a chain ID strictly larger than 2Carbon in node.selected
(short version:C in n.s
): matches carbons in the current selectionbond.order > 1.5
(short version:b.o > 1.5
): matches bonds with order strictly larger than 1.5node.type backbone
(short version:n.t bb
): matches backbone nodes
Logical operators examples:
C or H
: matches atoms that are carbons or hydrogensnode.type residue and not residue.type ALA
(short version:n.t r and not r.t ALA
): matches all non-alanine residues
Topology operators examples:
O in node.type sidechain
(short version:O in n.t sc
): matches oxygens in side chain nodesn.t a out of n.t r
: matches atoms that do not belong to residuesnode.type sidechain having S
(short version:n.t sc h S
): matches side chain nodes that have at least one sulfur atomH linking O
(short version:H l O
): matches hydrogens bonded to oxygen atoms
Proximity operators examples:
"CA" within 5A of S
(short version:"CA" w 5A of S
): matches nodes named "CA" that are within 5 angstrom of any sulfur atom (use quotes, since names may contain spaces)node.type residue beyond 5A of node.selected
(short version:n.t r b 5A of n.s
): matches residue nodes beyond 5 angstrom of the current selectionresidue.secondaryStructure helix
(short version:r.ss h
): matches residue nodes in alpha helices
Specification by attributes#
Nodes and groups of nodes may be specified by their attributes.
SAMSON's NSL supports several different attribute spaces based on node types. Since nodes may have different types in SAMSON (e.g. atom, bond, etc.), each attribute is defined in an attribute space:
node
(short name:n
)atom
(short name:a
)bond
(short name:b
)chain
(short name:c
)residue
(short name:r
)structuralGroup
(short name:sg
)
The node
attribute space (short name: n
) corresponds to attributes that are defined for each node. For example, the selection flag is a node attribute, since each node has a selection flag. Hence, the NSL expression node.selectionFlag true
may match any node whose selection flag is true, regardless of its node type (atom, bond, etc.).
Specification by name#
Nodes may be specified by names, i.e. strings enclosed with quotes with the possibility to use the wildcard characters *
.
Note
Bond names are formed from the names of the atoms they bond, and cannot be searched by name.
For example, entering "ALA*"
may yield ALA residues, backbones, and side chains, e.g:
"ALA 22 Backbone"
"ALA 22 Side chain"
"ALA 22"
"ALA 28 Backbone"
"ALA 28 Side chain"
"ALA 28"
...
Examples:
"CA*"
: matches nodes with names that start with CA"*AL*"
: matches nodes that have AL in their names, e.g. ALA and VAL residues, backbones, and side chains
Specification by symbols and element names#
In order to efficiently match atoms of a given element type, symbols and element names are valid NSL expressions.
Possible values: atomic symbols and element names (with the first character capitalized).
Examples:
Carbon
: matches carbon atomsC
: matches carbon atomsCa
: matches calcium atomsH
: matches hydrogen atoms
Lists and ranges#
NSL supports providing lists (divided by comma ,
) and ranges (via colon :
) wherever possible. Examples:
atom.chain A, B, C
(short version:a.c A,B,C
): matches atoms in chains A, B, and Cresidue.id 20:40, 50:60
(short version:r.id 20:40, 50:60
): matches residues with IDs between 20 and 40, and between 50 and 60atom.x -1nm:1nm
(short version:a.x -1nm:1nm
): matches atoms with x-coordinate between -1 nm and 1 nm
Operations on sets#
Logical operators#
Logical operators may be used to perform operations on sets.
Available operators:
and
not
or
xor
(exclusive or).
Examples:
sg.id 1000:1040 and sg.nat < 4
: matches structural groups with IDs between 1000 and 1040 that have less than 4 atomsa.sn <= 20 or a.sn >=40
: matches atoms with serial number smaller than 20 or larger than 40n.t r and not r.t CYS
: matches residues that are not cysteinsa.sn >= 20 xor a.oc >= 0.5
: matches atoms which either have a serial number larger than 20, or have an occupancy larger than 0.5, but not those that satisfy both conditions
Note
Without proper care, the not
condition might produce surprising results. For example, not r.t CYS
does not return the list of residues that are not cysteins. Indeed, a folder is also a node which is not a residue that's a cystein. If only residues should be returned, a proper query would be n.t r and not r.t CYS
.
Topology operators#
Containment operators may be used to specify inclusion in or out of a set, one set having nodes from another set, or having atoms linked to atoms in another set.
Available operators:
in
out of
having
(short name:h
) - whether a node has another node(s)linking
(short name:l
) - whether an atom or a structural node is linked to another atom(s) or structural node(s) containing atoms
Examples:
n.t a in "2AZ8"
: matches atoms in "2AZ8"n.t a in r.t CYS
: matches atoms that belong to cysteinsH in r.t ARG
: matches hydrogens that belong to argininsH in n.h
: matches hidden hydrogensn.t a out of r.t PRO
: matches atoms that do not belong to prolinesnode.type sidechain having S
(short version:n.t sc h S
): matches side chain nodes that have at least one sulfur atomH linking O
(short version:H l O
): matches hydrogens bonded to oxygen atomsnode.type residue linking (residue.type PRO, CYS)
(short version:n.t r l (r.t PRO, CYS)
): matches residues that have atoms linked to atoms in PRO or CYS residues, this includes PRO and CYS residues themselves.
Proximity operators#
Proximity operators may be used to select nodes based on distances.
Available operators:
within {distance} of
(short name:w {distance} of
)beyond {distance} of
(short name:b {distance} of
)
Examples:
C within 5A of "GLN 2"
(short version:C w 5A of "GLN 2"
): matches carbons within 5 angstrom of "GLN 2"node.type atom beyond 5A of "2AZ8-IA"
(short version:n.t a b 5A of "2AZ8-IA"
): matches atoms beyond 5 angstrom of "2AZ8-IA"
Note
Without proper care, the within
and beyond
conditions might produce surprising results. For example, * within 5A of "GLN 2"
will select the document node, since some atoms in the document are within 5 angstrom of "GLN 2"
: the atoms that belong to "GLN 2"
. If only atom nodes should be returned, a proper query would be n.t a within 5A of "GLN 2"
. Note that this latter query would also return atoms that belong to "GLN 2"
. If only atoms outside "GLN 2"
should be returned, then a proper query would be n.t a within 5A of "GLN 2" out of "GLN 2"
.
Length units#
femtometer
(short name:fm
)picometer
(short name:pm
)angstrom
(short name:A
)nanometer
(short name:nm
)micrometer
(short name:um
)millimeter
(short name:mm
)centimeter
(short name:cm
)meter
(short name:m
)
Mass units#
dalton
(short name:Da
)kilodalton
(short name:kDa
)megadalton
(short name:MDa
)gigadalton
(short name:GDa
)electronMass
(short name:auMass
)yoctogram
(short name:yg
)zeptogram
(short name:zg
)attogram
(short name:ag
)femtogram
(short name:fg
)picogram
(short name:pg
)nanogram
(short name:ng
)microgram
(short name:ug
)gram
(short name:g
)kilogram
(short name:kg
)