Skip to content

Node Specification Language#

SAMSON has a powerful Node Specification Language (NSL) that may be used to select various structures (ligands, receptors, etc.) and nodes (atoms, residues, etc.) based on their properties. NSL can be used in the Find command or to filter nodes in the Document view.

You can try the following interactive tutorial in SAMSON (Help > Tutorials): "Selecting using the Node Specification Language".

For example, the Find command in SAMSON lets users enter a NSL string to select nodes from the document:

Select nodes with the Find command

A NSL expression may also be used to filter nodes in the Document view. In the example below, the user has entered n.t sc (node.type sidechain) and hit Enter to select all side chains:

Filter and select nodes from the document view

Here are some examples of NSL expressions:

  • node.category ligand or node.category receptor : select all ligands and receptors (short version: n.c lig or n.c rec)
  • Hydrogen : select all hydrogens (short version: H)
  • atom.chainID > 2 : select all atoms with a chain ID strictly larger than 2 (short version: a.ci > 2)
  • Carbon in node.selected : select all carbons in the current selection (short version: C in n.s)
  • bond.order > 1.5 : select all bonds with order strictly larger than 1.5 (short version: b.o > 1.5)
  • node.type backbone : select all backbone nodes (short version: n.t bb)
  • O in node.type sidechain : select all oxygens in side chain nodes (short version: O in n.t sc)
  • "CA" within 5A of S : select all nodes named "CA" that are within 5 angstrom of any sulfur atom (short version: "CA" w 5A of S) (use quotes, since names may contain spaces)
  • node.type residue beyond 5A of node.selected : select all residue nodes beyond 5 angstrom of the current selection (short version: n.t r b 5A of n.s)
  • residue.secondaryStructure helix : select residue nodes in alpha helices (short version: r.ss h)
  • node.type sidechain having S : select side chain nodes that have at least one sulfur atom (short version: n.t sc h S)
  • H linking O : select all hydrogens bonded to oxygen atoms (short version: H l O)

Logical operators may be used:

  • C or H : select atoms that are carbons or hydrogens

When using the Find command as well, you may press the Tab key in the search box for context-sensitive completion. This is particularly useful when searching nodes by name. For example, entering "ALA" (with the quote sign, to indicate a name) and pressing the Tab key will list all nodes whose name begins with "ALA", e.g:

  • "ALA 22 Backbone"
  • "ALA 22 Side-chain"
  • "ALA 22"
  • "ALA 28 Backbone"
  • "ALA 28 Side-chain"
  • "ALA 28"
  • ...

Specification by attributes#

Nodes and groups of nodes may be specified by their attributes.

SAMSON's NSL supports several different attribute spaces based on node types. Since nodes may have different types in SAMSON (e.g. atom, bond, etc.), each attribute is defined in an attribute space:

The node attribute space (short name n) corresponds to attributes that are defined for each node. For example, the selection flag is a node attribute, since each node has a selection flag. Hence, the NSL expression node.selectionFlag true may match any node whose selection flag is true, regardless of its node type (atom, bond, etc.).

Specification by name#

Nodes may be specified by names, i.e. strings enclosed with quotes.

Note that bond names are formed from the names of the atoms they bond, and cannot be searched by name.

As noted above, pressing the Tab key in the search box when a string is being entered (with the quote sign, to indicate a name) lists all nodes whose name begins with the entered string. For example, entering "ALA" and pressing the Tab key may yield e.g:

  • "ALA 22 Backbone"
  • "ALA 22 Side-chain"
  • "ALA 22"
  • "ALA 28 Backbone"
  • "ALA 28 Side-chain"
  • "ALA 28"
  • ...

Possible values: a string enclosed with quotes.

Examples:

  • "Folder 1"
  • "CA"
  • "Group 1"
  • "1EQZ"
  • "ALA 28"
  • "Nanotube"
  • "Simulator 1"

Specification by symbols and element names#

In order to efficiently match atoms of a given element type, symbols and element names are valid NSL expressions.

Possible values: atomic symbols and element names (capitalized)

Examples:

  • Carbon: matches all carbon atoms
  • C: matches all carbon atoms
  • Ca: matches all calcium atoms
  • H: matches all hydrogen atoms

Operations on sets#

Logical operators#

Logical operators may be used to perform operations on sets.

Available operators:

  • and
  • not
  • or
  • xor (exclusive or).

Examples:

  • a.sn >= 20 and a.sn <= 897: matches atoms with serial number between 20 and 897
  • a.sn <= 20 or a.sn >=40: matches atoms with serial number smaller than 20 or larger than 40
  • n.t r and not r.t CYS: matches residue nodes that are not cysteins
  • a.sn >= 20 xor a.oc >= 0.5: matches atoms which either have a serial number larger than 20, or have an occupancy larger than 0.5, but not those that satisfy both conditions

Note that, without proper care, the not condition might produce surprising results. For example, not r.t CYS does not return the list of residues that are not cysteins. Indeed, a folder node is also a node which is not a residue node that's a cystein. If only residues nodes should be returned, a proper query would be n.t r and not r.t CYS.

Topology operators#

Containment operators may be used to specify inclusion in or out of a set.

Available operators:

  • in
  • out of

Examples:

  • n.t a in "2AZ8": matches atoms in "2AZ8"
  • n.t a in r.t CYS: matches atoms that belong to cysteins
  • H in r.t ARG: matches hydrogens that belong to arginins
  • H in n.h: matches hidden hydrogens
  • n.t a out of r.t PRO: matches atoms that do not belong to prolines

Proximity operators#

Proximity operators may be used to select nodes based on distances.

Available operators:

  • within {distance} of
  • beyond {distance} of

Examples:

  • C within 5A of "GLN 2": matches carbons within 5 angstrom of "GLN 2"
  • n.t a beyond 5A of "2AZ8-IA": matches atoms beyond 5 angstrom of "2AZ8-IA"

Possible units for distances are:

  • micrometer (short name um)
  • nanometer (short name nm)
  • angstrom (short name A)
  • picometer (short name pm)
  • femtometer (short name fm)

Note that, without proper care, the within condition might produce surprising results. For example, * within 5A of "GLN 2" will select the document node, since some atoms in the document are within 5 angstrom of "GLN 2": the atoms that belong to "GLN 2". If only atom nodes should be returned, a proper query would be n.t a within 5A of "GLN 2". Note that this latter query would also return atoms that belong to "GLN 2". If only atoms outside "GLN 2" should be returned, then a proper query would be n.t a within 5A of "GLN 2" out of "GLN 2".