Web Analytics Made Easy - Statcounter
Skip to content

Node Specification Language#

SAMSON has a powerful Node Specification Language (NSL) that may be used to select various structures (ligands, receptors, etc.) and nodes (atoms, residues, etc.) based on their properties. NSL can be used in the Find command or to filter nodes in the Document view.

Interactive tutorial in SAMSON (Help > Tutorials): "Selecting using the Node Specification Language".

Find command#

The Find command (1) in SAMSON lets you enter a NSL string to select nodes (atoms, residues, etc.) from the active document:

  1. Select > Find, , : Ctrl+F, : Cmd+F

Select nodes with the Find command

When using the Find command, you may press the Tab key in the search box for context-sensitive completion. This is particularly useful when searching nodes by name. For example, entering "ALA (with the opening quote sign and without the closing quote sign to indicate a start of a name) and pressing the Tab key will list all nodes whose name begins with "ALA", e.g:

  • "ALA 22 Backbone"
  • "ALA 22 Side chain"
  • "ALA 22"
  • "ALA 28 Backbone"
  • "ALA 28 Side chain"
  • "ALA 28"
  • ...

Find using Document View#

A NSL expression may also be used to filter nodes in the Document view (1) and to select them by clicking Enter, as shown in the image below with selecting side chains via the following NSL: n.t sc (full version: node.type sidechain):

  1. Interface > Document view, , : Ctrl+1, : Cmd+1

Filter and select nodes from the document view

Examples#

Here are some examples of NSL expressions:

  • node.category ligand, receptor (short version: n.c lig, rec): matches ligands and receptors
  • Hydrogen (short version: H): matches hydrogens
  • residue.id 20:40, 50:60 (short version: r.id 20:40, 50:60): matches residues with IDs between 20 and 40 and between 50 and 60
  • atom.chainID > 2 (short version: a.ci > 2): matches atoms with a chain ID strictly larger than 2
  • Carbon in node.selected (short version: C in n.s): matches carbons in the current selection
  • bond.order > 1.5 (short version: b.o > 1.5): matches bonds with order strictly larger than 1.5
  • node.type backbone (short version: n.t bb): matches backbone nodes

Logical operators examples:

  • C or H: matches atoms that are carbons or hydrogens
  • node.type residue and not residue.type ALA (short version: n.t r and not r.t ALA): matches all non-alanine residues

Topology operators examples:

  • O in node.type sidechain (short version: O in n.t sc): matches oxygens in side chain nodes
  • n.t a out of n.t r: matches atoms that do not belong to residues
  • node.type sidechain having S (short version: n.t sc h S): matches side chain nodes that have at least one sulfur atom
  • H linking O (short version: H l O): matches hydrogens bonded to oxygen atoms

Proximity operators examples:

  • "CA" within 5A of S (short version: "CA" w 5A of S): matches nodes named "CA" that are within 5 angstrom of any sulfur atom (use quotes, since names may contain spaces)
  • node.type residue beyond 5A of node.selected (short version: n.t r b 5A of n.s): matches residue nodes beyond 5 angstrom of the current selection
  • residue.secondaryStructure helix (short version: r.ss h): matches residue nodes in alpha helices

Specification by attributes#

Nodes and groups of nodes may be specified by their attributes.

SAMSON's NSL supports several different attribute spaces based on node types. Since nodes may have different types in SAMSON (e.g. atom, bond, etc.), each attribute is defined in an attribute space:

The node attribute space (short name: n) corresponds to attributes that are defined for each node. For example, the selection flag is a node attribute, since each node has a selection flag. Hence, the NSL expression node.selectionFlag true may match any node whose selection flag is true, regardless of its node type (atom, bond, etc.).

Specification by name#

Nodes may be specified by names, i.e. strings enclosed with quotes with the possibility to use the wildcard characters *.

Note

Bond names are formed from the names of the atoms they bond, and cannot be searched by name.

For example, entering "ALA*" may yield ALA residues, backbones, and side chains, e.g:

  • "ALA 22 Backbone"
  • "ALA 22 Side chain"
  • "ALA 22"
  • "ALA 28 Backbone"
  • "ALA 28 Side chain"
  • "ALA 28"
  • ...

Examples:

  • "CA*": matches nodes with names that start with CA
  • "*AL*": matches nodes that have AL in their names, e.g. ALA and VAL residues, backbones, and side chains

Specification by symbols and element names#

In order to efficiently match atoms of a given element type, symbols and element names are valid NSL expressions.

Possible values: atomic symbols and element names (with the first character capitalized).

Examples:

  • Carbon: matches carbon atoms
  • C: matches carbon atoms
  • Ca: matches calcium atoms
  • H: matches hydrogen atoms

Lists and ranges#

NSL supports providing lists (divided by comma ,) and ranges (via colon :) wherever possible. Examples:

  • atom.chain A, B, C (short version: a.c A,B,C): matches atoms in chains A, B, and C
  • residue.id 20:40, 50:60 (short version: r.id 20:40, 50:60): matches residues with IDs between 20 and 40, and between 50 and 60
  • atom.x -1nm:1nm (short version: a.x -1nm:1nm): matches atoms with x-coordinate between -1 nm and 1 nm

Operations on sets#

Logical operators#

Logical operators may be used to perform operations on sets.

Available operators:

  • and
  • not
  • or
  • xor (exclusive or).

Examples:

  • sg.id 1000:1040 and sg.nat < 4: matches structural groups with IDs between 1000 and 1040 that have less than 4 atoms
  • a.sn <= 20 or a.sn >=40: matches atoms with serial number smaller than 20 or larger than 40
  • n.t r and not r.t CYS: matches residues that are not cysteins
  • a.sn >= 20 xor a.oc >= 0.5: matches atoms which either have a serial number larger than 20, or have an occupancy larger than 0.5, but not those that satisfy both conditions

Note

Without proper care, the not condition might produce surprising results. For example, not r.t CYS does not return the list of residues that are not cysteins. Indeed, a folder is also a node which is not a residue that's a cystein. If only residues should be returned, a proper query would be n.t r and not r.t CYS.

Topology operators#

Containment operators may be used to specify inclusion in or out of a set, one set having nodes from another set, or having atoms linked to atoms in another set.

Available operators:

  • in
  • out of
  • having (short name: h) - whether a node has another node(s)
  • linking (short name: l) - whether an atom or a structural node is linked to another atom(s) or structural node(s) containing atoms

Examples:

  • n.t a in "2AZ8": matches atoms in "2AZ8"
  • n.t a in r.t CYS: matches atoms that belong to cysteins
  • H in r.t ARG: matches hydrogens that belong to arginins
  • H in n.h: matches hidden hydrogens
  • n.t a out of r.t PRO: matches atoms that do not belong to prolines
  • node.type sidechain having S (short version: n.t sc h S): matches side chain nodes that have at least one sulfur atom
  • H linking O (short version: H l O): matches hydrogens bonded to oxygen atoms
  • node.type residue linking (residue.type PRO, CYS) (short version: n.t r l (r.t PRO, CYS)): matches residues that have atoms linked to atoms in PRO or CYS residues, this includes PRO and CYS residues themselves.

Proximity operators#

Proximity operators may be used to select nodes based on distances.

Available operators:

  • within {distance} of (short name: w {distance} of)
  • beyond {distance} of (short name: b {distance} of)

Examples:

  • C within 5A of "GLN 2" (short version: C w 5A of "GLN 2"): matches carbons within 5 angstrom of "GLN 2"
  • node.type atom beyond 5A of "2AZ8-IA" (short version: n.t a b 5A of "2AZ8-IA"): matches atoms beyond 5 angstrom of "2AZ8-IA"

Note

Without proper care, the within and beyond conditions might produce surprising results. For example, * within 5A of "GLN 2" will select the document node, since some atoms in the document are within 5 angstrom of "GLN 2": the atoms that belong to "GLN 2". If only atom nodes should be returned, a proper query would be n.t a within 5A of "GLN 2". Note that this latter query would also return atoms that belong to "GLN 2". If only atoms outside "GLN 2" should be returned, then a proper query would be n.t a within 5A of "GLN 2" out of "GLN 2".

Length units#

  • femtometer (short name: fm)
  • picometer (short name: pm)
  • angstrom (short name: A)
  • nanometer (short name: nm)
  • micrometer (short name: um)
  • millimeter (short name: mm)
  • centimeter (short name: cm)
  • meter (short name: m)

Mass units#

  • dalton (short name: Da)
  • kilodalton (short name: kDa)
  • megadalton (short name: MDa)
  • gigadalton (short name: GDa)
  • electronMass (short name: auMass)
  • yoctogram (short name: yg)
  • zeptogram (short name: zg)
  • attogram (short name: ag)
  • femtogram (short name: fg)
  • picogram (short name: pg)
  • nanogram (short name: ng)
  • microgram (short name: ug)
  • gram (short name: g)
  • kilogram (short name: kg)