Using the SMILES Manager

In this tutorial, you will learn how to use the SMILES Manager extension of SAMSON. This extension is based on RDKit. RDKit is a widely used open-source toolkit for cheminformatics. One of its features is the conversion of molecules SMILES strings to 2D and 3D structures.

The extension interface presents three tabs: Manage SMILES, Replace fragments, and Positional Analogue Scanning. In this tutorial, we will present the first two sections one by one as they are totally independent. For the Positional Analogue Scanning, please check the Positional Analogue Scanning using the SMILES Manager Extension tutorial. In each of the first two tabs, you can find a menu bar that contains all the actions with corresponding shortcuts. SMILES strings are presented in a table with their corresponding names and 2D conformation images.




You can import SMILES files (.smi) or text files (.txt) containing SMILES strings by clicking on Open from the File drop-down menu located in the menu bar of the extension. The SMILES files must have the RDKit .smi format (image below) with a SMILES string in the first column and a molecule name in the second column.

COc1cc(CNS(=O)(=O)CCCCC=CC(C)C)ccc1O Mol1
COc1cc(CNC(CCCCC=CC(C)C)C(F)(F)F)ccc1O Mol2
COc1cc(CNC2CC2CCCCC=CC(C)C)ccc1O Mol3
COc1cc(CC=C(F)CCCCC=CC(C)C)ccc1O Mol4
COc1cc(CNC2OCC2CCCCC=CC(C)C)ccc1O Mol5
COc1cc(COC2OCC2CCCCC=CC(C)C)ccc1O Mol6

RDKit SMILES files containing additional attributes (as in the image below) are also supported but these attributes will be ignored.

SMILES Attribute Name
CC 2 Mol1
C=C 5 Mol2
[CH+2]C[CH+2] 100 Mol3

Also, text files (.txt) containing SMILES strings as shown below can be imported.


Adding, modifying, and removing SMILES

It is possible to manually add SMILES strings using the Add line action from the Edit drop-down menu or using the Add button  just below the SMILES table. In the same way, each line can be removed and the table can be cleared. The SMILES string can be easily modified by modifying the corresponding Code cell in the table.

Also, a name can be assigned to each SMILES string. If no name has been assigned, the name will be the same as its SMILES code.

Generating, opening, updating, and saving 2D depictions

Using RDKit functions, 2D depictions of the SMILES strings are generated on the fly. When a SMILES string is invalid the corresponding line is highlighted and an error image is attributed instead of the 2D depiction. You can open a large view of the 2D depiction by either double-clicking on the 2D depiction or by right-clicking on the image and choosing the Open action in the context menu.

At any time the 2D depictions presented in the table can be hidden by clicking on the Hide 2D image action of the View drop-down menu or by clicking on the 2D header cell of the table. Note that even when 2D depictions are hidden it is possible to open them using the aforementioned ways.

You can create several open windows to compare several molecules or navigate through all of them in a unique window using the navigation buttons. For large molecules, it is possible to zoom in/out the images. Note that when the SMILES code of a molecule is modified the corresponding image is automatically updated (the same for the molecule’s name as well).

Finally, you can save the 2D depiction as a single PNG or SVG image either from the right-click context menu or from the large view window.

Grid images of 2D depictions

It is also possible to save the 2D depictions in a grid in an SVG file by clicking on the Save as grid image action of the File drop-down menu. You can also save only the selected 2D depictions by choosing the Save selection as grid image action. A popup window will appear where you will be able to set the parameters of the grid (number of columns, size of a single panel, and whether the molecule’s name should be displayed or not).

Generating 3D structures

The main feature of this SAMSON module is to generate 3D structures from SMILES codes. After selecting molecules in the table, you can click on the Selected SMILES string to Document action in the Export drop-down menu. The 3D structures of the selected molecules with their corresponding names will be added to the active document.

The generation of the 3D structure can also be done for a single SMILES string either by clicking on the Generate 3D structure action in the right-click context menu or by clicking on the Generate 3D structure button in the large view window.

One interesting feature that I have improved in this new version is the substructure search. It is now possible to do a more advanced search using a given combination of patterns. For example, let us generate only the molecules that have a fluorine atom and a nitrogen atom separated in space. For that, in the Search bar enter F for fluorine then press on Select visible. This will select SMILES strings having a fluorine atom (two SMILES strings in the animation below). Then we clear the search bar. Note that our selection remains unchanged (in the animation below, two SMILES strings are still selected). Now we enter N for nitrogen and we can see in the result of our search that only one selected SMILES string is present (the other one does not have nitrogen). So we press on the Unselect hidden button in order to unselect the molecule that is not displayed and keep only the ones presenting the nitrogen atom.

Note that by default in RDKit information about stereochemistry is not used in substructure searches but this can be changed by enabling the chirality.

Replace fragments

The Replace fragments tab allows for quick and easy generation of thousands of different molecules by replacing a specified fragment of the initial molecule with various fragments. This feature provides isosteric replacement and can be very useful to quickly generate a library of slightly different molecules to test their properties.

Note that each table is presented in the same way as the SMILES table in the Manage SMILES tab with the same buttons. One difference is the saving of the table as a grid or as a file which is present as buttons at the right bottom of each table.

Let’s try to generate a library of fragments for a molecule:

  1. First, we need to specify one or several initial molecules in which we would like to replace a fragment. You can import initial molecules from a file by clicking Open initial molecules in the File drop-down menu, add them manually, import from a SAMSON Document by clicking SAMSON Selection as initial molecule in the Import drop-down menu. In this tutorial, we will be using the following initial molecule (load it from a file or  add it manually):
  2. Enter the fragment that you want to be replaced in the initial molecules either by entering it manually or by importing it from the active SAMSON Document by going to the SAMSON Selection as replaced fragment action in the Import drop-down menu. In this tutorial, the target fragment to replace is:
  3. Provide one or more “Replaced with fragments” (fragments with which you want to make a replacement) by importing from a file (click Open fragments to replace with… in the File drop-down menu) or by adding them manually. In this tutorial, we will be using the following fragments:
  4. In the Run drop-down menu, click on the Replace fragments.
  5. Results of the replacements are stored in the Results table. You can them in the active document – check out the Export drop-down menu.

Note that in RDKit the fragments must have the following syntax: [*:1]C(=O)NC[*:2] with [*:1] and [*:2] at the two extremities of the pattern.

The resulting molecules can then be converted into 3D structures, modified, or saved in a file or a grid image.

That’s all for this tutorial about the RDKit-SMILES Manager module of SAMSON. You can check out the Positional Analogue Scanning using the SMILES Manager Extension tutorial.

Get the SMILES Manager extension here.

If you have any questions or feedback, please use the SAMSON forum.

If you need help with installation, please visit: installing SAMSON, adding SAMSON Elements to your SAMSON.

Comments are closed.