Predicting protein-ligand complexes using NMR2

NMR2

Introduction#

Nuclear Magnetic Resonance (NMR) spectroscopy is a powerful and versatile analytical tool extensively used in chemistry and biology to determine the structure of molecules, including protein-ligand complexes. Unlike X-ray crystallography that requires crystalline samples, NMR spectroscopy can analyze molecules in solution, closely mimicking physiological conditions. This makes NMR particularly suited for studying interactions between proteins and ligands.

This tutorial demonstrates how to use the integration of NMR2, a method developed by Prof. Dr. Julien Orts from Vienna University, into SAMSON. By leveraging NMR2 within SAMSON, researchers can more easily interpret NMR data to elucidate the structure of protein-ligand complexes, advance the understanding of molecular interactions, and facilitate the development of novel therapeutics.

Adding the NMR2 extension#

To follow this tutorial, make sure you add the NMR2 extension to your account, and restart your SAMSON to have it automatically installed.

Important: the NMR2 extension requires an active license of Cyana, which is available from L.A. Systems.

Obtaining the tutorial document#

You can then import the NMR2 Tutorial document into SAMSON. To do this, click on Home > Download, paste the following identifier: 3433D68C-50F9-4F45-848D-42E8142572CE, and press Enter. This will directly import the shared document into SAMSON.

Download the tutorial document

Here is how the tutorial document looks like:

The tutorial document

The document contains:

PIN1: the protein structure
Ligand: the ligand structure
Binding site: a group containing 4 residues (LEU 61, LEU 122, MET 130 and LEU 141)

Setting up NMR2#

To find the NMR2 extension, type NMR2 into the Find everything box in the top part of SAMSON (shortcut: Shift+E):

Find NMR2

and press Enter. The NMR2 interface will appear:

NMR2 Interface

When you start NMR2 for the first time, you need to indicate the path to your Cyana executable. For this, click on the Browse... button in part 1 - Set Cyana executable path and select the corresponding file on your computer.

Then, select the folder in which all NMR2 jobs will be created by clicking the Browse... button in part 2 - Set results folder. For each job, NMR2 will create a new subfolder which will indicate the time at which the job was started, as well as the names of the protein and the ligand. For example, a job folder could be named 2024.03.04-10h17m38s-PIN1-Ligand.

Note: the NMR2 extension will save these settings, so you will not need to reenter them the next time your start SAMSON.

Setting up calculation parameters#

Structures#

To predict the structure of a protein-ligand complex, the NMR2 algorithm needs to know:

The protein structure
The ligand structure
The list of residues that may contain (or may be close to) methyls observed by NMR

This can be easily defined using the Document view:

Select system in the Document view

and the NMR2 interface:

Set system

Precisely:

Click on the PIN1 structure (with the blue icon) in the Document view to select it, then click on Set for the Receptor part in the NMR2 interface.
Click on the Ligand structure in the Document view to select it, then click on Set for the Ligand part in the NMR2 interface. Setting the ligand automatically renames its atoms and potentially adds pseudo-atoms. The atoms and pseudo-atoms names are the ones which should be used in the distance restraints (see below).
Click on the Binding site group in the Document view, then click on Set for the Binding site part in the NMR2 interface. This group contains four residues of the protein among which (or close to which) we expect the unassigned methyls to be found.

Note: the group node is a saved selection which has been created to make it easy to reproduce this tutorial. In practice, you would select residues directly from the Document view or using the Find command.

Distances#

Distance restraints (i.e., lower and upper bounds on distances between atoms or groups) are the most important set of parameters in NMR2.

After deducing them for the NOESY peaks, you must enter them in the following format:

SITE_1 SITE_2 = LOWER_BOUND UPPER_BOUND

where a site can be a proton (e.g., H1, H2, etc.), a pseudo-atom (e.g., Q, Q4, etc.), or a methyl group (e.g., M1, M2, etc.), and lower and upper bounds are expressed in Angstrom.

For this tutorial, copy the list of distance restraints below and paste it in the Distances box of the NMR2 interface:

Q4 M5 = 2.18 3.88
H8 M5 = 3.80 5.70
Q4 M1 = 3.73 6.64
H7 M3 = 4.42 6.62
H5 M2 = 2.36 3.54
Q M3 = 2.88 5.12
H7 M4 = 2.91 4.36
H7 M1 = 4.63 6.94
H5 M4 = 2.17 3.25
Q4 M4 = 2.81 4.99
H8 M4 = 3.92 5.88
H7 M5 = 3.56 5.34
Q4 M2 = 3.52 6.25
H8 M1 = 3.58 5.36

Notice how the restraints involve five different methyls in the protein.

Assigned methyls#

The NMR2 algorithm will automatically determine which combination of methyl assignments best satisfies the restraints. When numerous methyls are involved, though, this can lead to a combinatorial explosion of possibilities, which may take a long time to solve.

When you have some information about specific methyls (for example thanks to other NMR experiments), you can enter this information to speed up the search. For this tutorial, enter the following assignment in the Assigned methyls box, to specify that we force methyl 5 to be pseudo-atom QE on residue 130:

M5 = 130 QE

Partial assignment#

In some cases, even though you may not have enough information to exactly assign a methyl, you may still make partial assignments. For example, you may still specify in the Partial assignment box that two methyls are on the same residue using the same_res keyword:

same_res = M1 M2
same_res = M3 M4

or that a methyl belongs to certain residue types:

M1 = MET ILE

For this tutorial, just leave the Partial assignment box empty.

Other options#

The NMR2 extension allows you to add several other options to control the search.

For example, the protein is considered rigid by default. If you want to allow for a 20 degrees tolerance in the side chain of residue 130 and a 10 degrees tolerance in the side chain of residue 61, you can add the following line in the Options box:

SC = 130 20; 61 10

If you want to give a 20 degrees tolerance to all side chains considered in the binding site, simply enter:

SC = site 20

If you want to allow for a tolerance in the backbone as well, enter similar lines with BB instead of SC.

For this tutorial, just leave the Options box empty.

Starting calculations#

You can now click on the Predict structures button. NMR2 will first regularize the protein structure, then will compute a series of structures with different methyls assignments to determine the combinations that best satisfy the distance restraints. For the structures and the parameters in this tutorial, the whole process should take a few minutes. Since the method is highly parallelizable (methyls combinations can be tested in parallel), NMR2 detects how many cores are available, and uses all but one (to keep your computer responsive).

When the method completes, a summary of the obtained structures will be displayed, with the best methyls assignments, the resulting value of the Target function (TF), which expresses how well distance restraints are satisfied (the lower, the better) and the Van der Waals (VDW) function, which indicates potential steric clashes (the lower, the better):

Results output

The predicted structures are saved in the output subfolder of the job in the PDB format:

Resulting files

You can then import these structures using Home > File > Open... or simply by dropping the files into SAMSON.

For example, you can import the 10 predicted structures 1_pdb_7-5-2-6-1.pdb, 2_pdb_7-6-2-5-1.pdb, ... at the same time:

Results in the Viewport

Then use the Document view to remove the pseudo-atoms that were added by Cyana for the calculations. Precisely, type un in the Filter to find these pseudo-atoms (they have unknown type), press Enter to select them, then press Delete to erase them:

Unknown atoms in the Document

Then, select all structures and, in the context menu, align them using the alpha carbons:

Align alpha-carbons

This will align all structures and will show the different predictions for the ligands positions:

Result ligands

You can then hide and show these structures using the Document view, and use other extensions for analysis, simulation, etc.

Congrats on completing this tutorial! Let us know if you have any questions.