Protein Path Finder

In this post, we will show how to use the Protein Path Finder app for finding possible paths between two conformations of a protein.

Requirements#

Protein Path Finder app
GROMACS force field interaction model
FIRE state updater

Before starting the tutorial, please download the ProteinPathSearch sample archive which contains necessary files for this tutorial. After extracting it, you will have a structural model (4akeA_1akeA_minimized.sam) and the topology files for energy evaluation (4akeA.itp, 4akeA.top).

Tutorial#

In this tutorial, we will apply the ART-RRT method for finding protein conformational transition paths from a protein. The ART-RRT method is a combination of the T-RRT method and the ARAP methods. The T-RRT method is used for finding possible pathways and the ARAP methods is for probing possible protein motions. Moreover, a constrained minimization is used for minimizing protein motions.

Loading the input model#

Launch SAMSON. Load 4akeA_1akeA_minimized.sam. It will open a structural model of Adenylate Kinase. In the Document View (if you do not see it, enable it in the Interface menu, or press Ctrl/Cmd + 1), you can see the protein under the name 4AKEA_H.pdb and two conformations (start and goal). Our aim is to find a path between the conformations start and goal.

Visualizing the secondary structure#

Let's visualize the secondary structure of the protein. In the Document View, right-click on the 4AKEA_H.pdb and choose Add > Visual Model. In the Add visual model window, select Secondary structure as shown in the following figure.

Let's now hide the default ball-and-stick representation of the protein for better visibility by unchecking the box next to the 4AKEA_H.pdb in the Document View.

Launching the Protein Path Finder app#

Open the Protein Path Finder app by clicking on its icon . You can also find it in the App > Biology menu.

In the app's window, you will see two tabs: the Settings tab is for setting up the search parameters and the Results tab is for collecting the results.

Setting up energy evaluation#

First, in the Settings tab, we need to specify the interaction model (the force field) and the state updater (the simulation method) which will be used in the computations. For the Interaction model select GROMACS force field from a drop-down list, and for the State updater select FIRE in the drop-down list. If you do not see either GROMACS force field or FIRE state updater, please check that you have all necessary SAMSON Elements in the requirements section. Then, click the Apply button (see the figure below) to apply the interaction model and the state updater on all the atoms of the (active) document.

A window will pop up asking if you want to apply a new model, click Yes. The GROMACS force field setup will then ask for the topology file for the protein model. Click Browse and choose the file 4akeA.top provided with the structural model file. Keep the rest of the information in the window intact (see the figure below) then click OK.

Two windows should appear: the GROMACS force field properties window showing the potential energy of the system and the FIRE Properties window showing the state updater parameters. Set the parameters for FIRE as in the picture below (the step size to 1 fs, the number of steps to 1).

Setting up the system#

Let's now set up the system. Expand the Set up the system box.

Setting the start and goal conformations#

Let's choose the start and goal conformations. Click Get conformations from the active document to obtain all the conformations present in the active document. Then, select the start conformation in the Start conformation list and the goal conformation in the Goal conformation list as shown below. These two conformations will be used to seed the search process.

Defining the active ARAP atoms#

Now we need to specify which protein atoms will be considered by the ARAP method as active atoms, i.e. atoms that control the protein motion. The other atoms will be passive: their motion will follow active atoms according to the ARAP method. Let's choose two alpha-Carbon (CA) atoms from backbones of the residue GLY 12 and the residue ARG 123. For simplicity, the Document has a group named CA GLY_12 and ARG_123 that refers to these atoms. In the Document view, double-click on this CA GLY_12 and ARG_123 group, this action will select nodes in the group.

Then, in the App, click the Add button to set the active ARAP atoms.

In the Advanced information box, you should see the number of added active ARAP atoms.

You can see which atoms were chosen as active ARAP atoms by clicking the Select and Unselect buttons. If you are not satisfied with the active ARAP atoms assignment, you can reset your choices by clicking the Reset () button.

During the setup of the system, a new visual model should appear in the document to show the sampling box and atom types (green for active ARAP atoms).

Defining the sampling box#

Let's now define the sampling box for the active ARAP atoms. Expand the Set the sampling box for the active ARAP atoms box.

The sampling box defines the sampling region for the chosen active ARAP atoms. The size and position of the box biases their motion and therefore the resulting pathways. The App starts with a sampling box size which encloses all protein atoms. Let's set the sampling box to be a cube of 200 angstroms along each dimension (see picture below).

A green box visualizes the sampling box.

Defining the search parameters#

Let's now define the search parameters. Expand the Set parameters box and set them as in the following figure:

Use seed: we use the specified seed number for the planner and after each run the seed value is incremented by 1. If the box is unchecked, a random seed is used.
Runs = 2, we run the method 2 times to extract a maximum of 2 paths.
ARAP-modeling iterations = 20: the number of iterations for the ARAP method.
Minimization iterations = 20: we apply 20 steps of constrained minimization with FIRE each time a new state is generated in order to minimize it.
Initial temperature (T) = 0.001 K, Temperature factor = 2, Failures before increase of T = 1: the parameters for the T-RRT sampling algorithm.
RRT extension step size = 1 A: the extension step size for the sampling algorithm.
Use alignment strategy: if this box is checked, each accepted state of the protein during the search will be aligned with the start conformation. This strategy tends to make the search faster, but in rare cases it produces some artifacts.
Max. elapsed time per run: each run is stopped as soon as the elapsed time reaches this value.

Running, pausing, or stopping the planner#

Once you set up the system, the sampling box, and the search parameters, you can launch the search for transition pathways.

Click the Run button to start computing paths. The search process can be paused by clicking the Pause button and resumed with the Resume button (these buttons are located at the same place as the Run button while the planner is running). To stop the process, click the Stop button.

During the search, in the Advanced information box under the Planning information, you can observe the elapsed time for the current run (Current running time), the elapsed time for all of the runs (Total running time), the number of tree nodes in the current run (Nodes), the run number (Run), and the number of paths found (Paths found).

Results#

As soon as a path is found, it is added to a list in the Results tab. For example, the following figure shows two paths found.

Each path contains the following fields

id: path id.
# states: Number of conformations (or states) in the path.
MinE (kcal/mol): Minimum energy of the path conformations.
MaxE (kcal/mol): Maximum energy of the path conformations.
Saddle (kcal/mol): Difference between MaxE and MinE.
Barrier (kcal/mol): Difference between MaxE and First.
Time (s): Time elapsed for searching this path.
First (kcal/mol): Energy of the first state in the path.
Last (kcal/mol): Energy of the last state in the path.
Remarks: Comments on the path (editable).
Color: Color of the energy curve for this path.

Viewing conformation energies along the path#

To view the energy curve of the path, select a path by clicking on it in the path table. To plot several energy curves, select several paths (Ctrl/Cmd + left-click for multi-selection). After selecting a path, you can move the slider to see a particular conformation in the path as shown in the picture below. The corresponding conformation is shown in the viewport and its corresponding energy is shown in the GROMACS force field window.

Exporting the results: paths, conformations, path table content#

You can copy the content of the path table to the clipboard by selecting the paths for which you want to export data and pressing Ctrl/Cmd + C or right-clicking and then selecting Copy table content. You can also copy path energy values along the selected paths via right-clicking and selecting Copy path energy.

To export paths from the path table into the document, select the paths you are interested in and press the Export paths button.

You can delete paths by selecting them in the path table and clicking the Delete paths button.

To export conformations along the path into the document, select the paths from the path table, choose the export interval and click the Export button.

Next steps#

Creating pathlines#

Check the Pathlines tutorial to learn about how to create pathlines, for example, for visualization of the movement of the center of mass of a molecule.

Improving paths with P-NEB#

The resulting paths can be significantly improved with the help of the parallel Nudged Elastic Bands (NEB) method implemented in the P-NEB app. Please check out the P-NEB tutorial on exploring transition paths for more information.

Exporting atoms trajectories along paths#

The resulting paths can be saved in .sam or .samx format. If you want to export only trajectories of some atoms along the path, you can use the Export Along Paths app. Please refer to the tutorial on how to use the Export Along Paths app for more details.

If you have any questions or feedback, please use the SAMSON forum.