Protein Path Finder
In this tutorial, we will show you how to use the Protein Path Finder app for finding possible paths between two conformations of a protein. The Protein Path Finder applies the ART-RRT method for finding protein conformational transition paths of a protein. The ART-RRT method is a combination of the T-RRT method and the ARAP methods 1. The T-RRT method is used for finding possible pathways and the ARAP modeling method is for probing possible protein motions. Moreover, a constrained minimization is used for minimizing protein motions.
Requirements#
- Protein Path Finder app
- FIRE state updater
Load the input model#
In SAMSON, go to Home > Download and insert https://www.samson-connect.net/documents/5eb7990e-fa44-4595-ab2c-99b6c68eada9 - this will load a document with this tutorial's sample from SAMSON Connect.
In the Document view (1), you should see a structural model of Adenylate Kinase, chain A, with two conformations corresponding to 4AKE and 1AKE structures: start and goal. Our aim is to find a path between the start and goal conformations.
- Interface menu > Document view or , : Ctrl+1, : Cmd+1
Note
For your own model, you might need to first prepare it (remove alternate locations, add hydrogens, remove solvent) - for that you can use Home > Prepare.
Launch the Protein Path Finder app#
Open the Protein Path Finder app via Home > Apps > Biology or find it via Find everything.
In the app's window, you will see two tabs: the Settings tab is for setting up the search parameters and the Results tab is for collecting the results.
Setup energy evaluation#
Note
The system needs to be already minimized. You can minimize the system using Edit > Minimize which uses Universal Force Field (UFF).
First, in the Settings tab, we need to specify an interaction model and a state updater which will be used in the computations. For the Interaction model select Universal Force Field (UFF) in the drop-down list, and for the State updater select FIRE. If you do not see the FIRE state updater, please check the requirements section and make sure you installed it from SAMSON Connect.
A window will pop up asking if you want to apply a new model, click Yes.
The Universal Force Field (UFF) setup will then ask whether to use existing bonds - choose to use existing bonds and click OK.
Two windows should appear: the Universal Force Field properties window showing its parameters and UFF energies of the current state of the system and the FIRE Properties window showing the state updater parameters. Set the parameters for FIRE as in the picture below (the step size to 1 fs, the number of steps to 1).
Setup the system#
Let's now set up the system. Expand the Set up the system box.
Set the start and goal conformations#
Let's choose the start and goal conformations. Click Get conformations from the active document to obtain all the conformations present in the active document. Then, select the start conformation in the Start conformation list and the goal conformation in the Goal conformation list as shown below. These two conformations will be used to seed the search process.
Define the active ARAP atoms#
Now we need to specify which protein atoms will be considered by the ARAP method as active atoms, i.e. atoms that control the protein motion. The other atoms will be passive: their motion will follow active atoms according to the ARAP method. Let's choose two alpha-Carbon (CA) atoms from backbones of the residue GLY 12 and the residue ARG 123. For simplicity, the document has a group named CA in GLY 12 and CA in ARG 123
that refers to these atoms. In the Document view, double-click on this group, this action will select nodes in the group.
Tip
We selected these atoms using the following Node Specification Language expression: ("CA" in "GLY 12") or ("CA" in "ARG 123")
.
See also User guide - Selecting.
Then, in the App, click the Add button to set the active ARAP atoms.
In the Advanced information box, you should see the number of added active ARAP atoms.
You can see which atoms were chosen as active ARAP atoms by clicking the Select buttons. If you are not satisfied with the active ARAP atoms assignment, you can reset your choices by clicking the Reset () button.
During the setup of the system, a new visual model should appear in the document to show the sampling box and atom types (green for active ARAP atoms).
Define the sampling box#
Let's now define the sampling box for the active ARAP atoms. Expand the Set the sampling box for the active ARAP atoms box.
The sampling box defines the sampling region for the chosen active ARAP atoms. The size and position of the box biases their motion and therefore the resulting pathways. The App starts with a sampling box size which encloses all protein atoms in both start and goal conformations. Let's set the sampling box to be a cube of 200 angstroms along each dimension (see picture below).
A green box visualizes the sampling box.
Define the search parameters#
Let's now define the search parameters. Expand the Set parameters box and set them as in the following figure:
- Use seed: use the specified seed number for the planner and after each run the seed value is incremented by 1. If the box is unchecked, a random seed is used. The specified seed number is needed if you want to reproduce the results later for the same seed.
- Runs = 2, we run the method 2 times to extract a maximum of 2 paths.
- ARAP-modeling iterations = 20: the number of iterations for the ARAP method.
- Minimization iterations = 20: we apply 20 steps of constrained minimization with FIRE each time a new state is generated in order to minimize it.
- Initial temperature (T) = 0.001 K, Temperature factor = 2, Failures before increase of T = 1: the parameters for the T-RRT sampling algorithm.
- RRT extension step size = 1 A: the extension step size for the sampling algorithm.
- Use alignment strategy: if this box is checked, each accepted state of the protein during the search will be aligned with the start conformation. This strategy tends to make the search faster, but in rare cases it produces some artifacts.
- Max. elapsed time per run: each run is stopped as soon as the elapsed time reaches this value.
Run the planner#
Once you set up the system, the sampling box, and the search parameters, you can launch the search for transition pathways.
Click the Run button to start computing paths. The search process can be paused by clicking the Pause button and resumed with the Resume button (these buttons are located at the same place as the Run button while the planner is running). To stop the process, click the Stop button.
During the search, in the Advanced information box under the Planning information, you can observe the elapsed time for the current run (Current running time), the elapsed time for all of the runs (Total running time), the number of tree nodes in the current run (Nodes), the run number (Run), and the number of paths found (Paths found).
Results#
As soon as a path is found, it is added to a list in the Results tab. For example, the following figure shows two paths found.
Each path contains the following fields
- id: the path id
- # states: the number of conformations in the path.
- MinE (kcal/mol): the minimum energy of the path conformations.
- MaxE (kcal/mol): the maximum energy of the path conformations.
- Saddle (kcal/mol): the difference between MaxE and MinE.
- Barrier (kcal/mol): the difference between MaxE and First.
- Time (s): time elapsed for searching this path.
- First (kcal/mol): the energy of the first conformation in the path.
- Last (kcal/mol): the energy of the last conformation in the path.
- Remarks: comments on the path (editable).
- Color: the color of the energy curve for this path.
View conformation energies along the path#
To view the energy curve of the path, select a path by clicking on it in the path table. To plot several energy curves, select several paths (Ctrl/Cmd + left-click for multi-selection). After selecting a path, you can move the slider to see a particular conformation in the path as shown in the picture below. The corresponding conformation is shown in the viewport and its corresponding energy is shown in the Universal Force Field window.
Export the results: paths, conformations, path table content#
You can copy the content of the path table to the clipboard by selecting the paths for which you want to export data and pressing Ctrl/Cmd + C or right-clicking and then selecting Copy table content. You can also copy path energy values along the selected paths via right-clicking and selecting Copy path energy.
To export paths from the path table into the document as trajectories, select the paths you are interested in and press the Export paths button.
To export conformations along the path into the document, select the paths from the path table, choose the export interval and click the Export button.
Tip
You can double-click on a path in the document to start/stop it.
To access the path controllers, select the path and open the Inspector (1).
If you right-click on a path, you can also access some of its options via Path > ....
- Interface > Inspector, , : Ctrl+2, : Cmd+2"""
Next steps#
Create pathlines#
Check the Pathlines tutorial to learn about how to create pathlines to visualize the movement of the center of mass of a molecule.
Improve paths with P-NEB#
The resulting paths can be significantly improved with the help of the parallel Nudged Elastic Bands (NEB) method implemented in the P-NEB app. Please check out the P-NEB tutorial on exploring transition paths for more information.
Export atoms trajectories along paths#
The resulting paths can be saved in .sam or .samx format. If you want to export only trajectories of some atoms along the path, you can use the Export Along Paths app. Please refer to the tutorial on how to use the Export Along Paths app for more details.
If you have any questions or feedback, please use the SAMSON Connect Forum.