This page lists several commands and mini-tutorials useful for bioinformatics purposes in protein simulations analyses using VMD (v. 1.9.3 but also older versions). It assumes Ubuntu is installed. Nonetheless, they can easily be changed for Windows (e.g. putting NAMD in PATH). Prior to running these commands, CHARMM-GUI for system preparation and NAMD 3.0 alpha was used for running the simulations, therefore the codes should be adapted for other production outputs. NAMD 2.x is also required for energy interaction calculations.
Needed files
IMPORTANT: Two tcl files are needed, you should copy the codes and save them as rmsf.tcl, and align_button.tcl, respectively.
rmsf.tcl – a script for calculating RMSF over an atom selection. I got this over the internet, but could not track its original source
proc rmsf_selection {selection} {
set sel_molecule $selection
set file_name "Calfa-rmsf.dat"
set outfile [open $file_name w];
set sel [atomselect top $sel_molecule]
set mol [$sel molindex]
for {set i 0} {$i < [$sel num]} {incr i} {
set rmsf [measure rmsf $sel]
$sel set beta $rmsf
puts $outfile "[expr {$i+1}] [lindex $rmsf $i]"
}
close $outfile
}
align_button.tcl – a script replicating the function of the ALIGN button from the RMSD Visualizer Tool (with regards to Bassam Haddad)
proc align {rmolid smol} {
set ref_molid $rmolid
set sel_mol $smol
set numframes [molinfo $ref_molid get numframes]
set ref_frame [atomselect $ref_molid "$sel_mol" frame 0]
set n 1
set sys [atomselect $ref_molid all]
for {set i 0} {$i < $numframes} {incr i} {
animate goto $i
set align_frame [atomselect $ref_molid "$sel_mol"]
set trans_matrix [measure fit $align_frame $ref_frame]
$sys move $trans_matrix
if {($n % $numframes) == 0 } {
puts "alignment $n of $numframes"
}
incr n
}
puts "Alignments complete, ready for RMSD calculations"
}
Aligning trajectories based on an atom selection and calculate RMSF of alpha carbons
Suppose you have a transmembrane protein, and you simulate it using VMD. It has an ectodomain, which is water-soluble, and one or more transmembrane helices anchoring it to the membrane. You want to test whether some changes in the protein (e.g. a mutation) will alter the protein behavior, by inspecting changes in the root mean square fluctuation (RMSF) for each of the atoms.
The first step would be to align the protein, which you would normally do before calculating any RMSD or RMSF. However, in this case, you only want to align a specific set of atoms, namely, the ones in the transmembrane domain, provided this domain is, theoretically, fixed. The ALIGN button in the RMSD Visualizer Tool and a proper atom selection would do this, but, to use it in VMD console, I provide (see above) an align_button.tcl file, which does the same procedures.
Next, we need to perform the actual RMSF calculation, using the rmsf.tcl (see above) file. This file also allows it to be used on an atom selection, and we choose to use only alpha carbons. We can run the script over the entire protein, expecting almost no fluctuation in the previously fixed atoms (through alignment).
Overall, we can start VMD from where the production files are located and input in the tcl console:
#open trajectory
mol new {step5_charmm2namd.psf} type {psf} first 0 last -1 step 1 waitfor 1
mol addfile {step7_production.dcd} type {dcd} first 0 last -1 step 1 waitfor -1 0
#use align button command to align the transmembrane domain
source {/path/to/align_button.tcl}
set alignment "protein and name CA and resid 560 to 590" #supposes TM 560-590
set id 0
align $id $alignment
#use rmsf.tcl to calculate RMSF
source {/path/to/rmsf.tcl}
set alpha_carbons "protein and name CA" #selects alpha carbons, can be changed
rmsf_selection $alpha_carbons
#quick selection to reveal the TM protein in VMD display
mol modselect 0 0 protein
mol modstyle 0 0 NewCartoon 0.300000 10.000000 4.100000 0
mol modcolor 0 0 Beta
menu graphics off
#change the name of the output as desired
set oldName "Calfa-rmsf.dat"
set newName "Calfa-rmsf_run1.dat"
file rename $oldName $newName
And that is all to be done to obtain a file with RMSF values per residue. Since you usually have to run more than one replica, you may obtain more RMSF files, each one named Calfa-rmsf_runx.dat, with x = 1,2… I do this manually, but the above script may be changed for running some loops. Suppose you have six replica simulations, with their corresponding six RMSF files, and you want to combine them. You can use Python for that:
import sys
import os
import pandas as pd
import numpy as np
valori1=pd.read_csv("Calfa-rmsf_run1.dat", delimiter=' ', index_col=0, header=None)
valori1=valori1.rename(columns={ valori1.columns[0]: "rmsf1" })
valori2=pd.read_csv("Calfa-rmsf_run2.dat", delimiter=' ', index_col=0, header=None)
valori2=valori2.rename(columns={ valori2.columns[0]: "rmsf2" })
valori3=pd.read_csv("Calfa-rmsf_run3.dat", delimiter=' ', index_col=0, header=None)
valori3=valori3.rename(columns={ valori3.columns[0]: "rmsf3" })
valori4=pd.read_csv("Calfa-rmsf_run4.dat", delimiter=' ', index_col=0, header=None)
valori4=valori4.rename(columns={ valori4.columns[0]: "rmsf4" })
valori5=pd.read_csv("Calfa-rmsf_run5.dat", delimiter=' ', index_col=0, header=None)
valori5=valori5.rename(columns={ valori5.columns[0]: "rmsf5" })
valori6=pd.read_csv("Calfa-rmsf_run6.dat", delimiter=' ', index_col=0, header=None)
valori6=valori6.rename(columns={ valori6.columns[0]: "rmsf6" })
valori = pd.concat([valori1, valori2, valori3, valori4, valori5, valori6], join = 'outer', axis = 1)
valori.to_csv('val.csv', index=False)
The resulting file can be used for further inspection or to build a line and shaded chart; the file is ready for input in a python code, as presented here, bearing in mind to change the xticks and ylim accordingly.
Interaction energy profiler
VMD has a plugin named NAMDEnergy, useful when the interaction energies between sets of atoms must be computed. The plugin has an interface, which may be tedious to use when multiple interaction energy profiles have to be computed, thus the plugin should be called from the tcl console. A no-CUDA NAMD 2.x version has to be called from the scripts, for the calculation. Please download them from NAMD official website, after registration. Here, I use the NAMD_2.12_Linux-x86_64-multicore package, which, after untar, was added to PATH and renamed (so that it does not interfere with other NAMD installations), all done in Ubuntu bash:
#I put the tar in the /opt folder
sudo tar xf NAMD_2.12_Linux-x86_64-multicore.tar.gz
cd NAMD_2.12_Linux-x86_64-multicore
sudo mv namd2 namd2_multicore
sudo echo "export PATH=$(pwd):\${PATH}" >> ~/.bashrc
#now NAMD2.12 can be called from bash and tcl using the command "namd2_multicore"
The following tcl code is run from the tcl console of VMD, which is run under the directory where the production dcd file(s) are located. It calculates the vdW, electrostatic, and total interaction energies between a ligand (UNK residue name) and three amino acids (92, 93 and 118). The script is designed for use with CHARMM-GUI outputs:
mol new step5_charmm2namd.psf
mol addfile step7_production.dcd waitfor -1
set ligand [atomselect top "resname UNK"]
package require namdenergy
set thr92 [atomselect top "protein and resid 92"]
namdenergy -vdw -elec -sel $thr92 $ligand -plot -ofile "athr92.csv" -extsys step7_production.restart.xsc -exe "namd2_multicore" -par toppar/par_all36_carb.prm -par toppar/par_all36_cgenff.prm -par toppar/par_all36_lipid.prm -par toppar/par_all36_na.prm -par toppar/par_all36m_prot.prm -par toppar/toppar_water_ions.str -par ../unk/unk.prm
set thr93 [atomselect top "protein and resid 93"]
namdenergy -vdw -elec -sel $thr93 $ligand -plot -ofile "bthr93.csv" -extsys step7_production.restart.xsc -exe "namd2_multicore" -par toppar/par_all36_carb.prm -par toppar/par_all36_cgenff.prm -par toppar/par_all36_lipid.prm -par toppar/par_all36_na.prm -par toppar/par_all36m_prot.prm -par toppar/toppar_water_ions.str -par ../unk/unk.prm
set ile118 [atomselect top "protein and resid 118"]
namdenergy -vdw -elec -sel $ile118 $ligand -plot -ofile "cile118.csv" -extsys step7_production.restart.xsc -exe "namd2_multicore" -par toppar/par_all36_carb.prm -par toppar/par_all36_cgenff.prm -par toppar/par_all36_lipid.prm -par toppar/par_all36_na.prm -par toppar/par_all36m_prot.prm -par toppar/toppar_water_ions.str -par ../unk/unk.prm
The resulting independent csv files are unformatted and can be then processed and merged with a python code similar to the following
import sys
import os
import glob
import pandas as pd
import csv
valori=pd.read_csv("athr92.csv", skipinitialspace=True, delimiter=' ')
valori['Residue']='aThr92'
valori.to_csv('athr92_redone.csv', sep=',')
valori=pd.read_csv("bthr93.csv", skipinitialspace=True, delimiter=' ')
valori['Residue']='bThr93'
valori.to_csv('bthr93_redone.csv', sep=',')
valori=pd.read_csv("cile118.csv", skipinitialspace=True, delimiter=' ')
valori['Residue']='cile118'
valori.to_csv('cile118_redone.csv', sep=',')
interesting_files = glob.glob("*_redone.csv")
df = pd.concat((pd.read_csv(f, header = 0) for f in interesting_files))
df.to_csv("bigfile.csv")
for f in glob.glob("*_redone.csv"):
os.remove(f)
The bigfile.csv should contain all the data, in a concatenated fashion, and may now be used to build an interaction energy profile chart.