
Accelrys Materials Studio (MS) is a commercial program, so I think I should be extra-careful writing about it. Let's start with the usual disclaimer: I am not connected in any way with Accelrys and what I write is what I think about Materials Studio (MS) as a sporadic user, and do not reflect in any way the opinion of my laboratory and of the people working there. Being a sporadic user, I honestly cannot tell too much about the use MS, so I will not. Let me just say that I consider MS is a really nice piece of software for certain things, for example the 3D editor is amazing, and a pain for others. Plus, is Windows only. Personally I see MS a product tuned more for the industry than for academic research, where flexibility and portability should be taken in much higher consideration (again, my personal opinion).
There is one aspect in the last versions of MS that is worthy to explore and that can improve tremendously the flexibility of MS: perl scripting. Now it is possible to program in perl within MS, which make possible to use and combine almost all the functions you can access via the graphical interface. Furthermore, perl scripting make finally easy to make MS interact with the "external" world, meaning you can export your results as you prefer and also add new capabilities to MS. I have recently give a small introductory seminar to perl scripting in MS to my colleagues, and I have thought that it could be useful to collect in Chembytes those scripts and those I will make for the next meetings. All the script you will find here are made by me or published with the consent of the author and are free to be used and modified. If for any reason this page is illegal (I cannot see why, but I may be wrong), please contact me and I will procede to remove this page and/or all the illegal contents. Also, please note that the script are made to illustrate some procedures, and are far to be complete, efficient or bug free. Take them as a starting point and backup your data before execute the scripts. I am not responsible for any loss of data due to the use of these scripts. Use them at your own risk!
Let's start: what do you need?
Well, the only thing you need to have installed is MS. That assuming you are not interested in Accelrys pipeline pilot.
I strongly recommend you to get a perl installation also outside MS: perl is free and having a perl installation will let you easily create small script to get familiar with the syntax of perl. In Windows you can install ActivePerl from ActiveState or strawberry perl, as suggested in the Perl's official website.
There is another option though: install the standard distribution of perl (and its modules, if required) via cygwin. This will allows to work in a bash terminal under Windows, that is so much better than the standard Windows consolle.
What you should do writing your code
Comment the scripts, use appropriate and meaningful names for your variables and indent the code. These three requirements will make your code much more easy to understand and eventually modify for anyone (yes, also you).
Materials Studio perl environment
To write a new script, just select script from the File->New menu. You can also open an existing script, with File->Open.
By default, when you create a new perl script, the following lines are present:
#!perl
use strict;
use MaterialsScript qw(:all);
The use of strict is optional, but I would advise you to keep using it, while the MaterialsScript module is required to access to the MS functionalities.
The scripts
Numbering the atoms
One of the major problems I have encounter in using MS, is find a way to display a label showing the number in the structure for the atoms. Which is the first atom in the structure? and the last? and if I want to calculate a distance between the atoms 10 and 23, which ones I should select? Sure, you can display the cartesian coordinates and then edit the xtd file and try to figure out which atom is which. This script will modify the atom names as ElementSymbol-N, where N is the position of the atom in the structure. The modified document is not saved by the script, so you can save or save as, if you wish to maintain the original file.
Download
Download the script here
Export a MS xtd trajectory as XYZ trajectory.
This script export a xtd trajectory from MS Forcite module as XYZ trajectory. The script creates two text files in the current directory: xtd2mol.txt, which contains a short log and trj.txt, containing the XYZ trajectory. The extension txt is due to the fact that when you create a document, the format of the document must be recognized by MS, and xyz/xmol are not known formats. Therefore, you will have to deal with .txt extension for your files.
Backup copies of the files, are managed by MS in the usual way, appending (N) to the name of the flle, where N is a progressive number.
Download
Download the script here
Disclaimer
use the software and the scripts in this wiki at your own risk. The software is freeware and it come without any (as in nothing, nada, niente, rien) WARRANTY. Therefore I cannot be responsible for loss of data, time, bad results or anything else deriving by the use of my software or this wiki.
hi
Hey very cool blog!! Guy.. Beautiful.. Wonderful.. I will bookmark your website and take the feeds additionallyKI am satisfied to search out numerous useful information right here in the publish, we want develop more techniques on this regard, thank you for sharing aeaccffebkekckgf
Sir:
I have a problem when use the Script, it will prompt "Cannot operate upon infinite SymmetrySystem (function/property "Atoms" at -e line 42". I don't find the answer.
Substitute line 42 with: my $atoms = $doc->DisplayRange->Atoms;
I have a file in perl scirpt .I want to know how to define box size in the script.
need your help please,…
Use of uninitialized value $box_x in multiplication (*) at inflategro.pl line 259.
Use of uninitialized value $box_y in multiplication (*) at inflategro.pl line 260.
#!/usr/bin/perl -w
sub round {
my( $num, $prec )= @_;
return int( $num/$prec + 0.5 - ($num<0) ) * $prec;
}
sub abstand {
my( $a1,$a2,$a3,$b1,$b2,$b3)= @_;
return ( (($a1-$b1)2 + ($a2-$b2)2 + ($a3-$b3)2)0.5);
}
#Check input
if (@ARGV<7) {die "\n\n
###########################################################################
INFLATEGRO
Written by Christian Kandt, (c) 2005-2007
Kandt C, Ash WL, Tieleman DP (2007): Setting up and running molecular
dyanmics simulations of membrane proteins. Methods 41:475-488
###########################################################################
INFLATEGRO reads the coordinates of a bilayer and inflates them
in XY directions using a common scaling factor. To identify
the lipids their actual residue name must be given.
Water coordinates (SOL) will be ingored, everything else will
be centered in the XY plane of the new simulation box.
A distance cutoff in A can be defined: Only lipids with a
P - CA distance exceeding that cutoff will be written. It is currently
assumed that you're actually dealing with phospholipids. However, this can
be easily extended. Just have a look at the code.
Area per lipid is estimated by caculating the area per protein first.
This is done using a grid-based approach. A grid size of 5 A was found to
give good results. Output is written as a 3-collumned ASCII file holding
3 area per lipid values: total, upper leaflet & lower leaflet.
If the additional flag >protein< is set only the protein
coordinates will be written.
USAGE:
INFLATEGRO bilayer.gro scaling_factor lipid_residue_name cutoff inflated_bilayer.gro gridsize areaperlipid.dat (protein)
…good luck!
;) Christian Kandt 05/2007
\n\n";}
if (!(open (INPUT, $ARGV[0]))) {print "Eeeeek! No $ARGV[0] at all!\n\n";
die;}
#$input=$ARGV[0];
$scale=$ARGV[1];
$name=$ARGV[2];
$cutoff=$ARGV[3]*0.1;
$output=$ARGV[4];
$gridsize=$ARGV[5];
$area=$ARGV[6];
$switch=0;
if (@ARGV==8 and $ARGV[7] eq "protein") {
print "Well, just the protein then….\n";
$switch=1;
}
$zaehler=1;
$counter=1;
$now=0;
$protein_xmax=-1000;
$protein_ymax=-1000;
$protein_xmin=1000;
$protein_ymin=1000;
print STDOUT "Reading….. \n";
while (<INPUT>) {
if (/^ +([0-9.-]+) +([0-9.-]+) +([0-9.-]+)/) {
$box_x=$1;
$box_y=$2;
$box_z=$3;
}
unless ($now==0) {
if (/^ +(\d+)(\S+) +(\S+) +([0-9.-]+) +([0-9.-]+) +([0-9.-]+)/ ) {
if ($2 eq $name) {
$resnum_l[$zaehler]=$1;
$resname_l[$zaehler]=$2;
$atmnamenum=$3;
$x_l[$zaehler]=$4;
$y_l[$zaehler]=$5;
$z_l[$zaehler]=$6;
$howlong=length($atmnamenum) - 1;
$atmname_l[$zaehler]=substr($atmnamenum,0,$howlong-4);
$atmnum_l[$zaehler]=substr($atmnamenum,$howlong-4);
#print "$resnum_l[$zaehler]: $atmnamenum $atmname_l[$zaehler] $atmnum_l[$zaehler]\n";
$zaehler++;
}
if (!($2 eq $name) && !($2 eq "SOL")) {
$resnum_p[$counter]=$1;
$resname_p[$counter]=$2;
$atmnamenum=$3;
$x_p[$counter]=$4;
$y_p[$counter]=$5;
$z_p[$counter]=$6;
$howlong=length($atmnamenum) - 1;
$atmname_p[$counter]=substr($atmnamenum,0,$howlong-4);
$atmnum_p[$counter]=substr($atmnamenum,$howlong-4);
if ($x_p[$counter]>$protein_xmax) {$protein_xmax=$x_p[$counter];}
if ($x_p[$counter]<$protein_xmin) {$protein_xmin=$x_p[$counter];}
if ($y_p[$counter]>$protein_ymax) {$protein_ymax=$y_p[$counter];}
if ($y_p[$counter]<$protein_ymin) {$protein_ymin=$y_p[$counter];}
#print "$resnum_p[$counter]: $atmnamenum $atmname_p[$counter] $atmnum_p[$counter]\n";
$counter++;
}
}
}
if (/^ +(\d+)(\S+) +(\S+) +(\d+) +([0-9.-]+) +([0-9.-]+) +([0-9.-]+)/ ) {
if ($2 eq $name) {
$resnum_l[$zaehler]=$1;
$resname_l[$zaehler]=$2;
$atmname_l[$zaehler]=$3;
$atmnum_l[$zaehler]=$4;
$x_l[$zaehler]=$5;
$y_l[$zaehler]=$6;
$z_l[$zaehler]=$7;
if ($atmnum_l[$zaehler]==9999) {$now=1;}
#print "$resnum_l[$zaehler] $now\n";
$zaehler++;
}
if (!($2 eq $name) && !($2 eq "SOL")) {
$resnum_p[$counter]=$1;
$resname_p[$counter]=$2;
$atmname_p[$counter]=$3;
$atmnum_p[$counter]=$4;
$x_p[$counter]=$5;
$y_p[$counter]=$6;
$z_p[$counter]=$7;
if ($x_p[$counter]>$protein_xmax) {$protein_xmax=$x_p[$counter];}
if ($x_p[$counter]<$protein_xmin) {$protein_xmin=$x_p[$counter];}
if ($y_p[$counter]>$protein_ymax) {$protein_ymax=$y_p[$counter];}
if ($y_p[$counter]<$protein_ymin) {$protein_ymin=$y_p[$counter];}
if ($atmnum_p[$counter]==9999) {$now=1;}
#print "$resnum_p[$counter] \n";
$counter++;
}
}
}
close (INPUT);
$zaehler;
$counter;
$totalatmn=$zaehler+$counter;
$protein_xmin=$protein_xmin*10;
$protein_xmax=$protein_xmax*10;
$protein_ymin=$protein_ymin*10;
$protein_ymax=$protein_ymax*10;
$box_x=$box_x*$scale;
$box_y=$box_y*$scale;
print STDOUT "Scaling lipids….\n";
$pcount=1;
for ($k=1; $k<=$zaehler; $k++) {
if (substr($atmname_l[$k],0,1) eq "P") {
$pxneu=$x_l[$k]*$scale;
$pyneu=$y_l[$k]*$scale;
$res=$resnum_l[$k];
$translatex_l[$res]=$pxneu-$x_l[$k];
$translatey_l[$res]=$pyneu-$y_l[$k];
$phosz[$pcount]=$z_l[$k];
$pcount++;
}
}
$pcount—;
print "There are $pcount lipids…\n";
$atomperlipid= $zaehler/$pcount;
print "with $atomperlipid atoms per lipid..\n";
print "\nDetermining upper and lower leaflet…\n";
$middle=0;
for ($p=1;$p<=$pcount; $p++) {
$middle=$middle + $phosz[$p];
}
$middle=$middle / $pcount;
$uppercount=0;
$lowercount=0;
$upper=0;
$lower=0;
for ($p=1;$p<=$pcount; $p++) {
if ($phosz[$p]>$middle) {$upper=$upper+$phosz[$p];
$uppercount++;}
if ($phosz[$p]<$middle) {$lower=$lower+$phosz[$p];
$lowercount++;}
}
$upper=$upper/$uppercount;
$lower=$lower/$lowercount;
print "$uppercount lipids in the upper…\n";
print "$lowercount lipids in the lower leaflet \n\n";
#Determining protein XY-COM & calculating translation vector
if ($counter==0) {print "No protein coordinates found…\n";}
if ($counter>0) {
print STDOUT "Centering protein….\n";
$xpsum=0;
$ypsum=0;
for ($k=1; $k<=$counter; $k++) {
$xpsum=$xpsum+$x_p[$k];
$ypsum=$ypsum+$y_p[$k];
}
$xcom=$xpsum/$counter;
$ycom=$ypsum/$counter;
$xcenter=0.5*$box_x;
$ycenter=0.5*$box_y;
$translatex_p=$xcenter - $xcom;
$translatey_p=$ycenter - $ycom;
#print "COM: $xcom $ycom\n";
#print "New center: $xcenter $ycenter\n";
#print "Translation vector: $translatex_p $translatey_p\n\n";
}
$upper_rm=0;
$lower_rm=0;
if ($cutoff>0) {
if ($switch==0) {
print "Checking for overlap….\n";
print "…this might actually take a while….\n";
$overlapcount=0;
for ($k=1; $k<=$zaehler; $k++) {
$uppercheck=0;
$lowercheck=0;
$progress=($k/$zaehler)*100;
$progress=round ($progress,2);
print STDOUT "$progress % done…\r";
if (substr($atmname_l[$k],0,1) eq "P") {
$res=$resnum_l[$k];
$overlap[$res]=0;
for ($i=1; $i<=$counter; $i++) {
if ($atmname_p[$i] eq "CA") {
$distance=abstand ($x_l[$k]+$translatex_l[$res],$y_l[$k]+$translatey_l[$res],$z_l[$k],$x_p[$i]+$translatex_p,$y_p[$i]+$translatey_p,$z_p[$i]);
if ($distance<=$cutoff) {
$overlap[$res]=1;
if ($z_l[$k] > $middle) {$uppercheck=1;}
if ($z_l[$k] < $middle) {$lowercheck=1;}
}
}
}
$overlapcount=$overlapcount+$overlap[$res];
if ($uppercheck==1) {$upper_rm++;}
if ($lowercheck==1) {$lower_rm++;}
}
}
}
print "\nThere are $overlapcount lipids within cut-off range…\n";
print "$upper_rm will be removed from the upper leaflet…\n";
print "$lower_rm will be removed from the lower leaflet…\n\n";
}
$newlipids=$pcount - $upper_rm - $lower_rm;
$newupper=$uppercount - $upper_rm;
$newlower=$lowercount - $lower_rm;
$totalatmn_new=$totalatmn - ($upper_rm + $lower_rm)*$atomperlipid;
print STDOUT "Writing scaled bilayer & centered protein…\n";
open(OUTPUT, ">$output");
print OUTPUT "What you read here has nothing to do with anything. So you don't have to read it. Thank you.\n";
print OUTPUT "$totalatmn_new\n";
for ($k=1; $k<=$counter; $k++) {
$newx=$x_p[$k]+$translatex_p;
$newy=$y_p[$k]+$translatey_p;
# print "$newx $newy\n";
printf OUTPUT "%5d%-5s%5s%5d%8.3f%8.3f%8.3f\n",$resnum_p[$k],$resname_p[$k],$atmname_p[$k],$atmnum_p[$k],$newx,$newy,$z_p[$k];
}
if ($switch==0) {
for ($k=1; $k<=$zaehler; $k++) {
$res=$resnum_l[$k];
$newx=$x_l[$k]+$translatex_l[$res];
$newy=$y_l[$k]+$translatey_l[$res];
if ($cutoff>0) {
if ($overlap[$res]==0) {
printf OUTPUT "%5d%-5s%5s%5d%8.3f%8.3f%8.3f\n",$resnum_l[$k],$resname_l[$k],$atmname_l[$k],$atmnum_l[$k],$newx,$newy,$z_l[$k];
}
}
if ($cutoff==0) {
printf OUTPUT "%5d%-5s%5s%5d%8.3f%8.3f%8.3f\n",$resnum_l[$k],$resname_l[$k],$atmname_l[$k],$atmnum_l[$k],$newx,$newy,$z_l[$k];
}
}
}
printf OUTPUT "%10.5f%10.5f%10.5f\n",$box_x,$box_y,$box_z;
close OUTPUT;
print "\n\nCalculating Area per lipid…\n";
$protein_xmax=int ($protein_xmax+1);
$protein_xmin=int ($protein_xmin);
$protein_ymax=int ($protein_ymax+1);
$protein_ymin=int ($protein_ymin);
$xrange=$protein_xmax-$protein_xmin;
$yrange=$protein_ymax-$protein_ymin;
print "Protein X-min/max: $protein_xmin $protein_xmax\n";
print "Protein Y-min/max: $protein_ymin $protein_ymax\n";
print "X-range: $xrange A Y-range: $yrange A\n";
if ($protein_xmin != 0 or $protein_xmax != 0){
for ($k=1; $k<=$counter; $k++) {
$x_p[$k]=10*$x_p[$k]-$protein_xmin;
$y_p[$k]=10*$y_p[$k]-$protein_ymin;
}
}
print "Building $xrange X $yrange 2D grid on protein coordinates…\n";
#for ($x=0; $x<=$xrange; $x=$x+$gridsize) {
for ($x=0; $x<=$xrange/$gridsize; $x=$x+1) {
for ($y=0; $y<=$yrange/$gridsize; $y=$y+1) {
$grid[$x][$y]=0;
}
}
print "Calculating area occupied by protein..\n";
print "full TMD..\n";
for ($k=1; $k<=$counter; $k++) {
if ($z_p[$k]>=$lower and $z_p[$k]<=$upper) {
$x = int( $x_p[$k] / $gridsize);
$y = int( $y_p[$k] / $gridsize);
$grid[$x][$y]=1;
}
$progress=$k/$counter *100;
$progress=round ($progress,1);
print "$progress % done…\r";
}
$howmany=0;
for ($x=0; $x<=$xrange/$gridsize; $x=$x+1) {
for ($y=0; $y<=$yrange/$gridsize; $y=$y+1) {
$howmany=$howmany+$grid[$x][$y];
}
}
$areaprotein_total=($gridsize)**2 *$howmany *0.01;
$arealipid_total=($box_x * $box_y - $areaprotein_total)/($newlipids*0.5);
print "upper TMD..\n";
for ($x=0; $x<=$xrange/$gridsize; $x=$x+1) {
for ($y=0; $y<=$yrange/$gridsize; $y=$y+1) {
$grid[$x][$y]=0;
}
}
for ($k=1; $k<=$counter; $k++) {
if ($z_p[$k]>=$middle and $z_p[$k]<=$upper) {
$x = int( $x_p[$k] / $gridsize);
$y = int( $y_p[$k] / $gridsize);
$grid[$x][$y]=1;
}
$progress=$k/$counter *100;
$progress=round ($progress,1);
print "$progress % done…\r";
}
$howmany=0;
for ($x=0; $x<=$xrange/$gridsize; $x=$x+1) {
for ($y=0; $y<=$yrange/$gridsize; $y=$y+1) {
$howmany=$howmany+$grid[$x][$y];
}
}
$areaprotein_upper=($gridsize)**2 *$howmany *0.01;
$arealipid_upper=($box_x * $box_y - $areaprotein_upper)/($newupper);
print "lower TMD..\n";
for ($x=0; $x<=$xrange/$gridsize; $x=$x+1) {
for ($y=0; $y<=$yrange/$gridsize; $y=$y+1) {
$grid[$x][$y]=0;
}
}
for ($k=1; $k<=$counter; $k++) {
if ($z_p[$k]>=$lower and $z_p[$k]<=$middle) {
$x = int( $x_p[$k] / $gridsize);
$y = int( $y_p[$k] / $gridsize);
$grid[$x][$y]=1;
}
$progress=$k/$counter *100;
$progress=round ($progress,1);
print "$progress % done…\r";
}
$howmany=0;
for ($x=0; $x<=$xrange/$gridsize; $x=$x+1) {
for ($y=0; $y<=$yrange/$gridsize; $y=$y+1) {
$howmany=$howmany+$grid[$x][$y];
}
}
$areaprotein_lower=($gridsize)**2 *$howmany *0.01;
$arealipid_lower=($box_x * $box_y - $areaprotein_lower)/($newlower);
print "Area per protein: $areaprotein_total nm^2\n";
print "Area per lipid: $arealipid_total nm^2\n\n";
print "Area per protein, upper half: $areaprotein_upper nm^2\n";
print "Area per lipid, upper leaflet : $arealipid_upper nm^2\n\n";
print "Area per protein, lower half: $areaprotein_lower nm^2\n";
print "Area per lipid, lower leaflet : $arealipid_lower nm^2\n\n";
print STDOUT "Writing Area per lipid…\n";
open(OUTPUT, ">$area");
#if ($grid[$x][$y]==1) { print OUTPUT "$x $y\n";}
print OUTPUT "$arealipid_total $arealipid_upper $arealipid_lower \n";
close OUTPUT;
print "Done!\n\n\n";
hi
i need to numbering beads in the cubics smaller than my xsd box.
expaination:
i want to calculate local densities in my box and subtract from global density of box.
i need a script to that.
my box is 237 *237*237 A3.
imaginery cubics is 1*1*1 A3.
it means i need 237*237*237 local density.(for this, the number of special beads(because i have 3 kind of beads) in these cubics is requierd)
Post preview:
Close preview