Author Archives: frank

The artistry of molecules

Molecules are beautiful things, intricate and infinitely variable. As part of research publications it can be useful to catch them from their best angles. This short post gives some tips on how to present molecules in publications.

Our molecule for today is Prostaglandin-F2&#945
PGE2a.gif

There are a number of chemical databases and ways of expressing molecular identities.
A Wikipedia search is not a bad place to start.
If you look to the right of the page, you’ll see a number of “Identifiers” in chemical databases listed.
PubChem, a public database run by the NIH lists Prostaglandin-F2&#945 and shows an image.
It also gives a chemical code, or isomeric SMILES code:
CCCCC[C@@H](/C=C/[C@H]1[C@@H](C[C@@H]([C@@H]1C/C=C\CCCC(=O)O)O)O)O

Now we can take this code and do something really interesting with it.

Download and run a program called Jmol.

It’s a Java program which is open source and cross platform. Once open, go to the File menu and click on “Get MOL”, then paste the SMILES code in and…

JMOL.png

You’ll now see a great 3D rendering of your molecule. You can rotate it at will, as well as changing colours, the size of molecules and other settings.
Jmol also loads much larger molecules and protein sequences. This is where the Protein Data Bank comes in useful. Look up your protein, for example BDNF and you’ll notice a four character descriptor, in this case: 1B8M as well as an option to save a .pdb file. Either of these options can be used to visualize the protein in Jmol (either >>File>>Get PDB, or >>File>>Open respectively).

BDNF.png

This image was viewed with the following options (right click to bring them up):

  1. color >> atoms >> by scheme >> chains
  2. style >> structures >> ribbons

Jmol is a powerful piece of software with it’s own scripting capabilities. This makes standardizing your images easy.

It also allows making animations and all sorts of other things.
For example, the animation for the Prostaglandin-F2&#945 molecule above was created using the following script:

# goal: a 360 degree spin 

~degstep = 5 # or 2, degrees per step
~degrees = 0 # total degrees rotated

while (~degrees < 360)
{
    # SPIN
    rotate y @~degstep
    ~degrees += ~degstep
    
    # DISPLAY/WRITE FRAME
    refresh
    
    ~whereami = "degrees = " + ~degrees 
    print ~whereami
    
    ~fileprefix = "anim/"

    # UNCOMMENT TO GENERATE EXTRA FRAMES 
    ~degreesplus = 1000 + ~degrees
    ~towrite = ~fileprefix + "m" + ~degreesplus + ".jpg"

    # UNCOMMENT TO GENERATE FRAMES.
    #~towrite = ~fileprefix + "m" + ~degrees + ".jpg"
    
        # WRITE THE IMAGE
    write image jpg @~towrite
} 

These images can then be combined using another open source java program called imageJ. Just import image sequence, then save as animgif.

Another useful bit of scripting for Jmol, is saving off with transparency, this allows adding images to composite documents more easily.
Try this from the console:

wireframe 0.15
spacefill 25%
color bonds [40,40,40]
write pngT /YourfilePath/molecule.png

Links

This post has mostly focused upon 3D molecules, for which you can use Jmol,
however, if you’d like to make 2D structures, there are a number of programs available

  • MolView – Web based, open source 2D and 3D viewer
  • JChemPaint is an excellent open source desktop java program which also uses SMILES codes
  • Pubchem sketcher is a web based free 2D sketcher
  • BKChem is a python program
  • JSME and open source web based editor
  • BALL project includes advanced 3D visualisation (open source)

Non-free

Reproducible Research

We’re currently preparing a series of guides for students which will cover the use of essential software tools for research. The modules will cover what we believe are the most important pieces of software for students to become proficient data managers and efficient paper and report writers.
All the tools listed are open source and allow for collaborative document management. The core principles are well described in Christopher Gandrud’s excellent book:
Reproducible Research – with R & RStudio

In preparing the lessons, it’s been interesting to survey the tools that we and other researchers use regularly and look at why we use these tools. Here’s an outline of the lessons:

  • Document Management
    • LaTeX
    • LaTeX extensions (templates, knitr, texcount)
    • Collaborative Editing (ShareLaTex, Etherpad)
    • Bibliographic Managament (BibTex, Zotero)
    • Version Management with Git
    • Cloud Storage and Security (Owncloud, encfs)
  • Statistics
    • R and R-studio
    • Shiny and knitr
  • Programming for Researchers
    • Why you’ll need to program and why it’s no harder than cooking
    • OS basics and BASH
    • MATLAB and why (not) to use it
    • Python
  • Graphics
    • Image formats: vector graphics vs bitmaps
    • SVG and the fine art of Inkscape
    • Dynamic graphics with Processing
  • Web Publishing
    • HTML is your friend
    • Intellectual Property and Privacy
    • Running your own blog
    • Integrating with LaTeX
  • Hardware
    • Basic Electronics
    • Arduino
    • Principles of Neuroimaging
    • Putting it all together

The resources for these lessons will be published as open documents in the future. We’d be interested to hear your comments if there are other subjects that you’d be interested in.