A method framework for high-performance data mining of Monte Carlo Protein Simulations

Biomolecular simulations on supercomputers using Monte Carlo or Molecular Dynamics methods generate sets of trajectories with a huge number of molecular conformations (108 – 1014), each conformation represented as a high dimensional vector of the molecules' degrees of freedom. Currently the dimensionality and the vast number of conformations pose serious problems to analyzing such trajectories.

To gain further insight into the details of the biophysical process from these simulations this project will develop High Performance methods for comparison, ordering, indexing, and mining of molecular structure ensembles and their dynamic substructures. Large scale Monte Carlo simulations of protein folding(1) and peptide aggregation from the SL-Bio will provide first use cases for the method framework developed within the project.

(1) Mohanty, S. ; Meinke, J. ; Zimmermann, O.
Folding of Top7 in unbiased all-atom Monte Carlo simulations
Proteins 81(8), 1446 - 1456 (2013) [10.1002/prot.24295]

Project duration:
January 2015 - December 2016

Prof. Dr. T. Seidl, Data Management and Data Exploration Group, Computer Science 9, RWTH Aachen University
Dr. Olav Zimmermann, Simulation Laboratory Biology, Jülich Supercomputing Centre, Forschungszentrum Jülich