RU

2015: Mapping the spaces of chemical compounds and materials

2015-mapping-spaces-chemical-compounds-EN.jpg

Scientists from the Faculty of Physics of M.V. Lomonosov Moscow State University in collaboration with their colleagues from the Strasbourg University have developed an efficient approach to the analysis and visualization of huge amount of data on structures and properties of chemical compounds and materials, which opens up new ways to design them.

Analyzing, visualizing, mapping and navigating through the multidimensional chemical space representing structures and properties of chemical compounds and materials is a new and very promising approach to design them. The resulting graphical diagrams enable to generalize the accumulated results of numerous experiments, the number of which can reach millions – in this case one speaks of Big Data.

Scientists from the Faculty of Physics of M.V. Lomonosov Moscow State University (research group of Dr. Igor I. Baskin, Dept. of Polymer and Crystal Physics) in collaboration with their colleagues from the Strasbourg University (Laboratory of Chemoinformatics headed by Prof. A. Varnek) have developed an efficient approach to analyzing and visualizing large amounts of data on the structures and properties of chemical compounds and materials using incremental generative topographic mapping. In this case each chemical object (chemical compound or material) is considered as a point in a multidimensional space of chemical or material data. In the framework of this approach, a set of points corresponding to the data under study is approximated with the help of a two-dimensional smooth manifold, which can be associated with a flexible “rubber sheet” hovering over the data space. The maps of this space are built by projecting chemical objects on it. This procedure can efficiently be applied to processing any amount of data concerning the structures and properties of chemical objects. The resulting maps are highly informative. They can be used to analyze the available data and for the directed search for new chemical compounds or materials with desirable properties. Moreover, this approach allows performing a comparative analysis of large data sets, as demonstrated in the article by comparing large libraries of chemical compounds.

The results of this work have been published in the paper: H.A.Gaspar, I.I.Baskin, G.Marcou, D.Horvath, and A.Varnek, Chemical Data Visualization and Analysis with Incremental Generative Topographic Mapping: Big Data Challenge, J. Chem. Inf. Model. 55, 8494 (2015)