Machine learning for damaging mutations prediction

The new-generation sequencing technology has ushered in a new era in medicine, making it easier to identify a sequence of nucleotides in the DNA or a sequence of amino acids in the proteins of a specific individual and use this information for both diagnosis and treatment. Minute alterations in these sequences, mutations can be indicative of a minor disorder and, sometimes, a grave disease.

Scientists from Skoltech, the Technical University of Munich, St. Petersburg Polytechnic University and the Indian Institute of Technology Madras (Chennai, India) developed a machine-learning-based method that allows analyzing the atomic structures of proteins and predicting the pathogenicity of mutations. The method is adapted for transmembrane proteins that account for 25-30% of all the proteins in a cell and often serve as targets for drugs.

"In this study, we used a combination of 1D information on the amino acid sequences of proteins and 3D information on the protein's atomic structures to create an effective machine-learning-based model that helps identify disease-associated amino acid substitutions in membrane proteins," says the first author of the study and Assistant Professor at Skoltech, Petr Popov.

Credit: 
Skolkovo Institute of Science and Technology (Skoltech)