A research team led by Associate Professor Jonathan Sebat, Ph.D., of Cold Spring Harbor Laboratory (CSHL) has developed a sensitive and accurate way of identifying gene copy number variations (CNVs). The method, which is described in a paper published in Genome Research, uses new DNA sequencing technologies to look for regions of the genome that vary in copy number between individuals in the population. Capable of detecting a wide range of different classes of CNVs, large and small, this method allows researchers to extract more genetic information from the complete genome sequence of an individual.
CNVs are regions of the genome that vary in the number of copies between individuals. These variants were once considered to be anomalies that occurred rarely among healthy individuals. As the result of discoveries made in 2004, CNVs are now recognized as a major source of human genetic variation and methods for detecting CNVs have proven to be an effective approach for identifying genetic risk factors for disease.
Genome sequencing technologies are improving at a rapid pace. The current challenge is to find ways to extract all of the genetic information from the data. One of the biggest challenges has been the detection of CNVs. Sebat, in collaboration with Seungtai Yoon of CSHL and Kenny Ye, Ph.D., at the Albert Einstein College of Medicine, developed a statistical method to estimate DNA copy number of a genomic region based on the number of sequences that map to that location (or "read depth"). When the genomes of multiple individuals are compared, regions that differ in copy number between individuals can be identified.
The new method allows the detection of small structural variants that could not be detected using earlier microarray-based methods. This is significant because most of the CNVs the genome are less than 5000 nucleotides in length. The new method is also able to detect certain classes of CNVs that other sequencing-based approaches struggle with, particularly those located in complex genomic regions where rearrangements occur frequently.
The development of this novel method is timely. The 1000 Genomes Project was launched in 2008, as an international effort to sequence the genomes of 2000 individuals across geographic and ethnic regions to catalog human genetic variation. Sebat's team along with many other groups has contributed to the production and analysis of these data.
This innovation improves the detection of structural variants from whole genome sequence data, which will lead to improved sensitivity to detect disease-causing CNVs.
Source: Cold Spring Harbor Laboratory