PacBio's SMRT sequencing provides scientists with a superior gorilla genome reference

MENLO PARK, Calif., March 31, 2016 -- Pacific Biosciences of California, Inc., (Nasdaq:PACB) the pioneer and leader in long-read sequencing using its Single Molecule, Real-Time (SMRT®) technology, today announced that scientists from the University of Washington, the McDonnell Genome Institute at Washington University in St. Louis, and other institutions have published the best genome assembly of the gorilla to date -- more than 150-fold improvement over previous assemblies -- using long-read SMRT Sequencing from Pacific Biosciences. The peer-reviewed paper appears in the April 1 issue of the journal Science and is available online today.

Analysis of the gorilla genome promises to shed light on biological mechanisms behind speech, disease, neurological behavior, and other traits that separate us from our closest primate relatives. The previous gorilla genome assembly, built with short-read and Sanger sequencing data, was highly fragmented, containing more than 400,000 gaps with missing sequence, and was constructed using the human genome as a guiding reference. The new assembly, which was carried out de novo (i.e., without any reference information), represents a remarkable decrease in assembly fragmentation (433,861 pieces previously vs. 15,997 with PacBio data, or a >96% reduction in the degree of fragmentation). The PacBio assembly recovers 93% of the gaps and nearly all exons missing from the previous assembly, and provides at least 148 Mbp of additional euchromatic sequence. The authors used information from six additional western lowland gorilla genomes to create a pan-reference genome for use by the scientific community.

"Our results demonstrate the utility of long-read sequence technology to generate high-quality working draft genomes of complex vertebrate genomes without guidance from preexisting reference genomes," the authors report in the paper. "The genome assembly that results from using the long-read data provides a more complete picture of gene content, structural variation and repeat biology, as well as allows us to refine population genetic and evolutionary inferences."

The team used PacBio's SMRT Sequencing, followed by assembly and polishing with PacBio's new FALCON assembly and Quiver consensus algorithms to create a more complete picture of the gorilla genome. FALCON enabled the researchers to traverse most repetitive structures, validating the utility of the experimental tool for assembling complex genomes and representing the first published use of this genome assembler. More information about the assembly is provided in the PacBio blog.

Jonas Korlach, Chief Scientific Officer of Pacific Biosciences, commented: "We are delighted by the authors' suggestion that our approach provides a routine way to assemble complex genomes without relying on reference genomes, bringing high-quality de novo mammalian assemblies within the reach of individual labs. This paper also demonstrates the importance of true long reads over scaffolding approaches for generating highly contiguous genomes without gaps, which is necessary for understanding gene content, population genetic diversity, ancestral evolution, and species biology." ave just published the best genome assembly of the gorilla to date,using long-read SMRT Sequencing from Pacific Biosciences. The peer-reviewed paper appears in the April 1 issue of the journal Science and is available online today.

Source: Bioscribe