Genomics and Darwinism

The May 2009 issue of Genome Research (www.genome.org) is a special issue celebrating the 200th anniversary of Charles Darwin's birth, and the 150th anniversary of the publication of On the Origin of Species. Published online today, the issue features a collection of perspective reviews and primary research in comparative genomics, genome evolution, and population genomics. Authored by leaders in the field, the perspectives provide novel insight into topics at the heart of evolutionary genomics and evaluate the contributions of genomic research to studies of natural selection, human evolution, ancestry, quantitative trait variation, and the origin of prokaryotic organisms. The following sections highlight several of the primary research papers published in the issue, presenting novel insight on population genetics and molecular evolution.

1. Human population diversity and signals of recent positive selection

Recent advances in genotyping technology have allowed researchers to scan hundreds of thousands of single nucleotide polymorphisms (SNPs), utilizing this data to analyze genetic variation among and between populations across the globe with unprecedented power, helping us to learn more about our ancestry and evolutionary history than ever before. A series of primary research papers in this special issue of Genome Research have investigated human adaptation and evolution on a genome-wide scale, describing novel fine-scale genetic structure within and between populations around the world and presenting evidence for targets of recent positive selection in the genome.

Using the extensive genotype data of 53 populations in the Human Genome Diversity Panel, a research team led by Joseph Pickrell, Graham Coop, and Jonathan Pritchard of the University of Chicago searched for regions of the genome that have been positively selected during recent evolutionary history. These are DNA sequences that over time became more common in members of a population as they adapted to factors such as climate and disease. They found that positively selected regions of the genome are shared between closely related populations, but signals of selection are significantly less shared when the authors looked at more distant groups. The authors also identified previously unknown signals of selection, including genes in the NRG-ERBB4 signaling pathway that play a role in the development of the brain, heart, breast, and other tissues.

Finding regions of the genome that have been positively selected in populations can be complicated by many factors, including population growth, genetic bottlenecks, and variability in mutation and recombination. Applying a complex model of demographics, Rasmus Nielsen of UC Berkeley and colleagues analyzed a set of genes from samples of European Americans and African Americans and identified a significant number of genes under selection, including genes related to disease and muscle development. Importantly, this work is unique in that the data was obtained from direct sequencing rather than SNP data sets, which are limited to only genetic variations that are previously known.

Researchers led by Adam Auton and Carlos Bustamante of Cornell University analyzed patterns of genetic diversity in several global populations, including Southern Europe, Latin America, and South Asia. The group found genetic evidence for historical admixture of Southern Europeans with Mexican populations, identified a genetic "gradient" in diversity from north to south across Europe, and discussed language and geography as factors in population structure within India. The authors emphasized that their study highlights the contribution of demographic and ancestral factors toward the patterns of variation in modern populations.

A study led by Brian McEvoy and Peter Visscher of the Queensland Institute of Medical Research has focused the hunt for genetic structure and signals of positive selection on Northern European populations. McEvoy and colleagues scanned genotypes from several populations, detecting genetic substructure so fine that they were able to genetically differentiate populations as closely related as those from Ireland and the UK. The authors also detected signals of positive selection in the populations, particularly at genes related to immunity, which result hints at possible adaptations to events such as a disease epidemic.

Recognizing critical gaps in the genotyping data sets that have been previously analyzed for human genetic structure, researchers led by Lynn Jorde of the University of Utah have included in their analyses of worldwide populations genotype data from caste and tribal groups from India, Eastern Europe, and Malaysia. Strikingly, the group was able to discern fine-grain genetic substructure between upper- and lower-caste individuals of the same region, shedding light on the effects of social factors on genetic structure.

References:

Pickrell J.K. et al. Signals of recent positive selection in a worldwide sample of human populations. Genome Res doi:10.1101/gr.087577.108

Nielsen R. et al. Darwinian and demographic forces affecting human protein coding genes. Genome Res doi:10.1101/gr.088336.108

Auton A. et al. Global distribution of genomic diversity underscores rich complex history of continental human populations. Genome Res doi:10.1101/gr.088898.108

McEvoy B.P. et al. Geographical structure and differential natural selection among North European populations. Genome Res doi:10.1101/gr.083394.108

Xing J. et al. Fine-scaled human genetic structure revealed by SNP microarrays. Genome Res doi:10.1101/gr.085589.108

2. Rapidly evolving gene families

Researchers are searching for signals of selection in humans on a genome-wide scale, but they are also focusing on specific gene families to learn more about evolutionary mechanisms. In this special issue of Genome Research, three primary research papers have investigated how similar genes involved in a biological process have been shaped by evolution to produce functional variation.

Genes related to immune function are among the most genetically diverse, an evolutionary adaptation to combat the myriad of infectious agents that an organism encounters. In a paper from an international team led by Paul Norman and Peter Parham of Stanford University, the authors have described an elegant evolutionary mechanism by which novel genetic variation is introduced at two genes, KIR3DL1 and KIR3DS1, which encode components of the human innate immune system. The authors reconstruct the meiotic recombination events that gave rise to KIR3DL1/S1 variation in African and Eurasian subjects, describing how this mechanism, combined with ancient diversity preserved within the lineage, created novel defenses that are subject to natural selection.

The apolipoprotein L gene family (APOL) encode factors critical for defense against Trypanosome brucei, the parasitic organism that causes sleeping sickness, as well as having known functions in the programmed cell death of damaged host cells. Eric Smith and Harmit Malik of the University of Washington have investigated the evolution of this gene family in primates, presenting fascinating new insights into how these genes have evolved to counter the pathogen components they interact with. Smith and Malik present evidence that the entire APOL family has evolved rapidly in primates, particularly at sites in the genes critical for interaction with pathogen proteins, which are also under selection to evade detection by the host. This study highlights the evolutionary conflict between host and pathogen, and the dynamic nature of gene families required to adapt.

Geoffrey Findlay and colleagues from the University of Washington have taken a different approach to find novel genes – shotgun proteomics. Using mass spectrometry, the group characterized 19 novel seminal fluid proteins in the fly Drosophila melanogaster, which are known to evolve rapidly and play a key role in male reproductive success. They then searched for the genes that encode the novel proteins, and found genes that had previously escaped detection. There is also evidence that like other seminal fluid components, these genes are positively selected. By going after proteins rather the genes that encode them, Findlay and colleagues demonstrated that proteomic-based methods of gene discovery locate genes missed by computational methods of annotation, and reveals more information about function.

References:

Norman P.J. et al. Meiotic recombination generates rich diversity in NK cell receptor genes, alleles, and haplotypes. Genome Res doi:10.1101/gr.085738.108

Smith E.E. et al. The apolipoprotein L family of programmed cell death and immunity genes rapidly evolved in primates at discrete sites of host–pathogen interactions. Genome Res doi:10.1101/gr.085647.108

Findlay G.D. et al. Proteomic discovery of previously unannotated, rapidly evolving seminal fluid genes in Drosophila. Genome Res doi:10.1101/gr.089391.108

3. Ancient exonization of RNA in plants

In comparison to humans, relatively fewer plant RNAs are subject to alternative splicing, the mechanism by which RNA transcripts are edited to generate functionally diverse protein products. However, this does not mean that alternative splicing has not played a crucial role during plant evolution. Researchers led by Brad Barbzuk of the Donald Danforth Plant Science Center searched the plant genome for alternative splicing events that have been conserved across the evolution of higher plants, and made a surprising discovery. They found that in higher plants, the gene encoding TFIIIA, a subunit of the RNA-synthesizing enzyme RNA polymerase III, has acquired a new exon with striking resemblance to a noncoding RNA gene that TFIIIA regulates – 5S rRNA. Alternative splicing events that include the exon target the transcript for degredation.

"This is intriguing because 5S rRNA has thus been co-opted to play a regulatory role during the expression of the transcription factor, TFIIIA, which is the very gene responsible for activating 5S rRNA transcription," explained Barbazuk. The authors performed functional studies on Arabidopsis thaliana, observing that TFIIIA regulation is involved in response to osmotic stress, a critical challenge for land plants. Barbazuk noted that the significance of their work is underscored by the fact that it is found in the genomes of all land plants investigated, suggesting that this event may have played a role in the adaptation of plants to land environments.

Reference:

Fu Y. et al. Alternative splicing of anciently exonized 5S rRNA regulates plant transcription factor TFIIIA. Genome Res doi:10.1101/gr.086876.108

Source: Cold Spring Harbor Laboratory