Gene duplication sheds light on evolution of human complexity

HOUSTON -- (Nov. 3, 2009) -- A painstaking analysis of thousands of genes and the proteins they encode shows that human beings are biologically complex, at least in part, because of the way humans evolved to cope with redundancies arising from duplicate genes.

"We have found a specific evolutionary mechanism to account for a portion of the intricate biological complexity of our species," said Ariel Fernandez, professor of bioengineering at Rice University. "It is a coping mechanism, a process that enables us to deal with the fitness consequences of inefficient selection. It enables some of our proteins to become more specialized over time, and in turn makes us more complex."

Fernandez is the lead author of a paper slated to appear in the December issue of the journal Genome Research. The research is available online now.

Fernandez said the study drew from previous findings by his own research group and from seminal work of Michael Lynch, Distinguished Professor of Biology at Indiana University and a recently elected a fellow of the National Academy of Science. Lynch's work has shown that natural selection is less efficient in humans as compared with simpler creatures like bacteria. This "selection inefficiency" arises from the smaller population size of humans as compared with unicellular organisms.

"In all organisms, genes get duplicated every so often, for reasons we don't fully understand," Fernandez said. "When working efficiently, natural selection eliminates many of these duplicates, which are called 'paralogs.' In our earlier work, we saw that an unusual number of gene duplicates had survived in the human genome, which makes sense given selection inefficiency in humans."

In prior research on protein structure, Fernandez's team found that some proteins are packaged more poorly than others. Moreover, they found that the least-efficiently packed proteins are structurally stable only when they bind with partner proteins to form complexes.

"These poorly packed proteins are potential troublemakers when gene duplication occurs," Fernandez said. "The paralog encodes more copies of the protein than the body needs. This is called a 'dosage imbalance,' and it can make us sick. For instance, dosage imbalance has been implicated in Alzheimer's and other diseases."

Given selection inefficiency, Fernandez knew that paralogs encoding poorly packed proteins could remain in the human genome for quite a while. So he and graduate student Jianpeng Chen decided to examine whether gene duplicates had remained in the genome long enough for random genetic mutations to affect the paralogs dissimilarly. Fernandez and Chen, now a senior researcher in Beijing, China, cross-analyzed databases on genomics, protein structure, microRNA regulation and protein expression in such troublesome paralogs.

"The longer these duplicate genes stick around due to inefficient selection, the more likely they are to suffer a random mutation," Fernandez said. "Portions of every gene act to regulate protein expression -- by binding with microRNA, for example. We found numerous instances where random mutations had caused paralogs to be expressed dissimilarly, in ways that removed detrimental dosage imbalances."

Lynch said one aspect of Fernandez's research that is potentially groundbreaking is the observed tendency of proteins to evolve a more open structure in complex organisms.

"This observation fits with the general theory that large organisms with relatively small population sizes -- compared to microbes -- are subject to the vagaries of random genetic drift and hence the accumulation of very mildly deleterious mutations," Lynch said.

In principle, he said, the accumulation of such mutations may encourage a slight breakdown in protein stability. This, in turn, opens the door to interactions with other proteins that can return a measure of that lost stability.

"These are the potential roots for the emergence of novel protein-protein interactions, which are the hallmark of evolution in complex, multicellular species," Lynch said. "In other words, the origins of some key aspects of the evolution of complexity may have their origins in completely nonadaptive processes."

Fernandez said the research reveals how increasingly specialized proteins can evolve. He drew an analogy to a business that hires two delivery drivers that initially cover the same parts of town but eventually specialize to deliver only to specific neighborhoods.

"Eventually, even if times become tough, you cannot lay off either of them because they each became so specialized that your company needs them both," he said.

The more simple a creature is, the fewer specialized proteins it possesses. Humans and other higher-order mammals need many specialized proteins to build the specialized tissues in their skin, skeleton and organs. Even more specialized proteins are needed to maintain and regulate them. This complexity requires that the duplicates of the original jack-of-all-trades gene be retained, but this does not happen unless selection is inefficient. This is frequently a point of contention between proponents of evolution and intelligent design.

Fernandez and Chen looked at duplicate genes across the human genome and found that the more poorly packed a protein was, the more likely it was to be distinguished through paralog specialization.

"This supports the case for evolution because it shows that you can drive complexity with random mutations in duplicate genes," Fernandez said. "But this also implies that random drift must prevail over Darwinian selection. In other words, if Darwinian selection were ruthlessly efficient in humans -- as it is in bacteria and unicellular eukaryotes -- then our level of complexity would not be possible."

Source: Rice University