Structural genomics accelerates protein structure determination

Proteins are molecular machines that transport substances, catalyze chemical reactions, pump ions, and identify signaling substances. They are chains of amino acids and the individual amino acid sequence is known for many of them. However, the functions a protein can carry out inside the cell are determined by the three-dimensional spatial structure of the protein. Establishing this so-called tertiary structure presents a great challenge to scientists. There is, thus, a lot of catching up to be done in structure analysis. To push progress, the National Institute of General Medical Sciences (NIGMS) of the USA National Institutes of Health (NIH) has invested over 500 million dollars in this field over the last ten years as part of the Protein Structure Initiative with the hope of making significant progress in medicine and biological research.

Informatics professor Burkhard Rost and Marco Punta, Carl von Linde Junior Fellow at the Institute for Advanced Study (IAS) of the TU München, are involved in this large-scale project. They are affiliated with the New York Consortium on Membrane Protein Structure (NYCOMPS), which is among nine funded membrane research centers. The NYCOMPS scientists put a special emphasis on membrane proteins. That is because they play a key role in pharmacological research. When a pharmaceutical agent enters the cell, it normally interacts first with membrane proteins. Knowing the protein structure is essential to understanding this interaction at the molecular level.

However, in the case of these very important membrane proteins, experimentally deciphering the tertiary structure is particularly difficult. For example the recombinant production of many membrane proteins is a major challenge and purification and crystallization are also difficult steps. The result: although around 25 percent of all proteins are membrane proteins, they account for less than one percent of the total number of proteins with known structures. Membrane protein structures are thus underrepresented 25-fold. Given their medical relevance, they should be much better known.

Since the experimental analysis of a membrane protein can take up to several years, the NYCOMPS scientists applied a bioinformatics strategy, the so-called homology modeling. The basic assumption of this strategy is that proteins with common evolutionary predecessors resemble each other in their amino acid sequences, as well as in their three-dimensional structure. If the structure of one of the related proteins can be determined experimentally, the remaining ones can be predicted.

In the case of the bacterial membrane protein TehA they could bring all pieces of the puzzle together. "In a screening procedure we searched for TeHA-related membrane proteins by comparing tens of thousands of amino acid sequences. Using a multistage selection process we chose 43 proteins from 38 different organisms," says TUM computational biologist Marco Punta.

Scientists at Columbia University now succeeded in experimentally determining the tertiary structure of the membrane protein TehA of the bacterium Haemophilus influenzae using X-ray crystallography. With a resolution of 0.12 nanometers (1.2 Ångstrøm), this structure is among the best crystal structures ever obtained for a membrane protein. Furthermore, the experiment harbored a surprise: The TehA membrane protein has a hitherto entirely unknown fold.

After getting to know the "TehA family," the scientists at Columbia University succeeded in deriving the structures of the individual proteins. In particular, they modeled the structure of the plant membrane protein SLAC1. Comparing this to the protein structure of TehA derived experimentally, they could build a structural model for SLAC1 – entirely without experimentation, using nothing but bioinformatics methods.

"Using this procedure we aim to have a high structure determination throughput rate. determining more protein structures in a shorter time – that was our goal, in particular for the membrane proteins. The results at hand show that this strategy can work for membrane proteins, too," says Burkhard Rost.

Ultimately, the three-dimensional structures are determined to identify the function of the proteins using mutagenesis tests. Although the membrane proteins TehA and SLAC1 are only distantly related – the overlap of the amino acid sequence is only 19 percent – the predicted tertiary structure of SLAC1 was so good that a new hypothesis on the function of the SLAC1 membrane protein could be put forward.

SLAC1 is found in the stomata of the plant Arabidopsis thaliana. Stomata control the exchange of water vapor and carbon dioxide between the plant and its environment. This is very important in photosynthesis. The membrane protein SLAC1 plays a role in this process, as well, as part of the anion channel: It influences the turgor pressure – the pressure of cell fluid on the cell wall – and thus the gas exchange of the plant cell as a reaction to environmental influences such as aridity and high carbon dioxide concentration.

SLAC1 anion channels are entirely novel in structure and, apparently, in the mechanism for ion conductance. The SLAC1 pore has a relatively uniform diameter, but in the middle a Phenylalanine-group blocks the way. The results suggest that this amino acid is turned away when the ion channel is activated through binding of a triggering protein.

Source: Technische Universitaet Muenchen