Researchers hone in on the best software for detecting microRNAs in plants

Almost twenty years ago, the process of RNA silencing was discovered in plants, whereby small fragments of RNA inactivate a portion of a gene during protein synthesis. These fragments--called microRNAs (abbreviated as miRNAs)--have since been shown to be essential at nearly every stage of growth and development in plants, from the production of flowers, stems, and roots to the ways plants interact with their environment and ward off infection.

The detection and characterization of miRNAs is an active field of research. In the decade following their discovery in plants, over 1,000 bioinformatic tools were used to identify miRNAs and map out potential places to look for them.

In a new study published in the journal Applications in Plant Sciences, researchers set out to simplify the process of miRNA discovery and characterization by testing eight of the most commonly used miRNA applications, evaluating each based on accuracy, sensitivity, speed, and the amount of computer memory used by the software.

The sheer number of applications available can make studying miRNAs a daunting task. The majority of these tools (77%) were also developed and tested with animal systems in mind, making it unclear whether their utility can be extended for the detection and analysis of miRNAs in plants.

"The biosynthetic pathways of miRNA in plants and animals are different," said Dr. Qi You, a lecturer at the Agricultural College of Yangzhou University and senior author of the study. "At the same time, the stem-loop structure of plant miRNA precursors is larger than that of animals. Plant miRNAs will be modified by methylation, but animal miRNAs will not."

Things get even trickier when considering that individuals of the same plant species can have varying genome sizes, making it difficult to develop a single standardized approach. Additionally, some programs only look for miRNAs that are already known to exist, while others comb through genomes to find new additions to the list.

Finally, while most tools use search criteria to directly identify miRNA sequences in a genome, others are designed to search for miRNAs in their early developmental stages (precursor miRNA), when more base pairs are attached at either end before a molecular cleaving process trims them down to size.

You and her colleagues took eight miRNA applications (one that was developed for use in animals, and two that are exclusively used to locate precursor miRNA) and vigorously tested them on four different plant species, each with varying genome sizes: thale cress (Arabidopsis thaliana), rice (Oryza sativa), maize (Zea mays), and wheat (Triticum aestivum).

While all eight programs had similar performance in terms of accuracy, there were some obvious winners and losers for all other metrics scored, analyzed both together and separately.

First, the software developed for miRNA detection in animals scored low on sensitivity and on average took longer to run than most of the other programs. It was particularly bad at identifying known miRNAs in maize and had a high rate of true to false positives in wheat, indicating it likely isn't reliably suitable for use in plants.

Two programs stood out from the rest for their high sensitivity and low run times. The first, miRExpress--a program developed in 2009 for the detection of putative novel miRNAs--had the highest sensitivity for three of the four species tested and used the least amount of memory. The runner-up was the program sRNAbench, which had the highest sensitivity for wheat and similar run times.

Of the two, sRNAbench has many of the same functions and utility as the program developed for animal systems tested in this study, making it a solid stand-in for use in plants.

Ultimately, however, there's no one-size-fits-all option when it comes to choosing the best program, and researchers should choose based on their needs and available resources, You said.

"Researchers should consider the size of the genome of the tested species, the size of the sample data, and the configuration of their own computer equipment."

For labs that might not have access to high-performance computing or that don't have ample computer memory space, the authors recommend miRExpress, which maintains high accuracy rates while being less resource-intensive.

For those with more RAM, sRNAbench is the next step up, with high accuracy and applicability to several species with varying genome sizes. Similarly, detecting precursor miRNA requires large amounts of RAM, for which the authors recommend the program miRkwood.

Credit: 
Botanical Society of America