skip to content

Department of Plant Sciences


Analysis of Differential Expression

A novel method for the analysis of differential expression, baySeq, has been developed by Dr Hardcastle and released as an R package in Bioconductor and a manuscript has been submitted for publication in BMC Bioinformatics.

Discovery of small RNA loci

Novel methods for the discovery of small RNA loci have also been developed by Dr Hardcastle and released as the R package, 'segmentSeq'. These tools have been successfully used for the analysis of several small RNA dataset that have recently been published in high profile journals (Molnar et al, 2010; Havecker et al, 2010).

Identifying Phased Loci

In plants, small RNAs may trigger the production of secondary small RNAs from the target messenger RNA (mRNA) rather than cleaving it. This may happen repeatedly leading to cascades of small RNAs. These secondary small RNAs are phased: their start positions are at regular intervals along the mRNA sequence. Bruno Santos is developing a new method for identifying such phased which overcomes the shortcomings of currently published methods. He aims to identify phased RNAs in several plant species, reconstruct the interaction networks between these RNAs and the messenger RNAs and characterise their impact on the transcriptome network. Ho Ming Chen from the Institute of Plant and Microbial Biology, Taipei, Taiwan visited the group for three months in 2009. She carried out work to test the hypothesis that it is 22 nucleotide small RNAs that trigger secondary RNA production, as well characterising phased loci in tomatoes and viruses.

High Throughput Sequencing Analysis Pipeline

A short read analysis pipeline has been developed for small RNA, transcriptome, ChIP and genome data obtained by high throughput sequencing. A web interface allows users to record their samples in a database, provide information about the samples and request a sequencing run. Users can track the progress of their samples at the sequencing centre. Once complete, sequencing results are retrieved automatically from the sequencing centre and processed by de-multiplexing bar-coded samples, stripping adapters or trimming (as appropriate) and aligning against one or more genomes. Each step of the process creates output files which can be used for further analyses. The web interface allows users to view the results of the pipeline processing including summary statistics and graphs which can be used for quality assessment. The pipeline has been made available to members of the SIROCCO consortium.

Other Projects

The group has also contributed to the development of the UEA small RNA toolkit, supported by a BBSRC grant to Professor Baulcombe, Dr Vince Molton and Dr Tamas Dalmay.


Members of the Bioinformatics group have provided resources to the SIROCCO Community. Dr Dunn has liaised with the Consortium members and also worked with the SIROCCO Project Manager to organise two Bioinformatics Training Sessions. Rishi Nag, funded by SIROCCO, has developed and maintained a website for the Project. He has also developed and released a software tool for aggregating the results of multiple microRNA prediction tools and a tool for converting between data formats.

Cambridge Next Generation Sequencing Bioinformatics Days

In 2009 and 2010, Dr Kelly organised two Cambridge Next Generation Sequencing Bioinformatics Days. The purpose of these days was to bring together bioinformaticians and others working with high throughput sequencing data to discuss common problems and solutions in a series of interactive workshops. Both days were attend by over 180 people from the Cambridge area, Norwich and London and were very well received. The next one will be held on 28th March 2011.