In times of exponentially growing amounts of sequence data and numerous completed genomes, the processing and analysis of these data becomes more and more critical. To bridge the gap between the speed of data generation and interpretation we develop efficient tools for the analysis of biomolecular data on a large scale. Some of these tools use efficient suffix array based index structures, others distributed, parallel processing of the data.

current projects

EST Clustering 
Suffix array based clustering of EST and cDNA data

A system for interactive high throughput sequence analysis and (differential) comparative genomics.

past projects

Large Scale Repeat Analysis requires extensive algorithmic support. The program GenAlyzer was designed to serve as a fundamental tool for repetitive DNA studies on a genomic as well as inter-genomic scale.

Hypa - A system for the declarative description and efficient search of hybrid patterns in large genomic data sets.

