EST Clustering

Clustering EST sequences is a widely used method for analyzing the transcriptome of a genome. Especially for organisms which genome sequence is not (yet) sequenced, the EST data is a valuable source of information. We develop gene prediction methods based on EST data for model organisms, focusing on organisms with no genome sequence available.
We have developed a database system that incorporates a variety of precomputed analyses (see Genlight project) and provides a query able sequence analysis system. The compilation of tools normally available through separate public domain sequence analysis systems provides the experimental biologist with the ability to develop complex queries to identify and characterize genes of interest either on an individual basis or in a batch mode.
For a comparison of EST data with availbale genomic sequences, we have developed the e2g tool, which allows the rapid mapping of ESTs to a given genomic sequence.



