All talks will be held in lecture hall H6 at the western end of the main hall.
The time for a talk is 20 minutes plus 5 min discussion.
Wednesday, 2007-09-05 - from 10:30 to 12:15
Characterization of Genetic Signal Sequences with Batch-Learning SOM
|Presenting Author: Takashi Abe|
|Authors: Takashi Abe, Shun Ikeda, Shigehiko Kanaya, Kennosuke Wada, and Toshimichi Ikemura|
An unsupervised clustering algorithm Kohonen's SOM is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We previously modified the conventional SOM for genome informatics, making the learning process and resulting map independent of the order of data input on the basis of Batch Learning SOM (BL-SOM). We generated BL-SOMs for tetra- and pentanucleotide frequencies in 300,000 10-kb sequences from 13 eukaryotes for which almost complete genomic sequences are available. BL-SOM recognized species-specific characteristics of oligonucleotide frequencies in most 10-kb sequences, permitting species-specific classification of sequences without any information regarding the species. We next constructed BL-SOMs with tetra- and pentanucleotide frequencies in 37,086 full-length mouse cDNA sequences. With BL-SOM we also analyzed occurrence patterns of the oligonucleotides that are thought to be involved in transcriptional regulation on the human genome.
Advanced metric adaptation in Generalized LVQ for classification of mass spectrometry data
|Presenting Author: Petra Schneider|
|Authors: Petra Schneider, Michael Biehl, Frank-Michael Schleif, Barbara Hammer|
Metric adaptation constitutes a powerful approach to improve the performance of prototype based classication schemes. We apply extensions of Generalized LVQ based on different adaptive distance measures in the domain of clinical proteomics. The Euclidean distance in GLVQ is extended by adaptive relevance vectors and matrices of global or local influence where training follows a stochastic gradient descent on an appropriate error function. We compare the performance of the resulting learning algorithms for the classification of high dimensional mass spectrometry data from cancer research. High prediction accuracies can be obtained by adapting full matrices of relevance factors in the distance measure in order to adjust the metric to the underlying data structure. The easy interpretability of the resulting models after training of relevance vectors allows to identify discriminative features in the original spectra.
Genome feature exploration using hyperbolic Self-Organising Maps
|Presenting Author: Christian Martin|
|Authors: Christian Martin, Naryttza N. Diaz, Jörg Ontrup and Tim W. Nattkemper|
The advent of sequencing technologies allows to reassess the relationship between species in the hierarchically organized tree of life. Self-Organizing Maps (SOM) in Euclidean and hyperbolic space are applied to genomic signatures of 350 different organisms of the two superkingdoms Bacteria and Archaea to link the sequence signature space to pre-defined taxonomic levels, i.e. the tree of life. In the hyperbolic space the SOMs are trained by either the standard algorithm (HSOM) or in a hierarchical manner (H²SOM). For evaluating the SOM performances, distances between organisms in the feature space, on the SOM grid and in the taxonomy tree are compared pair-wise. We show that the structure recovered using the different SOMs reflects the gold standard of current taxonomy. The distances between species are better preserved when using the HSOM or H²SOM which makes the hyperbolic space better suited for embedding the high dimensional genomic signatures.
SOM-based Peptide Prototyping for Mass Spectrometry Peak Intensity Prediction
|Presenting Author: Alexandra Scherbart|
|Authors: Alexandra Scherbart, Wiebke Timm, Tim W. Nattkemper, Sebastian Böcker|
In todays bioinformatics, Mass spectrometry
(MS) is the key technique for the identification of proteins.
A prediction of spectrum peak intensities from pre
computed molecular features would pave the way to better
understanding of spectrometry data and improved spectrum
evaluation. We propose a neural network architecture
of Local Linear Map (LLM)-type based on Self-Organizing
Maps (SOMs) for peptide prototyping and learning locally
tuned regression functions for peak intensity prediction in
MALDI-TOF mass spectra. We obtain results comparable
to those obtained by nu-Support Vector Regression and
show how the SOM learning architecture provides a basis
for peptide feature profiling and visualisation.