[1] B. Hammer, D. Hoffmann, F.-M. Schleif, and X.Zhu. Learning vector quantization for (dis-)similarities. NeuroComputing, 131:43-51, 2014. [ bib | .pdf ]
[2] D. Hofmann, F.-M. Schleif, and B. Hammer. Learning interpretable kernelized prototype-based models. NeuroComputing, 131:43-51, 2014. [ bib | .pdf ]
[3] M. Strickert, Kerstin Bunte, F.-M. Schleif, and E. Huellermeier. Correlation-based neighbor embedding. NeuroComputing, page online, 2014. [ bib | .pdf ]
[4] F.-M. Schleif, X. Zhu, and B. Hammer. Sparse conformal prediction for dissimilarity data. Annals of Mathematics and Artificial Intelligence, page to appear, 2014. [ bib | .pdf ]
[5] F.-M. Schleif, P. Tino, and T. Villmann. Recent trends in learning of structured and non-standard data. In Proceedings of ESANN 2014, pages 243-252, 2014. [ bib | .pdf ]
[6] F.-M. Schleif. Proximity learning for non-standard big data. In Proceedings of ESANN 2014, pages 359-364, 2014. [ bib | .pdf ]
[7] M. J. Embrechts, F. Rossi, F.-M. Schleif, and J. A. Lee. Advances in artificial neural networks, machine learning, and computational intelligence (esann 2013). Neurocomputing, page to appear, 2014. [ bib | .pdf ]
[8] F.-M. Schleif. Discriminative fast soft competitive learning. In Proceedings of ICANN 2014, page to appear, 2014. [ bib | .pdf ]
[9] Xibin Zhu, Frank-Michael Schleif, and Barbara Hammer. Adaptive conformal semi-supervised vector quantization for dissimilarity data. Pattern Recognition Letters, 49:138-145, 2014. [ bib | DOI | .pdf ]
[10] X.Zhu, F.-M. Schleif, and B. Hammer. Semi-supervised vector quantization for proximity data. In Proceedings of ESANN 2013, pages 89-94, 2013. [ bib | .pdf ]
Semi-supervised learning (SSL) is focused on learning from labeled and unlabeled data by incorporating structural and statistical in- formation of the available unlabeled data. The amount of data is dra- matically increasing, but few of them are fully labeled, due to cost and time constraints. This is even more challenging for non-vectorial, proxim- ity data, given by pairwise proximity values. Only few methods provide SSL for this data, limited to positive-semi-definite (psd) data. They also lack interpretable models, which is a relevant aspect in life-sciences where most of these data are found. This paper provides a prototype based SSL approach for proximity data.

[11] F.-M. Schleif and A. Gisbrecht. Data analysis of (non-)metric proximities at linear costs. In Proceedings of SIMBAD 2013, pages 59-74, 2013. [ bib | .pdf ]
Domain specific (dis-)similarity or proximity measures, employed e.g. in alignment algorithms in bio-informatics, are often used to compare com- plex data objects and to cover domain specific data properties. Lacking an under- lying vector space, data are given as pairwise (dis-)similarities. The few available methods for such data do not scale well to very large data sets. Kernel methods easily deal with metric similarity matrices, also at large scale, but costly trans- formations are need starting from non-metric (dis-) similarities. We propose an integrative combination of Nystr ̈ m approximation, potential double centering o and eigenvalue correction to obtain valid kernel matrices at linear costs. Accord- ingly effective kernel approaches, become accessible for these data. Evaluation at several larger (dis-)similarity data sets shows that the proposed method achieves much better runtime performance than the standard strategy while keeping com- petitive model accuracy. Our main contribution is an efficient linear technique, to convert (potentially non-metric) large scale dissimilarity matrices into approxi- mated psd kernel matrices.

[12] X.Zhu, F.-M. Schleif, and B. Hammer. Secure semi-supervised vector quantization for dissimilarity data. In Proceedings of IWANN 2013, pages 347-356, 2013. [ bib | .pdf ]
The amount and complexity of data increase rapidly, how- ever, due to time and cost, only few of them are fully labeled. In this context non-vectorial relational data given by pairwise (dis-)similarities without explicit vectorial representation, like score-values in sequences alignments, are particularly challenging. Existing semi-supervised learn- ing (SSL) algorithms focus on vectorial data given in Euclidean space. In this paper we extend a prototype-based classifier for dissimilarity data to non i.i.d. semi-supervised tasks. Using conformal prediction the ’se- cure region’ of unlabeled data can be used to improve the trained model based on labeled data while adapting the model complexity to cover the ’insecure region’ of labeled data. The proposed method is evaluated on some benchmarks from the SSL domain.

[13] A. Micheli, F.-M. Schleif, and P. Tino. Novel approaches in machine learning and computational intelligence. NeuroComputing, 112:1-3, 2013. [ bib ]
[14] F.-M. Schleif, X. Zhu, and B. Hammer. Sparse prototype representation by core sets. In Proceedings of IDEAL 2013, pages 302-309, 2013. [ bib | .pdf ]
Due to the increasing amount of large data sets, efficient learning algorithms are necessary. Also the interpretation of the final model is desirable to draw efficient conclusions from the model results. Prototype based learning algorithms have been extended recently to proximity learners to analyze data given in non-standard data formats. The supervised methods of this type are of special interest but suffer from a large number of optimization parameters to model the prototypes. In this contribution we derive an efficient core set based preprocessing to restrict the number of model parameters to O( n/eps^2 ) with n as the number of prototypes. Accordingly, the number of model parameters gets independent of the size of the data sets but scales with the requested precision of the core sets. Experimental results show that our approach does not significantly degrade the performance while significantly reducing the memory complexity.

[15] F.-M. Schleif. Large scale nystroem approximation for non-metric similarity and dissimilarity data. Technical Report MLR-03-2013, Machine Learning Reports, ISSN:1865-3960 http://www.uni-leipzig.de/ compint/mlr/mlr_03_2013.pdf, 2013. [ bib ]
[16] F.-M. Schleif and T. Villmann. Analysis of temporal kinect motion capturing data. Technical Report MLR-05-2013, Machine Learning Reports, ISSN:1865-3960 http://www.uni-leipzig.de/ compint/mlr/mlr_05_2013.pdf, 2013. [ bib ]
[17] F.-M. Schleif. Large scale nyström approximation for non-metric similarity and dissimilarity data. Technical Report MLR-03-2013, Machine Learning Reports, ISSN:1865-3960 http://www.uni-leipzig.de/ compint/mlr/mlr_03_2013.pdf, 2013. [ bib ]
[18] K. Bunte, P. Schneider, B. Hammer, F.-M. Schleif, T. Villmann, and M. Biehl. Limited rank matrix learning discriminative dimension reduction and visualization. Neural Networks, 26:159-173, 2012. [ bib | .pdf ]
We present an extension of the recently introduced Generalized Matrix Learning Vector Quantization algorithm. In the original scheme, adaptive square matrices of relevance factors parameterize a discriminative distance measure. We extend the scheme to matrices of limited rank corresponding to low-dimensional representations of the data. This allows to incorporate prior knowledge of the intrinsic dimension and to reduce the number of adaptive parameters efficiently. In particular, for very large dimensional data, the limitation of the rank can reduce computation time and memory requirements significantly. Furthermore, two- or three-dimensional representations constitute an efficient visualization method for labeled data sets. The identification of a suitable projection is not treated as a pre-processing step but as an integral part of the supervised training. Several real world data sets serve as an illustration and demonstrate the usefulness of the suggested method

[19] B. Hammer, B. Mokbel, F.-M. Schleif, and X. Zhu. White box classification of dissimilarity data. In Emilio Corchado, Václav Snášel, Ajith Abraham, Michal Wozniak, Manuel Graña, and Sung-Bae Cho, editors, Hybrid Artificial Intelligent Systems, volume 7208 of Lecture Notes in Computer Science, pages 309-321. Springer Berlin / Heidelberg, 2012. [ bib | .pdf ]
[20] X. Zhu, A. Gisbrecht, F.-M. Schleif, and B. Hammer. Approximation techniques for clustering dissimilarity data. Neuro Computing, 90:72-84, 2012. [ bib | .pdf ]
Recently, diverse high quality prototype-based clustering techniques have been developed which can directly deal with data sets given by general pairwise dissimilarities rather than standard Euclidean vectors. Examples include affinity propagation, relational neural gas, or relational generative topographic mapping. Corresponding to the size of the dissimilarity matrix, these techniques scale quadratically with the size of the training set, such that training becomes prohibitive for large data volumes. In this contribution, we investigate two different linear time approximation techniques, patch processing and the Nystr ̈m approximation. We apply these approximations to several representative clustering techniques for dissimilarities, where possible, and compare the results for diverse data sets

[21] K. Bunte, F.-M. Schleif, and M. Biehl. Adaptive learning for complex-valued data. In Proceedings of ESANN 2012, pages 387-392, 2012. [ bib | .pdf ]
In this paper we propose a variant of the Generalized Matrix Learning Vector Quantization (GMLVQ) for dissimilarity learning on complex-valued data. Complex features can be encountered in various data domains, e.g. Fourier transformed mass spectrometry or image analysis data. Current approaches deal with complex inputs by ignoring the imaginary parts or concatenating real and imaginary parts in one real valued vector. In this contribution we propose a prototype based classification method, which allows to deal with complex-valued data directly. The algorithm is tested on a benchmark data set and for leaf recognition using Zernike moments. We observe that the complex version converges much faster than the original GMLVQ evaluated on the real parts only. The complex version has fewer free parameters than using a concatenated vector and is thus computationally more efficient than original GMLVQ

[22] F.-M. Schleif, A. Gisbrecht, and B. Hammer. Relevance learning for short high-dimensional time series in the life sciences. In Proceedings of IJCNN 2012, pages 2069-2076, 2012. [ bib | .pdf ]
Digital data characterizing physiological processes over time are becoming increasingly important such as spectrometric data or gene expression profiles. Typical characteristics of such data are high dimensionality due to a fine grained measurement, but usually only few time points of the series. Due to the short length, classical time series models cannot be used. At the same time, due to the high dimensionality, data cannot be treated by means of time windows using simple vectorial techniques. Here, we consider the generative topographic mapping through time (GTM-TT) as a highly regularized model for time series inspection in the unsupervised setting, based on hidden Markov models enhanced with topographic mapping facilities. We extend the model such that supervised classification can be built on top of GTM-TT, resulting in supervised GTM-TT, and we extend the technique by supervised relevance learning. The latter adapts the metric according to given auxiliary information resulting in an interpretable form which can deal with high dimensional inputs. We demonstrate the technique in simulated data as well as an example from the biomedical domain, reaching state of the art classification accuracy in both cases.

[23] M. Biehl, K. Bunte, F.-M. Schleif, P. Schneider, and T. Villmann. Large margin linear discriminative visualization by matrix relevance learning. In Proceedings of IJCNN 2012, pages 1873-1880, 2012. [ bib | .pdf ]
We suggest and investigate the use of Generalized Matrix Relevance Learning (GMLVQ) in the context of discriminative visualization. This prototype-based, supervised learning scheme parameterizes an adaptive distance measure in terms of a matrix of relevance factors. By means of a few benchmark problems, we demonstrate that the training process yields low rank matrices which can be used efficiently for the discriminative visualization of labeled data. Comparison with well known standard methods illustrate the flexibility and discriminative power of the novel approach. The mathematical analysis of GMLVQ shows that the corresponding stationarity condition can be formulated as an eigenvalue problem with one or several strongly dominating eigenvectors. We also study the inclusion of a penalty term which enforces non-singularity of the relevance matrix and can be used to control the role of higher order eigenvalues, efficiently

[24] X. Zhu, F.-M. Schleif, and B. Hammer. Patch processing for relational learning vector quantization. In Jun Wang, Gary G. Yen, and Marios M. Polycarpou, editors, Advances in Neural Networks - ISNN 2012, volume 7367 of Lecture Notes in Computer Science, pages 55-63. Springer, 2012. [ bib | .pdf ]
Recently, an extension of popular learning vector quantization (LVQ) to general dissimilarity data has been proposed, relational generalized LVQ (RGLVQ) [10, 9]. An intuitive prototype based classification scheme results which can divide data characterized by pairwise dissimilarities into priorly given categories. However, the technique relies on the full dissimilarity matrix and, thus, has squared time complexity and linear space complexity. In this contribution, we propose an intuitive linear time and constant space approximation of RGLVQ by means of patch processing. An efficient heuristic which maintains the good classification accuracy and interpretability of RGLVQ results, as demonstrated in three examples from the biomdical domain.

[25] F.-M. Schleif, X. Zhu, and B. Hammer. A conformal classifier for dissimilarity data. In Proceedings of AIAI 2012, pages 234-243, 2012. [ bib | .pdf ]
Current classification algorithms focus on vectorial data, given in euclidean or kernel spaces. Many real world data, like biological sequences are not vectorial and often non-euclidean, given by (dis-)similarities only, requesting for efficient and interpretable models. Current classifiers for such data require complex transformations and provide only crisp classification without any measure of confidence, which is a standard requirement in the life sciences. In this paper we propose a prototype-based conformal classifier for dissimilarity data. It effectively deals with dissimilarity data. The model complexity is automatically adjusted and confidence measures are provided. In experiments on dissimilarity data we investigate the effectiveness with respect to accuracy and model complexity in comparison to different state of the art classifiers

[26] F.-M. Schleif, X. Zhu, A. Gisbrecht, and B. Hammer. Fast approximated relational and kernel clustering. In Proceedings of ICPR 2012, pages 1229 - 1232. IEEE, 2012. [ bib | .pdf ]
The large amount of digital data requests for scalable tools like efficient clustering algorithms. Many algorithms for large data sets request linear separability in an Euclidean space. Kernel approaches can capture the non-linear structure but do not scale well for large data sets. Alternatively, data are often represented implicitly by dissimilarities like for protein sequences, whose methods also often do not scale to large problems. We propose a single algorithm for both type of data, based on a batch approximation of relational soft competitive learning, termed fast generic soft-competitive learning. The algorithm has linear computational and memory requirements and performs favorable to traditional techniques

[27] F.-M. Schleif, X. Zhu, and B. Hammer. Soft competitive learning for large data sets. In Proceedings of MCSD 2012, pages 141-151, 2012. [ bib | .pdf ]
Soft competitive learning is an advanced k-means like clustering approach overcoming some severe drawbacks of k-means, like initialization dependence and sticking to local minima. It achieves lower distortion error than k-means and has shown very good performance in the clustering of complex data sets, using various metrics or kernels. While very effective, it does not scale for large data sets which is even more severe in case of kernels, due to a dense prototype model. In this paper, we propose a novel soft-competitive learning algorithm using core-sets, significantly accelerating the original method in practice with natural sparsity. It effectively deals with very large data sets up to multiple million points. Our method provides also an alternative fast kernelization of soft-competitive learning. In contrast to many other clustering methods the obtained model is based on only few prototypes and shows natural sparsity. It is the first natural sparse kernelized soft competitive learning approach. Numerical experiments on synthetical and benchmark data sets show the efficiency of the proposed method.

[28] F.-M. Schleif, B. Mokbel, A. Gisbrecht, L. Theunissen, V. Dürr, and B. Hammer. Learning relevant time points for time-series data in the life sciences. In Proceedings of ICANN 2012, pages 531-539, 2012. [ bib | .pdf ]
In the life sciences, short time series with high dimensional entries are becoming more and more popular such as spectrometric data or gene expression profiles taken over time. Data characteristics rule out classical time series analysis due to the few time points, and they prevent a simple vectorial treatment due to the high dimensionality. In this contribution, we successfully use the generative topographic mapping through time (GTM-TT) which is based on hidden Markov models enhanced with a topographic mapping to model such data. We propose an extension of GTM-TT by relevance learning which automatically adapts the model such that the most relevant input variables and time points are emphasized by means of an automatic relevance weighting scheme. We demonstrate the technique in two applications from the life sciences.

[29] A. Gisbrecht, B. Mokbel, F.-M. Schleif, X. Zhu, and B. Hammer. Linear time relational prototype based learning. Journal of Neural Systems, 22(5):online, 2012. [ bib | .pdf ]
Prototype based learning offers an intuitive interface to inspect large quantities of electronic data in supervised or unsupervised settings. Recently, many techniques have been extended to data described by general dissimilarities rather than Euclidean vectors, so-called relational data settings. Unlike the Euclidean counterparts, the techniques have quadratic time complexity due to the underlying quadratic dissimilarity matrix. Thus, they are infeasible already for medium sized data sets. The contribution of this article is twofold: on the one hand we propose a novel supervised prototype based classification technique for dissimilarity data based on popular learning vector quantization, on the other hand we transfer a linear time approximation technique, the Nystr ̈m approximation, to this algorithm and an unsupervised counterpart, the relational generative topographic mapping. This way, linear time and space methods result. We evaluate the techniques on three examples from the biomedical domain.

[30] F.-M. Schleif and A. Gisbrecht. Data analysis of (non-)metric (dis-)similarities at linear costs. Technical Report MLR-04-2012, Machine Learning Reports, ISSN:1865-3960 http://www.uni-leipzig.de/ compint/mlr/mlr_04_2012.pdf, 2012. [ bib ]
[31] F.-M. Schleif, S. Simmuteit, and T. Villmann. Hierarchical deconvolution of linear mixtures of high-dimensional mass spectra in micro-biology. In Proceedings of AIA 2011, pages CD-publication, 2011. [ bib | DOI | .pdf ]
This paper introduces a hierarchical model for the description and deconvolution of composite patterns. The patterns are described in a basis system of spectral basis functions. The mixture coefficients for the composite patterns are determined by solving a linear mixture model with nonneg- ative coefficients. In life science research, wet-lab mixed samples of possible known basis substances occur regularly and cause a challenge for identification tasks. Also in case of known basis functions the problem is still complex, if the used basis is very sparse and the number of basis functions is very large. Simple approaches either try combining different basis spectra or incorporate blind source separation. Our proposed method is to use nonnegative least squares combined with a hierarchical prototype based learning model. We evaluate our method on mixtures of real and simulated composite patterns of mass spectrometry data from bacteria. Results show remarkable success and can be taken as a promising step in the new field of automatic unmixing of mixed cultures.

[32] A. Gisbrecht, B. Hammer, F.-M. Schleif, and X. Zhu. Accelerating kernel clustering for biomedical data analysis. In Proceedings of CIBCB 2011, pages 154-161, 2011. [ bib | .pdf ]
The increasing size and complexity of modern data sets turns modern data mining techniques to indispensable tools when inspecting biomedical data sets. Thereby, dedicated data formats and detailed information often cause the need for problem specific similarities or dissimilarities instead of the standard Euclidean norm. Therefore, a number of clustering techniques which rely on similarities or dissimilarities only have recently been proposed. In this contribution, we review some of the most popular dissimilarity based clustering techniques and we discuss possibilities how to get around the usually squared complexity of the models due to their dependency on the full dissimilarity matrix. We evaluate the techniques on two benchmarks from the biomedical domain.

[33] F.-M. Schleif, A. Gisbrecht, and B. Hammer. Accelerating kernel neural gas. In Timo Honkela, Wlodzislaw Duch, Mark Girolami, and Samuel Kaski, editors, Artificial Neural Networks and Machine Learning - ICANN 2011, volume 6791 of Lecture Notes in Computer Science, pages 150-158. Springer, 2011. [ bib | .pdf ]
Clustering approaches constitute important methods for unsupervised data analysis. Traditionally, many clustering models focus on spherical or ellipsoidal clusters in Euclidean space. Kernel methods extend these approaches to more complex cluster forms, and they have been recently integrated into several clustering techniques. While leading to very flexible representations, kernel clustering has the drawback of high memory and time complexity due to its dependency on the full Gram matrix and its implicit representation of clusters in terms of feature vectors. In this contribution, we accelerate the kernelized Neural Gas algorithm by incorporating a Nystr ̈m approximation scheme and active learning, and we arrive at sparse solutions by integration of a sparsity constraint. We provide experimental results which show that these accelerations do not lead to a deterioration in accuracy while improving time and memory complexity.

[34] K. Bunte, F.-M. Schleif, and T. Villmann. Mathematical foundations of the self organized neighbor embedding (sone) for dimension reduction and visualization. In Proceedings of ESANN 2011, pages 29-34, 2011. [ bib | .pdf ]
In this paper we propose the generalization of the recently introduced Neighbor Embedding Exploratory Observation Machine (NEXOM) for dimension reduction and visualization. We provide a general mathematical framework called Self Organized Neighbor Embedding (SONE). It treats the components, like data similarity measures and neighborhood functions, independently and easily changeable. And it enables the utilization of different divergences, based on the theory of Frechet derivatives. In this way we propose a new dimension reduction and visualization algorithm, which can be easily adapted to the user specific request and the actual problem.

[35] U. Seiffert, F.-M. Schleif, and D. Zühlke. Recent trends in computational intelligence in life science. In Proceedings of ESANN 2011, pages 77-86, 2011. [ bib | .pdf ]
Computational intelligence generally comprises a rather large set of – in a wider sense – adaptive and human-like data analysis and modelling methods. Due to some superior features - such as generalisation, trainability, coping with incomplete and inconsistent data, etc. computational intelligence has found its way into numerous applications in almost all scientific disciplines. A very prominent field among them are life sciences that are characterised by some unique requirements in terms of data structure and analysis.

[36] P. Schneider, T. Geweniger, F.-M. Schleif, M. Biehl, and T. Villmann. Multivariate class labeling in robust soft LVQ. In Proceedings of ESANN 2011, pages 17-22, 2011. [ bib | .pdf ]
We introduce a generalization of Robust Soft Learning Vector Quantization (RSLVQ). This algorithm for nearest prototype classification is derived from an explicit cost function and follows the dynamics of a stochastic gradient ascent. We generalize the RSLVQ cost function with respect to vectorial class labelings. This approach allows to realize multivariate class memberships for prototypes and training samples, and the prototype labels can be learned from the data during training. We present experiments to demonstrate the new algorithm in practice.

[37] F.-M. Schleif. Sparse kernel vector quantization with local dependencies. In Proceedings of IJCNN 2011, pages 1538 - 1545, 2011. [ bib | .pdf ]
Clustering approaches are very important methods to analyze data sets in an initial unsupervised setting. Traditionally many clustering approaches assume data points to be independent. Here we present a method to make use of local dependencies to improve clustering under guaranteed distortions. Such local dependencies are very common for data generated by imaging technologies with an underlying topographic support of the measured data. We provide experimental results on artificial and real world data of clustering tasks

[38] B. Hammer, A. Gisbrecht, A. Hasenfuss, B. Mokbel, F. M. Schleif, and X. Zhu. Topographic mapping of dissimilarity data. In Jorma Laaksonen and Timo Honkela, editors, Advances in Self-Organizing Maps, WSOM 2011, Lecture Notes in Computer Science 6731, pages 1-15. Springer, 2011. [ bib | .pdf ]
Topographic mapping offers a very flexible tool to inspect large quantities of high-dimensional data in an intuitive way. Often, electronic data are inherently non Euclidean and modern data formats are connected to dedicated non-Euclidean dissimilarity measures for which classical topographic mapping cannot be used. We give an overview about extensions of topographic mapping to general dissimilarities by means of median or relational extensions. Further, we discuss efficient approximations to avoid the usually squared time complexity.

[39] A. Gisbrecht, F.-M. Schleif, X. Zhu, and B. Hammer. Linear time heuristics for topographic mapping of dissimilarity data. In Yin et al. [40], pages 25-33. [ bib | .pdf ]
Topographic mapping offers an intuitive interface to inspect large quantities of electronic data. Recently, it has been extended to data described by general dissimilarities rather than Euclidean vectors. Unlike its Euclidean counterpart, the technique has quadratic time complexity due to the underlying quadratic dissimilarity matrix. Thus, it is infeasible already for medium sized data sets. We introduce two approximation techniques which speed up the complexity to linear time algorithms: the Nyström approximation and patch processing, respectively. We evaluate o the techniques on three examples from the biomedical domain

[40] Hujun Yin, Wenjia Wang, and Victor J. Rayward-Smith, editors. Intelligent Data Engineering and Automated Learning - IDEAL 2011 - 12th International Conference, Norwich, UK, September 7-9, 2011. Proceedings, volume 6936 of Lecture Notes in Computer Science. Springer, 2011. [ bib ]
[41] B. Hammer, F.-M. Schleif, and X. Zhu. Relational extensions of learning vector quantization. In Bao-Liang Lu, Liqing Zhang, and James Kwok, editors, Neural Information Processing, volume 7063 of Lecture Notes in Computer Science, pages 481-489. Springer, 2011. [ bib | .pdf ]
Prototype based models offer an intuitive interface to given data sets by means of an inspection of the model prototypes. Supervised classification can be achieved by popular techniques such as learning vector quantization (LVQ) and extensions derived from cost functions such as generalized LVQ (GLVQ) and robust soft LVQ (RSLVQ). These methods, however, are restricted to Euclidean vectors and they cannot be used if data are characterized by a general dissimilarity matrix. In this approach, we propose relational extensions of GLVQ and RSLVQ which can directly be applied to general possibly non-Euclidean data sets characterized by a symmetric dissimilarity matrix.

[42] Barbara Hammer, Bassam Mokbel, F.-M. Schleif, and Xibin Zhu. Prototype-based classification of dissimilarity data. In Gama et al. [43], pages 185-197. [ bib | .pdf ]
Unlike many black-box algorithms in machine learning, prototype based models offer an intuitive interface to given data sets since prototypes can directly be inspected by experts in the field. Most techniques rely on Euclidean vectors such that their suitability for complex scenarios is limited. Recently, several unsupervised approaches have successfully been extended to general possibly non-Euclidean data charac- terized by pairwise dissimilarities. In this paper, we shortly review a general approach to extend unsupervised prototype-based techniques to dissimilarities, and we transfer this approach to supervised prototype-based classification for general dissimilarity data.

[43] João Gama, Elizabeth Bradley, and Jaakko Hollmén, editors. Advances in Intelligent Data Analysis X - 10th International Symposium, IDA 2011, Porto, Portugal, October 29-31, 2011. Proceedings, volume 7014 of Lecture Notes in Computer Science. Springer, 2011. [ bib ]
[44] F.-M. Schleif, T. Riemer, U. Börner, L. Schnapka-Hille, and M. Cross. Genetic algorithm for shift-uncertainty correction in 1-D NMR based metabolite identifications and quantifications. Bioinformatics, 27(4):524-533, 2011. [ bib | .pdf ]
Motivation: The analysis of metabolic processes is becoming increasingly important to our understanding of complex biological systems and disease states. Nuclear magnetic resonance spectroscopy (NMR) is a particularly relevant technology in this respect, since the NMR signals provide a quantitative measure of metabolite concentrations. However, due to the complexity of the spectra typical of biological samples, the demands of clinical and high throughput analysis will only be fully met by a system capable of reliable, automatic processing of the spectra. An initial step in this direction has been taken by Targeted Profiling (TP), employing a set of known and predicted metabolite signatures fitted against the signal. However, an accurate fitting procedure for 1 H NMR data is complicated by shift uncertainties in the peak systems caused by measurement imperfections. These uncertainties have a large impact on the accuracy of identification and quantification and currently require compensation by very time consuming manual interactions. Here, we present an approach, termed Extended Targeted Profiling (ETP), that estimates shift uncertainties based on a genetic algorithm (GA) combined with a least squares optimization (LSQO). The estimated shifts are used to correct the known metabolite signatures leading to significantly improved identification and quantification. In this way, use of the automated system significantly reduces the effort normally associated with manual processing and paves the way for reliable, high throughput analysis of complex NMR spectra. Results: The results indicate that using simultaneous shift uncertainty correction and least squares fitting significantly improves the identification and quantification results for 1 H NMR data in comparison to the standard targeted profiling approach and compares favorably with the results obtained by manual expert analysis. Preservation of the functional structure of the NMR spectra makes this approach more realistic than simple binning strategies.

[45] F.-M. Schleif, T. Villmann, B. Hammer, and P. Schneider. Efficient kernelized prototype-based classification. Journal of Neural Systems, 21(6):443-457, 2011. [ bib | .pdf ]
Prototype based classifiers are effective algorithms in modeling classification problems and have been applied in multiple domains. While many supervised learning algorithms have been successfully extended to kernels to improve the discrimination power by means of the kernel concept, prototype based classifiers are typically still used with Euclidean distance measures. Kernelized variants of prototype based classifiers are currently too complex to be applied for larger data sets. Here we propose an extension of Kernelized Generalized Learning Vector Quantization (KGLVQ) employing a sparsity and approximation technique to reduce the learning complexity. We provide generalization error bounds and experimental results on real world data, showing that the extended approach is comparable to SVM on different public data.

[46] E. Mwebaze, P. Schneider, F.-M. Schleif, S. Haase, T. Villmann, and M. Biehl. Divergence based learning vector quantization. In Proceedings of ESANN 2010, pages 247-252, 2010. [ bib ]
[47] D. Zühle, F.-M. Schleif, T. Geweniger, and T. Villmann. Learning vector quantization for heterogeneous structured data. In Proceedings of ESANN 2010, pages 271-276, 2010. [ bib ]
[48] T. Villmann, S. Haase, F.-M. Schleif, B. Hammer, and M. Biehl. The mathematics of divergence based online learning in vector quantization. In Proceedings of ANNPR 2010, pages 108-119, 2010. [ bib ]
[49] T. Villmann, S. Haase, F.-M. Schleif, and B. Hammer. Divergence based online learning in vector quantization. In Proceedings of ICAISC 2010, pages 479-486, 2010. [ bib ]
[50] C. Angulo, J. A. Lee, and F.-M. Schleif. Advances in computational intelligence and learning. NeuroComputing, 73(7-9):1049-1050, 2010. [ bib ]
[51] S. Simmuteit, F.-M. Schleif, T. Villmann, and B. Hammer. Evolving trees for the retrieval of mass spectrometry based bacteria fingerprints. Knowledge and Information Systems, 25(2):327-343, 2010. [ bib | .pdf ]
In this paper we investigate the application of Evolving Trees for the anal- ysis of mass spectrometric data of bacteria. Evolving Trees are extensions of Self- Organizing Maps developed for hierarchical classification systems. Therefore they are well suited for taxonomic problems like the identification of bacteria. Here we focus on three topics, an appropriate pre-processing and encoding of the spectra, an adequate data model by means of a hierarchical Evolving Tree and an interpretable visualization. First the high-dimensionality of the data is reduced by a compact representation. Here we employ sparse coding, specifically tailored for the processing of mass spectra. In the second step the topographic information which is expected in the fingerprints is used for advanced tree evaluation and analysis. We adapted the original topographic prod- uct for Self-Organizing-Maps for Evolving Trees to achieve a judgment of topography. Additionally we transferred the concept of U-matrix for evaluation of the separability of Self-Organizing-Maps to their analog in Evolving Trees. We demonstrate these ex- tensions for two mass spectrometric data sets of bacteria fingerprints and show their classification and evaluation capabilities in comparison to state of the art techniques.

[52] F.-M. Schleif, T. Villmann, B. Hammer, P. Schneider, and M. Biehl. Generalized derivative based kernelized learning vector quantization. In Proceedings of IDEAL 2010, pages 21-28, 2010. [ bib | .pdf ]
We derive a novel derivative based version of kernelized Generalized Learning Vector Quantization (KGLVQ) as an effective, easy to interpret, prototype based and kernelized classifier. It is called D-KGLVQ and we provide generalization error bounds, experimental results on real world data, showing that D-KGLVQ is competitive with KGLVQ and the SVM on UCI data and additionally show that automatic parameter adaptation for the used kernels simplifies the learning.

[53] E. Mwebaze, P. Schneider, F.-M. Schleif, J.R. Aduwo, J.A. Quinn, S. Haase, T. Villmann, and M. Biehl. Divergence based classification in learning vector quantization. NeuroComputing, 74:1429-1435, 2010. [ bib | www: ]
We discuss the use of divergences in dissimilarity-based classification. Divergences can be employed whenever vectorial data consists of non-negative, potentially normalized features. This is, for instance, the case in spectral data or histograms. In particular, we introduce and study divergence based learning vector quantization (DLVQ). We derive cost function based DLVQ schemes for the family of g-divergences which includes the well-known Kullback–Leibler divergence and the so-called Cauchy–Schwarz divergence as special cases. The corresponding training schemes are applied to two different real world data sets. The first one, a benchmark data set (Wisconsin Breast Cancer) is available in the public domain. In the second problem, color histograms of leaf images are used to detect the presence of cassava mosaic disease in cassava plants. We compare the use of standard Euclidean distances with DLVQ for different parameter settings. We show that DLVQ can yield superior classification accuracies and Receiver Operating Characteristics.

[54] T. Villmann, F.-M. Schleif, and B. Hammer. Sparse representation of data. In Proceedings of ESANN 2010, pages 225-234, 2010. [ bib | .pdf ]
The amount of data available for investigation and analysis is rapidly growing in various areas of research like in biology, medicine, (bio-)chemistry or physics. Many of these data sets are very complex but have also a simple inherent structure which allows an appropriate sparse representation and modeling of such data with less or no information loss. Advanced methods are needed to extract these inherent but hidden information. The task of sparse data representation and modeling can be approached using very different models. Some focus on the encoding and reconstruction of the data by means of sparse basis function sets, like wavelets some other identify more complex underlying structures by means of deconvolution approaches such as non-negativ matrix factorization. But also feature reduction, feature extraction and sparse clustering techniques, often employing data specific knowledge, can be employed to obtain sparse models of high dimensional data. All these fields have a long tradition but due to the increasing amount of data, sparse representation techniques have got a tremendous attention in the last decade with strong progress in theory and outstanding applications. We provide an overview on recent achievements and current trends of ongoing research.

[55] S. Simmuteit, F.-M. Schleif, and T. Villmann. Hierarchical evolving trees together with global and local learning for large data sets in maldi imaging. In Proceedings of WCSB 2010, pages 103-106, 2010. [ bib | .pdf ]
The analysis of very large sets of data with multiple thousand measurements is an increasing problem. High- throughput approaches in the life science lead to large amounts of data which need to be analyzed by data mining approaches. Focusing on clustering and visualization approaches a common problem are very large similarity matrices. Standard techniques suffer from memory and runtime limitations for such complex settings or are not applicable at all. Here we present a hierarchical composite clustering employing data specific properties to deal with this problem for data with an inherent hierarchical order. As an additional advantage our algorithm allows easy control of the clustering depth. The method is a prototype based approach leading to sparse, compact and interpretable models. We derive the algorithm and present it on data taken from tissues slices of high resolution MALDI Imaging. Results show an effective clustering as well as significant improvements of the computational complexity for this type of data.

[56] F.-M. Schleif, T. Riemer, U. Boerner, L. Schnapka-Hille, and M. Cross. Efficient identification and quantification of metabolites in 1-h nmr measurements by a novel data encoding approach. In Proceedings of WCSB 2010, pages 91-94, 2010. [ bib | .pdf ]
The analysis of metabolic processes is becoming increasingly important to our understanding of complex biological systems and disease states. Nuclear magnetic resonance (NMR) spectroscopy is a particularly relevant technology in this respect, since the NMR signals provide a quantitative measure of metabolite concentrations. How- ever, due to the complexity of the spectra typical of bio- logical samples, the demands of clinical and high through- put analysis will only be fully met by a system capable of reliable, automatic processing of the spectra. We present here a novel data representation strategy for the measured spectra which simplifies the pre-processing of the data and supports the automatic identication and quantification of metabolites. The approach is combined with an extended targeted profiling strategy to allow the highly automated processing of 1 H NMR spectra, generating readouts suitable for the derivation of system biological models. The parallel application of both manual expert analysis and the automated approach to 1 H NMR spectra obtained from stem cell extracts shows that the results obtained are highly comparable. Use of the automated system therefore significantly reduces the effort normally associated with manual processing and paves the way for reliable, high throughput analysis of complex NMR spectra.

[57] F.-M. Schleif, M. Lindemann, P. Maass, M. Diaz, J. Decker, T. Elssner, M. Kuhn, and H. Thiele. Support vector classification of proteomic profile spectra based on feature extraction with the bi-orthogonal discrete wavelet transform. Computing and Visualization in Science, 12:189-199, 2009. [ bib | .pdf ]
Automatic classification of high-resolution mass spectrometry data has increasing potential to support physicians in diagnosis of diseases like cancer. The proteomic data exhibit variations among different disease states. A precise and reliable classification of mass spectra is essential for a successful diagnosis and treat- ment. The underlying process to obtain such reliable classification results is a crucial point. In this paper such a method is explained and a corresponding semi automatic parametrization procedure is derived. Thereby a simple straightforward classification procedure to assign mass spectra to a particular disease state is derived. The method is based on an initial preprocessing stage of the whole set of spectra followed by the bi-orthogonal discrete wavelet transform (DWT) for feature extraction. The approximation coefficients calculated from the scaling function exhibit a high peak pattern matching property and feature a denoising of the spectrum. The discriminating coefficients, selected by the Kolmogorov- Smirnov test are finally used as features for training and testing a support vector machine with both a linear and a radial basis kernel. For comparison the peak areas obtained with the ClinProt-System1 [33] were analyzed using the same support vector machines. The introduced approach was evaluated on clinical MALDI-MS data sets with two classes each originating from cancer studies. The cross validated error rates using the wavelet coeffi- cients where better than those obtained from the peak areas.

[58] F.-M. Schleif and T. Villmann. Neural maps and learning vector quantization - theory and applications. In In Proceedings of the ESANN 2009, pages 509-516, 2009. [ bib | .pdf ]
Neural maps and Learning Vector Quantizer are fundamental paradigms in neural vector quantization based on Hebbian learning. The beginning of this field dates back over twenty years with strong progress in theory and outstand- ing applications. Their success lies in its robustness and simplicity in application whereas the mathematics beyond is rather difficult. We provide an overview on recent achievements and current trends of ongoing research.

[59] S. Simmuteit, F.-M. Schleif, T. Villmann, and M. Kostrzewa. Hierarchical pca using tree-som for the identification of bacteria. In In Proceedings of the 7th International Workshop on Self Organizing Maps WSOM 2009, pages 272-280, 2009. [ bib | .pdf ]
In this paper we present an extended version of Evolving Trees using Oja’s rule. Evolving Trees are extensions of Self-Organizing Maps developed for hierarchical classification systems. Therefore they are well suited for taxonomic problems like the identification of bacteria. The paper focus on clustering and visualization of bacteria measurements. A modified variant of the Evolving Tree is developed and applied to obtain a hierarchical clustering. The method provides an inherent PCA analysis which is analyzed in combination with the tree based visualization. The obtained loadings support insights in the classification decision and can be used to identify features which are relevant for the cluster separation.

[60] M. Strickert, J. Keilwagen, F.-M. Schleif, , T. Villmann, and M. Biehl. Matrix metric adaptation for improved linear discriminant analysis of biomedical data. In Bio-Inspired Systems: Computational and Ambient Intelligence, 10th International Work-Conference on Artificial Neural Networks, IWANN 2009, Proceedings, Part I. LNCS 5517, pages 933-940. Springer, 2009. [ bib ]
[61] T. Villmann and F.-M. Schleif. Functional vector quantization by neural maps. In Proceedings of Whispers 2009, page CD, 2009. [ bib ]
[62] S. Simmuteit, F.-M. Schleif, T. Villmann, and T. Elssner. Tanimoto metric in tree-som for improved representation of mass spectrometry data with an underlying taxonomic structure. In Proceedings of ICMLA 2009, pages 563-567. IEEE Press, 2009. [ bib | .pdf ]
In this paper we develop a Tanimoto metric variant of the Evolving Tree for the analysis of mass spectrometric data of animal fur. The Evolving Tree is an extension of Self- Organizing Maps developed to analyze hierarchical cluster- ing problems. Together with the Tanimoto similarity mea- sure, which is intended to work with taxonomic structured data, the Evolving Tree is well suited for the identification of animal hair based on mass spectrometry fingerprints. Re- sults show a suitable hierarchical clustering of the test data and also a good retrieval capability with a logarithmic num- ber of comparisons.

[63] M. Strickert, F.-M. Schleif, T. Villmann, and U. Seiffert. Unleashing pearson correlation for faithful analysis of biomedical data. In Similarity-based Clustering, pages 70-91. Springer, LNAI 5400, 2009. [ bib ]
[64] F.-M. Schleif, F.-M. Ongyerth, and T. Villmann. Supervised data analysis and reliability estimation for spectral data. NeuroComputing, 72(16-18):3590-3601, 2009. [ bib | .pdf ]
The analysis and classification of data, is a common task in multiple fields of experimental research such as bioinformatics, medicine, satellite remote sensing or chemometrics leading to new challenges for an appropriate analysis. For this purpose dif- ferent machine learning methods have been proposed. These methods usually do not provide information about the reliability of the classification. This however is a common requirement in e.g. medicine and biology. In this line the present contribution offers an approach to enhance classifiers with reliability estimates in the context of prototype vector quantization. This extension can also be used to optimize precision or recall of the classifier system and to determine items which are not classifiable. This can lead to significantly improved classification results. The method is exemplarily presented on satellite remote spectral data but is applicable to a wider range of data sets.

[65] F.-M. Schleif, A. Vellido, and M. Biehl. Advances in machine learning and computational intelligence. NeuroComputing, 72:7-9, 2009. [ bib ]
[66] F.-M. Schleif, T. Villmann, M. Kostrzewa, B. Hammer, and A. Gammerman. Cancer informatics by prototype-networks in mass spectrometry. Artificial Intelligence in Medicine, 45:215-228, 2009. [ bib | .pdf ]
Mass spectrometry has become a standard technique to analyse clinical samples in cancer research. The obtained spectrometric measurements reveal a lot of in- formation of the clinical sample at the peptide and protein level. The spectra are high dimensional and, due to the small number of samples a sparse coverage of the population is very common. In clinical research the calculation and evaluation of classification models is important. For classical statistics this is achieved by hy- pothesis testing with respect to a chosen level of confidence. In clinical proteomics the application of statistical tests is limited due to the small number of samples and the high dimensionality of the data. Typically soft methods from the field of machine learning like prototype based vector quantizers [17], Support Vector Ma- chines(SVM) [32], Self-Organizing Maps (SOMs) [17] and respective variants are used to generate such models. However for these methods the classification decision is crisp in general and no or only few additional information about the safety of the decision is available.

[67] F.-M. Schleif, T. Riemer, U. Boerner, and M. Cross. Extended targeted profiling to identify and quantify metabolites in 1-h nmr measurements. Technical report, 2009. [ bib ]
[68] S. Simmuteit, J. Simmuteit, F.-M. Schleif, and T. Villmann. Deconvolution and identification of mass spectra from mixed and pure colonies of bacteria. Technical report, 2009. [ bib ]
[69] M. Biehl, B. Hammer, F.-M. Schleif, P. Schneider, and T. Villmann. Stationarity of matrix relevance learning vector quantization. Technical Report MLR-01-2009, Machine Learning Reports, ISSN:1865-3960 http://www.uni-leipzig.de/ compint/mlr/mlr_01_2009.pdf, 2009. [ bib ]
[70] T. Villmann, B. Hammer, F.-M. Schleif, W. Herrmann, and M. Cottrell. Fuzzy classification using information theoretic learning vector quantization. NeuroComputing, 71(16-18):3070-3076, 2008. [ bib ]
[71] T. Villmann, F.-M. Schleif, B. Hammer, and M. Kostrzewa. Exploration of mass-spectrometric data in clinical proteomics using learning vector quantization methods. Briefings in Bioinformatics, 9(2):129-143, 2008. [ bib | .pdf ]
In the present contribution we present two recently developed classification al- gorithms for analysis of mass-spectrometric data - the supervised neural gas and the fuzzy labeled self-organizing map. The algorithms are inherently regularizing, which is recommended, for these spectral data because of its high dimensionality and the sparseness for specific problems. The algorithms are both prototype based such that the principle of characteristic representants is realized. This leads to an easy interpretation of the generated classifcation model. Further, the fuzzy labeled self-organizing map, is able to process uncertainty in data, and classification results can be obtained as fuzzy decisions. Moreover, this fuzzy classifcation together with the property of topographic mapping offers the possibility of class similarity detec- tion, which can be used for class visualization. We demonstrate the power of both methods for two exemplary examples: the classification of bacteria (listeria types) and neoplastic and non-neoplastic cell populations in breast cancer tissue sections.

[72] M. Strickert, F.-M. Schleif, and T. Villmann. Metric adaptation for supervised attribute rating. In Michel Verleysen, editor, In Proceedings of the 16th European Symposium on Artificial Neural Networks (ESANN) 2008, pages 31-36, Evere, Belgium, 2008. d-side publications. [ bib ]
[73] P. Schneider, F.-M. Schleif, T. Villmann, and M. Biehl. Generalized matrix learning vector quantizer for the analysis of spectral data. In Michel Verleysen, editor, In Proceedings of the 16th European Symposium on Artificial Neural Networks (ESANN) 2008, pages 451-456, Evere, Belgium, 2008. d-side publications. [ bib | .pdf ]
The analysis of spectral data constitutes new challenges for machine learning algorithms due to the functional nature of the data. Special attention is given to the used metric in such analysis. Recently a prototype based algorithm has been proposed which allows the integration of a full adaptive matrix in the metric. In this contribution we analyse this approach with respect to band matrices and its usage for the analysis of functional spectral data. The approach is tested on satellite data and data taken from food chemistry

[74] F.-M. Schleif, T. Riemer, M. Cross, and T. Villmann. Automatic identification and quantification of metabolites in h-nmr measurements. In In Proceedings of the Workshop on Computational Systems Biology (WCSB) 2008, pages 165-168, 2008. [ bib ]
[75] F.-M. Schleif, M. Ongyerth, and T. Villmann. Sparse coding neural gas for analysis of nuclear magnetic resonance spectroscopy. In In Proceedings of the CBMS 2008, pages 620-625, 2008. [ bib | .pdf ]
Nuclear Magnetic Resonance Spectroscopy is a technique for the analysis of complex biochemical materials. Thereby the iden- tification of known sub-patterns is impor- tant. These measurements require an accu- rate preprocessing and analysis to meet clin- ical standards. Here we present a method for an appropriate sparse encoding of NMR spectral data combined with a fuzzy classifi- cation system allowing the identification of sub-patterns including mixtures thereof. The method is evaluated in contrast to an alterna- tive approach using simulated metabolic spec- tra.

[76] T. Geweniger, F.-M. Schleif, A. Hasenfuss, B. Hammer, and T. Villmann. Comparison of cluster algorithms for the analysis of text data using kolmogorov complexity. In In Proceedings of the ICONIP 2008, pages CD-Publication, 2008. [ bib | .pdf ]
In this paper we present a comparison of multiple cluster algorithms and their suitability for clustering text data. The clustering is based on similarities only, employing the Kolmogorov complexity as a similiarity measure. This motivates the set of considered clustering algo- rithms which take into account the similarity between objects exclusively. Compared cluster algorithms are Median kMeans, Median Neural Gas, Relational Neural Gas, Spectral Clustering and Affinity Propagation. keywords: cluster algorithm, similarity data, neural gas, spectral clus- tering, message passing, kMeans, Kolmogorov complexity

[77] F.-M. Schleif, T. Villmann, and B. Hammer. Pattern recognition by supervised relevance neural gas and its application to spectral data in bioinformatics. In Encyclopedia of Artificial Intelligence. 2008. [ bib ]
[78] F.-M. Schleif, B. Hammer, T. Villmann, M. v.d. Werff, A. Deelder, and R. Tollenaar. Analysis of spectral data in clinical proteomics by use of learning vector quantizers. In Computational Intelligence in Biomedicine and Bioinformatics: Current Trends and Applications, pages 141-167, chap. 6. 2008. [ bib ]
[79] F.-M. Schleif, T. Villmann, and B. Hammer. Prototype based fuzzy classification in clinical proteomics. International Journal of Approximate Reasoning, 47(1):4-16, 2008. [ bib | .pdf ]
Proteomic profiling based on mass spectrometry is an important tool for studies at the protein and peptide level in medicine and health care. Thereby, the identification of relevant masses, which are characteristic for specific sample states e.g. a disease state is complicated. Further, the classification accuracy and safety is especially important in medicine. The determination of classification models for such high dimensional clinical data is a complex task. Specific methods, which are robust with respect to the large number of dimensions and fit to clinical needs, are required. In this contribution two such methods for the construction of nearest prototype classifiers are compared in the context of clinical proteomic studies, which are specifically suited to deal with such high-dimensional functional data. Both methods are suitable to the adaptation of the underling metric, which is useful in proteomic research to get a problem adequate representation of the clinical data. In addition they allow fuzzy classification and for one of them allows fuzzy classified training data. Both algorithms are investigated in detail with respect to their specific properties. A performance analyzes is taken on real clinical proteomic cancer data in a comparative manner.

[80] M. Strickert, F.-M. Schleif, and U. Seiffert. Derivatives of pearson correlation for gradient-based analysis of biomedical data. Ibero-American Journal of Artificial Intelligence, 37(12):37-44, 2008. [ bib ]
[81] K. Bunte, P. Schneider, B. Hammer, F.-M. Schleif, T. Villmann, and M. Biehl. Discriminative visualization by limited rank matrix learning. Technical Report MLR-03-2008, Machine Learning Reports, ISSN:1865-3960 http://www.uni-leipzig.de/ compint/mlr/mlr_03_2008.pdf, 2008. [ bib ]
[82] F.-M. Schleif, B. Hammer, and Th. Villmann. Margin based Active Learning for LVQ Networks. Neurocomputing, 70(7-9):1215-1224, 2007. [ bib ]
[83] F.-M. Schleif, B. Hammer, and Th. Villmann. Supervised neural gas for functional data and its application to the analysis of clinical proteom spectra. In In Proceedings of the 9th International Work-Conference on Artificial Neural Networks (IWANN) 2007, pages 1036-1044, Berlin, Heidelberg, Germany, 2007. Springer. [ bib ]
[84] T. Villmann, M. Strickert, C. Brüß, F.-M. Schleif, and U. Seiffert. Visualization of fuzzy information in in fuzzy-classification for image sagmentation using MDS. In Michel Verleysen, editor, In Proceedings of the 15th European Symposium on Artificial Neural Networks (ESANN) 2007, pages 103-108, Evere, Belgium, 2007. d-side publications. [ bib ]
[85] A. Hasenfuss, B. Hammer, F.-M. Schleif, and T. Villmann. Neural gas clustering for sparse proximity data. In In Proceedings of the 9th International Work-Conference on Artificial Neural Networks (IWANN) 2007, pages 539-546, Berlin, Heidelberg, Germany, 2007. Springer. [ bib ]
[86] T. Villmann, F.-M. Schleif, E. Merenyi, and B. Hammer. Fuzzy labeled self organizing map for classification of spectra. In In Proceedings of the 9th International Work-Conference on Artificial Neural Networks (IWANN) 2007, pages 556-563, Berlin, Heidelberg, Germany, 2007. Springer. [ bib ]
[87] B. Hammer, A. Hasenfuss, B. Hammer, F.-M. Schleif, T. Villmann, M. Strickert, and U. Seiffert. Intuitive clustering of biological data. In Proc. of IJCNN 2007, pages 1877-1882. IEEE, 2007. [ bib ]
[88] S.-O. Deininger, M. Gerhard, and F.-M. Schleif. Statistical classification and visualization of maldi-imaging data. In Proc. of CBMS 2007, pages 403-405, 2007. [ bib ]
[89] F.-M. Schleif, T. Villmann, and B. Hammer. Analysis of proteomic spectral data by multi resolution analysis and self-organizing-maps. In Proceedings of the 7th International Workshop on Fuzzy Logic and Applications (WILF) 2007, pages 563-570, Berlin, Heidelberg, Germany, 2007. Springer. [ bib ]
[90] F.-M. Schleif. Prototypen basiertes maschinelles lernen in der klinische proteomik. In Ausgezeichnete Informatikdissertationen 2006, pages 179-188. GI-Edition Lecture Notes in Informatics (LNI), 2007. [ bib ]
[91] T. Villmann, F.-M. Schleif, B. Hammer, M. Strickert, and E. Merenyi. Class imaging of hyperspectral satellite remote sensing data using fuzzy labeled self organizing maps. In Proc. of WSOM 2007, pages http://biecoll.ub.uni-bielefeld.de//frontdoor.php?source_opus=128&la=en. Bielefeld University Press, 2007. [ bib ]
[92] P. Schneider, M. Biehl, F.-M. Schleif, and B. Hammer. Advanced metric adaptation in general lvq for classification of mass spectrometry data. In Proc. of WSOM 2007, pages http://biecoll.ub.uni-bielefeld.de//frontdoor.php?source_opus=125&la=en. Bielefeld University Press, 2007. [ bib ]
[93] M. Strickert, F.-M. Schleif, and U. Seiffert. Gradients of pearson correlation for analysis of biomedical data. In Proc. of ASAI 2007, pages 139-150, 2007. [ bib ]
[94] M. Strickert and F.-M. Schleif. Supervised attribute relevance determination for protein identification in stress experiments. In Proc. of MLSB 2007, pages 81-86, 2007. [ bib ]
[95] T. Villmann, F.-M. Schleif, M. v.d.Werff, A. Deelder, and R. Tollenaar. Associative learning in soms for fuzzy-classification. In Proc. of ICMLA 2007, pages 581-586, 2007. [ bib ]
[96] F.-M. Schleif. Maschinelles Lernen mit Prototypmethoden in der klinischen Proteomik. Künstliche Intelligenz (KI), 4/07:65-67, 2007. [ bib | .pdf ]
Die klinische Proteomik untersucht proteinbasierte Krankheitsprozesse in klinischen Proben. Die Messung der Probe erfolgt dabei typischer Weise durch ein Massenspektrometer. Dabei entstehen hochdimensionale Spektren, die die Expressivität von bestimmten Proteinfragmenten anzeigen. Eine weitere Herausforderung ist die eher geringe Anzahl von Proben. Zudem ist die Güte und Interpretierbarkeit der Klassifikationsentscheidung von besonderer Bedeutung und die Adaptierbarkeit der generischen Klassifikationsmodelle bei Nachmessungen. Entsprechend werden die Spektren zur Weiterverarbeitung geeignet reduziert. Nach geeigneter Evaluierung, können diese für die Analyse und Diagnostik von Krankheitsprozessen in Frage kommen. Wir betrachten kurz die Aufbereitung der Spektren, nachfolgend werden Konzepte prototypischer Klassifikationsverfahren beschrieben und deren Erweiterungen f ̈r die klinische Proteomik skizziert. Im Ergebnisteil wird die entwickelte Algorithmik zur Bildung von Klassifikationsmodellen für verschiedene klinische Datensätze eingesetzt und bewertet.

[97] M. Strickert, F.-M. Schleif, T. Villmann, and U. Seiffert. Derivatives of pearson correlation for gradient based analysis of biomedical data. In Similarity based Clustering, LNCS, 2007. [ bib ]
[98] F.-M. Schleif. In Dagstuhl online proceedings - Seminar Similarity based Clustering. Schloss Dagstuhl - Leibniz Center for Informatics, 2007. [ bib ]
[99] F.-M. Schleif, A. Hasenfuss, and B. Hammer. Aggregation of multiple peak lists by use of an improved neural gas network. Technical Report MLR-02-2007, Machine Learning Reports, ISSN:1865-3960 http://www.uni-leipzig.de/ compint/mlr/mlr_02_2007.pdf, 2007. [ bib ]
[100] F.-M. Schleif. Preprocessing of nuclear magnetic resonance spectrometry data. Technical Report MLR-01-2007, Machine Learning Reports, ISSN:1865-3960 http://www.uni-leipzig.de/ compint/mlr/mlr_01_2007.pdf, 2007. [ bib ]
[101] Th. Villmann, F.-M. Schleif, and B. Hammer. Comparison of relevance learning vector quantization with other metric adaptive classification methods. Neural Networks, 19(5):610-622, 2006. [ bib ]
[102] C. Brüß, F. Bollenbeck, F.-M. Schleif, W. Weschke, T. Villmann, and U. Seiffert. Fuzzy image segmentation with fuzzy labelled neural gas. In Proc. of ESANN 2006, pages 563-569, 2006. [ bib ]
[103] B. Hammer, T. Villmann, F.-M. Schleif, C. Albani, and W. Hermann. Learning vector quantization classification with local relevance determination for medical data. In Proc. of ICAISC 2006, LNAI 4029, pages 603-612. Springer, 2006. [ bib ]
[104] F.-M. Schleif, B. Hammer, and Th. Villmann. Margin based Active Learning for LVQ Networks. In Proc. of ESANN 2006, pages 539-545, 2006. [ bib ]
[105] F.-M. Schleif, T. Elssner, M. Kostrzewa, T. Villmann, and B. Hammer. Analysis and visualization of proteomic data by fuzzy labeled self organizing maps. In Proc. of CBMS 2006, pages 919-924, 2006. [ bib ]
[106] T. Villmann, U. Seiffert, F.-M. Schleif, C. Brüß, T. Geweniger, and B. Hammer. Fuzzy labeled self-organizing map with label-adjusted prototypes. In Proc. of ANNPR 2006, pages 46-56, 2006. [ bib ]
[107] T. Villmann, B. Hammer, F.-M. Schleif, T. Geweniger, and M. Cottrell. Prototype based classification using information theoretic learning. In Proc. of ICONIP 2006, pages 40-49, 2006. [ bib ]
[108] B. Hammer, A. Hasenfuss, F.-M. Schleif, and T. Villmann. Supervised batch neural gas. In Proc. of ANNPR 2006, pages 33-45, 2006. [ bib ]
[109] B. Hammer, A. Hasenfuss, F.-M. Schleif, and T. Villmann. Supervised median clustering. In Proc. of ANNIE 2006, pages 623-632, 2006. [ bib ]
[110] F.-M. Schleif, T. Elssner, M. Kostrzewa, T. Villmann, and B. Hammer. Machine learning and soft-computing in bioinformatics - a short journey. In Proc. of FLINS 2006, pages 541-548. World Scientific Press, 2006. [ bib ]
[111] T. Villmann, F.-M. Schleif, and B. Hammer. Prototype-based fuzzy classification with local relevance for proteomics. NeuroComputing Letters, 69:2425-2428, 2006. [ bib ]
[112] T. Villmann, B. Hammer, F.-M. Schleif, T. Geweniger, and W. Herrmann. Fuzzy classification by fuzzy labeled neural gas. Neural Networks, 19(6-7):772-779, 2006. [ bib ]
[113] F.-M. Schleif. Prototype based Machine Learning for Clinical Proteomics. PhD thesis, Technical University Clausthal, Technical University Clausthal, Clausthal-Zellerfeld, Germany, 2006. [ bib ]
[114] B. Hammer, A. Hasenfuss, F.-M. Schleif, and Th. Villmann. Supervised median clustering. Technical Report Technical Reports University of Clausthal IfI-09-06, 2006. [ bib ]
[115] F.-M. Schleif, Th. Villmann, and B. Hammer. Local metric adaptation for soft nearest prototype classification to classify proteomic data. In In Proceedings of the 6th Workshop on Fuzzy Logic and Applications (WILF) 2005, pages 290-296, Berlin Heidelberg, Germany, 2005. Springer. [ bib ]
[116] Th. Villmann, B. Hammer, F.-M. Schleif, and T. Geweniger. Fuzzy labeled neural gas for fuzzy classification. In Marie Cottrell, editor, Proceedings of the 5th Workshop on Self-Organizing Maps (WSOM) 2005, pages 283-290, Paris, France, 2005. University Paris-1-Pantheon-Sorbonne on CD-ROM (C) 2005 WSOM'05 Organizing Committee. [ bib ]
[117] F.-M. Schleif, Th. Villmann, and B. Hammer. Fuzzy labeled soft nearest neighbor classification with relevance learning. In M. Arif Wani, Krzysztof J. Cios, and Khalid Hafeez, editors, In Proceedings of the 4th International Conference on Machine Learning and Applications (ICMLA) 2005, pages 11-15, Los Alamitos, CA, USA, 2005. IEEE Press. [ bib ]
[118] B. Hammer, F.-M. Schleif, and Th. Villmann. On the generalization ability of prototype-based classifiers with local relevance determination. Technical Report Technical Reports University of Clausthal IfI-05-14, 2005. [ bib ]
[119] F.-M. Schleif. Plugins mit wxwidgets. Offene Systeme, 01/05:5-10, 2005. [ bib ]
[120] F.-M. Schleif, U. Clauss, Th. Villmann, and B. Hammer. Supervised relevance neural gas and unified maximum separability analysis for classification of mass spectrometric data. In M. Arif Wani, Krzysztof J. Cios, and Khalid Hafeez, editors, In Proceedings of the 3rd International Conference on Machine Learning and Applications (ICMLA) 2004, pages 374-379, Los Alamitos, CA, USA, December 2004. IEEE Press. [ bib ]
[121] T. Villmann, B. Hammer, and F.-M. Schleif. Metrik adaptation for optimal feature classification in learning vector quantization applied to environment detection. In In Proceedings of Selbstorganisation Von Adaptivem Verfahren (SOAVE'2004), pages 592-597. Fortschritts-Berichte VDI Reihe 10, Nr. 742, VDI Verlag, Germany, 2004. [ bib ]
[122] Th. Villmann, F.-M. Schleif, and B. Hammer. Supervised neural gas and relevance learning in learning vector quantisation. In Takeshi Yamakawa, editor, In Proceedings of the 4th Workshop on Self Organizing Maps (WSOM) 2003, pages 47-52, Hibikino, Kitakyushu, Japan, 2003. Kyushu Institute of Technology on CD-ROM (C) 2003 WSOM'03 Organizing Committee. [ bib ]
[123] V. Gruhn, M. Hülder, R. Ijoui, and F.-M. Schleif. A distributed logistic support communication system. In H. Linger, J. Fisher, W.G. Wojtkowski, J. Zupancic, K. Vigo, and J. Arnold, editors, In Proceedings of ISD 2003 - Constructing the Infrastructure for the Knowledge Economy - Methods and Tools, Theory and Practice, pages 705-713. Kluwer Academic Publishers, London, 2003. [ bib ]
[124] T. Dörfler, A. Simmel, F.-M. Schleif, and E. Sommerfeld. Working memory load and eeg coherence. Brain Topography, 15(4):269, 2003. [ bib ]
[125] M. Köhler, K. Buchta, F.-M., F.-M. Schleif, and E. Sommerfeld. A mission for the eeg coherence analysis: Is the task complex or difficult? Brain Topography, 15(4):271, 2003. [ bib ]
[126] M. Köhler, K. Buchta, F.-M., F.-M. Schleif, and E. Sommerfeld. Complexity and difficulty in memory based comparison. In In Proceedings of the 18th Meeting of the International Society for Psychophysics, pages 433-439. Pabst Publishing, 2002. [ bib ]
[127] F.-M. Schleif. Momentbasierte methoden zur schriftzeichenerkennung. Master's thesis, University of Leipzig, Dokumentenserver Universität Leipzig, http://dol.uni-leipzig.de/pub/2002-33, 2002. [ bib ]
[128] F.-M. Schleif. Ocr mit statistischen momenten. Gaotenblatt, pages 15-17, 2002. [ bib ]
[129] F.-M. Schleif and H. Stamer. LaTeX im studentischen alltag. Gaotenblatt, pages 3-10, 2002. [ bib ]
[130] A. Simmel, T. Dörfler, F.-M. Schleif, and E. Sommerfeld. An analysis of connections between internal and external learning process indicators using eeg coherence analysis. In In Proceedings of the 17th Meeting of the International Society for Psychophysics, pages 602-607. Pabst Publishing, 2001. [ bib ]
[131] T. Dörfler, A. Simmel, F.-M. Schleif, and E. Sommerfeld. Complexity - dependent synchronization of brain subsystems during memorization. In In Proceedings of the 17th Meeting of the International Society for Psychophysics, pages 343-348. Pabst Publishing, 2001. [ bib ]