Author name cluster

Marcus Held

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

1 author row

NeurIPS Conference 1999 Conference Paper

Model Selection in Clustering by Uniform Convergence Bounds

Joachim Buhmann
Marcus Held

Unsupervised learning algorithms are designed to extract struc(cid: 173) ture from data samples. Reliable and robust inference requires a guarantee that extracted structures are typical for the data source, Le. , similar structures have to be inferred from a second sample set of the same data source. The overfitting phenomenon in max(cid: 173) imum entropy based annealing algorithms is exemplarily studied for a class of histogram clustering models. Bernstein's inequality for large deviations is used to determine the maximally achievable approximation quality parameterized by a minimal temperature. Monte Carlo simulations support the proposed model selection cri(cid: 173) terion by finite temperature annealing.

PDF Details

NeurIPS Conference 1998 Conference Paper

Visualizing Group Structure

Marcus Held
Jan Puzicha
Joachim Buhmann

Cluster analysis is a fundamental principle in exploratory data analysis, providing the user with a description of the group struc(cid: 173) ture of given data. A key problem in this context is the interpreta(cid: 173) tion and visualization of clustering solutions in high- dimensional or abstract data spaces. In particular, probabilistic descriptions of the group structure, essential to capture inter-cluster relation(cid: 173) ships, are hardly assessable by simple inspection ofthe probabilistic assignment variables. VVe present a novel approach to the visual(cid: 173) ization of group structure. It is based on a statistical model of the object assignments which have been observed or estimated by a probabilistic clustering procedure. The objects or data points are embedded in a low dimensional Euclidean space by approximating the observed data statistics with a Gaussian mixture model. The algorithm provides a new approach to the visualization of the inher(cid: 173) ent structure for a broad variety of data types, e. g. histogram data, proximity data and co-occurrence data. To demonstrate the power of the approach, histograms of textured images are visualized as an example of a large-scale data mining application.

PDF Details

NeurIPS Conference 1997 Conference Paper

Unsupervised On-line Learning of Decision Trees for Hierarchical Data Analysis

Marcus Held
Joachim Buhmann

An adaptive on-line algorithm is proposed to estimate hierarchical data structures for non-stationary data sources. The approach is based on the principle of minimum cross entropy to derive a decision tree for data clustering and it employs a metalearning idea (learning to learn) to adapt to changes in data characteristics. Its efficiency is demonstrated by grouping non-stationary artifical data and by hierarchical segmentation of LANDSAT images.

PDF Details