A sequential Bayesian alternative to the classical parallel fuzzy clustering model

Link to Article: 
Abstract: 

Unsupervised separation of a group of datums of a particular type, into clusters which are homogenous within a problem class-specific context, is a classical research problem which is still actively visited. Since the 1960s, the research community has converged into a class of clustering algorithms, which utilizes concepts such as fuzzy/probabilistic membership as well as possibilistic and credibilistic degrees. In spite of the differences in the formalizations and approaches to loss assessment in different algorithms, a significant majority of the works in the literature utilize the sum of datum-to-cluster distances for all datums and all clusters. In essence, this double summation is the basis on which additional features such as outlier rejection and robustification are built. In this work, we revisit this classical concept and suggest an alternative clustering model in which clusters function on datums sequentially. We exhibit that the notion of being an outlier emerges within the mathematical model developed in this paper. Then, we provide a generic loss model in the new framework. In fact, this model is independent of any particular datum or cluster models and utilizes a robust loss function. An important aspect of this work is that the modeling is entirely based on a Bayesian inference framework and that we avoid any notion of engineering terms based on heuristics or intuitions. We then develop a solution strategy which functions within an Alternating Optimization pipeline.

Keywords: 
Data clustering
Fuzzy clustering
Possibilistic clustering
Robust clustering
Bayesian modeling
Sequential clustering
Year: 
2015
Volume: 
Issue: 
Page Start: 
28
Page End: 
47