This book presents the bi-partial approach to data analysis, which is both uniquely general and enables the development of techniques for many data analysis problems, including related models and algorithms. It is based on adequate representation of the essential clustering problem: to group together the similar, and to separate the dissimilar. This leads to a general objective function and subsequently to a broad class of concrete implementations. Using this basis, a suboptimising procedure can be developed, together with a variety of implementations.
This procedure has a striking affinity with the classical hierarchical merger algorithms, while also incorporating the stopping rule, based on the objective function. The approach resolves the cluster number issue, as the solutions obtained include both the content and the number of clusters. Further, it is demonstrated how the bi-partial principle can be effectively applied to a wide variety of problems in data analysis.
The book offers a valuable resource for all data scientists who wish to broaden their perspective on basic approaches and essential problems, and to thus find answers to questions that are often overlooked or have yet to be solved convincingly. It is also intended for graduate students in the computer and data sciences, and will complement their knowledge and skills with fresh insights on problems that are otherwise treated in the standard "academic" manner.
Preface.- Chapter 1. Notation and main assumptions.- Chapter 2. The problem of cluster analysis.- Chapter 3. The general formulation of the objective function.- Chapter 4. Formulations and rationales for other problems in data analysis, etc.