Name: Data Clustering: Theory, Algorithms, and Applications, Second Edition
Brand: Guojun Gan
Price: 28538.00 JPY
Availability: InStock
Rating: 4.333333333333333 (3 reviews)

Deliver to Japan

For best experience Get the App

DOWNLOAD THE APP

WE SOURCE ITEMS FROM

UAE
UK
India
USA
Japan
KOREA

Customer Services

About Us

Customer Support

Reviews

4.0

All from verified purchases

D**T

Useful Overview of Algorithms, But Needs a More Systemic Framework

This is a useful compendium of a variety of methods of clustering, for a variety of data types, with numerous measures of similarity, and many examples of algorithms. The ultimate emphasis is on the algorithms, even the implementation in MATLAB or C++.However this book is short on useful theoretical frameworks, reflecting more the efforts of practitioners from various fields than that of applied mathematicians, despite the SIAM imprint. There are times when it seems that the authors were just writing quick overviews of different papers, sometimes changing notation, even with typographical errors or other lapses, without trying to unify the different methods by developing a common framework.I developed a new clustering algorithm for a particular application to proportional representation in voting and was surprised that none of the algorithms described in this book encompassed my method. This is despite the fact that most of my concepts have been previously used by others in other contexts. For example, I use fuzzy sets, learned directly from Bezdek many years ago, but not as described by the k-means or c-means formulas presented in chapter 8, even though the k-means centroiding idea is easily generalized to encompass my method.A good general framework for clustering algorithms would be that of optimization theory, with both discrete and continuous aspects. Yet the authors do not appear to be well versed in this theory. For example, instead of using the phrase “objective function” they use “validity indices” and survey only a few specific formulas, rather than describing a framework to give practitioners better guidance for constructing an objective function well suited to the problem at hand. In my case, I wanted clusters (= voting blocks) that are not too small, reasonably compact, with some but not too much overlap permitted, and which include the bulk of the population, permitting some voters to be outside any cluster or at least without full membership in the clusters. I also wanted an objective function with a fairly smooth dependence on the data and on parameters of the algorithm, without sudden jumps. I was able to do this, but this book would not have helped.Another important practical aspect of clustering is not covered at all. Namely, how to handle very large sets of data. In scientific computing a common paradigm is to start with “discretization” – you take continuous or nearly continuous data and first organize it into bundles based on similarity. This is itself a problem of partition-type clustering. But at this stage the emphasis is on speed of processing, not optimality, seeking a radical reduction in problem size with minimal loss of information. How to best do it is very problem dependent, but an obvious place to start is k-means partitioning for a value of k that is fairly large yet still much smaller than the original number of data points. Then apply a more optimal procedure to these k data points. With categorical data, sorting procedures on the categories may be used to get even better discretizations. In my case the data points are ballots where voters rank candidates, so that I can sort ballots by the highest ranked candidates, using k-means to represent the lower rankings by a centroid. Note that the resulting data points have different weights (represent different numbers of voters), meaning that the follow-on algorithm will be for a graph weighted vertices as well as weighted edges (= similarity, such as correlation). However, this framework is not described in the book.Another puzzling omission is the lack of discussion of good methods for initializing iterative algorithms, despite the acknowledgement on p. 164 that k-means is very dependent on the initialization of the center vectors. In my case I simply use a variety of crude techniques to initialize, then build into my iterative solver a procedure than merges strongly overlapping clusters and deletes very small clusters. Then my objective function allows me to rank the results. Thus “quick and dirty” clustering schemes do have an important role, even in the context of more optimal algorithms.In conclusion, this book provides fairly readable snapshots of clustering techniques in action, but would greatly benefit from a more systemic approach.

14.11.2015

A**R

Five Stars

like this one . Great for students!

13.12.2016

A**R

Five Stars

Great books for students!

13.12.2016

Trustpilot

TrustScore 4.5 | 7,300+ reviews

Ali H.

Fast shipping and excellent packaging. The Leatherman tool feels very premium and sturdy.

1 day ago

Farhan Q.

The delivery time was excellent, and the packaging was secure.

2 months ago

View all Trustpilot reviews

Shop Global, Save with Desertcart

Value for Money

Competitive prices on a vast range of products

Shop Globally

Serving millions of shoppers across more than 100 countries

Enhanced Protection

Trusted payment options loved by worldwide shoppers

Customer Assurance

Trusted payment options loved by worldwide shoppers.

Desertcart App

Shop on the go, anytime, anywhere.

¥28538

Duties & taxes incl.

Japanstore

Free Shipping

with PRO Membership

Free Returns

30 daysfor PRO membership users

15 dayswithout membership

Secure Transaction

Trustpilot

TrustScore 4.5 | 7,300+ reviews

Yusuf A.

Fantastic experience overall. Will recommend to friends and family.

1 month ago

Reema J.

Perfect platform for hard-to-find items. Delivery was prompt.

1 month ago

View all Trustpilot reviews

Data Clustering: Theory, Algorithms, and Applications, Second Edition

Description

Reviews

Common Questions