This is the
talk page for discussing improvements to the
Cluster analysis article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1Auto-archiving period: 365 days |
Daily page views
|
This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Text has been copied to or from this article; see the list below. The source pages now serve to
provide attribution for the content in the destination pages and must not be deleted as long as the copies exist. For attribution and to access older versions of the copied text, please see the history links below.
|
This article is substantially duplicated by a piece in an external publication. Please do not flag this article as a copyright violation of the following source:
|
The content of this article has been derived in whole or part from
https://github.com/eXascaleInfolab/clubmark/tree/master/docs. Permission has been received from the copyright holder to release this material under both the
Creative Commons Attribution-ShareAlike 3.0 Unported license and the
GNU Free Documentation License. You may use either or both licenses. Evidence of this has been confirmed and stored by
VRT volunteers, under ticket number
2019021110001288. Also available under
Creative Commons Attribution 4.0 and
Apache 2.0 This template is used by approved volunteers dealing with the Wikimedia volunteer response team system (VRTS) after receipt of a clear statement of permission at permissions-en wikimedia.org. Do not use this template to claim permission. |
Can someone please make infinity-norm a link: infinity-norm
(The article is currently locked.)
This page appears to have been deliberately vandalised.
Please unlock this page.
A Google search for "V-means clustering" only returns this Wikipedia article. Can someone provide a citation for this?
for future ref, this is the V-means paragraph that was removed
This article possibly contains
original research. (October 2007) |
This article contains
weasel words: vague phrasing that often accompanies
biased or
unverifiable information. (March 2009) |
V-means clustering utilizes cluster analysis and nonparametric statistical tests to key researchers into segments of data that may contain distinct homogenous sub-sets. The methodology embraced by V-means clustering circumvents many of the problems that traditionally beleaguer standard techniques for categorizing data. First, instead of relying on analyst predictions for the number of distinct sub-sets (k-means clustering), V-means clustering generates a pareto optimal number of sub-sets. V-means clustering is calibrated to a usened confidence level p, whereby the algorithm divides the data and then recombines the resulting groups until the probability that any given group belongs to the same distribution as either of its neighbors is less than p.
Second, V-means clustering makes use of repeated iterations of the nonparametric Kolmogorov-Smirnov test. Standard methods of dividing data into its constituent parts are often entangled in definitions of distances (distance measure clustering) or in assumptions about the normality of the data (expectation maximization clustering), but nonparametric analysis draws inference from the distribution functions of sets.
Third, the method is conceptually simple. Some methods combine multiple techniques in sequence in order to produce more robust results. From a practical standpoint this muddles the meaning of the results and frequently leads to conclusions typical of “data dredging.”
I believe ther is a typo at "typological analysis"; should be "topological"
The explanation of the fuzzy c-means algorithm seems quite difficult to follow, the actual order of the bullet points is correct but which bit is to be repeated and when is misleading.
"The fuzzy c-means algorithm is greatly similar to the k-means algorithm:
Also aren't c-means and k-means just different names for the same thing, in which case can they be changed to be consistent throughout?
The c-means clustering relates only to the fuzzy logic clustering algorithm. You could say that k-means is teh convergence of c-clustering with ordinary logic, rather than fuzzy logic.
The grid-based clustering section has no real references and poorly described in comparison to the rest of the article.
This is the
talk page for discussing improvements to the
Cluster analysis article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
Archives: 1Auto-archiving period: 365 days |
Daily page views
|
This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||||||||||||||||
|
Text has been copied to or from this article; see the list below. The source pages now serve to
provide attribution for the content in the destination pages and must not be deleted as long as the copies exist. For attribution and to access older versions of the copied text, please see the history links below.
|
This article is substantially duplicated by a piece in an external publication. Please do not flag this article as a copyright violation of the following source:
|
The content of this article has been derived in whole or part from
https://github.com/eXascaleInfolab/clubmark/tree/master/docs. Permission has been received from the copyright holder to release this material under both the
Creative Commons Attribution-ShareAlike 3.0 Unported license and the
GNU Free Documentation License. You may use either or both licenses. Evidence of this has been confirmed and stored by
VRT volunteers, under ticket number
2019021110001288. Also available under
Creative Commons Attribution 4.0 and
Apache 2.0 This template is used by approved volunteers dealing with the Wikimedia volunteer response team system (VRTS) after receipt of a clear statement of permission at permissions-en wikimedia.org. Do not use this template to claim permission. |
Can someone please make infinity-norm a link: infinity-norm
(The article is currently locked.)
This page appears to have been deliberately vandalised.
Please unlock this page.
A Google search for "V-means clustering" only returns this Wikipedia article. Can someone provide a citation for this?
for future ref, this is the V-means paragraph that was removed
This article possibly contains
original research. (October 2007) |
This article contains
weasel words: vague phrasing that often accompanies
biased or
unverifiable information. (March 2009) |
V-means clustering utilizes cluster analysis and nonparametric statistical tests to key researchers into segments of data that may contain distinct homogenous sub-sets. The methodology embraced by V-means clustering circumvents many of the problems that traditionally beleaguer standard techniques for categorizing data. First, instead of relying on analyst predictions for the number of distinct sub-sets (k-means clustering), V-means clustering generates a pareto optimal number of sub-sets. V-means clustering is calibrated to a usened confidence level p, whereby the algorithm divides the data and then recombines the resulting groups until the probability that any given group belongs to the same distribution as either of its neighbors is less than p.
Second, V-means clustering makes use of repeated iterations of the nonparametric Kolmogorov-Smirnov test. Standard methods of dividing data into its constituent parts are often entangled in definitions of distances (distance measure clustering) or in assumptions about the normality of the data (expectation maximization clustering), but nonparametric analysis draws inference from the distribution functions of sets.
Third, the method is conceptually simple. Some methods combine multiple techniques in sequence in order to produce more robust results. From a practical standpoint this muddles the meaning of the results and frequently leads to conclusions typical of “data dredging.”
I believe ther is a typo at "typological analysis"; should be "topological"
The explanation of the fuzzy c-means algorithm seems quite difficult to follow, the actual order of the bullet points is correct but which bit is to be repeated and when is misleading.
"The fuzzy c-means algorithm is greatly similar to the k-means algorithm:
Also aren't c-means and k-means just different names for the same thing, in which case can they be changed to be consistent throughout?
The c-means clustering relates only to the fuzzy logic clustering algorithm. You could say that k-means is teh convergence of c-clustering with ordinary logic, rather than fuzzy logic.
The grid-based clustering section has no real references and poorly described in comparison to the rest of the article.