![]() | This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||
|
The comment related to RapidMiner states that the point closest to the mean is "not" the medoid. This statement is wrong! A proof for the closest point to be a medoid (a sample may have several!) is given in https://dx.doi.org/10.13140/2.1.4453.2009
This article is very confusing and not a reliable explanation of Partitioning Around Medoids/K-Medoids. It needs to be edited and the explanation made clearer. —Preceding unsigned comment added by Kdambiec ( talk • contribs) 08:52, 1 May 2008 (UTC)
Added explanation about how the cost is calculated. The Minkowski distance metric is used to calculate the distance by using an r value of 1. This is sometimes refered to as the City block distance or the Manhattan distance. -- Xinunus ( talk) 20:49, 22 June 2008 (UTC)
I didn't find it very confusing, although the article should be tidied up into proper paragraphs. This page desperately needs to reference the original academic article or at least a book. Also, I'm not sure that there is a single "k-medoids" algorithm, as a number of clustering methods use a k-medoids approach (eg PAM). Finally, if the Minkowski distance is the same as the Manhattan distance, isn't mentioning it redundant? Satyr9 ( talk) 05:22, 28 July 2008 (UTC)
"Cost" is never defined in this article, despite being used extensively in the algorithm. 64.189.203.198 ( talk) 03:13, 9 October 2019 (UTC)
=Wrong algorithm? Isn't this CLARANS (Clustering Large Application based on RANdomized Search) as proposed by Ng and Han? If I understand correctly, PAM should start with one (central) medoid and then add k-1 medoids if total cost can be reduced. Then all possible combinations of "switching" are calculated and carried out. This is of course incredibly slow for large amounts of data which is why CLARANS randomly selects one non-medoid per medoid and checks for a difference in cost resulting in a non-optimal result, if it terminates without having found the minimum cost/distance.
See page 56f in "Knowledge Discovery in Databases" by Martin Ester & Jörg Sander, Springer Verlag 2000 (German, sorry :p) —Preceding unsigned comment added by 95.90.189.48 ( talk) 23:43, 16 April 2009 (UTC)
Perhaps I misunderstood the procedure, but at the end of the example, wouldn't it go on selecting random O' till every non medoid had been tried for ? Obviously improvements in the other cluster might be possible (4,7 rather than 3,4 for example?). I'll admit though, that I don't know more about this algorithm than I've read here - but it would seem strange to me for the algorithm to terminate at the first check that doesn't give an improvement. Regards Sean Heron ( talk) 10:35, 7 May 2009 (UTC).
"It is more robust to noise and outliers as compared to k-means" is a strong claim. Without a good citation, I do not think it belongs in the article. 71.198.184.231 ( talk) 20:43, 8 May 2009 (UTC)
In the example, Figure 1.2 and 1.3 show X5 = (6,2) as outlier instead of being associated with cluster2 as in the explanation. 115.72.95.218 ( talk) 11:08, 8 June 2014 (UTC)
![]() | This edit request by an editor with a conflict of interest was declined. |
Conflict of Interest Disclosure: I am the primary author of BanditPAM (Mo Tiwari).
Request to editor: The COI guidelines suggest that if an edit is verifiable and appropriate, it will usually be accepted. In this circumstance, I would argue that the suggested changes are both verifiable and appropriate. Motiwari ( talk) 04:44, 14 December 2020 (UTC)
Closing this edit request as no consensus. If sources are found to support this request, it may be reopened. Altamel ( talk) 04:13, 4 March 2021 (UTC)
![]() | This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||
|
The comment related to RapidMiner states that the point closest to the mean is "not" the medoid. This statement is wrong! A proof for the closest point to be a medoid (a sample may have several!) is given in https://dx.doi.org/10.13140/2.1.4453.2009
This article is very confusing and not a reliable explanation of Partitioning Around Medoids/K-Medoids. It needs to be edited and the explanation made clearer. —Preceding unsigned comment added by Kdambiec ( talk • contribs) 08:52, 1 May 2008 (UTC)
Added explanation about how the cost is calculated. The Minkowski distance metric is used to calculate the distance by using an r value of 1. This is sometimes refered to as the City block distance or the Manhattan distance. -- Xinunus ( talk) 20:49, 22 June 2008 (UTC)
I didn't find it very confusing, although the article should be tidied up into proper paragraphs. This page desperately needs to reference the original academic article or at least a book. Also, I'm not sure that there is a single "k-medoids" algorithm, as a number of clustering methods use a k-medoids approach (eg PAM). Finally, if the Minkowski distance is the same as the Manhattan distance, isn't mentioning it redundant? Satyr9 ( talk) 05:22, 28 July 2008 (UTC)
"Cost" is never defined in this article, despite being used extensively in the algorithm. 64.189.203.198 ( talk) 03:13, 9 October 2019 (UTC)
=Wrong algorithm? Isn't this CLARANS (Clustering Large Application based on RANdomized Search) as proposed by Ng and Han? If I understand correctly, PAM should start with one (central) medoid and then add k-1 medoids if total cost can be reduced. Then all possible combinations of "switching" are calculated and carried out. This is of course incredibly slow for large amounts of data which is why CLARANS randomly selects one non-medoid per medoid and checks for a difference in cost resulting in a non-optimal result, if it terminates without having found the minimum cost/distance.
See page 56f in "Knowledge Discovery in Databases" by Martin Ester & Jörg Sander, Springer Verlag 2000 (German, sorry :p) —Preceding unsigned comment added by 95.90.189.48 ( talk) 23:43, 16 April 2009 (UTC)
Perhaps I misunderstood the procedure, but at the end of the example, wouldn't it go on selecting random O' till every non medoid had been tried for ? Obviously improvements in the other cluster might be possible (4,7 rather than 3,4 for example?). I'll admit though, that I don't know more about this algorithm than I've read here - but it would seem strange to me for the algorithm to terminate at the first check that doesn't give an improvement. Regards Sean Heron ( talk) 10:35, 7 May 2009 (UTC).
"It is more robust to noise and outliers as compared to k-means" is a strong claim. Without a good citation, I do not think it belongs in the article. 71.198.184.231 ( talk) 20:43, 8 May 2009 (UTC)
In the example, Figure 1.2 and 1.3 show X5 = (6,2) as outlier instead of being associated with cluster2 as in the explanation. 115.72.95.218 ( talk) 11:08, 8 June 2014 (UTC)
![]() | This edit request by an editor with a conflict of interest was declined. |
Conflict of Interest Disclosure: I am the primary author of BanditPAM (Mo Tiwari).
Request to editor: The COI guidelines suggest that if an edit is verifiable and appropriate, it will usually be accepted. In this circumstance, I would argue that the suggested changes are both verifiable and appropriate. Motiwari ( talk) 04:44, 14 December 2020 (UTC)
Closing this edit request as no consensus. If sources are found to support this request, it may be reopened. Altamel ( talk) 04:13, 4 March 2021 (UTC)