This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
This page says "In order to use the Mahalanobis distance to classify a test point as belonging to one of N classes, one first estimates the covariance matrix of each class, usually based on samples known to belong to each class. Then, given a test sample, one computes the Mahalanobis distance to each class, and classifies the test point as belonging to that class for which the Mahalanobis distance is minimal."
However, I have two sources that both use the single pooled within-group covariance matrix for computing all distances, instead of a covariance matrix per class.
How to reconcile these two views? I believe the classification section should be rewritten.
dfrankow ( talk) 17:02, 23 December 2008 (UTC)
Statistical leverage links here. Is that appropriate?? Seems to me the former concept is broader than this article. — DIV ( 128.250.204.118 07:46, 9 July 2007 (UTC))
That intuitive explanation was very helpful, great work. Matumio ( talk) 09:34, 21 December 2007 (UTC)
The covariance matrix is usually called rather than . It hampers readability. I am changing it unless there is a specific reason for calling it . Sourangshu ( talk) 13:34, 25 February 2008 (UTC) Monday, February 25 2008
I understand the desire to use a symbol for covariance that is not confused with 'sum', but every other article makes due, including that for covariance. For that reason, I prefer over S. 192.35.35.34 ( talk) 23:27, 13 March 2009 (UTC)
Cov(x) = . Cov(.) is a function on x. is the result of applying that function to x. The usual distinction between and S is that the former indicates a population value and the latter a sample estimate (in statistics). In the present case, that distinction appears moot. Kmarkus ( talk) 13:49, 18 September 2009 (UTC)
The text ends saying that using it is the same of finding the group of maximum probability. Isn't it the maximum likelihood?... And AFAIK that is just if the distributions are the same, with radial symmetry, and also if the groups are considered to happen with the same probability. -- NIC1138 ( talk) 20:21, 29 October 2008 (UTC)
User:Aetheling removed the text. Aetheling, can you please explain why? -- Pot ( talk) 11:58, 27 May 2010 (UTC)
Adding Template:Refimprove banner. I recently added the main reference to Mahalanobis, 1936, but I just noticed it had been previously removed for some obscure reason (this article obviously exist, even if it is hard to find (you need to register to the journal). More generally, this article is insufficiently sourced. Intuitive explanation and Relationship to leverage should have at least 1 reference each. Each application should be referenced. Calimo ( talk) 09:57, 7 December 2008 (UTC)
Why not? I had added this and it was removed Hotelling T2 Distribution, The MathWorks. Retrieved on 2008-12-16. -- Pot ( talk) 14:10, 16 December 2008 (UTC)
Suppose I have two uncorrelated 2D points with zero mean but with different variance, x and y. Suppose
and
As I understand Mahalanobis distance, If I had x=[0.1,1] it would have a Mahalanobis distance of 0.1414 from the origin. Likewise if y=[1,0.1] it would have a Mahalanobis distance of 0.1414 from the origin (would that be a "Mahalobis norm"?) but y=[0.1,1] would have a Mahalobis distance of 1.0 from the origin. What is the Mahalanobis distance between a given x and y? My feeling is that this intuitively measures the unlikelyhood of the distance between x and y. As such, it should have something to do with the expected value of . Any thoughts? In particular, I'm guessing that we can let and use the sum of normally distributed random variables rules to find that
and so the Mahalanobis distance between x and y would be
Does that sound right? —Ben FrantzDale ( talk) 18:02, 30 March 2009 (UTC)
I'm not editing the page, because I don't know if it's _true_ or not. But, this page http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_mahalanobis.htm claims that the distribution of the mahalanobis distance squared is the chi square distribution with p degrees of freedom. Is this so? (I've experimentally tried it, and the values seem to work, but I know enough to know that that doesn't necessarily prove anything...) If it _is_ the same tho, it might be good to add it in the article. 68.174.98.250 ( talk) 18:02, 13 March 2010 (UTC)
I don't think the current statement about the distribution of the Mahalanobis distance is correct. The link above is broken so I cannot check it but consider the example of two independent -dimensional random vectors and with identical means a common diagonal covariance matrix . Then
This would have a distribution with degrees of freedom if the thing in brackets would be normally distributed. But it is not! Both and have variance , so their difference has variance which means the term in brackets has variance 2 and not 1. I think the correct statement is that is -distributed with degrees of freedom since there we are subtracting the true mean and not another random vector. Based on the above reasoning I also think that is -distributed with degrees of freedom. Unfortunately I don't have a reference for any of this. Any objections to changing the article accordingly? MWiebusch78 10:41, 12 June 2019 (UTC)
I find this article obtuse. Is it an overgeneralization to say that Mahalanobis distance is just the multidimensional generalization of "how many standard deviations is x from the mean?" —Ben FrantzDale ( talk) 18:33, 29 May 2010 (UTC)
What you suggest sounds like "normalised" or "standardised" Euclidean distance and, unlike Mahalanobis distance, would not take into account covariance between dimensions. (That is my understanding at least) —Preceding
unsigned comment added by
124.184.162.72 (
talk)
23:32, 3 February 2011 (UTC)
The current link to the original reference, Mahalanobis (1936), is not working
I got the paper (for free) from http://www.insa.ac.in/insa_pdf/20005b8c_49.pdf but this did require registration (and I am not sure the link will work without or how to test that - it does at least work for me in a different browser than the one I used to register). —Preceding unsigned comment added by 124.184.162.72 ( talk) 23:30, 3 February 2011 (UTC)
How do you calculate Mahalanobis distance when the covariance matrix has determinant=0 (can´t be inverted)? ( talk) 0:31, 27 February 2012 (UTC)
The inversion problem seem to be an inherent issue of covariance metrices. Maybe someone with some expertise in the field could explain the challenges and introduce pseudoinverse as the solution? — Preceding unsigned comment added by 89.182.26.147 ( talk) 20:25, 15 January 2014 (UTC)
I'm not sure I agree with the characterization of Euclidean distance. Mahalanobis distance is clearly just an inner product defined by the nonnegative definite matrix S^-1. Anyone know why a concept that had been around in mathematics since about the time of Cauchy or before got named for a statistician in the 1930s?
briardew ( talk) 17:29, 16 August 2012 (UTC)
Maybe a numerical example should be added to the article? Carstensen ( talk) 17:48, 10 November 2012 (UTC)
This name "Mahalanobis distance" is completely non-standard. It is a simple concept, showing up frequently in analyses of a dependent variable with correlated errors, and has been around for long before this person. Why should a very old concept be named after somebody from the 20th century? The article is of poor quality and makes statements referring to this as "Mahalanobis' discovery," when this is patently false.
This contains the following: "we will get an equation for a metric that looks a lot like the Mahalanobis distance". What does this mean? "Looks a lot like" is vague and completely unhelpful - either it is the same expression or it isn't. If it isn't the difference should be explicated. Perhaps an expert could rewrite this section clearly - otherwise I think it would be better deleted, but I thought I'd comment here before doing that.
Also perhaps the section should have a more specific heading than "Discussion" - it's far from being a general discussion. — Preceding unsigned comment added by 213.162.107.11 ( talk) 11:57, 2 June 2014 (UTC)
It is extremely necessary. Currently I am unable to understand it BurstPower ( talk) 13:08, 18 November 2015 (UTC)
This section concludes with a opaque statement that "if the data has a nontrivial nullspace, Mahalanobis distance can be computed after projecting the data (non-degenerately) down onto any space of the appropriate dimension for the data." It would really help if somone who understands this statement can rephrase it more precisely while linking to pages that define any necessary jargon. An equation would be ideal. Carroll.ian ( talk) 04:10, 20 April 2017 (UTC)
Often you want to know, what is the probability for a sample to have a Mahalanobis distance greater than R? I think this should be discussed in the article, including how it maps to sigma values of a univariate distribution, and with reference to confidence regions. Cesiumfrog ( talk) 01:35, 10 November 2017 (UTC)
A search at commons found the following diagrams:
However, since their descriptions (click on images to see them) are in Polish only, some expert is needed to select the most appropriate one and devise a good caption. - Jochen Burghardt ( talk) 19:05, 28 March 2019 (UTC)
This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
This page says "In order to use the Mahalanobis distance to classify a test point as belonging to one of N classes, one first estimates the covariance matrix of each class, usually based on samples known to belong to each class. Then, given a test sample, one computes the Mahalanobis distance to each class, and classifies the test point as belonging to that class for which the Mahalanobis distance is minimal."
However, I have two sources that both use the single pooled within-group covariance matrix for computing all distances, instead of a covariance matrix per class.
How to reconcile these two views? I believe the classification section should be rewritten.
dfrankow ( talk) 17:02, 23 December 2008 (UTC)
Statistical leverage links here. Is that appropriate?? Seems to me the former concept is broader than this article. — DIV ( 128.250.204.118 07:46, 9 July 2007 (UTC))
That intuitive explanation was very helpful, great work. Matumio ( talk) 09:34, 21 December 2007 (UTC)
The covariance matrix is usually called rather than . It hampers readability. I am changing it unless there is a specific reason for calling it . Sourangshu ( talk) 13:34, 25 February 2008 (UTC) Monday, February 25 2008
I understand the desire to use a symbol for covariance that is not confused with 'sum', but every other article makes due, including that for covariance. For that reason, I prefer over S. 192.35.35.34 ( talk) 23:27, 13 March 2009 (UTC)
Cov(x) = . Cov(.) is a function on x. is the result of applying that function to x. The usual distinction between and S is that the former indicates a population value and the latter a sample estimate (in statistics). In the present case, that distinction appears moot. Kmarkus ( talk) 13:49, 18 September 2009 (UTC)
The text ends saying that using it is the same of finding the group of maximum probability. Isn't it the maximum likelihood?... And AFAIK that is just if the distributions are the same, with radial symmetry, and also if the groups are considered to happen with the same probability. -- NIC1138 ( talk) 20:21, 29 October 2008 (UTC)
User:Aetheling removed the text. Aetheling, can you please explain why? -- Pot ( talk) 11:58, 27 May 2010 (UTC)
Adding Template:Refimprove banner. I recently added the main reference to Mahalanobis, 1936, but I just noticed it had been previously removed for some obscure reason (this article obviously exist, even if it is hard to find (you need to register to the journal). More generally, this article is insufficiently sourced. Intuitive explanation and Relationship to leverage should have at least 1 reference each. Each application should be referenced. Calimo ( talk) 09:57, 7 December 2008 (UTC)
Why not? I had added this and it was removed Hotelling T2 Distribution, The MathWorks. Retrieved on 2008-12-16. -- Pot ( talk) 14:10, 16 December 2008 (UTC)
Suppose I have two uncorrelated 2D points with zero mean but with different variance, x and y. Suppose
and
As I understand Mahalanobis distance, If I had x=[0.1,1] it would have a Mahalanobis distance of 0.1414 from the origin. Likewise if y=[1,0.1] it would have a Mahalanobis distance of 0.1414 from the origin (would that be a "Mahalobis norm"?) but y=[0.1,1] would have a Mahalobis distance of 1.0 from the origin. What is the Mahalanobis distance between a given x and y? My feeling is that this intuitively measures the unlikelyhood of the distance between x and y. As such, it should have something to do with the expected value of . Any thoughts? In particular, I'm guessing that we can let and use the sum of normally distributed random variables rules to find that
and so the Mahalanobis distance between x and y would be
Does that sound right? —Ben FrantzDale ( talk) 18:02, 30 March 2009 (UTC)
I'm not editing the page, because I don't know if it's _true_ or not. But, this page http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_mahalanobis.htm claims that the distribution of the mahalanobis distance squared is the chi square distribution with p degrees of freedom. Is this so? (I've experimentally tried it, and the values seem to work, but I know enough to know that that doesn't necessarily prove anything...) If it _is_ the same tho, it might be good to add it in the article. 68.174.98.250 ( talk) 18:02, 13 March 2010 (UTC)
I don't think the current statement about the distribution of the Mahalanobis distance is correct. The link above is broken so I cannot check it but consider the example of two independent -dimensional random vectors and with identical means a common diagonal covariance matrix . Then
This would have a distribution with degrees of freedom if the thing in brackets would be normally distributed. But it is not! Both and have variance , so their difference has variance which means the term in brackets has variance 2 and not 1. I think the correct statement is that is -distributed with degrees of freedom since there we are subtracting the true mean and not another random vector. Based on the above reasoning I also think that is -distributed with degrees of freedom. Unfortunately I don't have a reference for any of this. Any objections to changing the article accordingly? MWiebusch78 10:41, 12 June 2019 (UTC)
I find this article obtuse. Is it an overgeneralization to say that Mahalanobis distance is just the multidimensional generalization of "how many standard deviations is x from the mean?" —Ben FrantzDale ( talk) 18:33, 29 May 2010 (UTC)
What you suggest sounds like "normalised" or "standardised" Euclidean distance and, unlike Mahalanobis distance, would not take into account covariance between dimensions. (That is my understanding at least) —Preceding
unsigned comment added by
124.184.162.72 (
talk)
23:32, 3 February 2011 (UTC)
The current link to the original reference, Mahalanobis (1936), is not working
I got the paper (for free) from http://www.insa.ac.in/insa_pdf/20005b8c_49.pdf but this did require registration (and I am not sure the link will work without or how to test that - it does at least work for me in a different browser than the one I used to register). —Preceding unsigned comment added by 124.184.162.72 ( talk) 23:30, 3 February 2011 (UTC)
How do you calculate Mahalanobis distance when the covariance matrix has determinant=0 (can´t be inverted)? ( talk) 0:31, 27 February 2012 (UTC)
The inversion problem seem to be an inherent issue of covariance metrices. Maybe someone with some expertise in the field could explain the challenges and introduce pseudoinverse as the solution? — Preceding unsigned comment added by 89.182.26.147 ( talk) 20:25, 15 January 2014 (UTC)
I'm not sure I agree with the characterization of Euclidean distance. Mahalanobis distance is clearly just an inner product defined by the nonnegative definite matrix S^-1. Anyone know why a concept that had been around in mathematics since about the time of Cauchy or before got named for a statistician in the 1930s?
briardew ( talk) 17:29, 16 August 2012 (UTC)
Maybe a numerical example should be added to the article? Carstensen ( talk) 17:48, 10 November 2012 (UTC)
This name "Mahalanobis distance" is completely non-standard. It is a simple concept, showing up frequently in analyses of a dependent variable with correlated errors, and has been around for long before this person. Why should a very old concept be named after somebody from the 20th century? The article is of poor quality and makes statements referring to this as "Mahalanobis' discovery," when this is patently false.
This contains the following: "we will get an equation for a metric that looks a lot like the Mahalanobis distance". What does this mean? "Looks a lot like" is vague and completely unhelpful - either it is the same expression or it isn't. If it isn't the difference should be explicated. Perhaps an expert could rewrite this section clearly - otherwise I think it would be better deleted, but I thought I'd comment here before doing that.
Also perhaps the section should have a more specific heading than "Discussion" - it's far from being a general discussion. — Preceding unsigned comment added by 213.162.107.11 ( talk) 11:57, 2 June 2014 (UTC)
It is extremely necessary. Currently I am unable to understand it BurstPower ( talk) 13:08, 18 November 2015 (UTC)
This section concludes with a opaque statement that "if the data has a nontrivial nullspace, Mahalanobis distance can be computed after projecting the data (non-degenerately) down onto any space of the appropriate dimension for the data." It would really help if somone who understands this statement can rephrase it more precisely while linking to pages that define any necessary jargon. An equation would be ideal. Carroll.ian ( talk) 04:10, 20 April 2017 (UTC)
Often you want to know, what is the probability for a sample to have a Mahalanobis distance greater than R? I think this should be discussed in the article, including how it maps to sigma values of a univariate distribution, and with reference to confidence regions. Cesiumfrog ( talk) 01:35, 10 November 2017 (UTC)
A search at commons found the following diagrams:
However, since their descriptions (click on images to see them) are in Polish only, some expert is needed to select the most appropriate one and devise a good caption. - Jochen Burghardt ( talk) 19:05, 28 March 2019 (UTC)