Linear discriminant analysis, Quadratic discriminant analysis and K nearest neighbors along with Logistic regression are widely used Machine learning methods for classification problems.
In this study, I am going compare these model on Football Data set. Since there are 3 response variables, we are not able to use Logistic regression, and not interested in using Multinomial Logistic regression.
LDA uses Baye’s theorem for classification, we assume that predictors that we use for classification are drawn from Gaussian Distribution, with a specific mean and a common covariance matrix.
QDA also assume that predictors are drawn from Gaussian Distribution, however, QDA assumes that each predictor has their own covariance matrix. That is, it assumes that an observation from the kth class is of the from X ~ N(muk,SigmaK), where sigmak is a covariance matrix for the kth class.
By using share common covariance matrix in LDA, LDA model becomes linear and QDA uses covariance matrix for each class, QDA becomes quadratic function.
However, KNN uses completely different approach, KNN uses feature similarity to predict the response variables.
Football Dataset
This data set have 90 observation for 6 different predictors.
GROUPS:
Group1: High school football players
Group2: College football players
Group3: Non-football players
MEASUREMENTS:
WDIM: head width at widest dimension,
CIRCUM: head circumference,
FBEYE: front to back measurement at eye level
EYEHD: eye to top of head measurement
EARHD: ear to top of the head measurement
JAW:jaw width
MODEL RESULT:
LDA and QDA both give me 64% accuracy prediction rate on Test, but KNN predicts 76% accuracy on th test data.
So the winner is KNN.