R语言数据分析作业

Assignment 2
Machine Learning and Big Data for Economics and Finance
Classification exercise
Consider the three variables in the dataset Assign2.csv. We are interested in predicting the third variable given the rst two variables as inputs.
1. Plot the data on a gure with the rst variable on the x-axis, the second vari- able on the y-axis and where the points color depends on the third variable.
2. Fit a linear model to the data. Produce a confusion matrix to show how well the model ts the data.
3. Repeat the same exercise for each of the following: a. Logistic regression.
b. Linear Discrminant Analysis.
4. Fit the model by k-nearest neighbor classication with k = 1; :::; 20. Produce a confusion matrix for each k.
5. Choose between all 24 methods by using 10-fold cross-validation. Try to justify the results based on your intuition regarding the data.

1. 将数据绘制在一个图形上，第一个变量在 x 轴上，第二个变量在 y 轴上，点的颜色取决于第三个变量。
2. 对数据拟合线性模型。 生成混淆矩阵以显示模型对数据的处理情况。
3. 对以下各项重复相同的练习： 逻辑回归。
b. 线性判别分析。
C。 二次判别分析。
4. 通过 k = 1 的 k 最近邻分类来拟合模型； ::::; 20. 为每个 k 生成一个混淆矩阵。
5. 使用 10 折交叉验证在所有 24 种方法之间进行选择。 尝试根据您对数据的直觉来证明结果的合理性。