开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

小强爱英语 · 2023年05月29日

能整体一道题解释一下吗?

NO.PZ2021083101000016

问题如下:

Achler splits the DTM into training, cross-validation, and test datasets. Achler uses a supervised learning approach to train the logistic regression model in predicting sentiment. Applying the receiver operating characteristics (ROC) technique and area under the curve (AUC) metrics, Achler evaluates model performance on both the training and the cross-validation datasets. The trained model performance for three different logistic regressions’ threshold p-values is presented in Exhibit 3.

Based on Exhibit 3, which threshold p-value indicates the best fitting model?

选项:

A.

0.57

B.

0.79

C.

0.84

解释:

B is correct. The higher the AUC, the better the model performance. For the threshold p-value of 0.79, the AUC is 91.3% on the training dataset and 89.7% on the cross- validation dataset, and the ROC curves are similar for model performance on both datasets. These findings suggest that the model performs similarly on both training and CV data and thus indicate a good fitting model.

A is incorrect because for the threshold p-value of 0.57, the AUC is 56.7% on the training dataset and 57.3% on the cross- validation dataset. The AUC close to 50% signifies random guessing on both the training dataset and the crossvalidation dataset. The implication is that for the threshold p-value of 0.57, the model is randomly guessing and is not performing well.

C is incorrect because for the threshold p-value of 0.84, there is a substantial difference between the AUC on the training dataset (98.4%) and the AUC on the cross- validation dataset (87.1%). This suggests that the model performs comparatively poorly (with a higher rate of error or misclassification) on the cross- validation dataset when compared with training data. Thus, the implication is that the model is overfitted.

考点:Model Training: Performance Evaluation

就是不知道那个P-value有什么用处,还有这两个概率相加到什么程度才过度拟合和,没明白。

1 个答案

星星_品职助教 · 2023年05月29日

同学你好,

1)本题不需要考虑p-value。三个p-value换成例如一、二、三也可以,

2)表格中百分数的大小没有固定要求,需要选项之间对比来分析和做排除。

以下就用第一行、第二行、第三行来替代p-value做说明:

第一行中,training set对应的百分数过小,说明这个模型在建模的training data set里表现的就很差,可直接排除。

第二行和第三行中,training set的百分数都很大,说明training set中表现都很好。此时需要进一步对比cross-validation set(CV set)的情况。

和第二行相比,第三行在training set里的AUC和在CV set里的AUC差距较大,这说明这个模型只在training set里表现好,在CV set里表现(相对)差。这是Overfitting问题的表现。即模型可以很好的拟合现在的数据,但是用于预测未来却效果很差。

而第二行training 和CV set里的百分数差距不大,说明这个模型无论是在training set还是在CV set里表现都不错而且表现差不多,所以是一个“ best fitting model”

由于第二行对应的p-value是0.79,所以就直接选择0.79所对应的B选项。


  • 1

    回答
  • 1

    关注
  • 521

    浏览
相关问题

NO.PZ2021083101000016 问题如下 Achler splits the M into training, cross-valition, antest tasets. Achler uses a superviselearning approato train the logistic regression mol in precting sentiment. Applying the receiver operating characteristi(ROtechnique anarea unr the curve (AUmetrics, Achler evaluates mol performanon both the training anthe cross-valition tasets. The trainemol performanfor three fferent logistic regressions’ thresholp-values is presentein Exhibit 3.Baseon Exhibit 3, whithresholp-value incates the best fitting mol? A.0.57 B.0.79 C.0.84 B is correct. The higher the AUthe better the mol performance. For the thresholp-value of 0.79, the AUC is 91.3% on the training taset an89.7% on the cross- valition taset, anthe ROC curves are similfor mol performanon both tasets. These finngs suggest ththe mol performs similarly on both training anta anthus incate a goofitting mol.A is incorrebecause for the thresholp-value of 0.57, the AUC is 56.7% on the training taset an57.3% on the cross- valition taset. The AUC close to 50% signifies ranm guessing on both the training taset anthe crossvalition taset. The implication is thfor the thresholp-value of 0.57, the mol is ranmly guessing anis not performing well.C is incorrebecause for the thresholp-value of 0.84, there is a substantifferenbetween the AUC on the training taset (98.4%) anthe AUC on the cross- valition taset (87.1%). This suggests ththe mol performs comparatively poorly (with a higher rate of error or misclassification) on the cross- valition taset when comparewith training tThus, the implication is ththe mol is overfitte考点Mol Training: PerformanEvaluation 为什么不是看两者相加最大的时候是最好的模型

2022-12-13 21:27 1 · 回答

NO.PZ2021083101000016 问题如下 Achler splits the M into training, cross-valition, antest tasets. Achler uses a superviselearning approato train the logistic regression mol in precting sentiment. Applying the receiver operating characteristi(ROtechnique anarea unr the curve (AUmetrics, Achler evaluates mol performanon both the training anthe cross-valition tasets. The trainemol performanfor three fferent logistic regressions’ thresholp-values is presentein Exhibit 3.Baseon Exhibit 3, whithresholp-value incates the best fitting mol? A.0.57 B.0.79 C.0.84 B is correct. The higher the AUthe better the mol performance. For the thresholp-value of 0.79, the AUC is 91.3% on the training taset an89.7% on the cross- valition taset, anthe ROC curves are similfor mol performanon both tasets. These finngs suggest ththe mol performs similarly on both training anta anthus incate a goofitting mol.A is incorrebecause for the thresholp-value of 0.57, the AUC is 56.7% on the training taset an57.3% on the cross- valition taset. The AUC close to 50% signifies ranm guessing on both the training taset anthe crossvalition taset. The implication is thfor the thresholp-value of 0.57, the mol is ranmly guessing anis not performing well.C is incorrebecause for the thresholp-value of 0.84, there is a substantifferenbetween the AUC on the training taset (98.4%) anthe AUC on the cross- valition taset (87.1%). This suggests ththe mol performs comparatively poorly (with a higher rate of error or misclassification) on the cross- valition taset when comparewith training tThus, the implication is ththe mol is overfitte考点Mol Training: PerformanEvaluation AUC不是越大越趋近于1越好吗?B和C该如何综合对比?

2022-08-14 18:13 1 · 回答