开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

FrankSun · 2021年10月09日

麻烦解答一下

NO.PZ2021083101000015

问题如下:

Achler has the data ready for the model training process. Rivera asks Achler to include start-up failure rates as a feature. Achler notices that the number of start-ups that fail (majority class) is significantly larger than the number of the start-ups that are successful (minority class).

Achler is concerned that because of class imbalance, the model will not be able to discriminate between start-ups that fail and start-ups that are successful.

Achler’s model training concern related to the model’s ability to discriminate could be addressed by randomly:

选项:

A.

oversampling the failed start-up data

B.

oversampling the successful start-up data

C.

undersampling the successful start-up data

解释:

B is correct.

Achler is concerned about class imbalance, which can be resolved by balancing the training data. The majority class (the failed start-up data) can be randomly undersampled, or the minority class (the successful start-up data) can be randomly oversampled.

考点:Model Training: Model Selection

这道题好像跟之前那道题重复的,但是还是挺疑惑的,如果问重复请忽略。如果没有,麻烦解析一下吧,谢谢

1 个答案
已采纳答案

星星_品职助教 · 2021年10月11日

同学你好,

这道题是case中拆分的小题,原答案复制如下:

---------------

这道题是class imbalance的问题。意思是两类数据(两个class)出现了不平衡。一个class里数据特别多,一个class里数据特别少。在这种数据不平衡下模型的预测会出现问题。

解决方案就是oversampling少的那个class(minority class),或者undersampling多的那个class(majority class)。

本题中 failed start-up data是majority class,所以要undersampling; successful start-up是minority class,所以要oversampling,排除后选择B选项。

  • 1

    回答
  • 2

    关注
  • 567

    浏览
相关问题