开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

AnnaZ · 2023年01月10日

题没看懂

NO.PZ2015120204000032

问题如下:

Paul suggests the following step which would be repeated every quarter.

Step 2 We utilize ML techniques to divide our investable universe of about 10,000 stocks into 20 different groups, based on a wide variety of the most relevant financial and non-financial characteristics. The idea is to prevent unintended portfolio concentration by selecting stocks from each of these distinct groups.

The hyperparameter in the ML model to be used for accomplishing Step 2 is?

选项:

A.

100, the number of small-cap stocks in Alef’s portfolio.

B.

10,000, the eligible universe of small-cap stocks in which Alef can potentially invest.

C.

20, the number of different groups (i.e. clusters) into which the eligible universe of small-cap stocks will be divided.

解释:

C is correct. Here, 20 is a hyperparameter (in the K-Means algorithm), which is a parameter whose value must be set by the researcher before learning begins.

A is incorrect, because it is not a hyperparameter. It is just the size (number of stocks) of Alef’s portfolio.

B is incorrect, because it is not a hyperparameter. It is just the size (number of stocks) of Alef’s eligible universe.

没看懂题,怎么看出是KNN算法的,进而考虑到参数用多少呢

1 个答案

星星_品职助教 · 2023年01月11日

同学你好,

1)本题为K-means clustering,不是KNN;

2)本题选项中已经提示了“clusters”,所以优先考虑clustering的算法。其中,Hierarchical clustering是有中间层的,本题为直接将10,000只股票分为20类,没有中间的层。所以可知这种算法为K-means clustering;

3)如果没有提示“clusters”,可根据题干的描述进行判断。本题的描述为给定特征值后直接分组,且分类的结果没有标签。这就可以排除一众如KNN等的supervised machine learning算法。而unsupervised ML算法只有PCA和Clustering,前者是降维,和本题没关系。后者中,通过没有中间层又可以排除Hierarchical clustering。最终符合描述的只有K-means clustering的算法。