开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

小胖 · 2020年10月10日

问一道题:NO.PZ2015120204000031

问题如下:

Paul suggests the following step which would be repeated every quarter.

Step 2 We utilize ML techniques to divide our investable universe of about 10,000 stocks into 20 different groups, based on a wide variety of the most relevant financial and non-financial characteristics. The idea is to prevent unintended portfolio concentration by selecting stocks from each of these distinct groups.

Which of the following machine learning techniques is most appropriate for executing Step 2:

选项:

A.

K-Means Clustering

B.

Principal Components Analysis (PCA)

C.

Classification and Regression Trees (CART)

解释:

A is correct. K-Means clustering is an unsupervised machine learning algorithm which repeatedly partitions observations into a fixed number, k, of nonoverlapping clusters (i.e., groups).

B is incorrect. Principal Components Analysis is a long-established statistical method for dimension reduction, not clustering. PCA aims to summarize or reduce highly correlated features of data into a few main, uncorrelated composite variables.

C is incorrect. CART is a supervised machine learning technique that is most commonly applied to binary classification or regression.

选项c为什么不对啊,c不是也可以用二叉树进行分类吗?

1 个答案
已采纳答案

星星_品职助教 · 2020年10月10日

同学你好,

CART可以进行分类,但是不符合本题的背景。这道题要求把10000只股票分20组,这是很典型的聚类问题。根据类似的特点聚成20类就可以,没有必要用CART一步步分。所以clustering才是“most appropriate”

其实大部分算法都可以分类,但是各自有特点,例如CART大部分都是离散的二叉树分类,一步一步往下走。除非题目提到有这个特点,否则不能仅因为这个算法能分类就去选。