NO.PZ201512020300000609
问题如下:
The output created in Steele’s Step 3 can be best described as a:
选项:
A.bag-of-words.
set of n-grams.
document term matrix.
解释:
A is correct. After the cleansed text is normalized, a bag-of-words is created. A bag-of-words (BOW) is a collection of a distinct set of tokens from all the texts in a sample dataset.
题目中说and create a distinct set of tokens from....这个是什么意思呢?我还以为是跟DTM有关