问题如下:
Given her objective, the visualization that Steele should create in the exploratory data analysis step is a:
选项:
A.scatter plot.
word cloud.
document term matrix.
解释:
B is correct. Steele wants to create a visualization for Schultz that shows the most informative words in the dataset based on their term frequency (TF, the ratio of the number of times a given token occurs in the dataset to the total
number of tokens in the dataset) values. A word cloud is a common visualization when working with text data as it can be made to visualize the most informative words and their TF values. The most commonly occurring words in the dataset can be shown by varying font size, and color is used to add more dimensions, such as frequency and length of words.
老师请问这题为何不选Document Term Metrix?