开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

Lilly · 2024年04月27日

如下

* 问题详情,请 查看题干

NO.PZ202304050200007702

问题如下:

Which of Bector’s statements regarding TF, IDF, and TF–IDF is correct?

选项:

A.

Statement 1

B.

Statement 2

C.

Statement 3

解释:

C is correct. Statement 3 is correct. TF–IDF values vary by the number of documents in the dataset, and therefore, the model performance can vary when applied to a dataset with just a few documents.

Statement 1 is incorrect because IDF is calculated as the log of the inverse, or reciprocal, of the document frequency measure. Statement 2 is incorrect because TF at the sentence (not collection) level is multiplied by IDF to calcu-late TF–IDF.

A is incorrect because Statement 1 is incorrect. IDF is calculated as the log of the inverse, or reciprocal, of the document frequency (DF) measure.

B is incorrect because Statement 2 is incorrect. TF at the sentence (not collec-tion) level is multiplied by IDF to calculate TF–IDF.

可以解释一下statement3么,TFIDF不是指的是词在句中的关系和含词句在文章中的关系么?和dataset包含多少篇文章有什么关联?

1 个答案

品职助教_七七 · 2024年04月27日

嗨,从没放弃的小努力你好:


Statement 3为document的数量不同会影响到TF-IDF的计算结果,当document变的很少的时候,model performance也会发生变化。

举例来说,当document为段落、或document为整篇文章、或每个句子时,算出来的TF-IDF会不同,不同的TF-IDF就对应着不同的model performance。

----------------------------------------------
努力的时光都是限量版,加油!

  • 1

    回答
  • 0

    关注
  • 205

    浏览
相关问题