高级数据结构与算法习题(3)

本文是一篇关于高级数据结构与算法的习题解析,涉及分布式索引策略、信息检索性能评估、查准率与查全率的概念,并提供了一道编程题——文档距离计算,用于检测两篇文章的相似度。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

一、判断题

1、In distributed indexing, document-partitioned strategy is to store on each node all the documents that contain the terms in a certain range.

T                F

解析:F。在磁盘分区索引技术中,每个节点均会存放部分索引,而不是所有的索引。因为分布式索引的方式是按文档序号排序的,如果按包含的terms分类,在储存故障时,关于这个terms的文档全没了,不抗风险。

2、When evaluating the performance of data retrieval, it is important to measure the relevancy of the answer set.

T                F

解析:F。这个说的是data retrieval,错。Information retrieval才需要measure the relevancy of the answer set。

3、Precision is more important than recall when evaluating the explosive detection in airport security.

T                F

解析:F。False,在机场安全的危险品探测中应该是Recall率更重要。

4、While accessing a term by hashing in an inverted file index, range searches are expensive.

T                F

解析:T。因为hash表是直接使用hash函数定位的时间是常数的,而使用搜索树则是O(logn)的。但是hash表的储存不灵活有缺点。

二、选择题

1、When measuring the relevancy of the answer set, if the precision is high but the recall is low, it means that:

A.most of the relevant documents are retrieved, but too many irrelevant documents are returned as well

B.most of the retrieved documents are relevant, but still a lot of relevant documents are missed

C.most of the relevant documents are retrieved, but the benchmark set is not large enough

D.most of the retrieved documents are relevant, but the benchmark set is not large enough

解析:B。召回率很低,但是精确的高,说明相关性很高。

2、Which of the following is NOT concerned for measuring a search engine?

A.How fast does it index

B.How fast does it search

C.How friendly is the interface

D.How relevant is the answer set

解析:C。界面有多友好显然不是吧。。。

3、There are 28000 documents in the database. The statistic data for one query are shown in the following table. The recall is: __

Relevant Irrelevant
Retrieved
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

博学者普克尔特

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值