You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Current state: ["Under Discussion"]

ISSUE: #17599

PRs: 

Keywords: range search, radius

Released: 

Summary(required)

目前 Milvus 已经实现的查询是为每个待查询向量返回 topk 个最相似的结果。

本项目的是为了实现另一种查询功能,用户指定查询半径 "radius",Milvus 查询并返回与待查询向量间距离优于 "radius" ("< radius" for "L2"; "> radius" for "IP" ) 的所有结果。

这一功能可以认为是现有查询功能的"超集",因为如果对 range search 的结果排序再取前 topk 个结果,就是现有查询功能的返回结果。

本项目的结果输出不同于原来的查询结果,原来的查询结果是定长的 nq * topk,而 range search 的返回结果是不定长的,除了 ids / distances 还会返回 lims 用以记录每条向量的查询结果在结果集中的偏移量。有另一个 MEP Pagination 会统一处理 Query 和 QueryByRange 的结果并返回给客户端,因此返回结果的处理不在本 MEP 讨论范围中。

Motivation(required)

有用户(做推荐系统)提需求希望 Milvus 能实现按距离查询的功能,即返回相似度优于某一阀值的所有结果。

Public Interfaces(optional)

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed.

我们复用了 Query 接口来实现 range search 的功能,因此 Milvus 的接口及所有 SDK 的接口不需要改变,只需要在 params 中加入 "radius" 信息即可。

如下图所示,在 search_params.params 中加入 "radius": 888

  search_params = {"metric_type": "L2", "params": {"ef": 32, "radius": 888}}
  res = col.search(vectors[:nq], "float_vector", search_params, limit, "int64 >= 0")

knowhere 中新加 QueryByRange 接口

  virtual DatasetPtr
  QueryByRange(const DatasetPtr& dataset, const Config& config, const faiss::BitsetView bitset);


Design Details(required)

Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgment based on the scope of the change.


Compatibility, Deprecation, and Migration Plan(optional)

  • What impact (if any) will there be on existing users?
  • If we are changing behaviors how will we phase out the older behavior?
  • If we need special migration tools, describe them here.
  • When will we remove the existing behavior?

Test Plan(required)

Describe in a few sentences how the MEP will be tested. We are mostly interested in system tests (since unit tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?

Rejected Alternatives(optional)

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other ways.

References(optional)

Briefly list all references

  • No labels