Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Released: 

Summary(required)

目前 Milvus 已经实现的查询是为每个待查询向量返回 topk 个最相似的结果。

At present, the behavior of Query() in Milvus is to return TOPK most similar results for each vector to be queried.

The purpose of this MEP is to realize another query function. The user specifies a query radius "radius", and Milvus queries and returns all results with distance better than this 本项目的是为了实现另一种查询功能,用户指定查询半径 "radius",Milvus 查询并返回与待查询向量间距离优于 "radius" ("< radius" for "L2"; "> radius" for "IP") 的所有结果。

这一功能可以认为是现有查询功能的"超集",因为如果对 range search 的结果排序再取前 topk 个结果,就是现有查询功能的返回结果。

This function can be considered as a "superset" of the existing query function, because if you sort the results of range search and take the first `topk` results, they will be identical with the return result of the existing query function.

The result output of this MEP is different from the original query result. The original query result is with fixed length `nq * topk`, while the return result of range search is variable length. In addition to `IDs` / `distances`, `lims` is also returned to record the offset of the query result of each vector in the result set. Another MEP pagination will uniformly process the results of `Query` and `QueryByRange` and return them to the client, so the processing of the returned results is not within the scope of this MEP discussion.本项目的结果输出不同于原来的查询结果,原来的查询结果是定长的 nq * topk,而 range search 的返回结果是不定长的,除了 ids / distances 还会返回 lims 用以记录每条向量的查询结果在结果集中的偏移量。有另一个 MEP Pagination 会统一处理 Query 和 QueryByRange 的结果并返回给客户端,因此返回结果的处理不在本 MEP 讨论范围中。

Motivation(required)

有用户(做推荐系统)提需求希望 Milvus 能实现按距离查询的功能,即返回相似度优于某一阀值的所有结果。

...