Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Keywords: range search, radius, query

Released: 

Summary(required)

...

The purpose of this MEP is to realize another query function. The user specifies a query radius "radius", and Milvus queries and returns all results with distance better than this "radius" ("< radius" for "L2"; "> radius" for "IP")

...

The result output of this MEP is different from the original query result. The original query result is with fixed length `nq * topk`, while the return result of range search is variable length. In addition to `IDs` / `distances`, `lims` is also returned to record the offset of the query result of each vector in the result set. Another MEP pagination will uniformly process the results of `Query` and `QueryByRange` and return them to the client, so the processing of the returned results is not within the scope of this MEP discussion.

Motivation(required)

...

A user (developing recommendation system) asked for Milvus to realize the function of querying by distance, that is, to return all results whose similarity is better than a certain "radius".

Public Interfaces(optional)

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed.

我们复用了 Query 接口来实现 range search 的功能,因此 Milvus 的接口及所有 SDK 的接口不需要改变,只需要在 params 中加入 "radius" 信息即可。

当指定 "radius" 则忽略 "limit" 设定。

...

  • No interface change in Milvus and all SDKs

We reuse the interface `Query()` to realize the function of range search, so the interface of Milvus and all SDKs need not to be changed. We only need add "radius" information to params. When "radius" is specified, the "limit" setting is ignored.

As shown in the following figure, set "radius: 888" in search_ params.params.

Code Block
languagepy
  default_index = {"index_type": "HNSW", "params":{"M": 48, "efConstruction": 500}, "metric_type": "L2"}
  collection.create_index("float_vector", default_index)
  collection.load()  
  search_params = {"metric_type": "L2", "params": {"ef": 32, "radius": 888}}
  res = collection.search(vectors[:nq], "float_vector", search_params, limit, "int64 >= 0")

...


  • Need add new interface `QueryByRange()` in knowhere
Code Block
languagecpp
  virtual DatasetPtr
  QueryByRange(const DatasetPtr& dataset, const Config& config, const faiss::BitsetView bitset);

One proposal is not to add new interface QueryByRange(), but reusing interface Query() for range search. Considering the implementation in Knowhere, we don't accept this proposal. This is because not all types of index support range search.

By now, Knowhere can support 13 types of index:

BinaryIDMAP, BinaryIVF

Design Details(required)

Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgment based on the scope of the change.

...