Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Range search only needs to put the parameter "radius_low_bound" and "radius_high_bound" into the search parameter "param" on from the SDK side.


In Knowhere, range search reuses interface Query() with search. When "radius_low_bound" or "radius_high_bound" is set, range search is called; otherwise, search is called. 

...

  • seg_offsets_: with length "nq * topk", -1 is filled in when no enough result data
  • distances_: whth with lengh "nq * topk", float_max (for "L2") or float_min (for "IP") is filled in data is undefined when no enough result data

...

There is no interface change , so there is and no compatibility issue.

Test Plan(required)

...

  1. reuse search testcases to test range search

...

  1. use sift1M dataset to test range do search, get result1
  2. set "radius_high_bound"
  3. do search again, get range search result2
  4. if "radius_high_bound" is too small, result2 should be a subset of result1; otherwise, result2 should be identical with result1

There is no public data set for range search. I have created one range search data set based on sift1M.

You can find them in NAS:

test/milvus/ann_hdf5/sift-128-euclidean-range.hdf5

test/milvus/ann_hdf5/glove-200-angular-range.hdf5

test/milvus/ann_hdf5/binary/sift-4096-hamming-range.hdf5

  1. we expect to get identical result as search

Rejected Alternatives(optional)

...

  1. Knowhere will add new API QueryByRange() and parameter legacy check API CheckRangeSearch()
  2. Because the output format of range search is not unified with search, we need add new structure SubRangeSearchResult for range search
  3. Because range search returns all result (not topk), we need implement Merge operation for chunk, segment, query node and proxy
  4. every Every Merge operation will make the result size more bigger, system memory usage cannot be estimated, the GRPC bandwidth between server and client cannot be estimated
  5. TOPK is a required parameter in SDK, but not used by range search, it will make user confused
  6. TOPK is also a required parameter in many search related APIs, in this proposal, we have to change all these APIs to support without TOPK

...