...
Range search only needs to put the parameter "radius_low_bound" and "radius_high_bound" into the search parameter "param" on from the SDK side.
In Knowhere, range search reuses interface Query() with search. When "radius_low_bound" or "radius_high_bound" is set, range search is called; otherwise, search is called.
...
- seg_offsets_: with length "nq * topk", -1 is filled in when no enough result data
- distances_: whth with lengh "nq * topk", float_max (for "L2") or float_min (for "IP") is filled in data is undefined when no enough result data
...
There is no interface change , so there is and no compatibility issue.
Test Plan(required)
...
- reuse search testcases to test range search
...
- use sift1M dataset to test range do search, get result1
- set "radius_high_bound"
- do search again, get range search result2
- if "radius_high_bound" is too small, result2 should be a subset of result1; otherwise, result2 should be identical with result1
There is no public data set for range search. I have created one range search data set based on sift1M.
You can find them in NAS:
test/milvus/ann_hdf5/sift-128-euclidean-range.hdf5
test/milvus/ann_hdf5/glove-200-angular-range.hdf5
test/milvus/ann_hdf5/binary/sift-4096-hamming-range.hdf5
- we expect to get identical result as search
Rejected Alternatives(optional)
...
- Knowhere will add new API QueryByRange() and parameter legacy check API CheckRangeSearch()
- Because the output format of range search is not unified with search, we need add new structure SubRangeSearchResult for range search
- Because range search returns all result (not topk), we need implement Merge operation for chunk, segment, query node and proxy
- every Every Merge operation will make the result size more bigger, system memory usage cannot be estimated, the GRPC bandwidth between server and client cannot be estimated
- TOPK is a required parameter in SDK, but not used by range search, it will make user confused
- TOPK is also a required parameter in many search related APIs, in this proposal, we have to change all these APIs to support without TOPK
...