Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

By now, Knowhere can support 13 types of index, but only 8 of them can support range search (the cell filled with BLUE):

BinaryIDMAPBinaryIVF


IDMAPIVF_FLATIVF_FLAT_NMIVF_SQIVF_PQ
HNSWANNOY


RHNSW_FLATRHNSW_IVFRHNSW_SQRHNSW_PQ

If add new interface QueryByRange(),  we can add following code into "VecIndex.h". If QueryByRange() is called for an index type which does not support range search, a knowhere exception will be thrown out.

Code Block
languagecpp
virtual DatasetPtr
QueryByRange(const DatasetPtr& dataset, const Config& config, const faiss::BitsetView bitset) {
    KNOWHERE_THROW_MSG("QueryByRange not supported yet");
}

I If we reuse interface Query() for range search, then the implementation of Query() for all index types will be change changed like this:

Code Block
languagecpp
virtual DatasetPtr
Query(const DatasetPtr& dataset, const Config& config, const faiss::BitsetView bitset) {
    if (radius exist in config) {
        // do range search
    } else {
        // do query
    }
}

This is not a good design for knowhere.

Design Details(required)

Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgment based on the scope of the change.

下图为一个 Search 请求从 SDK 到 SEGCORE 的完整调用栈,range search 完全复用该调用栈,不需要做任何改动。range search 只需要将参数 "radius" 在 SDK 端放入查询参数 "param" 中。

Image Removed

下图为 SEGCORE 内部向量查询时的调用栈示意图,黑色部分为现已实现的功能。

对于 sealed segment,要实现 range search 功能需要 knowhere 提供 QueryByRange 功能;

...

The following figure shows the complete call stack of a search request from SDK to segcore. Range search completely reuses this call stack without any changes. Range search only needs to put the parameter "radius" into the query parameter "param" on the SDK side.

Image Added


The following figure shows the call stack diagram of vector query in segcore, and the BLACK part shows the functions that have been implemented.

For sealed segment, to realize the range search function, knowhere needs to provide the interface QueryByRange();

For growing segment, because there is no index created, you can't use knowhere IDMAP to realize the brute force search function. You can only re-realize the full set of logic of brute force search by yourself. To implement range search, you need to implement the function shown in the RED part.

Another solution is that Knowhere provides a new IDMAP index, which does not need to insert vector data, but only needs to specify the external memory address of vector data. Growing segment can temporarily generate this kind of index during query, and then call the Query() & QueryByRange() interface provided by the IDMAP, and the index will be destroyed immediately after it is used up. 

There is another MEP solution has no additional memory overhead, but whether it is feasible needs further investigation.

另一解决方案是,knowhere 提供一种新的 IDMAP 索引,该索引不需要插入向量数据,只需要指定向量数据的外部内存地址。growing segment 在查询时可临时生成该种索引,继而调用 IDMAP 提供的 Query & QueryByRange 接口,该索引在用完后也即刻销毁。此方案没有额外的内存开销,但是否可行需进一步调研。

Image Modified


查询结果的处理

Query 返回的查询结果有2种,一种为 SubSearchResult,用于存放 segment 中每个 chunk 的查询结果。

...