Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The advantage of this solution is resource saving, except query node, no other nodes will be involved in; while the shortcoming is code duplication. See following "Search Flow" chart, `FloatSearchBruteForce` and `BinarySearchBruteForce` are copied from knowhere::IDMAP/BinaryIDMAP's Query() and modified a little. This will introduce more code maintenance effort. And when realize new feature on IDMAP/BinaryIDMAP in Knowhere, such as range search, we have to alto copy these codes implementation to Milvus.

Image Modified

A One good solution is to enhance IDMAP/BinaryIDMAP, not to add real vector data in, but only let index to hold an external vector data pointer in the index. User need guarantee that the memory is contiguous and safe.

...

Public Interfaces(optional)

...

Faiss need needs add new field "xb_ex" and new interface "add_ex" for structure IndexBinaryFlat; also add new field "codes_ex" and new interface "add_ex" for structure IndexFlat.

...

Code Block
languagecpp
//============================================================================
struct IndexBinaryFlat : IndexBinary {
    /// database vectors, size ntotal * d / 8
    std::vector<uint8_t> xb;

    /// external database vectors, size ntotal * d / 8
    uint8_t* xb_ex = nullptr;								// <==== new added
    ... ...
}

void IndexBinaryFlat::add_ex(idx_t n, const uint8_t* x) {
    xb_ex = (uint8_t*)x;
    ntotal = n;
}

//============================================================================
struct IndexFlatCodes : Index {
    ... ...
    /// encoded dataset, size ntotal * code_size
    std::vector<uint8_t> codes;

    // external encoded dataset , size ntotal * code_size
    uint8_t* codes_ex = nullptr;							// <==== new added
    ... ...
}

void IndexFlatCodes::add_ex(idx_t n, const float* x) {
    FAISS_THROW_IF_NOT(is_trained);
    FAISS_THROW_IF_NOT(codes.empty());
    codes_ex = (uint8_t*)x;
    ntotal = n;
}


Knowhere need needs add a new interface `AddExWithoutIds()` for both IDMAP and BinaryIDMAP.

...

When Knowhere detect that "codes_ex" is used in current IDMAP index or "xb_ex" is used in current BinaryIDMAP index, serialization is banned.


For Milvus, the implementation of API "FloatSearchBruteForce()" and "BinarySearchBruteForce()" will be re-written, but the interface need not changethey can use the enhanced IDMAP/BinaryIDMAP to search instead of calling Faiss interfaces.

Design Details(required)

...

In Milvus, when growing segment need create an enhanced IDMAP index, it can do in this way:

...