Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed.

Faiss need add a new field "xb_ex" and new interface "add_ex" for structure IndexBinaryFlat; also add new field "codes_ex" and new interface "add_ex()" for IDMAP and BinaryIDMAP" for structure IndexFlat.

In IndexBinaryFlat, "xb" and "xb_ex" are mutual exclusive, user cannot set them at the same time; it's same in IndexFlat, "codes" and "codes_ex" are also mutual exclusive, user cannot set both of them.

Code Block
languagecpp
//============================================================================
struct IndexBinaryFlat : IndexBinary {
    /// database vectors, size ntotal * d / 8
    std::vector<uint8_t> xb;

    /// external database vectors, size ntotal * d / 8
    uint8_t* xb_ex = nullptr;								// <==== new added
    ... ...
}

void IndexBinaryFlat::add_ex(idx_t n, const uint8_t* x) {
    xb_ex = (uint8_t*)x;
    ntotal = n;
}

//============================================================================
struct IndexFlatCodes : Index {
    ... ...
    /// encoded dataset, size ntotal * code_size
    std::vector<uint8_t> codes;

    // external encoded dataset , size ntotal * code_size
    uint8_t* codes_ex = nullptr;							// <==== new added
    ... ...
}

void IndexFlatCodes::add_ex(idx_t n, const float* x) {
    FAISS_THROW_IF_NOT(is_trained);
    FAISS_THROW_IF_NOT(codes.empty());
    codes_ex = (uint8_t*)x;
    ntotal = n;
}

...

Knowhere need add a new interface `AddExWithoutIds()` for both IDMAP and BinaryIDMAP.

Code Block
languagecpp
    // set external data pointer instead really add data
    void
    AddExWithoutIds(const DatasetPtr&, const Config&);

When Knowhere detect that "codes_ex" is used in current IDMAP index or "xb_ex" is used in current BinaryIDMAP index, serialization is banned.


For Milvus, the implementation of API "FloatSearchBruteForce()" and "BinarySearchBruteForce()" will be re-written, but the interface need not change.

...

Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgment based on the scope of the change.

In Milvus, when growing segment need create an enhanced IDMAP index, it can do in this way:

Code Block
languagecpp
    auto idmap_index = std::make_shared<knowhere::IDMAP>();
    idmap_index->Train(train_dataset, conf);
    idmap_index->AddExWithoutIds(train_dataset, conf);				// <==== call ""AddExWithoutIds"
    auto result = idmap_index->Query(query_dataset, conf, bitset);

This enhanced IDMAP index cannot be serialized, and will be auto destroyed without any cost. 

Compatibility, Deprecation, and Migration Plan(optional)

...