Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Index types and metric types supporting range search are listed below:


IPL2HAMMINGJACCARDTANIMOTOSUBSTRUCTURESUPERSTRUCTURE
BIN_IDMAP

  •   
  •   
  •   


BIN_IVF_FLAT

  •   
  •   
  •   


IDMAP
  •   
  •   





IVF_FLAT
  •   
  •   





IVF_PQ
  •   
  •   





IVF_SQ8
  •   
  •   





HNSW
  •   
  •   





ANNOY






If call range search API with unsupported index types or unsupported metric types, knowhere will throw out exception.

...

The valid scope for "radius" is defined as: -1.0 <= radius <= float_max

metric typerangesimilarnot similar
L2(0, inf)smalllarge
IP[-1, 1]1-1
jaccard[0, 1]01
tanimoto[0, 0.5]00.5
hamming[0, n]0n

2. QueryByRange()

This API does range search for index, it returns all unsorted results with distance "better than radius" (for IP: > radius; for others: < radius).

PROTO
virtual DatasetPtr
QueryByRange(const DatasetPtr& dataset, const Config& config, const faiss::BitsetView bitset)

INPUT

Dataset {
    knowhere::meta::TENSOR: -   // query data
    knowhere::meta::ROWS: -      // rows of queries
    knowhere::meta::DIM: -          // dimension
}

Config {

    knowhere::meta::RADIUS: -   // radius for range search

}

OUTPUT

Dataset {
    knowhere::meta::IDS: -                // result IDs with length LIMS[nq]
    knowhere::meta::DISTANCE: -  // result DISTANCES with length LIMS[nq]
    knowhere::meta::LIMS: -            // result offset prefix sum with length nq + 1
}

LIMS is with length "nq+1", it's the offset prefix sum for result IDS and result DISTANSE. The length of IDS and DISTANCE are the same but variable.

...

This API does range search for no-index dataset, it returns all unsorted results with distance "better than radius" (for IP: > radius; for others: < radius).

PROTO
static DatasetPtr
RangeSearch(const DatasetPtr base_dataset,
const DatasetPtr query_dataset,
const Config& config,
const faiss::BitsetView bitset);

INPUT

Dataset {
    knowhere::meta::TENSOR: -   // base data
    knowhere::meta::ROWS: -      // rows of base data
    knowhere::meta::DIM: -          // dimension
}

Dataset {
    knowhere::meta::TENSOR: -   // query data
    knowhere::meta::ROWS: -      // rows of queries
    knowhere::meta::DIM: -          // dimension
}

Config {

    knowhere::meta::RADIUS: -   // radius for range search

}

OUTPUT

Dataset {
    knowhere::meta::IDS: -                // result IDs with length LIMS[nq]
    knowhere::meta::DISTANCE: -  // result DISTANCES with length LIMS[nq]
    knowhere::meta::LIMS: -            // result offset prefix sum with length nq + 1
}

The output is as same as QueryByRange().

...

1st call with (radius_low_bound = 0.0, radius_high_bound = inf), get result distances like this: 

Code Block
{d(0), d(1), d(2), ..., d(n-1)}


{d(0), d(1), d(2), ..., d(n-1)}

2nd call with (radius_low_bound = d(n-1), radius_high_bound = inf), get result distances like this: {d(n), d(n+1), d(n+2), ..., d(2n-1)}

...