Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1.  querynode reads segment's binlog files from Minio, and saves them into structure Blob (internal/querynode/segment_loader.go::loadSegmentFieldsData)

    Code Block
    type Blob struct {
        Key   string    // binlog file path
        Value []byte    // binlog file data
    }

    The data in Blob is deserialized, raw-data in it is saved into structure InsertData

  2.  querynode invokes search engine to get SearchResult (internal/query_node/query_collection.go::search)

    Code Block
    languagecpp
    // internal/core/src/common/Types.h
    struct SearchResult {
     ...
     public:
        int64_t num_queries_;
        int64_t topk_;
        std::vector<float> result_distances_;
    ​
     public:
        void* segment_;
        std::vector<int64_t> internal_seg_offsets_;
        std::vector<int64_t> result_offsets_;
        std::vector<std::vector<char>> row_data_;
    };


    At this time, only "result_distances_" and "internal_seg_offsets_" of "SearchResult" are filled into data.

    querynode reduces all SearchResult returned by segment, fetches all other fields' data, and saves them into "row_data_" with row-based format. (internal/query_node/query_collection.go::reduceSearchResultsAndFillData)

  3. querynode organizes SearchResult again, and save them into structure milvus.Hits

    Code Block
    // internal/proto/milvus.proto
    message Hits {
      repeated int64 IDs = 1;
      repeated bytes row_data = 2;
      repeated float scores = 3;
    }
  4.  Row-based data saved in milvus.Hits is converted to column-based data, and saved into schemapb.SearchResultData (internal/query_node/query_collection.go::translateHits)

    Code Block
    // internal/proto/schema.proto
    message SearchResultData {
      int64 num_queries = 1;
      int64 top_k = 2;
      repeated FieldData fields_data = 3;
      repeated float scores = 4;
      IDs ids = 5;
      repeated int64 topks = 6;
    }
  5.  schemapb.SearchResultData is serialized, encapsulated as internalpb.SearchResults, saved into SearchResultMsg, and send into pulsar channel (internal/query_node/query_collection.go::search)

    Code Block
    // internal/proto/internal.proto
    message SearchResults {
      common.MsgBase base = 1;
      common.Status status = 2;
      string result_channelID = 3;
      string metric_type = 4;
      repeated bytes hits = 5;  // search result data
    ​
      // schema.SearchResultsData inside
      bytes sliced_blob = 9;
      int64 sliced_num_count = 10;
      int64 sliced_offset = 11;
    ​
      repeated int64 sealed_segmentIDs_searched = 6;
      repeated string channelIDs_searched = 7;
      repeated int64 global_sealed_segmentIDs = 8;
    }
  6. proxy 从 pulsar channel 中收集到所有 querynode 发送过来的 msgstream.SearchResultMsg,反序列化得到

     Proxy collects all SearchResultMsg from querynodes, gets schemapb.SearchResultData

    ,再做一次归并,数据放入

    by deserialization, then gets milvuspb.SearchResults

    ,通过 gRPC 传回 SDK

    by reducing, finally send back to SDK visa gRPC. (internal/proxy/task.go::SearchTask::PostExecute)

    Code Block
    // internal/proto/milvus.proto

    
    message SearchResults {

    
     
     common
     common.Status status = 1;

    
     
     schema
     schema.SearchResultData results = 2;

    
    }
  7. SDK 收到
  8.  SDK receives milvuspb.
  9. SearchResults,返回
  10. SearchResult

Public Interfaces(optional)

...