...
querynode reads segment's binlog files from Minio, and saves them into structure Blob (internal/querynode/segment_loader.go::loadSegmentFieldsData)
Code Block type Blob struct { Key string // binlog file path Value []byte // binlog file data }
The data in Blob is deserialized, raw-data in it is saved into structure InsertData
querynode invokes search engine to get SearchResult (internal/query_node/query_collection.go::search)
Code Block language cpp // internal/core/src/common/Types.h struct SearchResult { ... public: int64_t num_queries_; int64_t topk_; std::vector<float> result_distances_; public: void* segment_; std::vector<int64_t> internal_seg_offsets_; std::vector<int64_t> result_offsets_; std::vector<std::vector<char>> row_data_; };
At this time, only "result_distances_ 和 - " and "internal_seg_offsets_ 被填入了数据。
- " of "SearchResult" are filled into data.
querynode reduces all SearchResult returned by segment, fetches all other fields' data, and saves them into "row_data_" with row-based format.
中 (internal/query_node/query_collection.go::reduceSearchResultsAndFillData)
querynode
对 organizes SearchResult again, and save them into structure milvus.Hits
中Code Block // internal/proto/milvus.proto
message Hits {
repeated int64 IDs = 1;
repeated bytes row_data = 2;
repeated float scores = 3;
}
Row-based data saved in milvus.Hits 中的数据通过函数
translateHits
转为 列存 数据 is converted to column-based data, and saved into schemapb.SearchResultData (internal/query_node/query_collection.go::translateHits)Code Block // internal/proto/schema.proto
message SearchResultData {
int64int64 num_queries = 1;
int64int64 top_k = 2;
repeatedrepeated FieldData fields_data = 3;
repeatedrepeated float scores = 4;
IDsIDs ids = 5;
repeatedrepeated int64 topks = 6;
}
schemapb.SearchResultData
被序列化后,封装为internalpb.SearchResults
,并放入msgstream.SearchResultMsg
,通过 pulsar channel 发送 (internal/query_node/query_collection.go::search)// internal/proto/internal.proto
message SearchResults {
common.MsgBase base = 1;
common.Status status = 2;
string result_channelID = 3;
string metric_type = 4;
repeated bytes hits = 5; // search result data
// schema.SearchResultsData inside
bytes sliced_blob = 9;
int64 sliced_num_count = 10;
int64 sliced_offset = 11;
repeated int64 sealed_segmentIDs_searched = 6;
repeated string channelIDs_searched = 7;
repeated int64 global_sealed_segmentIDs = 8;
}proxy
从 pulsar channel 中收集到所有querynode
发送过来的msgstream.SearchResultMsg
,反序列化得到schemapb.SearchResultData
,再做一次归并,数据放入milvuspb.SearchResults
,通过 gRPC 传回 SDK (internal/proxy/task.go::SearchTask::PostExecute)// internal/proto/milvus.proto
message SearchResults {
common.Status status = 1;
schema.SearchResultData results = 2;
}SDK 收到
milvuspb.SearchResults
knowhere
返回的 SearchResult
中只有 querynode
在得到所有 segment
返回的 SearchResult
后,对结果做归并,并通过 internal_seg_offsets_
得到其它输出列数据,并按 行存 格式写入 row_data_
SearchResult
数据再次整理,存入数据结构 ...