Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Pre-defined format for import files

Assume we have a collection with 2 fields(one primary key and one vector field) and 5 rows:

uidvector
101[1.1, 1.2, 1.3, 1.4]
102[2.1, 2.2, 2.3, 2.4]
103[3.1, 3.2, 3.3, 3.4]
104[4.1, 4.2, 4.3, 4.4]
105[5.1, 5.2, 5.3, 5.4]


There are two ways to represent the collection with data files:

(1) Row-based data file, a JSON file contains multiple rows.

file_1.json:

Code Block
{    {"uid": 101, "vector": [1.1, 1.2, 1.3, 1.4]},    {"uid": 102, "vector": [2.1, 2.2, 2.3, 2.4]},    {"uid": 103, "vector": [3.1, 3.2, 3.3, 3.4]},    {"uid": 104, "vector": [4.1, 4.2, 4.3, 4.4]},    {"uid": 105, "vector": [5.1, 5.2, 5.3, 5.4]},
}

Call import() to import the file:

Code Block
import(collection_name="test", files=["file_1.json"])



(2) Column-based data file, each JSON file represents a column. We require the keyword "values" as a key of the field data.

In this case, there are two fields, so we create 2 JSON files:
file_1.json for the "uid" field:

Code Block
{
  "values": [101, 102, 103, 104, 105]
}

file_2.json for the "vector" field:

Code Block
{
  "values": [[1.1, 1.2, 1.3, 1.4], [2.1, 2.2, 2.3, 2.4], [3.1, 3.2, 3.3, 3.4], [4.1, 4.2, 4.3, 4.4], [5.1, 5.2, 5.3, 5.4]]
}

Call import() to import the file:

Code Block
import(collection_name="test", files={"uid": "file_1.json", "vector": "file_2.json"})

We also user store vectors in a Numpy file, let's say the "vector" field is stored in file_2.npy, then we can call import():

Code Block
import(collection_name="test", files={"uid": "file_1.json", "vector": "file_2.npy"})




Proxy RPC Interfaces

    The declaration of import API in proxy RPC:

...