Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This document describes how to support `delete` delete in Milvus. Milvus provides a new `delete` delete API to delete entities from a collection.

...

In some scenarios, users want to delete some entities from a collection which will no longer be searched out. Currently, users can only manually filter out unwanted results in entities from search results. We hope to implement a new function that allows users to delete entities from a collection.

...

Code Block
languagepy
def delete(self, condition=None)->MutationResult:
    """
    Delete entities by primary keys.
    Example: client.delete("_id in [1,10,100]")
    
    :param condition: an expression indicates whether an entity should be deleted
    :type  condition: str
    """

Design Details

Proxy maintains 2 kinds of pulsar channels for each collection:

  1. insert channel
    1. DDL msg (CreateCollection/DropCollection/CreatePartition/DropPartition)
    2. InsertMsg
    3. DeleteMsg
  2. search channel
    1. SearchMsg
    2. RetrieveMsg

DataNode consumes messages from insert channel only.

QueryNode consumes messages from both insert channel and search channel


Since Milvus's storage is an append-only design, the delete `delete` function is implemented through using soft delete, setting a flag on the existing data entity to indicate that the data this entity has been deleted.

This solution needs the :

  • record deletion offset in Milvus
  • let algorithm library support to

...

  • search with a bitset

and the deletion offset recorded in Milvus. Now the algorithm library Knowhere is `Knowhere` has already supported to search with a bitset indicated which indicates whether an entity is deleted. So we discuss how to store the deleted primary keys here.

...