weaviate.data

Data module used to create, read, update and delete object and references.

class weaviate.data.DataObject(connection: Connection)

Bases: object

DataObject class used to manipulate object to/from weaviate.

reference

A Reference object to create objects cross-references.

Type:

weaviate.data.references.Reference

Initialize a DataObject class instance.

Parameters:

connection (weaviate.connect.Connection) – Connection object to an active and running weaviate instance.

create(data_object: dict | str, class_name: str, uuid: str | UUID | None = None, vector: Sequence | None = None, consistency_level: ConsistencyLevel | None = None) str

Takes a dict describing the object and adds it to weaviate.

Parameters:
  • data_object (dict or str) – Object to be added. If type is str it should be either a URL or a file.

  • class_name (str) – Class name associated with the object given.

  • uuid (str, uuid.UUID or None, optional) – Object will be created under this uuid if it is provided. Otherwise, weaviate will generate a uuid for this object, by default None.

  • vector (Sequence or None, optional) –

    Embedding for the object. Can be used when:

    • a class does not have a vectorization module.

    • The given vector was generated using the _identical_ vectorization module that is configured for the

    class. In this case this vector takes precedence.

    Supported types are list, ‘numpy.ndarray`, torch.Tensor and tf.Tensor, by default None.

  • consistency_level (Optional[ConsistencyLevel], optional) – Can be one of ‘ALL’, ‘ONE’, or ‘QUORUM’. Determines how many replicas must acknowledge

Examples

Schema contains a class Author with only ‘name’ and ‘age’ primitive property.

>>> client.data_object.create(
...     data_object = {'name': 'Neil Gaiman', 'age': 60},
...     class_name = 'Author',
... )
'46091506-e3a0-41a4-9597-10e3064d8e2d'
>>> client.data_object.create(
...     data_object = {'name': 'Andrzej Sapkowski', 'age': 72},
...     class_name = 'Author',
...     uuid = 'e067f671-1202-42c6-848b-ff4d1eb804ab'
... )
'e067f671-1202-42c6-848b-ff4d1eb804ab'
Returns:

Returns the UUID of the created object if successful.

Return type:

str

Raises:
  • TypeError – If argument is of wrong type.

  • ValueError – If argument contains an invalid value.

  • weaviate.ObjectAlreadyExistsException – If an object with the given uuid already exists within weaviate.

  • weaviate.UnexpectedStatusCodeException – If creating the object in Weaviate failed for a different reason, more information is given in the exception.

  • requests.ConnectionError – If the network connection to weaviate fails.

delete(uuid: str | UUID, class_name: str | None = None, consistency_level: ConsistencyLevel | None = None) None

Delete an existing object from weaviate.

Parameters:
  • uuid (str or uuid.UUID) – The ID of the object that should be deleted.

  • class_name (Optional[str], optional) – The class name of the object with UUID uuid. Introduced in Weaviate version v1.14.0. STRONGLY recommended to set it with Weaviate >= 1.14.0. It will be required in future versions of Weaviate Server and Clients. Use None value ONLY for Weaviate < v1.14.0, by default None

  • consistency_level (Optional[ConsistencyLevel], optional) – Can be one of ‘ALL’, ‘ONE’, or ‘QUORUM’. Determines how many replicas must acknowledge

Examples

>>> client.data_object.get(
...     uuid="d842a0f4-ad8c-40eb-80b4-bfefc7b1b530",
...     class_name='Author', # ONLY with Weaviate >= 1.14.0
... )
{
    "additional": {},
    "class": "Author",
    "creationTimeUnix": 1617112817487,
    "id": "d842a0f4-ad8c-40eb-80b4-bfefc7b1b530",
    "lastUpdateTimeUnix": 1617112817487,
    "properties": {
        "age": 46,
        "name": "H.P. Lovecraft"
    },
    "vectorWeights": null
}
>>> client.data_object.delete(
...     uuid="d842a0f4-ad8c-40eb-80b4-bfefc7b1b530",
...     class_name='Author', # ONLY with Weaviate >= 1.14.0
... )
>>> client.data_object.get(
...     uuid="d842a0f4-ad8c-40eb-80b4-bfefc7b1b530",
...     class_name='Author', # ONLY with Weaviate >= 1.14.0
... )
None
Raises:
  • requests.ConnectionError – If the network connection to weaviate fails.

  • weaviate.UnexpectedStatusCodeException – If weaviate reports a none OK status.

  • TypeError – If parameter has the wrong type.

  • ValueError – If uuid is not properly formed.

exists(uuid: str | UUID, class_name: str | None = None, consistency_level: ConsistencyLevel | None = None) bool

Check if the object exist in weaviate.

Parameters:
  • uuid (str or uuid.UUID) – The UUID of the object that may or may not exist within Weaviate.

  • class_name (Optional[str], optional) – The class name of the object with UUID uuid. Introduced in Weaviate version 1.14.0. STRONGLY recommended to set it with Weaviate >= 1.14.0. It will be required in future versions of Weaviate Server and Clients. Use None value ONLY for Weaviate < 1.14.0, by default None

  • consistency_level (Optional[ConsistencyLevel], optional) – Can be one of ‘ALL’, ‘ONE’, or ‘QUORUM’. Determines how many replicas must acknowledge

Examples

>>> client.data_object.exists(
...     uuid='e067f671-1202-42c6-848b-ff4d1eb804ab',
...     class_name='Author',  # ONLY with Weaviate >= 1.14.0
... )
False
>>> client.data_object.create(
...     data_object = {'name': 'Andrzej Sapkowski', 'age': 72},
...     class_name = 'Author',
...     uuid = 'e067f671-1202-42c6-848b-ff4d1eb804ab'
... )
>>> client.data_object.exists(
...     uuid='e067f671-1202-42c6-848b-ff4d1eb804ab',
...     class_name='Author', # ONLY with Weaviate >= 1.14.0
... )
True
Returns:

True if object exists, False otherwise.

Return type:

bool

Raises:
  • requests.ConnectionError – If the network connection to weaviate fails.

  • weaviate.UnexpectedStatusCodeException – If weaviate reports a none OK status.

  • TypeError – If parameter has the wrong type.

  • ValueError – If uuid is not properly formed.

get(uuid: str | UUID | None = None, additional_properties: List[str] = None, with_vector: bool = False, class_name: str | None = None, node_name: str | None = None, consistency_level: ConsistencyLevel | None = None, limit: int | None = None, after: str | UUID | None = None, offset: int | None = None, sort: Dict[str, str | bool | List[bool] | List[str]] | None = None) Dict[str, Any] | None

Gets objects from weaviate, the maximum number of objects returned is 100. If ‘uuid’ is None, all objects are returned. If ‘uuid’ is specified the result is the same as for get_by_uuid method.

Parameters:
  • uuid (str, uuid.UUID or None, optional) – The identifier of the object that should be retrieved.

  • additional_properties (list of str, optional) – list of additional properties that should be included in the request, by default None

  • with_vector (bool) – If True the vector property will be returned too, by default False

  • class_name (Optional[str], optional) – The class name of the object with UUID uuid. Introduced in Weaviate version v1.14.0. STRONGLY recommended to set it with Weaviate >= 1.14.0. It will be required in future versions of Weaviate Server and Clients. Use None value ONLY for Weaviate < v1.14.0, by default None

  • consistency_level (Optional[ConsistencyLevel], optional) – Can be one of ‘ALL’, ‘ONE’, or ‘QUORUM’. Determines how many replicas must acknowledge a request before it is considered successful. Mutually exclusive with node_name param.

  • node_name (Optional[str], optional) – The name of the target node which should fulfill the request. Mutually exclusive with consistency_level param.

  • limit (Optional[int], optional) – The maximum number of data objects to return. by default None, which uses the weaviate default of 100 entries

  • after (Optional[UUID], optional) – Can be used to extract all elements by giving the last ID from the previous “page”. Requires limit to be set but cannot be combined with any other filters or search. Part of the Cursor API.

  • offset (Optional[int], optional) – The offset of objects returned, i.e. the starting index of the returned objects. Should be used in conjunction with the ‘limit’ parameter.

  • sort (Optional[Dict]) –

    A dictionary for sorting objects. sort[‘properties’]: str, List[str]

    By which properties the returned objects should be sorted. When more than one property is given, the objects are sorted in order of the list. The order of the sorting can be given by using ‘sort[‘order_asc’]’.

    sort[‘order_asc’]: bool, List[bool]

    The order the properties given in ‘sort[‘properties’]’ should be returned in. When a single boolean is used, all properties are sorted in the same order. If a list is used, it needs to have the same length as ‘sort’. Each properties order is then decided individually. If ‘sort[‘order_asc’]’ is True, the properties are sorted in ascending order. If it is False, they are sorted in descending order. if ‘sort[‘order_asc’]’ is not given, all properties are sorted in ascending order.

Returns:

A list of all objects. If no objects where found the list is empty.

Return type:

list of dicts

Raises:
  • TypeError – If argument is of wrong type.

  • ValueError – If argument contains an invalid value.

  • requests.ConnectionError – If the network connection to weaviate fails.

  • weaviate.UnexpectedStatusCodeException – If weaviate reports a none OK status.

get_by_id(uuid: str | UUID, additional_properties: List[str] = None, with_vector: bool = False, class_name: str | None = None, node_name: str | None = None, consistency_level: ConsistencyLevel | None = None) dict | None

Get an object as dict.

Parameters:
  • uuid (str or uuid.UUID) – The identifier of the object that should be retrieved.

  • additional_properties (list of str, optional) – List of additional properties that should be included in the request, by default None

  • with_vector (bool) – If True the vector property will be returned too, by default False.

  • class_name (Optional[str], optional) – The class name of the object with UUID uuid. Introduced in Weaviate version v1.14.0. STRONGLY recommended to set it with Weaviate >= 1.14.0. It will be required in future versions of Weaviate Server and Clients. Use None value ONLY for Weaviate < v1.14.0, by default None

Examples

>>> client.data_object.get_by_id(
...     uuid="d842a0f4-ad8c-40eb-80b4-bfefc7b1b530",
...     class_name='Author', # ONLY with Weaviate >= 1.14.0
... )
{
    "additional": {},
    "class": "Author",
    "creationTimeUnix": 1617112817487,
    "id": "d842a0f4-ad8c-40eb-80b4-bfefc7b1b530",
    "lastUpdateTimeUnix": 1617112817487,
    "properties": {
        "age": 46,
        "name": "H.P. Lovecraft"
    },
    "vectorWeights": null
}
Returns:

dict in case the object exists. None in case the object does not exist.

Return type:

dict or None

Raises:
  • TypeError – If argument is of wrong type.

  • ValueError – If argument contains an invalid value.

  • requests.ConnectionError – If the network connection to weaviate fails.

  • weaviate.UnexpectedStatusCodeException – If weaviate reports a none OK status.

replace(data_object: dict | str, class_name: str, uuid: str | UUID, vector: Sequence | None = None, consistency_level: ConsistencyLevel | None = None) None

Replace an already existing object with the given data object. This method replaces the whole object.

Parameters:
  • data_object (dict or str) – Describes the new values. It may be an URL or path to a json or a python dict describing the new values.

  • class_name (str) – Name of the class of the object that should be updated.

  • uuid (str or uuid.UUID) – The UUID of the object that should be changed.

  • vector (Sequence or None, optional) –

    Embedding for the object. Can be used when:

    • a class does not have a vectorization module.

    • The given vector was generated using the _identical_ vectorization module that is configured for the

    class. In this case this vector takes precedence.

    Supported types are list, ‘numpy.ndarray`, torch.Tensor and tf.Tensor, by default None.

  • consistency_level (Optional[ConsistencyLevel], optional) – Can be one of ‘ALL’, ‘ONE’, or ‘QUORUM’. Determines how many replicas must acknowledge

Examples

>>> author_id = client.data_object.create(
...     data_object = {'name': 'H. Lovecraft', 'age': 46},
...     class_name = 'Author'
... )
>>> client.data_object.get(author_id)
{
    "additional": {},
    "class": "Author",
    "creationTimeUnix": 1617112817487,
    "id": "d842a0f4-ad8c-40eb-80b4-bfefc7b1b530",
    "lastUpdateTimeUnix": 1617112817487,
    "properties": {
        "age": 46,
        "name": "H. Lovecraft"
    },
    "vectorWeights": null
}
>>> client.data_object.replace(
...     data_object = {'name': 'H.P. Lovecraft'},
...     class_name = 'Author',
...     uuid = author_id
... )
>>> client.data_object.get(author_id)
{
    "additional": {},
    "class": "Author",
    "id": "d842a0f4-ad8c-40eb-80b4-bfefc7b1b530",
    "lastUpdateTimeUnix": 1617112838668,
    "properties": {
        "name": "H.P. Lovecraft"
    },
    "vectorWeights": null
}
Raises:
  • TypeError – If argument is of wrong type.

  • ValueError – If argument contains an invalid value.

  • requests.ConnectionError – If the network connection to weaviate fails.

  • weaviate.UnexpectedStatusCodeException – If weaviate reports a none OK status.

update(data_object: dict | str, class_name: str, uuid: str | UUID, vector: Sequence | None = None, consistency_level: ConsistencyLevel | None = None) None

Update the given object with the already existing object in weaviate. Overwrites only the specified fields, the unspecified ones remain unchanged.

Parameters:
  • data_object (dict or str) – The object states the fields that should be updated. Fields not specified by in the ‘data_object’ remain unchanged. Fields that are None will not be changed. If type is str it should be either an URL or a file.

  • class_name (str) – The class name of the object.

  • uuid (str or uuid.UUID) – The ID of the object that should be changed.

  • vector (Sequence or None, optional) –

    Embedding for the object. Can be used when:

    • a class does not have a vectorization module.

    • The given vector was generated using the _identical_ vectorization module that is configured for the

    class. In this case this vector takes precedence.

    Supported types are list, ‘numpy.ndarray`, torch.Tensor and tf.Tensor, by default None.

  • consistency_level (Optional[ConsistencyLevel], optional) – Can be one of ‘ALL’, ‘ONE’, or ‘QUORUM’. Determines how many replicas must acknowledge

Examples

>>> author_id = client.data_object.create(
...     data_object = {'name': 'Philip Pullman', 'age': 64},
...     class_name = 'Author'
... )
>>> client.data_object.get(author_id)
{
    "additional": {},
    "class": "Author",
    "creationTimeUnix": 1617111215172,
    "id": "bec2bca7-264f-452a-a5bb-427eb4add068",
    "lastUpdateTimeUnix": 1617111215172,
    "properties": {
        "age": 64,
        "name": "Philip Pullman"
    },
    "vectorWeights": null
}
>>> client.data_object.update(
...     data_object = {'age': 74},
...     class_name = 'Author',
...     uuid = author_id
... )
>>> client.data_object.get(author_id)
{
    "additional": {},
    "class": "Author",
    "creationTimeUnix": 1617111215172,
    "id": "bec2bca7-264f-452a-a5bb-427eb4add068",
    "lastUpdateTimeUnix": 1617111215172,
    "properties": {
        "age": 74,
        "name": "Philip Pullman"
    },
    "vectorWeights": null
}
Raises:
  • TypeError – If argument is of wrong type.

  • ValueError – If argument contains an invalid value.

  • requests.ConnectionError – If the network connection to weaviate fails.

  • weaviate.UnexpectedStatusCodeException – If weaviate reports a none successful status.

validate(data_object: dict | str, class_name: str, uuid: str | UUID | None = None, vector: Sequence = None) dict

Validate an object against weaviate.

Parameters:
  • data_object (dict or str) – Object to be validated. If type is str it should be either an URL or a file.

  • class_name (str) – Name of the class of the object that should be validated.

  • uuid (str, uuid.UUID or None, optional) – The UUID of the object that should be validated against weaviate. by default None.

  • vector (Sequence or None, optional) –

    The embedding of the object that should be validated. Can be used when:

    • a class does not have a vectorization module.

    • The given vector was generated using the _identical_ vectorization module that is configured for the

    class. In this case this vector takes precedence.

    Supported types are list, ‘numpy.ndarray`, torch.Tensor and tf.Tensor, by default None.

Examples

Assume we have a Author class only ‘name’ property, NO ‘age’.

>>> client1.data_object.validate(
...     data_object = {'name': 'H. Lovecraft'},
...     class_name = 'Author'
... )
{'error': None, 'valid': True}
>>> client1.data_object.validate(
...     data_object = {'name': 'H. Lovecraft', 'age': 46},
...     class_name = 'Author'
... )
{
    "error": [
        {
        "message": "invalid object: no such prop with name 'age' found in class 'Author'
            in the schema. Check your schema files for which properties in this class are
            available"
        }
    ],
    "valid": false
}
Returns:

Validation result. E.g. {“valid”: bool, “error”: None or list}

Return type:

dict

Raises:
  • TypeError – If argument is of wrong type.

  • ValueError – If argument contains an invalid value.

  • weaviate.UnexpectedStatusCodeException – If validating the object against Weaviate failed with a different reason.

  • requests.ConnectionError – If the network connection to weaviate fails.

weaviate.data.references