weaviate.collections.batch

class weaviate.collections.batch._BatchClient(connection, consistency_level, results, batch_mode, executor, vectorizer_batching, objects=None, references=None)[source]

Bases: _BatchBase

Parameters:
add_object(collection, properties=None, references=None, uuid=None, vector=None, tenant=None)[source]

Add one object to this batch.

NOTE: If the UUID of one of the objects already exists then the existing object will be replaced by the new object.

Parameters:
  • collection (str) – The name of the collection this object belongs to.

  • properties (Mapping[str, None | str | bool | int | float | datetime | UUID | GeoCoordinate | PhoneNumber | _PhoneNumber | Mapping[str, WeaviateField] | Sequence[str] | Sequence[bool] | Sequence[int] | Sequence[float] | Sequence[datetime] | Sequence[UUID] | Sequence[Mapping[str, WeaviateField]]] | None) – The data properties of the object to be added as a dictionary.

  • references (Mapping[str, str | UUID | Sequence[str | UUID] | ReferenceToMulti] | None) – The references of the object to be added as a dictionary.

  • uuid (str | UUID | None) – The UUID of the object as an uuid.UUID object or str. It can be a Weaviate beacon or Weaviate href. If it is None an UUIDv4 will generated, by default None

  • vector (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None) – The embedding of the object. Can be used when a collection does not have a vectorization module or the given vector was generated using the _identical_ vectorization module that is configured for the class. In this case this vector takes precedence. Supported types are: - for single vectors: list, ‘numpy.ndarray`, torch.Tensor and tf.Tensor, by default None. - for named vectors: Dict[str, list above], where the string is the name of the vector.

  • tenant (str | Tenant | None) – The tenant name or Tenant object to be used for this request.

Returns:

The UUID of the added object. If one was not provided a UUIDv4 will be auto-generated for you and returned here.

Raises:

WeaviateBatchValidationError – If the provided options are in the format required by Weaviate.

Return type:

str | UUID

add_reference(from_uuid, from_collection, from_property, to, tenant=None)[source]

Add one reference to this batch.

Parameters:
  • from_uuid (str | UUID) – The UUID of the object, as an uuid.UUID object or str, that should reference another object.

  • from_collection (str) – The name of the collection that should reference another object.

  • from_property (str) – The name of the property that contains the reference.

  • to (str | UUID | Sequence[str | UUID] | ReferenceToMulti) – The UUID of the referenced object, as an uuid.UUID object or str, that is actually referenced. For multi-target references use wvc.Reference.to_multi_target().

  • tenant (str | Tenant | None) – The tenant name or Tenant object to be used for this request.

Raises:

WeaviateBatchValidationError – If the provided options are in the format required by Weaviate.

Return type:

None

class weaviate.collections.batch._BatchCollection(executor, connection, consistency_level, results, batch_mode, name, tenant, vectorizer_batching)[source]

Bases: Generic[Properties], _BatchBase

Parameters:
add_object(properties=None, references=None, uuid=None, vector=None)[source]

Add one object to this batch.

NOTE: If the UUID of one of the objects already exists then the existing object will be replaced by the new object.

Parameters:
  • properties (Properties | None) – The data properties of the object to be added as a dictionary.

  • references (Mapping[str, str | UUID | Sequence[str | UUID] | ReferenceToMulti] | None) – The references of the object to be added as a dictionary.

  • uuid (str | UUID | None) – The UUID of the object as an uuid.UUID object or str. If it is None an UUIDv4 will generated, by default None

  • vector (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None) – The embedding of the object. Can be used when a collection does not have a vectorization module or the given vector was generated using the _identical_ vectorization module that is configured for the class. In this case this vector takes precedence. Supported types are: - for single vectors: list, ‘numpy.ndarray`, torch.Tensor and tf.Tensor, by default None. - for named vectors: Dict[str, list above], where the string is the name of the vector.

Returns:

The UUID of the added object. If one was not provided a UUIDv4 will be auto-generated for you and returned here.

Raises:

WeaviateBatchValidationError – If the provided options are in the format required by Weaviate.

Return type:

str | UUID

add_reference(from_uuid, from_property, to)[source]

Add a reference to this batch.

Parameters:
  • from_uuid (str | UUID) – The UUID of the object, as an uuid.UUID object or str, that should reference another object.

  • from_property (str) – The name of the property that contains the reference.

  • to (str | UUID | Sequence[str | UUID] | ReferenceToMulti | List[str | UUID]) – The UUID of the referenced object, as an uuid.UUID object or str, that is actually referenced. For multi-target references use wvc.Reference.to_multi_target().

Raises:

WeaviateBatchValidationError – If the provided options are in the format required by Weaviate.

Return type:

None

class weaviate.collections.batch._BatchGRPC(weaviate_version, consistency_level)[source]

Bases: _BaseGRPC

This class is used to insert multiple objects into Weaviate using the gRPC API.

It is used within the _Data and _Batch classes hence the necessary generalities and abstractions so as not to couple to strongly to either use-case.

Parameters:
_BatchGRPC__grpc_objects(objects)
Parameters:

objects (List[_BatchObject])

Return type:

List[BatchObject]

_BatchGRPC__multi_vec(vectors)
Parameters:

vectors (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None)

Return type:

List[Vectors] | None

_BatchGRPC__single_vec(vectors)
Parameters:

vectors (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None)

Return type:

bytes | None

_BatchGRPC__translate_properties_from_python_to_grpc(data, refs)
Parameters:
  • data (Dict[str, Any])

  • refs (Mapping[str, str | UUID | Sequence[str | UUID] | ReferenceToMulti])

Return type:

Properties

objects(connection, *, objects, timeout, max_retries)[source]

Insert multiple objects into Weaviate through the gRPC API.

Parameters:
  • connection (ConnectionSync | ConnectionAsync) – The connection to the Weaviate instance.

  • objects (List[_BatchObject]) – A list of WeaviateObject containing the data of the objects to be inserted. The class name must be provided for each object, and the UUID is optional. If no UUID is provided, one will be generated for each object. The UUIDs of the inserted objects will be returned in the uuids attribute of the returned _BatchReturn object. The UUIDs of the objects that failed to be inserted will be returned in the errors attribute of the returned _BatchReturn object.

  • timeout (int | float) – The timeout in seconds for the request.

  • max_retries (float) – The maximum number of retries in case of a failure.

Return type:

BatchObjectReturn | Awaitable[BatchObjectReturn]

class weaviate.collections.batch._BatchREST(consistency_level)[source]

Bases: object

Parameters:

consistency_level (ConsistencyLevel | None)

references(connection, *, references)[source]
Parameters:
  • connection (ConnectionSync | ConnectionAsync)

  • references (List[_BatchReference])

Return type:

BatchReferenceReturn | Awaitable[BatchReferenceReturn]

weaviate.collections.batch.base

class weaviate.collections.batch.base.BatchRequest[source]

Bases: ABC, Generic[TBatchInput, TBatchReturn]

BatchRequest abstract class used as a interface for batch requests.

add(item)[source]

Add an item to the BatchRequest.

Parameters:

item (TBatchInput)

Return type:

None

prepend(item)[source]

Add items to the front of the BatchRequest.

This is intended to be used when objects should be retries, eg. after a temporary error.

Parameters:

item (List[TBatchInput])

Return type:

None

_abc_impl = <_abc._abc_data object>
class weaviate.collections.batch.base.ReferencesBatchRequest[source]

Bases: BatchRequest[_BatchReference, BatchReferenceReturn]

Collect Weaviate-object references to add them in one request to Weaviate.

pop_items(pop_amount, uuid_lookup)[source]

Pop the given number of items from the BatchRequest queue.

Returns:

A list of items from the BatchRequest.

Parameters:
  • pop_amount (int)

  • uuid_lookup (Set[str])

Return type:

List[_BatchReference]

_abc_impl = <_abc._abc_data object>
class weaviate.collections.batch.base.ObjectsBatchRequest[source]

Bases: BatchRequest[_BatchObject, BatchObjectReturn]

Collect objects for one batch request to weaviate.

pop_items(pop_amount)[source]

Pop the given number of items from the BatchRequest queue.

Returns:

A list of items from the BatchRequest.

Parameters:

pop_amount (int)

Return type:

List[_BatchObject]

_abc_impl = <_abc._abc_data object>
class weaviate.collections.batch.base._BatchDataWrapper(results: weaviate.collections.classes.batch.BatchResult = <factory>, failed_objects: List[weaviate.collections.classes.batch.ErrorObject] = <factory>, failed_references: List[weaviate.collections.classes.batch.ErrorReference] = <factory>, imported_shards: Set[weaviate.collections.classes.batch.Shard] = <factory>)[source]

Bases: object

Parameters:
results: BatchResult
failed_objects: List[ErrorObject]
failed_references: List[ErrorReference]
imported_shards: Set[Shard]
class weaviate.collections.batch.base._DynamicBatching[source]

Bases: object

class weaviate.collections.batch.base._FixedSizeBatching(batch_size: int, concurrent_requests: int)[source]

Bases: object

Parameters:
  • batch_size (int)

  • concurrent_requests (int)

batch_size: int
concurrent_requests: int
class weaviate.collections.batch.base._RateLimitedBatching(requests_per_minute: int)[source]

Bases: object

Parameters:

requests_per_minute (int)

requests_per_minute: int
class weaviate.collections.batch.base._BatchBase(connection, consistency_level, results, batch_mode, executor, vectorizer_batching, objects=None, references=None)[source]

Bases: object

Parameters:
property number_errors: int

Return the number of errors in the batch.

_shutdown()[source]

Shutdown the current batch and wait for all requests to be finished.

Return type:

None

flush()[source]

Flush the batch queue and wait for all requests to be finished.

Return type:

None

_add_object(collection, properties=None, references=None, uuid=None, vector=None, tenant=None)[source]
Parameters:
  • collection (str)

  • properties (Mapping[str, None | str | bool | int | float | datetime | UUID | GeoCoordinate | PhoneNumber | _PhoneNumber | Mapping[str, WeaviateField] | Sequence[str] | Sequence[bool] | Sequence[int] | Sequence[float] | Sequence[datetime] | Sequence[UUID] | Sequence[Mapping[str, WeaviateField]]] | None)

  • references (Mapping[str, str | UUID | Sequence[str | UUID] | ReferenceToMulti] | None)

  • uuid (str | UUID | None)

  • vector (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None)

  • tenant (str | None)

Return type:

str | UUID

_BatchBase__batch_send()
Return type:

None

_BatchBase__check_bg_thread_alive()
Return type:

None

_BatchBase__dynamic_batch_rate_loop()
Return type:

None

_BatchBase__dynamic_batching()
Return type:

None

_BatchBase__send_batch(objs, refs, readd_rate_limit)
Parameters:
Return type:

None

_BatchBase__start_bg_threads()

Create a background thread that periodically checks how congested the batch queue is.

Return type:

Thread

_add_reference(from_object_uuid, from_object_collection, from_property_name, to, tenant=None)[source]
Parameters:
  • from_object_uuid (str | UUID)

  • from_object_collection (str)

  • from_property_name (str)

  • to (str | UUID | Sequence[str | UUID] | ReferenceToMulti)

  • tenant (str | None)

Return type:

None

class weaviate.collections.batch.base._ClusterBatch(connection)[source]

Bases: object

Parameters:

connection (ConnectionSync)

get_nodes_status()[source]
Return type:

List[Node]

weaviate.collections.batch.batch_wrapper

class weaviate.collections.batch.batch_wrapper._BatchWrapper(connection, consistency_level)[source]

Bases: object

Parameters:
wait_for_vector_indexing(shards=None, how_many_failures=5)[source]

Wait for the all the vectors of the batch imported objects to be indexed.

Upon network error, it will retry to get the shards’ status for how_many_failures times with exponential backoff (2**n seconds with n=0,1,2,…,how_many_failures).

Parameters:
  • shards (List[Shard] | None) – The shards to check the status of. If None it will check the status of all the shards of the imported objects in the batch.

  • how_many_failures (int) – How many times to try to get the shards’ status before raising an exception. Default 5.

Return type:

None

_get_shards_readiness(shard)[source]
Parameters:

shard (Shard)

Return type:

List[bool]

property failed_objects: List[ErrorObject]

Get all failed objects from the batch manager.

Returns:

A list of all the failed objects from the batch.

property failed_references: List[ErrorReference]

Get all failed references from the batch manager.

Returns:

A list of all the failed references from the batch.

property results: BatchResult

Get the results of the batch operation.

Returns:

The results of the batch operation.

_BatchWrapper__get_shards_readiness(shard)
Parameters:

shard (Shard)

Return type:

List[bool]

_BatchWrapper__is_ready(max_count, shards, backoff_count=0)
Parameters:
  • max_count (int)

  • shards (List[Shard] | None)

  • backoff_count (int)

Return type:

bool

class weaviate.collections.batch.batch_wrapper._ContextManagerWrapper(current_batch)[source]

Bases: Generic[T]

Parameters:

current_batch (T)

weaviate.collections.batch.client

class weaviate.collections.batch.client._BatchClient(connection, consistency_level, results, batch_mode, executor, vectorizer_batching, objects=None, references=None)[source]

Bases: _BatchBase

Parameters:
add_object(collection, properties=None, references=None, uuid=None, vector=None, tenant=None)[source]

Add one object to this batch.

NOTE: If the UUID of one of the objects already exists then the existing object will be replaced by the new object.

Parameters:
  • collection (str) – The name of the collection this object belongs to.

  • properties (Mapping[str, None | str | bool | int | float | datetime | UUID | GeoCoordinate | PhoneNumber | _PhoneNumber | Mapping[str, WeaviateField] | Sequence[str] | Sequence[bool] | Sequence[int] | Sequence[float] | Sequence[datetime] | Sequence[UUID] | Sequence[Mapping[str, WeaviateField]]] | None) – The data properties of the object to be added as a dictionary.

  • references (Mapping[str, str | UUID | Sequence[str | UUID] | ReferenceToMulti] | None) – The references of the object to be added as a dictionary.

  • uuid (str | UUID | None) – The UUID of the object as an uuid.UUID object or str. It can be a Weaviate beacon or Weaviate href. If it is None an UUIDv4 will generated, by default None

  • vector (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None) – The embedding of the object. Can be used when a collection does not have a vectorization module or the given vector was generated using the _identical_ vectorization module that is configured for the class. In this case this vector takes precedence. Supported types are: - for single vectors: list, ‘numpy.ndarray`, torch.Tensor and tf.Tensor, by default None. - for named vectors: Dict[str, list above], where the string is the name of the vector.

  • tenant (str | Tenant | None) – The tenant name or Tenant object to be used for this request.

Returns:

The UUID of the added object. If one was not provided a UUIDv4 will be auto-generated for you and returned here.

Raises:

WeaviateBatchValidationError – If the provided options are in the format required by Weaviate.

Return type:

str | UUID

add_reference(from_uuid, from_collection, from_property, to, tenant=None)[source]

Add one reference to this batch.

Parameters:
  • from_uuid (str | UUID) – The UUID of the object, as an uuid.UUID object or str, that should reference another object.

  • from_collection (str) – The name of the collection that should reference another object.

  • from_property (str) – The name of the property that contains the reference.

  • to (str | UUID | Sequence[str | UUID] | ReferenceToMulti) – The UUID of the referenced object, as an uuid.UUID object or str, that is actually referenced. For multi-target references use wvc.Reference.to_multi_target().

  • tenant (str | Tenant | None) – The tenant name or Tenant object to be used for this request.

Raises:

WeaviateBatchValidationError – If the provided options are in the format required by Weaviate.

Return type:

None

weaviate.collections.batch.client.BatchClient

alias of _BatchClient

class weaviate.collections.batch.client._BatchClientWrapper(connection, config, consistency_level=None)[source]

Bases: _BatchWrapper

Parameters:
dynamic(consistency_level=None)[source]

Configure dynamic batching.

When you exit the context manager, the final batch will be sent automatically.

Parameters:

consistency_level (ConsistencyLevel | None) – The consistency level to be used to send batches. If not provided, the default value is None.

Return type:

_ContextManagerWrapper[_BatchClient]

fixed_size(batch_size=100, concurrent_requests=2, consistency_level=None)[source]

Configure fixed size batches. Note that the default is dynamic batching.

When you exit the context manager, the final batch will be sent automatically.

Parameters:
  • batch_size (int) – The number of objects/references to be sent in one batch. If not provided, the default value is 100.

  • concurrent_requests (int) – The number of concurrent requests when sending batches. This controls the number of concurrent requests made to Weaviate and not the speed of batch creation within Python.

  • consistency_level (ConsistencyLevel | None) – The consistency level to be used to send batches. If not provided, the default value is None.

Return type:

_ContextManagerWrapper[_BatchClient]

rate_limit(requests_per_minute, consistency_level=None)[source]

Configure batches with a rate limited vectorizer.

When you exit the context manager, the final batch will be sent automatically.

Parameters:
  • requests_per_minute (int) – The number of requests that the vectorizer can process per minute.

  • consistency_level (ConsistencyLevel | None) – The consistency level to be used to send batches. If not provided, the default value is None.

Return type:

_ContextManagerWrapper[_BatchClient]

_BatchClientWrapper__create_batch_and_reset()
Return type:

_ContextManagerWrapper[_BatchClient]

weaviate.collections.batch.collection

class weaviate.collections.batch.collection._BatchCollection(executor, connection, consistency_level, results, batch_mode, name, tenant, vectorizer_batching)[source]

Bases: Generic[Properties], _BatchBase

Parameters:
add_object(properties=None, references=None, uuid=None, vector=None)[source]

Add one object to this batch.

NOTE: If the UUID of one of the objects already exists then the existing object will be replaced by the new object.

Parameters:
  • properties (Properties | None) – The data properties of the object to be added as a dictionary.

  • references (Mapping[str, str | UUID | Sequence[str | UUID] | ReferenceToMulti] | None) – The references of the object to be added as a dictionary.

  • uuid (str | UUID | None) – The UUID of the object as an uuid.UUID object or str. If it is None an UUIDv4 will generated, by default None

  • vector (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None) – The embedding of the object. Can be used when a collection does not have a vectorization module or the given vector was generated using the _identical_ vectorization module that is configured for the class. In this case this vector takes precedence. Supported types are: - for single vectors: list, ‘numpy.ndarray`, torch.Tensor and tf.Tensor, by default None. - for named vectors: Dict[str, list above], where the string is the name of the vector.

Returns:

The UUID of the added object. If one was not provided a UUIDv4 will be auto-generated for you and returned here.

Raises:

WeaviateBatchValidationError – If the provided options are in the format required by Weaviate.

Return type:

str | UUID

add_reference(from_uuid, from_property, to)[source]

Add a reference to this batch.

Parameters:
  • from_uuid (str | UUID) – The UUID of the object, as an uuid.UUID object or str, that should reference another object.

  • from_property (str) – The name of the property that contains the reference.

  • to (str | UUID | Sequence[str | UUID] | ReferenceToMulti | List[str | UUID]) – The UUID of the referenced object, as an uuid.UUID object or str, that is actually referenced. For multi-target references use wvc.Reference.to_multi_target().

Raises:

WeaviateBatchValidationError – If the provided options are in the format required by Weaviate.

Return type:

None

class weaviate.collections.batch.collection._BatchCollectionWrapper(connection, consistency_level, name, tenant, config)[source]

Bases: Generic[Properties], _BatchWrapper

Parameters:
dynamic()[source]

Configure dynamic batching.

When you exit the context manager, the final batch will be sent automatically.

Return type:

_ContextManagerWrapper[_BatchCollection[Properties]]

fixed_size(batch_size=100, concurrent_requests=2)[source]

Configure fixed size batches. Note that the default is dynamic batching.

When you exit the context manager, the final batch will be sent automatically.

Parameters:
  • batch_size (int) – The number of objects/references to be sent in one batch. If not provided, the default value is 100.

  • concurrent_requests (int) – The number of concurrent requests when sending batches. This controls the number of concurrent requests made to Weaviate and not the speed of batch creation within Python.

Return type:

_ContextManagerWrapper[_BatchCollection[Properties]]

rate_limit(requests_per_minute)[source]

Configure batches with a rate limited vectorizer.

When you exit the context manager, the final batch will be sent automatically.

Parameters:

requests_per_minute (int) – The number of requests that the vectorizer can process per minute.

Return type:

_ContextManagerWrapper[_BatchCollection[Properties]]

_BatchCollectionWrapper__create_batch_and_reset()
Return type:

_ContextManagerWrapper[_BatchCollection[Properties]]

weaviate.collections.batch.grpc_batch_delete

class weaviate.collections.batch.grpc_batch_delete._BatchDeleteGRPC(weaviate_version, consistency_level)[source]

Bases: _BaseGRPC

This class is used to delete multiple objects from Weaviate using the gRPC API.

Parameters:
batch_delete(connection, *, name, filters, verbose, dry_run, tenant)[source]
Parameters:
  • connection (ConnectionSync | ConnectionAsync)

  • name (str)

  • filters (_Filters)

  • verbose (bool)

  • dry_run (bool)

  • tenant (str | None)

Return type:

DeleteManyReturn[List[DeleteManyObject]] | DeleteManyReturn[None] | Awaitable[DeleteManyReturn[List[DeleteManyObject]] | DeleteManyReturn[None]]

weaviate.collections.batch.grpc_batch_objects

class weaviate.collections.batch.grpc_batch_objects._BatchGRPC(weaviate_version, consistency_level)[source]

Bases: _BaseGRPC

This class is used to insert multiple objects into Weaviate using the gRPC API.

It is used within the _Data and _Batch classes hence the necessary generalities and abstractions so as not to couple to strongly to either use-case.

Parameters:
objects(connection, *, objects, timeout, max_retries)[source]

Insert multiple objects into Weaviate through the gRPC API.

Parameters:
  • connection (ConnectionSync | ConnectionAsync) – The connection to the Weaviate instance.

  • objects (List[_BatchObject]) – A list of WeaviateObject containing the data of the objects to be inserted. The class name must be provided for each object, and the UUID is optional. If no UUID is provided, one will be generated for each object. The UUIDs of the inserted objects will be returned in the uuids attribute of the returned _BatchReturn object. The UUIDs of the objects that failed to be inserted will be returned in the errors attribute of the returned _BatchReturn object.

  • timeout (int | float) – The timeout in seconds for the request.

  • max_retries (float) – The maximum number of retries in case of a failure.

Return type:

BatchObjectReturn | Awaitable[BatchObjectReturn]

_BatchGRPC__grpc_objects(objects)
Parameters:

objects (List[_BatchObject])

Return type:

List[BatchObject]

_BatchGRPC__multi_vec(vectors)
Parameters:

vectors (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None)

Return type:

List[Vectors] | None

_BatchGRPC__single_vec(vectors)
Parameters:

vectors (Mapping[str, Sequence[int | float] | Sequence[Sequence[int | float]]] | Sequence[int | float] | None)

Return type:

bytes | None

_BatchGRPC__translate_properties_from_python_to_grpc(data, refs)
Parameters:
  • data (Dict[str, Any])

  • refs (Mapping[str, str | UUID | Sequence[str | UUID] | ReferenceToMulti])

Return type:

Properties

weaviate.collections.batch.grpc_batch_objects._validate_props(props)[source]
Parameters:

props (Dict[str, Any])

Return type:

None

weaviate.collections.batch.grpc_batch_objects._serialize_primitive(value)[source]
Parameters:

value (Any)

Return type:

Any

weaviate.collections.batch.rest

class weaviate.collections.batch.rest._BatchREST(consistency_level)[source]

Bases: object

Parameters:

consistency_level (ConsistencyLevel | None)

references(connection, *, references)[source]
Parameters:
  • connection (ConnectionSync | ConnectionAsync)

  • references (List[_BatchReference])

Return type:

BatchReferenceReturn | Awaitable[BatchReferenceReturn]