tritonclient.utils#

函数

`deserialize_bf16_tensor`(encoded_tensor)	将编码的 bf16 张量反序列化为 dtype 为 python 对象的 numpy 数组
`deserialize_bytes_tensor`(encoded_tensor)	将编码的字节张量反序列化为 dtype 为 python 对象的 numpy 数组
`np_to_triton_dtype`(np_dtype)
`raise_error`(msg)	使用提供的消息引发错误
`serialize_bf16_tensor`(input_tensor)	将 bfloat16 张量序列化为字节的扁平 numpy 数组。
`serialize_byte_tensor`(input_tensor)	将字节张量序列化为长度前缀字节的扁平 numpy 数组。
`serialized_byte_size`(tensor_value)	获取 numpy ndarray 的底层字节数。
`triton_to_np_dtype`(dtype)

异常

指示非成功状态的异常。

exception tritonclient.utils.InferenceServerException(msg, status=None, debug_details=None)#

指示非成功状态的异常。

参数:

debug_details()#

获取有关异常的详细信息以进行调试

message()#

获取异常消息。

status()#

获取异常的状态。

tritonclient.utils.deserialize_bf16_tensor(encoded_tensor)#

将编码的 bf16 张量反序列化为 dtype 为 python 对象的 numpy 数组

tritonclient.utils.deserialize_bytes_tensor(encoded_tensor)#

将编码的字节张量反序列化为 dtype 为 python 对象的 numpy 数组

tritonclient.utils.serialize_bf16_tensor(input_tensor)#

将 bfloat16 张量序列化为字节的扁平 numpy 数组。numpy 数组应使用 np.float32 的 dtype。

tritonclient.utils.serialize_byte_tensor(input_tensor)#

将字节张量序列化为长度前缀字节的扁平 numpy 数组。numpy 数组应使用 np.object 的 dtype。对于 np.bytes，numpy 将删除字节序列末尾的尾随零，因此应避免使用它。

tritonclient.utils.serialized_byte_size(tensor_value)#

获取 numpy ndarray 的底层字节数。

模块

`cuda_shared_memory`
`shared_memory`