更新引擎中的权重。更多...

#include <NvInferRuntime.h>

nvinfer1::IRefitter 的继承关系图

公共成员函数
virtual	~IRefitter () noexcept=default

bool	setWeights (char const *layerName, WeightsRole role, Weights weights) noexcept
	为给定名称的层指定新权重。成功返回 true，如果新权重被拒绝则返回 false。拒绝的可能原因：更多...

bool	refitCudaEngine () noexcept
	重拟合关联的引擎。更多...

int32_t	getMissing (int32_t size, char const *layerNames, WeightsRole roles) noexcept
	获取缺失权重的描述。更多...

int32_t	getAll (int32_t size, char const *layerNames, WeightsRole roles) noexcept
	获取可以重拟合的所有权重的描述。更多...

TRT_DEPRECATED bool	setDynamicRange (char const *tensorName, float min, float max) noexcept

TRT_DEPRECATED float	getDynamicRangeMin (char const *tensorName) const noexcept
	获取动态范围的最小值。更多...

TRT_DEPRECATED float	getDynamicRangeMax (char const *tensorName) const noexcept
	获取动态范围的最大值。更多...

TRT_DEPRECATED int32_t	getTensorsWithDynamicRange (int32_t size, char const **tensorNames) const noexcept
	获取所有具有可重拟合动态范围的张量的名称。更多...

void	setErrorRecorder (IErrorRecorder *recorder) noexcept
	为此接口设置 ErrorRecorder。更多...

IErrorRecorder *	getErrorRecorder () const noexcept
	获取分配给此接口的 ErrorRecorder。更多...

bool	setNamedWeights (char const *name, Weights weights) noexcept
	指定给定名称的新权重。更多...

int32_t	getMissingWeights (int32_t size, char const **weightsNames) noexcept
	获取缺失权重的名称。更多...

int32_t	getAllWeights (int32_t size, char const **weightsNames) noexcept
	获取可以重拟合的所有权重的名称。更多...

ILogger *	getLogger () const noexcept
	获取创建 refitter 时使用的 logger 更多...

bool	setMaxThreads (int32_t maxThreads) noexcept
	设置最大线程数。更多...

int32_t	getMaxThreads () const noexcept
	获取 refitter 可以使用的最大线程数。更多...

bool	setNamedWeights (char const *name, Weights weights, TensorLocation location) noexcept
	在给定名称的指定设备上指定新权重。更多...

Weights	getNamedWeights (char const *weightsName) const noexcept
	获取与给定名称关联的权重。更多...

TensorLocation	getWeightsLocation (char const *weightsName) const noexcept
	获取与给定名称关联的权重的location。更多...

bool	unsetNamedWeights (char const *weightsName) noexcept
	取消设置与给定名称关联的权重。更多...

void	setWeightsValidation (bool weightsValidation) noexcept
	设置在重拟合期间是否验证权重。更多...

bool	getWeightsValidation () const noexcept
	获取在重拟合期间是否验证权重值。更多...

bool	refitCudaEngineAsync (cudaStream_t stream) noexcept
	在给定流上排队关联引擎的权重重拟合。更多...

Weights	getWeightsPrototype (char const *weightsName) const noexcept
	获取与给定名称关联的 Weights 原型。更多...

保护属性
apiv::VRefitter *	mImpl

附加的继承成员
继承自 nvinfer1::INoCopy 的保护成员函数
	INoCopy ()=default

virtual	~INoCopy ()=default

	INoCopy (INoCopy const &other)=delete

INoCopy &	operator= (INoCopy const &other)=delete

	INoCopy (INoCopy &&other)=delete

INoCopy &	operator= (INoCopy &&other)=delete

详细描述

更新引擎中的权重。

警告: 请勿从此类继承，这样做会破坏 API 和 ABI 的向前兼容性。

构造函数 & 析构函数文档

◆ ~IRefitter()

virtual nvinfer1::IRefitter::~IRefitter ( )

virtualdefaultnoexcept

成员函数文档

◆ getAll()

int32_t nvinfer1::IRefitter::getAll	(	int32_t	size,
		char const **	layerNames,
		WeightsRole *	roles
	)

inlinenoexcept

获取可以重拟合的所有权重的描述。

参数

size	可以安全写入非空 layerNames 或 roles 的项目数。
layerNames	写入层名称的位置。
roles	写入权重角色的位置。

返回值: 可以重拟合的 Weights 的数量。

如果 layerNames!=nullptr，则每个写入的指针都指向被重拟合的引擎拥有的字符串，并且在引擎销毁时变为无效。

◆ getAllWeights()

int32_t nvinfer1::IRefitter::getAllWeights	(	int32_t	size,
		char const **	weightsNames
	)

inlinenoexcept

获取可以重拟合的所有权重的名称。

参数

size	可以安全写入的权重名称的数量。
weightsNames	要更新的权重的名称，或用于未命名权重的 nullptr。

返回值: 可以重拟合的 Weights 的数量。

如果 layerNames!=nullptr，则每个写入的指针都指向被重拟合的引擎拥有的字符串，并且在引擎销毁时变为无效。

◆ getDynamicRangeMax()

TRT_DEPRECATED float nvinfer1::IRefitter::getDynamicRangeMax ( char const * tensorName ) const

inlinenoexcept

获取动态范围的最大值。

返回值: 动态范围的最大值。

如果从未设置动态范围，则返回校准期间计算的最大值。

警告: 字符串 tensorName 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

已弃用: 在 TensorRT 10.1 中已弃用。被显式量化取代。

◆ getDynamicRangeMin()

TRT_DEPRECATED float nvinfer1::IRefitter::getDynamicRangeMin ( char const * tensorName ) const

inlinenoexcept

获取动态范围的最小值。

返回值: 动态范围的最小值。

如果从未设置动态范围，则返回校准期间计算的最小值。

警告: 字符串 tensorName 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

已弃用: 在 TensorRT 10.1 中已弃用。被显式量化取代。

◆ getErrorRecorder()

IErrorRecorder * nvinfer1::IRefitter::getErrorRecorder ( ) const

inlinenoexcept

获取分配给此接口的 ErrorRecorder。

检索给定类已分配的错误记录器对象。如果尚未设置错误处理程序，则将返回 nullptr。

返回值: 指向已注册的 IErrorRecorder 对象的指针。

另请参阅: setErrorRecorder()

◆ getLogger()

ILogger * nvinfer1::IRefitter::getLogger ( ) const

inlinenoexcept

获取创建 refitter 时使用的 logger

返回值: logger

◆ getMaxThreads()

int32_t nvinfer1::IRefitter::getMaxThreads ( ) const

inlinenoexcept

获取 refitter 可以使用的最大线程数。

检索 refitter 可以使用的最大线程数。

返回值: refitter 可以使用的最大线程数。

另请参阅: setMaxThreads()

◆ getMissing()

int32_t nvinfer1::IRefitter::getMissing	(	int32_t	size,
		char const **	layerNames,
		WeightsRole *	roles
	)

inlinenoexcept

获取缺失权重的描述。

例如，如果已设置某些 Weights，但引擎以组合权重的方式进行了优化，则组合中任何未提供的 Weights 都被认为是缺失的。

参数

size	可以安全写入非空 layerNames 或 roles 的项目数。
layerNames	写入层名称的位置。
roles	写入权重角色的位置。

返回值: 缺失 Weights 的数量。

如果 layerNames!=nullptr，则每个写入的指针都指向被重拟合的引擎拥有的字符串，并且在引擎销毁时变为无效。

◆ getMissingWeights()

int32_t nvinfer1::IRefitter::getMissingWeights	(	int32_t	size,
		char const **	weightsNames
	)

inlinenoexcept

获取缺失权重的名称。

例如，如果已设置某些 Weights，但引擎以组合权重的方式进行了优化，则组合中任何未提供的 Weights 都被认为是缺失的。

参数

size	可以安全写入的权重名称的数量。
weightsNames	要更新的权重的名称，或用于未命名权重的 nullptr。

返回值: 缺失 Weights 的数量。

如果 layerNames!=nullptr，则每个写入的指针都指向被重拟合的引擎拥有的字符串，并且在引擎销毁时变为无效。

◆ getNamedWeights()

Weights nvinfer1::IRefitter::getNamedWeights ( char const * weightsName ) const

inlinenoexcept

获取与给定名称关联的权重。

参数

weightsName 要重拟合的权重的名称。

返回值: 与给定名称关联的 Weights。

如果从未设置权重，则返回空权重并将错误报告给 refitter errorRecorder。

警告: 字符串 weightsName 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

◆ getTensorsWithDynamicRange()

TRT_DEPRECATED int32_t nvinfer1::IRefitter::getTensorsWithDynamicRange	(	int32_t	size,
		char const **	tensorNames
	)		const

inlinenoexcept

获取所有具有可重拟合动态范围的张量的名称。

参数

size	可以安全写入非空 tensorNames 的项目数。
tensorNames	写入层名称的位置。

返回值: 可以重拟合的 Weights 的数量。

如果 tensorNames!=nullptr，则每个写入的指针都指向被重拟合的引擎拥有的字符串，并且在引擎销毁时变为无效。

已弃用: 在 TensorRT 10.1 中已弃用。被显式量化取代。

◆ getWeightsLocation()

TensorLocation nvinfer1::IRefitter::getWeightsLocation ( char const * weightsName ) const

inlinenoexcept

获取与给定名称关联的权重的 location。

参数

weightsName 要重拟合的权重的名称。

返回值: 与给定名称关联的权重的 Location。

如果从未设置权重，则返回 TensorLocation::kHOST 并将错误报告给 refitter errorRecorder。

警告: 字符串 weightsName 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

◆ getWeightsPrototype()

Weights nvinfer1::IRefitter::getWeightsPrototype ( char const * weightsName ) const

inlinenoexcept

获取与给定名称关联的 Weights 原型。

参数

weightsName 要重拟合的权重的名称。

返回值: 与给定名称关联的 Weights 原型。

权重原型的类型和计数与用于引擎构建的权重相同。对于权重原型，values 属性为 nullptr。当权重名称为 nullptr 或与任何可重拟合权重不对应时，权重原型的计数为 -1。

警告: 字符串 weightsName 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

◆ getWeightsValidation()

bool nvinfer1::IRefitter::getWeightsValidation ( ) const

inlinenoexcept

获取在重拟合期间是否验证权重值。

◆ refitCudaEngine()

bool nvinfer1::IRefitter::refitCudaEngine ( )

inlinenoexcept

重拟合关联的引擎。

返回值: 成功时为 True，如果新权重验证失败或调用前 getMissingWeights() != 0 则为 false。如果返回 false，则可能已重拟合部分权重。

如果引擎有待处理的排队工作，则行为未定义。CPU 或 GPU 上提供的权重可以在 refitCudaEngine 返回后取消设置并释放，或更新。

与引擎关联的 IExecutionContexts 在之后仍然有效使用。无需为多次重拟合调用重复设置相同的权重，因为可以直接更新权重内存。

◆ refitCudaEngineAsync()

bool nvinfer1::IRefitter::refitCudaEngineAsync ( cudaStream_t stream )

inlinenoexcept

在给定流上排队关联引擎的权重重拟合。

参数

stream 用于排队权重更新任务的流。

返回值: 成功时为 True，如果新权重验证失败或调用前 getMissingWeights() != 0 则为 false。如果返回 false，则可能已重拟合部分权重。

如果引擎在与提供的流不同的流上有待处理的排队工作，则行为未定义。CPU 上提供的权重可以在 refitCudaEngineAsync 返回后取消设置并释放，或更新。GPU 上提供的权重的释放或更新可以在 refitCudaEngineAsync 返回后在同一流上排队。

与引擎关联的 IExecutionContexts 在之后仍然有效使用。无需为多次重拟合调用重复设置相同的权重，因为可以直接更新权重内存。权重更新任务应使用与重拟合调用相同的流。

◆ setDynamicRange()

TRT_DEPRECATED bool nvinfer1::IRefitter::setDynamicRange	(	char const *	tensorName,
		float	min,
		float	max
	)

inlinenoexcept

更新张量的动态范围。

参数

tensorName	网络中 ITensor 的名称。
min	张量的动态范围的最小值。
max	张量的动态范围的最大值。

返回值: 成功时为 True；否则为 false。

如果不存在从该名称的网络张量派生的 Int8 引擎张量，则返回 false。如果成功，则 getMissing 可能会报告需要提供一些权重。

警告: 字符串 tensorName 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

已弃用: 在 TensorRT 10.1 中已弃用。被显式量化取代。

◆ setErrorRecorder()

void nvinfer1::IRefitter::setErrorRecorder ( IErrorRecorder * recorder )

inlinenoexcept

为此接口设置 ErrorRecorder。

将 ErrorRecorder 分配给此接口。ErrorRecorder 将跟踪执行期间的所有错误。此函数将至少调用一次已注册 ErrorRecorder 的 incRefCount。将 recorder 设置为 nullptr 将取消注册接口的记录器，如果已注册记录器，则会导致调用 decRefCount。

如果未设置错误记录器，则消息将发送到全局日志流。

参数

recorder 要向此接口注册的错误记录器。

另请参阅: getErrorRecorder()

◆ setMaxThreads()

bool nvinfer1::IRefitter::setMaxThreads ( int32_t maxThreads )

inlinenoexcept

设置最大线程数。

参数

maxThreads refitter 可以使用的最大线程数。

返回值: 成功时为 True，否则为 false。

默认值为 1，包括当前线程。大于 1 的值允许 TensorRT 使用多线程算法。小于 1 的值会触发 kINVALID_ARGUMENT 错误。

◆ setNamedWeights() [1/2]

bool nvinfer1::IRefitter::setNamedWeights	(	char const *	name,
		Weights	weights
	)

inlinenoexcept

指定给定名称的新权重。

参数

name	要重拟合的权重的名称。
weights	要与名称关联的新权重。

成功返回 true，如果新权重被拒绝则返回 false。拒绝的可能原因包括

权重名称为 nullptr 或与任何可重拟合权重不对应。
权重的计数与使用相同名称调用 getWeightsPrototype() 返回的计数不一致。
权重的类型与使用相同名称调用 getWeightsPrototype() 返回的类型不一致。

在方法 refitCudaEngine 或 refitCudaEngineAsync 返回之前修改权重将导致未定义的行为。

警告: 字符串 name 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

◆ setNamedWeights() [2/2]

bool nvinfer1::IRefitter::setNamedWeights	(	char const *	name,
		Weights	weights,
		TensorLocation	location
	)

inlinenoexcept

在给定名称的指定设备上指定新权重。

参数

name	要重拟合的权重的名称。
weights	指定设备上的新权重。
location	新权重的 location（主机 vs. 设备）。

返回值: 成功返回 true，如果新权重被拒绝则返回 false。拒绝的可能原因包括

权重名称为 nullptr 或与任何可重拟合权重不对应。
权重的计数与使用相同名称调用 getWeightsPrototype() 返回的计数不一致。
权重的类型与使用相同名称调用 getWeightsPrototype() 返回的类型不一致。

允许在 CPU 上提供一些权重，在 GPU 上提供另一些权重。在方法 refitCudaEngine() 或 refitCudaEngineAsync() 完成之前修改权重将导致未定义的行为。

警告: 字符串 name 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

◆ setWeights()

bool nvinfer1::IRefitter::setWeights	(	char const *	layerName,
		WeightsRole	role,
		Weights	weights
	)

inlinenoexcept

为给定名称的层指定新权重。成功返回 true，如果新权重被拒绝则返回 false。拒绝的可能原因包括

不存在具有该名称的层。
该层没有具有指定角色的权重。
权重的计数与该层的原始规范不一致。
权重的类型与该层的原始规范不一致。

在方法 refitCudaEngine 或 refitCudaEngineAsync 返回之前修改权重将导致未定义的行为。

警告: 字符串 layerName 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

◆ setWeightsValidation()

void nvinfer1::IRefitter::setWeightsValidation ( bool weightsValidation )

inlinenoexcept

设置在重拟合期间是否验证权重。

参数

weightsValidation 指示在重拟合期间是否验证权重。

设置为 true 时，TensorRT 将在 refit 调用中验证 FP32 到 FP16/BF16 权重转换或稀疏化权重期间的权重。如果提供的权重对于某些权重转换不正确，TensorRT 将发出警告并继续进行次要问题的转换（例如缩小转换期间的溢出），或者发出错误并停止严重问题的重拟合过程（例如稀疏化密集权重）。默认情况下，该标志为 true。将该标志设置为 false 可以提高重拟合性能。

◆ unsetNamedWeights()

bool nvinfer1::IRefitter::unsetNamedWeights ( char const * weightsName )

inlinenoexcept

取消设置与给定名称关联的权重。

参数

weightsName 要重拟合的权重的名称。

返回值: 如果从未设置权重，则返回 False，否则返回 true。

在释放权重之前取消设置权重。

警告: 字符串 weightsName 必须以 null 结尾，并且包括终止符在内最多为 4096 字节。

成员数据文档

◆ mImpl

apiv::VRefitter* nvinfer1::IRefitter::mImpl

protected

此类文档从以下文件生成

NvInferRuntime.h

公共成员函数

保护属性

附加的继承成员

详细描述

构造函数 & 析构函数文档

◆ ~IRefitter()

成员函数文档

◆ getAll()

◆ getAllWeights()

◆ getDynamicRangeMax()

◆ getDynamicRangeMin()

◆ getErrorRecorder()

◆ getLogger()

◆ getMaxThreads()

◆ getMissing()

◆ getMissingWeights()

◆ getNamedWeights()

◆ getTensorsWithDynamicRange()

◆ getWeightsLocation()

◆ getWeightsPrototype()

◆ getWeightsValidation()

◆ refitCudaEngine()

◆ refitCudaEngineAsync()

◆ setDynamicRange()

◆ setErrorRecorder()

◆ setMaxThreads()

◆ setNamedWeights() [1/2]

◆ setNamedWeights() [2/2]

◆ setWeights()

◆ setWeightsValidation()

◆ unsetNamedWeights()

成员数据文档

◆ mImpl