ReIdentificationNet - NVIDIA 文档

ReIdentificationNet 接受来自不同视角的行人裁剪图像作为网络输入，并输出该行人的嵌入特征。这些嵌入特征用于执行相似性匹配，以重新识别同一个人。当前版本中支持的模型基于 ResNet，由于其高精度，ResNet 是最常用的重新识别基线。

以下是训练 ReIdentificationNet 的预计时间

骨干网络类型	GPU 类型	训练图像数量	图像大小	身份数量	批大小	总 Epoch 数	总训练时间
Resnet50	1 x Nvidia A100 - 80GB PCIE	13,000	256x128x3	751	128	120	约 1.25 小时
Resnet50	1 x Nvidia Quadro GV100 - 32GB	13,000	256x128x3	751	64	120	约 2.5 小时

ReIdentificationNet 的数据输入

TAO 中的 ReIdentificationNet 应用程序期望数据采用 Market-1501 格式进行训练和评估。

有关 Market-1501 数据格式的更多信息，请参阅数据标注格式页面。

创建实验规格文件

ReIdentificationNet 的规格文件包括 model、dataset、re_ranking 和 train 参数。以下是在 Market-1501 上训练 ResNet 模型的示例规格，该数据集的训练集中包含 751 个身份。

复制
已复制!

            
            results_dir: "/path/to/experiment_results"
encryption_key: nvidia_tao
model:
  backbone: resnet_50
  last_stride: 1
  pretrain_choice: imagenet
  pretrained_model_path: "/path/to/pretrained_model.pth"
  input_channels: 3
  input_width: 128
  input_height: 256
  neck: bnneck
  feat_dim: 256
  neck_feat: after
  metric_loss_type: triplet
  with_center_loss: False
  with_flip_feature: False
  label_smooth: True
dataset:
  train_dataset_dir: "/path/to/train_dataset_dir"
  test_dataset_dir: "/path/to/test_dataset_dir"
  query_dataset_dir: "/path/to/query_dataset_dir"
  num_classes: 751
  batch_size: 64
  val_batch_size: 128
  num_workers: 1
  pixel_mean: [0.485, 0.456, 0.406]
  pixel_std: [0.226, 0.226, 0.226]
  padding: 10
  prob: 0.5
  re_prob: 0.5
  sampler: softmax_triplet
  num_instances: 4
re_ranking:
  re_ranking: True
  k1: 20
  k2: 6
  lambda_value: 0.3
train:
  results_dir: "${results_dir}/train"
  optim:
    name: Adam
    lr_monitor: val_loss
    steps: [40, 70]
    gamma: 0.1
    bias_lr_factor: 1
    weight_decay: 0.0005
    weight_decay_bias: 0.0005
    warmup_factor: 0.01
    warmup_iters: 10
    warmup_method: linear
    base_lr: 0.00035
    momentum: 0.9
    center_loss_weight: 0.0005
    center_lr: 0.5
    triplet_loss_margin: 0.3
  num_epochs: 10
  checkpoint_interval: 5
  validation_interval: 5
  seed: 1234

参数	数据类型	默认值	描述	支持的值
`model`	dict config	–	模型架构的配置
`dataset`	dict config	–	数据集的配置
`train`	dict config	–	训练任务的配置
`evaluate`	dict config	–	评估任务的配置
`inference`	dict config	–	推理任务的配置
`encryption_key`	string	None	用于加密和解密模型文件的加密密钥
`results_dir`	string	/results	保存实验结果的目录
`export`	dict config	–	ONNX 导出任务的配置
`re_ranking`	dict config	–	重排序模块的配置

model

model 参数提供了更改 ReIdentificationNet 架构的选项。

复制
已复制!

            
            model:
  backbone: resnet_50
  last_stride: 1
  pretrain_choice: imagenet
  pretrained_model_path: "/path/to/pretrained_model.pth"
  input_channels: 3
  input_width: 128
  input_height: 256
  neck: bnneck
  feat_dim: 256
  neck_feat: after
  metric_loss_type: triplet
  with_center_loss: False
  with_flip_feature: False
  label_smooth: True

参数	数据类型	默认值	描述	支持的值
`backbone`	string	resnet_50	模型类型，可以是 resnet_50 或基于 Swin 的架构（有关更多详细信息，请参阅 ReIdentificationNet Transformer）	“resnet_50”, “swin_base_patch4_window7_224”, “swin_small_patch4_window7_224, “swin_tiny_patch4_window7_224”
`last_stride`	unsigned int	1	卷积期间的步幅数	>0
`pretrain_choice`	string	imagenet	预训练网络	imagenet/self/””
`pretrained_model_path`	string		预训练模型的路径
`input_channels`	unsigned int	3	输入通道数	>0
`input_width`	int	128	输入图像的宽度	>0
`input_height`	int	256	输入图像的高度	>0
`neck`	string	bnneck	指定是否使用 BNNeck 进行训练	bnneck/””
`feat_dim`	unsigned int	256	特征嵌入的输出大小	>0
`neck_feat`	string	after	指定用于测试的 BNNeck 的哪个特征	before/after
`metric_loss_type`	string	triplet	度量损失的类型	triplet/center/triplet_center
`with_center_loss`	bool	False	指定是否启用中心损失	True/False
`with_flip_feature`	bool	False	指定是否启用图像翻转	True/False
`label_smooth`	bool	True	指定是否启用标签平滑	True/False

dataset

dataset 参数定义数据集源、训练批大小和数据增强。

复制
已复制!

            
            dataset:
  train_dataset_dir: "/path/to/train_dataset_dir"
  test_dataset_dir: "/path/to/test_dataset_dir"
  query_dataset_dir: "/path/to/query_dataset_dir"
  num_classes: 751
  batch_size: 64
  val_batch_size: 128
  num_workers: 1
  pixel_mean: [0.485, 0.456, 0.406]
  pixel_std: [0.226, 0.226, 0.226]
  padding: 10
  prob: 0.5
  re_prob: 0.5
  sampler: softmax_triplet
  num_instances: 4

参数	数据类型	默认值	描述	支持的值
`train_dataset_dir`	string		训练图像的路径
`test_dataset_dir`	string		测试图像的路径
`query_dataset_dir`	string		查询图像的路径
`num_classes`	unsigned int	751	唯一人员 ID 的数量	>0
`batch_size`	unsigned int	64	训练的批大小	>0
`val_batch_size`	unsigned int	128	验证的批大小	>0
`num_workers`	unsigned int	1	并行处理数据的工作进程数	>0
`pixel_mean`	float list	[0.485, 0.456, 0.406]	用于图像归一化的像素均值	float list
`pixel_std`	float list	[0.226, 0.226, 0.226]	用于图像归一化的像素标准差	float list
`padding`	unsigned int	10	图像周围的像素填充大小，用于图像增强	>=1
`prob`	float	0.5	用于图像增强的随机水平翻转概率	>0
`re_prob`	float	0.5	用于图像增强的随机擦除概率	>0
`sampler`	string	softmax_triplet	用于数据加载的采样器类型	softmax/triplet/softmax_triplet
`num_instances`	unsigned int	4	批次中同一人的图像实例数	>0

re_ranking

re_ranking 参数定义了重排序模块的设置。

复制
已复制!

            
            re_ranking:
  re_ranking: True
  k1: 20
  k2: 6
  lambda_value: 0.3

参数	数据类型	默认值	描述	支持的值
`re_ranking`	bool	True	启用重排序模块的标志	True/False
`k1`	unsigned int	20	用于 k 近邻的 k 值	>0
`k2`	unsigned int	6	用于局部查询扩展的 k 值	>0
`lambda_value`	float	0.3	原始距离与 Jaccard 距离组合的权重	>0.0

train

train 参数定义了训练过程的超参数。

复制
已复制!

            
            train:
  optim:
    name: Adam
    lr_monitor: val_loss
    steps: [40, 70]
    gamma: 0.1
    bias_lr_factor: 1
    weight_decay: 0.0005
    weight_decay_bias: 0.0005
    warmup_factor: 0.01
    warmup_iters: 10
    warmup_method: linear
    base_lr: 0.00035
    momentum: 0.9
    center_loss_weight: 0.0005
    center_lr: 0.5
    triplet_loss_margin: 0.3
  num_epochs: 10
  checkpoint_interval: 5
  validation_interval: 5
  seed: 1234

参数	数据类型	默认值	描述	支持的值
`num_gpus`	unsigned int	1	用于分布式训练的 GPU 数量	>0
`gpu_ids`	List[int]	[0]	用于分布式训练的 GPU 索引
`seed`	unsigned int	1234	用于 random、NumPy 和 torch 的随机种子	>0
`num_epochs`	unsigned int	10	运行实验的总 epoch 数	>0
`checkpoint_interval`	unsigned int	1	保存检查点的 epoch 间隔	>0
`validation_interval`	unsigned int	1	运行验证的 epoch 间隔	>0
`resume_training_checkpoint_path`	string		从中恢复训练的中间 PyTorch Lightning 检查点
`results_dir`	string	/results/train	用于保存训练结果的目录
`optim`	dict config		SGD 优化器的配置，包括学习率、学习率调度器、权重衰减等。
`clip_grad_norm`	float	0.0	通过 L2 范数裁剪梯度的量。值为 0.0 表示不裁剪。	>=0

optim

optim 参数定义了训练中 SGD 优化器的配置，包括学习率、学习率调度器和权重衰减。

复制
已复制!

            
            optim:
  name: Adam
  lr_monitor: val_loss
  lr_steps: [40, 70]
  gamma: 0.1
  bias_lr_factor: 1
  weight_decay: 0.0005
  weight_decay_bias: 0.0005
  warmup_factor: 0.01
  warmup_iters: 10
  warmup_method: linear
  base_lr: 0.00035
  momentum: 0.9
  center_loss_weight: 0.0005
  center_lr: 0.5
  triplet_loss_margin: 0.3

参数	数据类型	默认值	描述	支持的值
`name`	string	Adam	优化器的名称	Adam/SGD/Adamax/…
`lr_monitor`	string	val_loss	AutoReduce 调度器的监控值	val_loss/train_loss
`lr_steps`	int list	[40, 70]	用于 `MultiStep` 调度器降低学习率的步数	int list
`gamma`	float	0.1	WarmupMultiStepLR 的衰减率	>0.0
`bias_lr_factor`	float	1	WarmupMultiStepLR 的偏置学习率因子	>=1
`weight_decay`	float	0.0005	优化器的权重衰减系数	>0.0
`weight_decay_bias`	float	0.0005	优化器的权重衰减偏置	>0.0
`warmup_factor`	float	0.01	WarmupMultiStepLR 调度器的预热因子	>0.0
`warmup_iters`	unsigned int	10	WarmupMultiStepLR 调度器的预热迭代次数	>0
`warmup_method`	string	linear	优化器的预热方法	linear/cosine
`base_lr`	float	0.00035	训练的初始学习率	>0.0
`momentum`	float	0.9	WarmupMultiStepLR 优化器的动量	>0.0
`center_loss_weight`	float	0.0005	中心损失的平衡权重	>0.0
`center_lr`	float	0.5	用于学习中心损失中心的 SGD 学习率	>0.0
`triplet_loss_margin`	float	0.3	三重损失的边距值	>0.0

训练模型

使用以下命令运行 ReIdentificationNet 训练

复制
已复制!

            
            tao model re_identification train [-h] -e <experiment_spec>
                            [results_dir=<global_results_dir>]
                            [model.<model_option>=<model_option_value>]
                            [dataset.<dataset_option>=<dataset_option_value>]
                            [train.<train_option>=<train_option_value>]
                            [train.gpu_ids=<gpu indices>]
                            [train.num_gpus=<number of gpus>]

必需参数

-e, --experiment_spec_file：实验规格文件的路径。

可选参数

您可以设置可选参数来覆盖实验规格文件中的选项值。

-h, --help：显示此帮助消息并退出。
model.<model_option>：模型选项。
dataset.<dataset_option>：数据集选项。
re_ranking.<rerank_option>：重排序选项。
train.<train_option>：训练选项。
train.optim.<optim_option>：优化器选项

注意

对于训练、评估和推理，我们为每个相应的任务公开 2 个变量：num_gpus 和 gpu_ids，它们分别默认为 1 和 [0]。如果两者都已传递，但不一致，例如 num_gpus = 1，gpu_ids = [0, 1]，则会修改它们以遵循具有更多 GPU 的设置，例如 num_gpus = 1 -> num_gpus = 2。

检查点和恢复训练

在每个 train.checkpoint_interval，都会保存一个 PyTorch Lightning 检查点。它被称为 model_epoch_<epoch_num>.pth。这些保存在 train.results_dir 中，如下所示

复制
已复制!

            
            $ ls /results/train

'model_epoch_000.pth'
'model_epoch_001.pth'
'model_epoch_002.pth'
'model_epoch_003.pth'
'model_epoch_004.pth'

最新的检查点保存为 reid_model_latest.pth。如果 reid_model_latest.pth 存在于 train.results_dir 中，训练将自动从它恢复。如果提供了 train.resume_training_checkpoint_path，则它将取代前者。

此逻辑的主要含义是，如果您希望从头开始触发全新训练，则可以

评估模型

ReIdentificationNet 的评估指标是平均精度均值和排名准确率。可以使用 evaluate.output_sampled_matches_plot 和 evaluate.output_cmc_curve_plot 参数分别获得采样匹配图和累积匹配特性 (CMC) 曲线。

使用以下命令运行 ReIdentificationNet 评估

复制
已复制!

            
            tao model re_identification evaluate [-h] -e <experiment_spec_file>
                            evaluate.checkpoint=<model to be evaluated>
                            evaluate.output_sampled_matches_plot=<path to the output sampled matches plot>
                            evaluate.output_cmc_curve_plot=<path to the output CMC curve plot>
                            evaluate.test_dataset=<path to test data>
                            evaluate.query_dataset=<path to query data>
                            [evaluate.<evaluate_option>=<evaluate_option_value>]
                            [evaluate.gpu_ids=<gpu indices>]
                            [evaluate.num_gpus=<number of gpus>]

Re-Identification 不支持多 GPU 评估。

必需参数

-e, --experiment_spec_file：用于设置评估实验的实验规格文件
evaluate.checkpoint：.pth 模型
evaluate.output_sampled_matches_plot：采样匹配的绘制文件的路径
evaluate.output_cmc_curve_plot：CMC 曲线的绘制文件的路径
evaluate.test_dataset：测试数据的路径
evaluate.query_dataset：查询数据的路径

可选参数

evaluate.gpu_ids：用于运行评估的 GPU 索引。默认为 [0]。
evaluate.num_gpus：用于运行评估的 GPU 数量。默认为 1。
evaluate.results_dir：用于保存评估结果的目录。默认为 /results/evaluate。

在模型上运行推理

使用以下命令在 ReIdentificationNet 上使用 .tlt 模型运行推理。

复制
已复制!

            
            tao model re_identification inference [-h] -e <experiment_spec>
                            inference.checkpoint=<inference model>
                            inference.output_file=<path to output file>
                            inference.test_dataset=<path to gallery data>
                            inference.query_dataset=<path to query data>
                            [inference.<infer_option>=<infer_option_value>]
                            [inference.gpu_ids=<gpu indices>]
                            [inference.num_gpus=<number of gpus>]

输出是一个 JSON 文件，其中包含所有测试和查询数据的特征嵌入。

Re-Identification 当前不支持多 GPU 推理。

必需参数

-e, --experiment_spec：用于设置推理的实验规格文件
inference.checkpoint：用于执行推理的 .pth 模型
inference.output_file：输出 JSON 文件的路径
inference.test_dataset：测试数据的路径
inference.query_dataset：查询数据的路径

可选参数

inference.gpu_ids：用于运行推理的 GPU 索引。默认为 [0]。
inference.num_gpus：用于运行推理的 GPU 数量。默认为 1。
inference.results_dir：用于保存推理结果的目录。默认为 /results/inference。

预期的输出如下所示

复制
已复制!

            
            [
  {
    "img_path": "/path/to/img1.jpg",
    "embedding": [-0.30, 0.12, 0.13,...]
  },
  {
    "img_path": "/path/to/img2.jpg",
    "embedding": [-0.10, -0.06, -1.85,...]
  },
  ...
  {
    "img_path": "/path/to/imgN.jpg",
    "embedding": [1.41, 0.63, -0.15,...]
  }
]

导出模型

使用以下命令将 ReIdentificationNet 导出为 .onnx 格式以进行部署

复制
已复制!

            
            tao model re_identification export -e <experiment_spec>
                            export.checkpoint=<tlt checkpoint to be exported>
                            export.onnx_file=<path to exported file>
                            [export.gpu_id=<gpu index>]

必需参数

-e, --experiment_spec：用于设置导出的实验规格文件。
export.checkpoint：要导出的 .pth 模型。
export.onnx_file：用于保存导出模型的路径。默认路径与 \*.pth 模型位于同一目录中。

可选参数

export.gpu_id：将用于运行导出的 GPU 的索引。当机器安装了多个 GPU 时，您可以指定此值。请注意，导出只能在单个 GPU 上运行。

以下是使用 ReIdentificationNet 导出命令的示例

部署模型

您可以将训练好的深度学习和计算机视觉模型部署到边缘设备（例如 Jetson Xavier、Jetson Nano 或 Tesla），或通过 NVIDIA GPU 部署到云端。导出的 \*.onnx 模型也可以与 TAO Triton Apps 一起使用。

在 Triton 示例上运行 ReIdentificationNet 推理

TAO Triton Apps 为 ReIdentificationNet 提供了一个推理示例。它使用 TensorRT 引擎，并支持使用查询（probe）图像目录和包含相同身份的测试（gallery）图像目录运行。

要使用此示例，您需要使用 trtexec 从 \*.onnx 模型生成 TensorRT 引擎。

使用 `trtexec` 生成 TensorRT 引擎

有关使用 trtexec 命令生成 TensorRT 引擎的说明，请参阅 ReIdentificationNet 的 trtexec 指南。

运行 Triton 推理示例

您可以在启动 Triton 服务器时使用以下命令生成 TensorRT 引擎

复制
已复制!

            
            bash scripts/start_server.sh

当服务器运行时，您可以使用以下带客户端的命令从查询图像目录和测试图像目录获取结果

复制
已复制!

            
            python tao_client.py <path_to_query_directory> \
                    --test_dir <path_to_test_directory>
                    -m re_identification_tao model \
                    -x 1 \
                    -b 16 \
                    --mode Re_identification \
                    -i https \
                    -u localhost:8000 \
                    --async \
                    --output_path <path_to_output_directory>

注意

服务器将对输入图像目录执行推理。结果将保存为 JSON 文件。以下是 JSON 输出的示例

复制
已复制!

            
            [
  ...,
  {
    "img_path": "/localhome/Data/market1501/query/1121_c3s2_156744_00.jpg",
    "embedding": [-1.1530249118804932, -1.8521332740783691,..., 0.380886435508728]
  },...
  {
    "img_path": "/localhome/Data/market1501/bounding_box_test/1377_c2s3_038007_05.jpg",
    "embedding": [0.09496910870075226, 0.26107653975486755,..., 0.2835155725479126]
  },...
]

使用 Triton 的端到端推理

TAO Triton Apps 为从查询图像目录和测试图像目录进行端到端推理提供了一个示例。该示例下载 Market-1501 数据集并随机采样 100 个身份的子集。客户端隐式地将图像样本转换为数组并将其发送到 Triton 服务器。返回每个图像的特征嵌入并将其保存到 JSON 输出。还会生成采样匹配图像和 CMC 曲线图以进行可视化。

您可以使用以下命令启动 Triton 服务器（只会下载 ReIdentificationNet 模型并将其转换为 TensorRT 引擎）

复制
已复制!

            
            bash scripts/re_id_e2e_inference/start_server.sh

Triton 服务器启动后，打开另一个终端并使用以下命令在查询和测试图像上使用您先前启动的 Triton 服务器实例运行重新识别

复制
已复制!

            
            bash scripts/re_id_e2e_inference/start_client.sh

使用 trtexec 生成 TensorRT 引擎

使用 `trtexec` 生成 TensorRT 引擎