从 TAO 3.x(21.08、21.11、22.02、22.05)到 TAO 4.0 存在细微的接口更改。如果您正在使用旧笔记本、已将 TAO 工作流程集成到您自己的应用程序中,或者直接在容器中进行训练,则这可能会影响您。如果您使用TAO 入门指南中的较新笔记本,则这不适用,因为这些笔记本已经更新。
TAO 4.0 已将混合训练-部署容器分离为单独的训练和部署容器。由于训练和部署的库完全不同,这允许快速开发和更新各个组件。
训练容器包含 TensorFlow 和 PyTorch 等深度学习框架,但使训练后的模型准备好部署/推理的库和入口点现在已移至新的部署容器。部署容器现在处理 TensorRT 引擎和 INT8 校准缓存的生成,以及 TensorRT 模型评估和推理。
下图突出显示了与 INT8 校准生成和 TensorRT 模型评估相关的更改。如果您直接从容器进行训练,那么您需要单独拉取 tao-deploy
容器来运行 TensorRT 转换和评估。如果您使用启动器 CLI 或 API,则 CLI 或 API 将自动处理此问题。
TAO TensorFlow1 训练容器:nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
用于 MaskRCNN 和 UNet 的 TAO TensorFlow1 训练容器:nvcr.io/nvidia/tao/tao-toolkit:4.0.1-tf1.15.5
TAO TensorFlow2 训练容器:nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1
TAO PyTorch 训练容器:nvcr.io/nvidia/tao/tao-toolkit:4.0.0-pyt
TAO 部署容器:nvcr.io/nvidia/tao/tao-toolkit:4.0.0-deploy
此更改仅影响下表中的模型。对于其他模型,部署工件仍然包含在训练容器中,并将在未来迁移出来。
TensorFlow 1.x |
TensorFlow 2.x |
PyTorch |
分类 |
分类 |
Deformable DETR |
DetectNet_v2 |
EfficientDet |
Segformer |
DSSD |
|
|
EfficientDet |
|
|
Faster RCNN |
|
|
LPRNet |
|
|
Mask RCNN |
|
|
多任务分类 |
|
|
RetinaNet |
|
|
SSD |
|
|
UNet |
|
|
YOLOv3 |
|
|
YOLOv4 |
|
|
YOLOv4_tiny |
|
|
下表提供了每个网络的详细更改。命令取自 TAO Jupyter 笔记本。已包含最具代表性的网络,并且不包括 4.0 中引入的模型。
网络 |
TAO 3.x (21.08, 21.11, 22.02, 22.05) |
TAO 4.0 |
分类 |
tao classification export \
-m $USER_EXPERIMENT_DIR/output_retrain/weights/resnet_$EPOCH.tlt \
-o $USER_EXPERIMENT_DIR/export/final_model.etlt \
-k $KEY \
--cal_data_file $USER_EXPERIMENT_DIR/export/calibration.tensor \
--data_type int8 \
--batches 10 \
--cal_cache_file $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \
--classmap_json $USER_EXPERIMENT_DIR/output_retrain/classmap.json \
--gen_ds_config
tao converter $USER_EXPERIMENT_DIR/export/final_model.etlt \
-k $KEY \
-c $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \
-o predictions/Softmax \
-d 3,224,224 \
-i nchw \
-m 64 -t int8 \
-e $USER_EXPERIMENT_DIR/export/final_model.trt \
-b 64
|
tao classification_tf1 export \
-m $USER_EXPERIMENT_DIR/output_retrain/weights/resnet_$EPOCH.tlt \
-o $USER_EXPERIMENT_DIR/export/final_model.etlt \
-k $KEY \
--classmap_json $USER_EXPERIMENT_DIR/output_retrain/classmap.json \
--gen_ds_config
tao-deploy classification_tf1 gen_trt_engine \
-m $USER_EXPERIMENT_DIR/export/final_model.etlt \
-e $SPECS_DIR/classification_retrain_spec.cfg \
-k $KEY \
--batch_size 64 \
--max_batch_size 64 \
--batches 10 \
--data_type int8 \
--cal_data_file $USER_EXPERIMENT_DIR/export/calibration.tensor \
--cal_cache_file $USER_EXPERIMENT_DIR/export/final_model_int8_cache.bin \
--cal_image_dir $DATA_DOWNLOAD_DIR/split/test/ \
--engine_file $USER_EXPERIMENT_DIR/export/final_model.trt
|
DetectNet_v2 |
tao detectnet_v2 export \
-m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \
-o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
-k $KEY \
--cal_data_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor \
--data_type int8 \
--batches 10 \
--batch_size 4 \
--max_batch_size 4\
--engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 \
--cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \
--verbose
tao converter $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
-k $KEY \
-c $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \
-o output_cov/Sigmoid,output_bbox/BiasAdd \
-d 3,384,1248 \
-i nchw \
-m 64 \
-t int8 \
-e $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt \
-b 4
|
tao detectnet_v2 export \
-m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \
-e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
-o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
-k $KEY \
--gen_ds_config
tao-deploy detectnet_v2 gen_trt_engine \
-m $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
-k $KEY \
--data_type int8 \
--batches 10 \
--batch_size 4 \
--max_batch_size 64 \
--engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 \
--cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \
-e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
--verbose
|
EfficientDet |
tao efficientdet export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.tlt \
-o $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \
-k $KEY \
-e $SPECS_DIR/efficientdet_d0_retrain.txt \
--batch_size 8 \
--data_type int8 \
--cal_image_dir $DATA_DOWNLOAD_DIR/raw-data/val2017 \
--batches 10 \
--max_batch_size 1 \
--cal_cache_file $USER_EXPERIMENT_DIR/export/efficientdet_d0.cal
tao converter -k $KEY \
-c $USER_EXPERIMENT_DIR/export/trt.int8.cal \
-p image_arrays:0,1x512x512x3,8x512x512x3,16x512x512x3 \
-e $USER_EXPERIMENT_DIR/export/trt.int8.engine \
-t int8 \
-b 8 \
$USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt
|
tao efficientdet_tf1 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.tlt \
-o $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \
-k $KEY \
-e $SPECS_DIR/efficientdet_d0_retrain.txt
tao-deploy efficientdet_tf1 gen_trt_engine -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/model.step-$NUM_STEP.etlt \
-k $KEY \
--batch_size 8 \
--data_type int8 \
--cal_image_dir $DATA_DOWNLOAD_DIR/raw-data/val2017 \
--batches 10 \
--min_batch_size 1 \
--opt_batch_size 8 \
--max_batch_size 16 \
--cal_cache_file $USER_EXPERIMENT_DIR/export/efficientdet_d0.cal \
--engine_file $USER_EXPERIMENT_DIR/export/trt.int8.engine
|
SSD |
tao ssd export --gpu_index=$GPU_INDEX \
-m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/ssd_resnet18_epoch_$EPOCH.tlt \
-o $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \
-e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \
-k $KEY \
--cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \
--data_type int8 \
--batch_size 16 \
--batches 10 \
--cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \
--cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
--gen_ds_config
tao converter -k $KEY \
-d 3,300,300 \
-o NMS \
-c $USER_EXPERIMENT_DIR/export/cal.bin \
-e $USER_EXPERIMENT_DIR/export/trt.engine \
-b 8 \
-m 16 \
-t int8 \
-i nchw \
$USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt
|
tao ssd export --gpu_index=$GPU_INDEX \
-m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/ssd_resnet18_epoch_$EPOCH.tlt \
-k $KEY \
-o $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \
-e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \
--batch_size 16 \
--gen_ds_config
tao-deploy ssd gen_trt_engine --gpu_index=$GPU_INDEX \
-m $USER_EXPERIMENT_DIR/export/ssd_resnet18_epoch_$EPOCH.etlt \
-k $KEY \
-e $SPECS_DIR/ssd_retrain_resnet18_kitti.txt \
--engine_file $USER_EXPERIMENT_DIR/export/trt.engine \
--cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \
--data_type int8 \
--max_batch_size 16 \
--batch_size 16 \
--batches 10 \
--cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \
--cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile
|
UNet |
tao unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.tlt \
-k $KEY \
-e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \
--data_type int8 \
--engine_file $USER_EXPERIMENT_DIR/export/int8.isbi.retrained.engine \
--data_type int8 \
--cal_data_file $USER_EXPERIMENT_DIR/export/isbi_cal_data_file.txt \
--cal_cache_file $USER_EXPERIMENT_DIR/export/isbi_cal.bin \
--cal_image_dir $DATA_DOWNLOAD_DIR/isbi/images/val \
--max_batch_size 3 \
--batch_size 1 \
--gen_ds_config
tao converter -k $KEY \
-c $USER_EXPERIMENT_DIR/export/isbi_cal.bin \
-e $USER_EXPERIMENT_DIR/export/trt.int8.tlt.isbi.engine \
-i nchw \
-t int8 \
-p input_1:0,1x1x320x320,4x1x320x320,16x1x320x320 \
$USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.etlt
|
tao unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.tlt \
-k $KEY \
-e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \
--gen_ds_config
tao-deploy unet gen_trt_engine --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/isbi_experiment_retrain/weights/model_isbi_retrained.etlt \
-k $KEY \
-e $SPECS_DIR/unet_train_resnet_unet_isbi_retrain.txt \
--data_type int8 \
--engine_file $USER_EXPERIMENT_DIR/export/int8.isbi.retrained.engine \
--data_type int8 \
--cal_data_file $USER_EXPERIMENT_DIR/export/isbi_cal_data_file.txt \
--cal_cache_file $USER_EXPERIMENT_DIR/export/isbi_cal.bin \
--cal_image_dir $DATA_DOWNLOAD_DIR/isbi/images/val \
--max_batch_size 3 \
--batch_size 1
|
YOLOv3 |
tao yolo_v3 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov3_resnet18_epoch_$EPOCH.tlt \
-o $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \
-e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \
-k $KEY \
--cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \
--data_type int8 \
--batch_size 16 \
--batches 10 \
--cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \
--cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
--gen_ds_config
tao converter -k $KEY \
-p Input,1x3x384x1248,8x3x384x1248,16x3x384x1248 \
-c $USER_EXPERIMENT_DIR/export/cal.bin \
-e $USER_EXPERIMENT_DIR/export/trt.engine \
-b 8 \
-t int8 \
$USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt
|
tao yolo_v3 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov3_resnet18_epoch_$EPOCH.tlt \
-k $KEY \
-o $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \
-e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \
--gen_ds_config
tao-deploy yolo_v3 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov3_resnet18_epoch_$EPOCH.etlt \
-k $KEY \
-e $SPECS_DIR/yolo_v3_retrain_resnet18_tfrecord.txt \
--cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \
--data_type int8 \
--batch_size 16 \
--min_batch_size 1 \
--opt_batch_size 8 \
--max_batch_size 16 \
--batches 10 \
--cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin \
--cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
--engine_file $USER_EXPERIMENT_DIR/export/trt.engine.int8
|