Experimental Settings

🚧

All settings listed on this page are experimental and may break the TensorRT conversion process. Non-advanced users should stick with the default settings for maximum compatibility. Please contact us if you require a custom conversion or if you wish to include other experimental settings.

📘

All experimental settings listed in this page are optional settings. Default values that have been tested to be the most compatible for TensorRT conversion and inference will be used if no settings are overriden.

DockerParams

Experimental settings for Docker-related parameters.

Bases

dataclass

FieldTypeDescription
conversion_docker_imagestrDocker image for conversion. Defaults to "nvcr.io/nvidia/tensorflow:23.04-tf2-py3".
inference_docker_imagestrDocker image for inference. Defaults to "nvcr.io/nvidia/tritonserver:23.04-py3".
deviceintDevice to use for conversion. Defaults to 0.
workspace_pathstrWorkspace path. Defaults to "/workspace".
dir_permsstrDirectory permissions, one of ["r", "w", "rw"]. Defaults to "rw".

ConversionParams

🚧

INT8 conversion is currently not supported. Please contact us if you require a custom conversion.

Parameters for TensorRT conversion.

Bases

dataclass

FieldTypeDescription
precisionstrFloating-point precision for TensorRT conversion, either "FP32" or "FP16". Defaults to "FP32".
autoinstall_depsboolWhether Polygraphy will automatically install required Python packages at runtime. Defaults to True.
internal_correctness_checksboolWhether internal correctness checks are enabled. Defaults to False.
builder_optimization_levelintOptimization level for TensorRT builder in the range [1, 5]. A higher optimization level allows the optimizer to spend more time searching for optimization opportunities. The resulting engine may have better performance compared to an engine built with a lower optimization level, but the conversion time will increase significantly. Defaults to 3.
precision_constraintsstrIf set to “obey”, require that layers execute in specified precisions. If set to “prefer”, prefer that layers execute in specified precisions but allow TRT to fall back to other precisions if no implementation exists for the requested precision. Otherwise, precision constraints are ignored. Defaults to "none".
timeoutintTimeout (in seconds) for conversion. Defaults to 1800.
verboseboolVerbose output. Defaults to False.
log_dirstrLog directory to store conversion logs. Defaults to ".datature_logs".
experimentalConversionExperimentalParamsExperimental parameters for TensorRT conversion. Defaults to None.

ConversionExperimentalParams

Experimental settings for TensorRT conversion.

Bases

dataclass

FieldTypeDescription
sparse_weightsboolWhether to enable optimizations for sparse weights in TensorRT. Defaults to False.
version_compatibleboolWhether to build an engine designed to be forward TensorRT version compatible. Defaults to False.
error_on_timing_cache_missboolWhether to emit errors when a tactic being timed is not present in the timing cache. Defaults to False.
load_timing_cachestrLoad specified file containing tactic timing cache used to speed up the TensorRT engine building process. Defaults to None.
save_timing_cachestrSave tactic timing cache to specified file. Defaults to None.
disable_compilation_cacheboolWhether to disable caching of JIT-compiled code. Defaults to False.
load_tacticsstrLoad specified tactic replay file toverride tactics in TensorRT's default selections. Defaults to None.
save_tacticsstrSave tactics selected by TensorRT to a specified JSON file. Defaults to None.
check_paramsCheckParamsParameters for polygraphy check lint. Defaults to None.
sanitize_paramsSanitizeParamsParameters for polygraphy surgeon sanitize. Defaults to None.

CheckParams

Experimental settings for polygraphy check lint, which topologically "lints" an ONNX model to find faulty nodes in the graph.

Bases

dataclass

FieldTypeDescription
enabledboolEnables polygraphy check lint. Defaults to False.
output_json_pathstrOutput JSON path. Defaults to ".datature_logs/<CURRENT_TIME>.json".
providerstrExecution provider for ONNX model loading. Defaults to "cpu".
timeoutintTimeout for lint. Defaults to 1800 seconds.
verboseboolVerbose output. Defaults to False.

SanitizeParams

Experimental settings for polygraphy surgeon sanitize, which runs ONNX graph surgeon to clean up and optimize input shapes in an ONNX model.

Bases

dataclass

FieldTypeDescription
enabledboolEnables polygraphy surgeon sanitize. Defaults to False.
output_model_pathstrOutput path to save sanitized model. Defaults to "".
cleanupboolRun dead layer removal on the graph. This is generally not required if other options are set.
toposortboolTopologically sort nodes in the graph. Defaults to False.
no_shape_inferenceboolDisable ONNX shape inference when loading the model. Defaults to False.
force_fallback_shape_inferenceboolForce Polygraphy to use ONNX-Runtime to determine metadata for tensors in the graph. This can be useful in cases where ONNX shape inference does not generate correct information. Note that this will cause dynamic dimensions to become static. Defaults to False.
fold_constantsboolFold constants in the graph by computing subgraphs whose values are not dependent on runtime inputs. Defaults to False.
num_passesintThe number of constant folding passes to run. Sometimes, subgraphs that compute tensor shapes may not be foldable in a single pass. If not specified, Polygraphy will automatically determine the number of passes required. Defaults to None.
partitioningstrControls how to partition the graph during constant folding:
basic: Partition the graph so failures in one part do not affect other parts.
recursive: In addition to partitioning the graph, partition partitions where needed.
Defaults to None.
no_fold_shapesboolDisable folding Shape nodes and subgraphs that operate on shapes. Defaults to False.
no_per_pass_shape_inferenceboolDisable shape inference between passes of constant folding. Defaults to False.
timeoutintTimeout for model sanitization. Defaults to 1800 seconds.
verboseboolVerbose output. Defaults to False.
log_dirstrLog directory to store sanitization logs. Defaults to ".datature_logs".