Experimental Settings
All settings listed on this page are experimental and may break the TensorRT conversion process. Non-advanced users should stick with the default settings for maximum compatibility. Please contact us if you require a custom conversion or if you wish to include other experimental settings.
All experimental settings listed in this page are optional settings. Default values that have been tested to be the most compatible for TensorRT conversion and inference will be used if no settings are overriden.
DockerParams
Experimental settings for Docker-related parameters.
Bases
dataclass
Field | Type | Description |
---|---|---|
conversion_docker_image | str | Docker image for conversion. Defaults to "nvcr.io/nvidia/tensorflow:23.04-tf2-py3 ". |
inference_docker_image | str | Docker image for inference. Defaults to "nvcr.io/nvidia/tritonserver:23.04-py3 ". |
device | int | Device to use for conversion. Defaults to 0. |
workspace_path | str | Workspace path. Defaults to "/workspace ". |
dir_perms | str | Directory permissions, one of ["r ", "w ", "rw "]. Defaults to "rw ". |
ConversionParams
INT8 conversion is currently not supported. Please contact us if you require a custom conversion.
Parameters for TensorRT conversion.
Bases
dataclass
Field | Type | Description |
---|---|---|
precision | str | Floating-point precision for TensorRT conversion, either "FP32 " or "FP16 ". Defaults to "FP32 ". |
autoinstall_deps | bool | Whether Polygraphy will automatically install required Python packages at runtime. Defaults to True. |
internal_correctness_checks | bool | Whether internal correctness checks are enabled. Defaults to False. |
builder_optimization_level | int | Optimization level for TensorRT builder in the range [1, 5]. A higher optimization level allows the optimizer to spend more time searching for optimization opportunities. The resulting engine may have better performance compared to an engine built with a lower optimization level, but the conversion time will increase significantly. Defaults to 3. |
precision_constraints | str | If set to “obey ”, require that layers execute in specified precisions. If set to “prefer ”, prefer that layers execute in specified precisions but allow TRT to fall back to other precisions if no implementation exists for the requested precision. Otherwise, precision constraints are ignored. Defaults to "none ". |
timeout | int | Timeout (in seconds) for conversion. Defaults to 1800. |
verbose | bool | Verbose output. Defaults to False. |
log_dir | str | Log directory to store conversion logs. Defaults to ".datature_logs ". |
experimental | ConversionExperimentalParams | Experimental parameters for TensorRT conversion. Defaults to None. |
ConversionExperimentalParams
Experimental settings for TensorRT conversion.
Bases
dataclass
Field | Type | Description |
---|---|---|
sparse_weights | bool | Whether to enable optimizations for sparse weights in TensorRT. Defaults to False. |
version_compatible | bool | Whether to build an engine designed to be forward TensorRT version compatible. Defaults to False. |
error_on_timing_cache_miss | bool | Whether to emit errors when a tactic being timed is not present in the timing cache. Defaults to False. |
load_timing_cache | str | Load specified file containing tactic timing cache used to speed up the TensorRT engine building process. Defaults to None. |
save_timing_cache | str | Save tactic timing cache to specified file. Defaults to None. |
disable_compilation_cache | bool | Whether to disable caching of JIT-compiled code. Defaults to False. |
load_tactics | str | Load specified tactic replay file toverride tactics in TensorRT's default selections. Defaults to None. |
save_tactics | str | Save tactics selected by TensorRT to a specified JSON file. Defaults to None. |
check_params | CheckParams | Parameters for polygraphy check lint . Defaults to None. |
sanitize_params | SanitizeParams | Parameters for polygraphy surgeon sanitize . Defaults to None. |
CheckParams
Experimental settings for polygraphy check lint
, which topologically "lints" an ONNX model to find faulty nodes in the graph.
Bases
dataclass
Field | Type | Description |
---|---|---|
enabled | bool | Enables polygraphy check lint . Defaults to False. |
output_json_path | str | Output JSON path. Defaults to ".datature_logs/<CURRENT_TIME>.json ". |
provider | str | Execution provider for ONNX model loading. Defaults to "cpu ". |
timeout | int | Timeout for lint. Defaults to 1800 seconds. |
verbose | bool | Verbose output. Defaults to False. |
SanitizeParams
Experimental settings for polygraphy surgeon sanitize
, which runs ONNX graph surgeon to clean up and optimize input shapes in an ONNX model.
Bases
dataclass
Field | Type | Description |
---|---|---|
enabled | bool | Enables polygraphy surgeon sanitize . Defaults to False. |
output_model_path | str | Output path to save sanitized model. Defaults to "". |
cleanup | bool | Run dead layer removal on the graph. This is generally not required if other options are set. |
toposort | bool | Topologically sort nodes in the graph. Defaults to False. |
no_shape_inference | bool | Disable ONNX shape inference when loading the model. Defaults to False. |
force_fallback_shape_inference | bool | Force Polygraphy to use ONNX-Runtime to determine metadata for tensors in the graph. This can be useful in cases where ONNX shape inference does not generate correct information. Note that this will cause dynamic dimensions to become static. Defaults to False. |
fold_constants | bool | Fold constants in the graph by computing subgraphs whose values are not dependent on runtime inputs. Defaults to False. |
num_passes | int | The number of constant folding passes to run. Sometimes, subgraphs that compute tensor shapes may not be foldable in a single pass. If not specified, Polygraphy will automatically determine the number of passes required. Defaults to None. |
partitioning | str | Controls how to partition the graph during constant folding:basic : Partition the graph so failures in one part do not affect other parts.recursive : In addition to partitioning the graph, partition partitions where needed.Defaults to None. |
no_fold_shapes | bool | Disable folding Shape nodes and subgraphs that operate on shapes. Defaults to False. |
no_per_pass_shape_inference | bool | Disable shape inference between passes of constant folding. Defaults to False. |
timeout | int | Timeout for model sanitization. Defaults to 1800 seconds. |
verbose | bool | Verbose output. Defaults to False. |
log_dir | str | Log directory to store sanitization logs. Defaults to ".datature_logs ". |
Updated 8 months ago