
.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "tutorials/_rendered_examples/dynamo/torch_compile_resnet_example.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_tutorials__rendered_examples_dynamo_torch_compile_resnet_example.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_tutorials__rendered_examples_dynamo_torch_compile_resnet_example.py:


.. _torch_compile_resnet:

Compiling ResNet with dynamic shapes using the `torch.compile` backend
==========================================================

This interactive script is intended as a sample of the Torch-TensorRT workflow with `torch.compile` on a ResNet model.

.. GENERATED FROM PYTHON SOURCE LINES 10-12

Imports and Model Definition
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 12-17

.. code-block:: python


    import torch
    import torch_tensorrt
    import torchvision.models as models


.. GENERATED FROM PYTHON SOURCE LINES 18-23

.. code-block:: python


    # Initialize model with half precision and sample inputs
    model = models.resnet18(pretrained=True).half().eval().to("cuda")
    inputs = [torch.randn((1, 3, 224, 224)).to("cuda").half()]


.. GENERATED FROM PYTHON SOURCE LINES 24-26

Optional Input Arguments to `torch_tensorrt.compile`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 26-43

.. code-block:: python


    # Enabled precision for TensorRT optimization
    enabled_precisions = {torch.half}

    # Whether to print verbose logs
    debug = True

    # Workspace size for TensorRT
    workspace_size = 20 << 30

    # Maximum number of TRT Engines
    # (Lower value allows more graph segmentation)
    min_block_size = 7

    # Operations to Run in Torch, regardless of converter support
    torch_executed_ops = {}


.. GENERATED FROM PYTHON SOURCE LINES 44-46

Compilation with `torch_tensorrt.compile`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 46-59

.. code-block:: python


    # Build and compile the model with torch.compile, using Torch-TensorRT backend
    optimized_model = torch_tensorrt.compile(
        model,
        ir="torch_compile",
        inputs=inputs,
        enabled_precisions=enabled_precisions,
        debug=debug,
        workspace_size=workspace_size,
        min_block_size=min_block_size,
        torch_executed_ops=torch_executed_ops,
    )


.. GENERATED FROM PYTHON SOURCE LINES 60-62

Equivalently, we could have run the above via the torch.compile frontend, as so:
`optimized_model = torch.compile(model, backend="torch_tensorrt", options={"enabled_precisions": enabled_precisions, ...}); optimized_model(*inputs)`

.. GENERATED FROM PYTHON SOURCE LINES 64-66

Inference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 66-71

.. code-block:: python


    # Does not cause recompilation (same batch size as input)
    new_inputs = [torch.randn((1, 3, 224, 224)).half().to("cuda")]
    new_outputs = optimized_model(*new_inputs)


.. GENERATED FROM PYTHON SOURCE LINES 72-77

.. code-block:: python


    # Does cause recompilation (new batch size)
    new_batch_size_inputs = [torch.randn((8, 3, 224, 224)).half().to("cuda")]
    new_batch_size_outputs = optimized_model(*new_batch_size_inputs)


.. GENERATED FROM PYTHON SOURCE LINES 78-80

Avoid recompilation by specifying dynamic shapes before Torch-TRT compilation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 80-120

.. code-block:: python


    # The following code illustrates the workflow using ir=torch_compile (which uses torch.compile under the hood)
    inputs_bs8 = torch.randn((8, 3, 224, 224)).half().to("cuda")
    # This indicates dimension 0 of inputs_bs8 is dynamic whose range of values is [2, 16]
    torch._dynamo.mark_dynamic(inputs_bs8, 0, min=2, max=16)
    optimized_model = torch_tensorrt.compile(
        model,
        ir="torch_compile",
        inputs=inputs_bs8,
        enabled_precisions=enabled_precisions,
        debug=debug,
        workspace_size=workspace_size,
        min_block_size=min_block_size,
        torch_executed_ops=torch_executed_ops,
    )
    outputs_bs8 = optimized_model(inputs_bs8)

    # No recompilation happens for batch size = 12
    inputs_bs12 = torch.randn((12, 3, 224, 224)).half().to("cuda")
    outputs_bs12 = optimized_model(inputs_bs12)

    # The following code illustrates the workflow using ir=dynamo (which uses torch.export APIs under the hood)
    # dynamic shapes for any inputs are specified using torch_tensorrt.Input API
    compile_spec = {
        "inputs": [
            torch_tensorrt.Input(
                min_shape=(1, 3, 224, 224),
                opt_shape=(8, 3, 224, 224),
                max_shape=(16, 3, 224, 224),
                dtype=torch.half,
            )
        ],
        "enabled_precisions": enabled_precisions,
        "ir": "dynamo",
    }
    trt_model = torch_tensorrt.compile(model, **compile_spec)

    # No recompilation happens for batch size = 12
    inputs_bs12 = torch.randn((12, 3, 224, 224)).half().to("cuda")
    outputs_bs12 = trt_model(inputs_bs12)


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.000 seconds)


.. _sphx_glr_download_tutorials__rendered_examples_dynamo_torch_compile_resnet_example.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example




    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: torch_compile_resnet_example.py <torch_compile_resnet_example.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: torch_compile_resnet_example.ipynb <torch_compile_resnet_example.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_
