.. _lowering:

Lowering Phase
===============

The lowering phase is made up out of passes which are operations which map a graph from a high level representation
to a lower level one. Each pass does something specific for instance inlining method calls. The idea is to
significantly reduce what the conversion phase needs to be able to handle when actually mapping to TensorRT.
We aim for closer to 1->1 op conversion vs looking for applicable subgraphs, limiting the number of converters and
reduce the scope of each converter.

You can see the effects of each pass by setting the log level to ``Level::kGraph``

Passes Used
-------------

Eliminate Dead Code
**************************

    `torch/csrc/jit/passes/dead_code_elimination.h <https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/passes/dead_code_elimination.h>`_

Dead code elimination will check if a node has side effects and not delete it if it does.

Eliminate Exeception Or Pass Pattern
***************************************

    `trtorch/core/lowering/passes/exception_elimination.cpp <https://github.com/nvidia/trtorch/blob/master/core/lowering/passes/exception_elimination.cpp>`_

A common pattern in scripted modules are dimension gaurds which will throw execptions if
the input dimension is not what was expected.

.. code-block:: none

    %1013 : bool = aten::ne(%1012, %24) # ~/.local/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py:248:11
        = prim::If(%1013) # ~/.local/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py:248:8
        block0():
            = prim::RaiseException(%23) # ~/.local/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py:249:12
        -> ()
        block1():
        -> ()

Since we are resolving all of this at compile time and there are no execptions in the TensorRT graph, we just remove it.

Eliminate Redundant Gaurds
***************************************

    `torch/csrc/jit/passes/guard_elimination.h <https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/passes/guard_elimination.h>`_

Eliminate redundant guards for ops whose outputs are fully determined by their inputs i.e. if inputs to such ops are
guarded we are allowed to remove a guard on ops' outputs

Freeze Module
***************************************

    `torch/csrc/jit/passes/freeze_module.h <https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/passes/freeze_module.h>`_

Freeze attributes and inline constants and modules. Propogates constants in the graph.

Fuse Linear
***************************************

    `torch/csrc/jit/passes/fuse_linear.h <https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/passes/fuse_linear.h>`_

Match the ``aten::linear`` pattern and fuse it into a single ``aten::linear``
This pass fuse the addmm or matmul + add generated by JIT back to linear

Fuse Flatten Linear
***************************************

    `trtorch/core/lowering/passes/fuse_flatten_linear.cpp <https://github.com/nvidia/trtorch/blob/master/core/lowering/passes/fuse_flatten_linear.cpp>`_

TensorRT implicity flattens input layers into fully connected layers when they are higher than 1D. So when there is a
``aten::flatten`` -> ``aten::linear`` pattern we remove the ``aten::flatten``.

Lower Graph
***************************************

    `torch/csrc/jit/passes/lower_graph.h <https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/passes/lower_graph.h>`_

Given a graph with of a method which first argument is %self, lower it to a graph where
all attributes accesses are replaced with explicit inputs of the graph
(rather than results of prim::GetAttr executed on %self). Returns a tuple
(graph, parameters) where the last module.parameters.size() inputs to the
graph are the trainable parameters used in this method. The remaining inputs
are the true inputs to the function.

Remove Dropout
***************************************

    `trtorch/core/lowering/passes/remove_dropout.cpp <https://github.com/nvidia/trtorch/blob/master/core/lowering/passes/remove_dropout.cpp>`_

Removes dropout operators since we are doing inference.

Unpack AddMM
***************************************

    `trtorch/core/lowering/passes/unpack_addmm.cpp <https://github.com/nvidia/trtorch/blob/master/core/lowering/passes/unpack_addmm.cpp>`_

Unpacks ``aten::addmm`` into ``aten::matmul`` and ``aten::add_`` (with an additional ``trt::const``
op to freeze the bias in the TensorRT graph). This lets us reuse the ``aten::matmul`` and ``aten::add_``
converters instead of needing a dedicated converter.

Unpack LogSoftmax
***************************************

    `trtorch/core/lowering/passes/unpack_log_softmax.cpp <https://github.com/nvidia/trtorch/blob/master/core/lowering/passes/unpack_log_softmax.cpp>`_

Unpacks ``aten::logsoftmax`` into ``aten::softmax`` and ``aten::log``. This lets us reuse the
``aten::softmax`` and ``aten::log`` converters instead of needing a dedicated converter.