# `AC Fixed` Sample

This FPGA sample is structured as a tutorial that demonstrates how to use the Algorithmic C (AC) fixed-point data type `ac_fixed` and some best practices.

| Area                 | Description
|:--                   |:--
| What you will learn  | How different methods of `ac_fixed` number construction affect hardware resource utilization. <br> How to access and use the `ac_fixed` math library functions. <br> Use recommended method for constructing `ac_fixed` numbers in your kernel. <br> Understand trade offs between accuracy of results for reduced resource usage on the FPGA.
| Time to complete     | 30 minutes
| Category             | Concepts and Functionality

## Purpose

This FPGA tutorial shows you how to use the `ac_fixed` type to perform fixed-point arithmetic and includes some simple examples.

You can use the fixed-point data type in place of native floating point types to generate area efficient and optimized designs for the FPGA. Operations that do not utilize the full dynamic range of the native types are good candidates for using `ac_fixed` types. For example, multiplying by a number in the range of `(-1.0, 1.0)`.

This tutorial shows the recommended method for constructing an `ac_fixed` number along with some examples of using the fixed point math library functions. Additionally, the tutorial shows the methods and examples can be used to reduce the area of the hardware generated by the compiler by trading off accuracy of the mathematical operations.

## Prerequisites

| Optimized for        | Description
|:---                  |:---
| OS                   | Ubuntu* 20.04 <br> RHEL*/CentOS* 8 <br> SUSE* 15 <br> Windows* 10, 11 <br> Windows Server* 2019
| Hardware             | Intel® Agilex® 7, Agilex® 5, Arria® 10, Stratix® 10, and Cyclone® V FPGAs
| Software             | Intel® oneAPI DPC++/C++ Compiler

> **Note**: Even though the Intel DPC++/C++ oneAPI compiler is enough to compile for emulation, generating reports and generating RTL, there are extra software requirements for the simulation flow and FPGA compiles.
>
> For using the simulator flow, Intel® Quartus® Prime Pro Edition (or Standard Edition when targeting Cyclone® V) and one of the following simulators must be installed and accessible through your PATH:
> - Questa*-Intel® FPGA Edition
> - Questa*-Intel® FPGA Starter Edition
> - ModelSim® SE
>
> When using the hardware compile flow, Intel® Quartus® Prime Pro Edition (or Standard Edition when targeting Cyclone® V) must be installed and accessible through your PATH.

> **Warning** Make sure you add the device files associated with the FPGA that you are targeting to your Intel® Quartus® Prime installation.

This sample is part of the FPGA code samples. It is categorized as a Tier 2 sample that demonstrates a compiler feature.

```mermaid
flowchart LR
   tier1("Tier 1: Get Started")
   tier2("Tier 2: Explore the Fundamentals")
   tier3("Tier 3: Explore the Advanced Techniques")
   tier4("Tier 4: Explore the Reference Designs")

   tier1 --> tier2 --> tier3 --> tier4

   style tier1 fill:#0071c1,stroke:#0071c1,stroke-width:1px,color:#fff
   style tier2 fill:#f96,stroke:#333,stroke-width:1px,color:#fff
   style tier3 fill:#0071c1,stroke:#0071c1,stroke-width:1px,color:#fff
   style tier4 fill:#0071c1,stroke:#0071c1,stroke-width:1px,color:#fff
```

Find more information about how to navigate this part of the code samples in the [FPGA top-level README.md](/DirectProgramming/C++SYCL_FPGA/README.md).
You can also find more information about [troubleshooting build errors](/DirectProgramming/C++SYCL_FPGA/README.md#troubleshooting), [using Visual Studio Code with the code samples](/DirectProgramming/C++SYCL_FPGA/README.md#use-visual-studio-code-vs-code-optional), [links to selected documentation](/DirectProgramming/C++SYCL_FPGA/README.md#documentation), and more.

## Key Implementation Details

The sample illustrates the important concepts.

- Constructing an `ac_fixed` from a `float` or `double` value is much more area intensive than constructing one from another `ac_fixed`.
- The `ac_fixed` math library provides a set of functions for various math operations.
- The functions can be used to trade off accuracy of results for reduced resource usage on the FPGA.
- When using these functions, be mindful of the widths of the input and return types and follow the parameterization laid out in the header file for optimal results.

### Simple Code Example

An `ac_fixed` number can be defined as follows:

```cpp
ac_fixed<W, I, S> a;
```

Here `W` specifies the width in bits and `S` is a bool indicating if the number is signed. Signed numbers use the most significant bit (MSB) of the `W` bits to store the sign bit. The second parameter `I` is an integer that specifies the location of the fixed point relative to the MSB. Here are some examples of the range and quantum of `ac_fixed` numbers.

```cpp
ac_fixed<4,4,true> x; // xxxx. : range = [-8, 7], quantum = 1
ac_fixed<4,0,true> x; // .xxxx : range = [-8/16, 7/16], quantum = 1/16
ac_fixed<4,0,false> x; // .xxxx : range = [0, 15/16], quantum = 1/16
ac_fixed<4,7,false> x; // xxxx000. : range = [0, 120], quantum = 8
ac_fixed<4,-3,true> x; // (.xxxx)2^(-3) : range=[-1/16, 7/128], quantum = 1/128
```

When creating an `ac_fixed` value, the created value may be less precise than the source value or the conversion may trigger an overflow. `ac_fixed` provides enums to select quantization and overflow modes, which determine how the source value will be converted.

- The default quantization mode is `AC_TRN`, which drops bits to the right of LSB when quantization occurs.
- The default overflow mode is `AC_WRAP`, which drops bits to the left of the MSB when overflow occurs.

> **Note**: To select specific quantization and overflow modes, refer to section *2.1 Quantization and Overflow* in [Algorithmic C (AC) Datatypes, Software Version v3.7. June 2016](https://cdrdv2.intel.com/v1/dl/getContent/728986) for more details.

To use an `ac_fixed` type in your code, include the following header:

```cpp
#include <sycl/ext/intel/ac_types/ac_fixed.hpp>
```

> **Important**: You must pass the  `-qactypes` option to the `icpx` command on Linux or the `/Qactypes` option to the `icx-cl` command on Windows when building your SYCL program in order to include `ac_types` header files on the include path and link against the AC type libraries. In this tutorial, the options are passed through the `src/CMakeLists.txt` file.

### Recommended Method for Constructing `ac_fixed` Numbers

The compiler uses significant FPGA resources to convert floating point values to `ac_fixed` values. The kernel `ConstructFromFloat` in the function `TestConstructFromFloat` constructs an `ac_fixed` object from an accessor to a native `float` type.

In contrast, the kernel `ConstructFromACFixed` in the function `TestConstructFromACFixed` constructs an `ac_fixed` object from an accessor to another `ac_fixed` object. This consumes far less area than the previous kernel. See the section on *Examining the Reports* below to understand where to look for this difference within the optimization reports.

### Using the `ac_fixed` Math Functions

To use the `ac_fixed` math functions in your code, include the following header:

```cpp
#include <sycl/ext/intel/ac_types/ac_fixed_math.hpp>
```

The functions `TestCalculateWithFloat` and `TestCalculateWithACFixed` in this tutorial design contain the kernels `CalculateWithFloat` and `CalculateWithACFixed` respectively. Both calculate the simple expression for some input `x`.

```cpp
square_root ( sine(x) * sine(x) + cosine(x) * cosine(x) )
```

The kernel `CalculateWithFloat` uses floating point values and the standard math library while `CalculateWithACFixed` uses `ac_fixed` values and the `ac_fixed` math library.

In the kernel `CalculateWithACFixed`, the `sin_fixed` and `cos_fixed` functions require the integer part's bit width to be 3 and the input value range to be within [-pi, pi], while the input width of the functions can be chosen depending on the accuracy requirement. For this tutorial, `ac_fixed` inputs are instantiated with the following parameters:

```cpp
W = 10, I = 3, S = true
```

In this tutorial, the `ac_fixed` numbers are smaller in size than floating point numbers, which results in a reduction of the FPGA resources at the expense of accuracy. To see the trade offs in accuracy, compare the numeric results of the operations. The area utilization differences will be discussed in the section on *Examining the Reports*.

When you use the `ac_fixed` library, keep the following points in mind:

- **Input Bit Width and Input Value Range Limits**

    The fixed-point math functions have bit width and input value range requirements. All bit width and input value range requirements are documented at the top of the `ac_fixed_math.hpp` file, which locates in `${ONEAPI_ROOT}/compiler/latest/linux/lib/oclfpga/include/sycl/ext/intel/ac_types` on Linux or `%ONEAPI_ROOT%\compiler\latest\windows\lib\oclfpga\include\sycl\ext\intel\ac_types` on Windows.

- **Return Types**

    For fixed-point functions, each function has a default return type. Assigning the result to a non-default return type triggers a type conversion and can cause an increase in logic use or a loss of accuracy in your results.  All return types are documented at the top of the `ac_fixed_math.hpp` file. For example, for `sin_fixed` and `cos_fixed`, the input type is `ac_fixed<W, 3, true>`, and the output type is `ac_fixed<W-1, 2, true>`. You can also use the `auto` type to let the compiler use the default return type.

- **Accuracy**
  
  - **Floating point vs. Fixed point**: The host program for this tutorial shows the accuracy differences between the result provided by floating point math library and the result provided by the `ac_fixed` math library functions, where the `float` version generates a more accurate result than the smaller-sized `ac_fixed` version.

    >**Note**: The program is compiled with `fp-model` set to "precise" so that the accuracies of the floating-point math functions conform to the IEEE standard.


  - **Emulation vs. FPGA Hardware for fixed point math operations**: Due to the differences in the internal math implementations, the results from `ac_fixed` math functions in emulation and FPGA hardware might not always be bit-accurate. This tutorial shows how to build and run the sample for emulation and FPGA hardware so you can observe the difference.

## Build the `AC Fixed` Tutorial

>**Note**: When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script in the root of your oneAPI installation every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development.
>
> Linux*:
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
> - For private installations: ` . ~/intel/oneapi/setvars.sh`
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
>
> Windows*:
> - `C:\"Program Files (x86)"\Intel\oneAPI\setvars.bat`
> - Windows PowerShell*, use the following command: `cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'`
>
> For more information on configuring environment variables, see [Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html) or [Use the setvars Script with Windows*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html).

### On Linux*

1. Change to the sample directory.
2. Build the program for Intel® Agilex® 7 device family, which is the default.
   ```
   mkdir build
   cd build
   cmake ..
   ```
   > **Note**: You can change the default target by using the command:
   >  ```
   >  cmake .. -DFPGA_DEVICE=<FPGA device family or FPGA part number>
   >  ```
   >
   > Alternatively, you can target an explicit FPGA board variant and BSP by using the following command:
   >  ```
   >  cmake .. -DFPGA_DEVICE=<board-support-package>:<board-variant>
   >  ```
  > **Note**: You can poll your system for available BSPs using the `aoc -list-boards` command. The board list that is printed out will be of the form
  > ```
  > $> aoc -list-boards
  > Board list:
  >   <board-variant>
  >      Board Package: <path/to/board/package>/board-support-package
  >   <board-variant2>
  >      Board Package: <path/to/board/package>/board-support-package
  > ```
   >
   > You will only be able to run an executable on the FPGA if you specified a BSP.

3. Compile the design. (The provided targets match the recommended development flow.)

   1. Compile and run for emulation (fast compile time, targets emulates an FPGA device).
      ```
      make fpga_emu
      ```
   2. Generate the HTML optimization reports. (See [Read the Reports](#read-the-reports) below for information on finding and understanding the reports.)
      ```
      make report
      ```
   3. Compile for simulation (fast compile time, targets simulated FPGA device).
      ```
      make fpga_sim
      ```
   4. Compile and run on FPGA hardware (longer compile time, targets an FPGA device).
      ```
      make fpga
      ```

### On Windows*

1. Change to the sample directory.
2. Build the program for the Intel® Agilex® 7 device family, which is the default.
   ```
   mkdir build
   cd build
   cmake -G "NMake Makefiles" ..
   ```
   > **Note**: You can change the default target by using the command:
   >  ```
   >  cmake -G "NMake Makefiles" .. -DFPGA_DEVICE=<FPGA device family or FPGA part number>
   >  ```
   >
   > Alternatively, you can target an explicit FPGA board variant and BSP by using the following command:
   >  ```
   >  cmake -G "NMake Makefiles" .. -DFPGA_DEVICE=<board-support-package>:<board-variant>
   >  ```
  > **Note**: You can poll your system for available BSPs using the `aoc -list-boards` command. The board list that is printed out will be of the form
  > ```
  > $> aoc -list-boards
  > Board list:
  >   <board-variant>
  >      Board Package: <path/to/board/package>/board-support-package
  >   <board-variant2>
  >      Board Package: <path/to/board/package>/board-support-package
  > ```
   >
   > You will only be able to run an executable on the FPGA if you specified a BSP.

3. Compile the design. (The provided targets match the recommended development flow.)

   1. Compile for emulation (fast compile time, targets emulated FPGA device).
      ```
      nmake fpga_emu
      ```
   2. Generate the optimization report. (See [Read the Reports](#read-the-reports) below for information on finding and understanding the reports.)
      ```
      nmake report
      ```
   3. Compile for simulation (fast compile time, targets simulated FPGA device, reduced problem size).
      ```
      nmake fpga_sim
      ```
   4. Compile for FPGA hardware (longer compile time, targets FPGA device):
      ```
      nmake fpga
      ```
> **Note**: If you encounter any issues with long paths when compiling under Windows*, you may have to create your 'build' directory in a shorter path, for example c:\samples\build.  You can then run cmake from that directory, and provide cmake with the full path to your sample directory, for example:
>
>  ```
  > C:\samples\build> cmake -G "NMake Makefiles" C:\long\path\to\code\sample\CMakeLists.txt
>  ```
### Read the Reports

Locate the pair of `report.html` files in either of the following folders.

- **Report-only compile**:  `ac_fixed.report.prj`
- **FPGA hardware compile**: `ac_fixed.prj`

Scroll down on the Summary page of the report and expand the section titled **Compile Estimated Kernel Resource Utilization Summary**. Notice how the kernel `ConstructFromACFixed` consumes fewer resources than the kernel named `ConstructFromFloat`. Similarly, notice how the kernel named `CalculateWithACFixed` consumes fewer FPGA resources than `CalculateWithFloat`.

## Run the `AC Fixed` Sample

### On Linux
1. Run the sample on the FPGA emulator (the kernel executes on the CPU).
   ```
   ./ac_fixed.fpga_emu
   ```
2. Run the sample of the FPGA simulator device (the kernel executes on the CPU).
   ```
   CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1 ./ac_fixed.fpga_sim
   ```
3. Run the sample on the FPGA device (only if you ran `cmake` with `-DFPGA_DEVICE=<board-support-package>:<board-variant>`).
   ```
   ./ac_fixed.fpga
   ```

### On Windows
1. Run the sample on the FPGA emulator (the kernel executes on the CPU).
   ```bash
   ac_fixed.fpga_emu.exe
   ```
2. Run the sample of the FPGA simulator device (the kernel executes on the CPU).
   ```
   set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1
   ac_fixed.fpga_sim.exe
   set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=
   ```
> **Note**: Hardware runs are not supported on Windows.

## Example Output

The following is example output for the emulator.

```
1. Testing Constructing ac_fixed from float or ac_fixed:
Constructed from float:         3.6416015625
Constructed from ac_fixed:      3.6416015625

2. Testing calculation with float or ac_fixed math functions:
MAX DIFF (quantum) for ac_fixed<10, 3, true>:   0.0078125
MAX DIFF for float:                             9.53674e-07

Input 0:                        -0.80799192
result(fixed point):            1
difference(fixed point):        0
result(float):                  1
difference(float):              0

Input 1:                        -2.099829
result(fixed point):            0.9921875
difference(fixed point):        0.0078125
result(float):                  0.99999994
difference(float):              5.9604645e-08

Input 2:                        -0.74206626
result(fixed point):            1
difference(fixed point):        0
result(float):                  1
difference(float):              0

Input 3:                        -2.3321707
result(fixed point):            1
difference(fixed point):        0
result(float):                  1
difference(float):              0

Input 4:                        1.1432415
result(fixed point):            0.9921875
difference(fixed point):        0.0078125
result(float):                  0.99999994
difference(float):              5.9604645e-08

PASSED: all kernel results are correct.
```

### Understand the Results

You can obtain a smaller hardware footprint for your kernel by ensuring that the `ac_fixed` numbers are constructed from `float` or `double` numbers outside the kernel. Additionally, by using the `ac_fixed` types and math library functions, you can use an even smaller fixed point format to trade off even more accuracy for a more resource efficient design if your application requirements allow for it.

## License

Code samples are licensed under the MIT license. See [License.txt](/License.txt) for details.

Third-party program Licenses can be found here: [third-party-programs.txt](/third-party-programs.txt).
