# ARM Cortex-M Fault Reporters

This module manages data storage for core dumps provided by the `modm:crashcatcher`
module to investigate HardFault events via offline post-mortem debugging.
The data is stored in the volatile memory designated for the heap.

This works as follows:

1. A HardFault occurs and is intercepted by CrashCatcher.
2. CrashCatcher calls into this module to store the core dump in the heap as
   defined by the linkerscript's `.table.heap` section, thus effectively
   overwriting the heap, then reboots the device.
3. On reboot, only the remaining heap memory is initialized, leaving the core
   dump data intact.
4. The application has no limitations other than a reduced total heap size!
   It may access the report data at any time and use all hardware to send out
   this report.
5. After the application clears the report and reboots, the heap will once
   again be fully available.


## Restrictions on HardFault Entry

A HardFault is a serious bug and should it happen your application is most likely
compromised in some way. Here are some important points to take note of.

1. The HardFault has a hardcoded priority of -1 and only the NMI and the Reset
   exceptions have a higher priority (-2 and -3). This means ALL device interrupts
   have a LOWER priority!
2. The HardFault is a synchronous exception, it will NOT wait for anything to
   complete, especially not the currently executing interrupt (if any).
3. There are many reasons for the HardFault exception to be raised (e.g. accessing
   invalid memory, executing undefined instructions, dividing by zero) making
   it very difficult to recover in a generic way. It is therefore reasonable
   to abandon execution (=> reboot) rather than resuming execution in an
   increasingly unstable application.

On HardFault entry, this module calls the function `modm_hardfault_entry()` which
can be overwritten by the application to put the devices hardware in a safe mode.
This can be as simple as disabling power to external components, however, its
execution should be strictly time bound and NOT depend on other interrupts
completing (they won't), which will cause a deadlock.

```cpp
void modm_hardfault_entry()
{
    Board::MotorDrivers::disable();
    // return from this function as fast as possible
}
```

After this function returns, this module will generate the coredump into the
heap and reboot the device.


## Reporting the Fault

In order to recover from the HardFault the device is rebooted with a smaller
heap. Once the `main()` function is reached, the application code should check
for `FaultReporter::hasReport()` and then only initialize the bare minimum of
Hardware to send this report to the developer.

To access the report, use the `FaultReporter::begin()` and `FaultReporter::end()`
functions which return a `const_iterator` of the actual core dump data, that can
be used in a range-based for loop.

Remember to call  `FaultReporter::clearAndReboot()` to clear the report, reboot
the device and reclaim the full heap.

```cpp
int main()
{
    if (FaultReporter::hasReport()) // Check first after boot
    {
        Application::partialInitialize(); // Initialize only the necessary
        reportBegin();
        for (const uint8_t data : FaultReporter::buildId())
            reportBuildId(data); // send each byte of Build ID
        for (const uint8_t data : FaultReporter())
            reportData(data); // send each byte of data
        reportEnd(); // end the report
        FaultReporter::clearAndReboot(); // clear the report and reboot
        // never reached
    }
    // Normal initialization
    Application::initialize();
}
```

The application is able to use the heap, however, depending on the report size
(controllable via the `report_level` option) the heap may be much smaller then
normal. Make sure your application can deal with that.

For complex applications which perhaps communicate asynchronously (CAN,
Ethernet, Wireless) it may not be possible to send the report in one piece or
at the same time. The report data remains available until you reboot, even after
you've cleared the report.

```cpp
int main()
{
    const bool faultReport{FaultReporter::hasReport()};
    FaultReporter::clear(); // only clear report but do not reboot
    Application::initialize();

    while (true)
    {
        doOtherStuff();
        if (faultReport and applicationReady)
        {
            // Still valid AFTER clear, but BEFORE reboot
            const auto id = FaultReporter::buildId();
            auto begin = FaultReporter::begin();
            auto end = FaultReporter::end();
            //
            Application::sendReport(id, begin, end);
            // reboot when report has been fully sent
        }
    }
}
```


## Coredump via GDB

In case you encounter a HardFault while debugging and you did not include this
module or if you simply want to store the current system state for later
analysis or to share with other developers, you can simply call the
`modm_coredump` function inside GDB and it will generate a `coredump.txt` file.
Note that this coredump file contains all volatile memories including the heap,
so this method is strongly recommended if you can attach a debugger.

Consult your chosen build system module for additional integrations.


## Using the Fault Report

The fault report contains a core dump generated by CrashCatcher and is supposed
to be used by CrashDebug to present the memory view to the GDB debugger.
For this, you must use the ELF file that corresponds to the devices firmware,
as well as copy the coredump data formatted as *hexadecimal* values into a text
file, then call the debugger like this:

```
arm-none-eabi-gdb -tui executable.elf -ex "set target-charset ASCII" \
    -ex "target remote | CrashDebug --elf executable.elf --dump coredump.txt"
```

Note that the `FaultReporter::buildId()` contains the GNU Build ID, which can
help you find the right ELF file:

```
arm-none-eabi-readelf -n executable.elf

Displaying notes found in: .build_id
  Owner                 Data size Description
  GNU                  0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: 59f08f7a37a7340799d9dba6b0c092bc3c9515c5
```


### Post-Mortem Debugging with SCons

The `modm:build:scons` module provides a few helper methods for working with fault
reports. You still need to copy the coredump data manually, however, the firmware
selection is automated.

The SCons build system will automatically cache the ELF file for the build id for
every firmware upload (using `scons artifact`).
When a fault is reported, you can tell SCons the firmware build id and it will use
the corresponding ELF file automatically.

```sh
# Copy data into coredump.txt
touch coredump.txt
# Start postmortem debugging of executable with this build id
scons debug-coredump firmware=59f08f7a37a7340799d9dba6b0c092bc3c9515c5
```
