# Backward compatibility in Quickwit.

If you are reading this, chances are you want to make a change to one of the resource
of Quickwit's meta/config:

User edited:
- QuickwitConfig

User edited and stored in metastore:
- IndexConfig
- SourceConfig

Stored in metastore only:
- IndexMetadata
- FileBackedIndex
- SplitMetadata.

Quickwit currently manages backward compatibility of all of these resources but QuickwitConfig.
This document describes how to handle a change, and how to make test such a change,
and spot eventual regression.

## How do I update `{IndexMetadata, SplitMetadata, FileBackedIndex, SourceConfig, IndexConfig}`?

There are two types of upgrades.

### Naturally backward compatible change

Serde offers some attributes to make backward compatible change to our model.
For instance, it is possible to add a new field to a struct and slap
a `serde(default)` attribute to it in order to handle older serialized version of the
struct.

If you want to avoid to generate any diff on the non-regression json files,
you can also avoid use `#[serde(skip_serializing_if)]`, although by default,
it is recommended to not use it.

It is also possible to rename a field in a backward compatible manner
by using the `#[serde(alias)]`.


For this type of change it is not required to update the serialization version.

We have a some mechanism (see backward compatibility test project below) to spot
non-regression.

When introducing such a change:
- modify your model with the help of the attributes above.
- modify the example for the model by editing its `TestableForRegression` trait implementation.
- commit the 2 files that were updated by build.rs
- eyeball the diff on the `.expected.json` that failed, and send it with your PR.

### Change requiring a new version

For heavier changes requiring a new version, you will have to add to increment the configuration
version. Please make sure that all of these resources share the same version number.

- update the resource struct you want to change.
- create a new item in the `VersionedXXXX` struct. It is usually located in a serialize.rs file
- `Serialize` is not needed for the previous serialized version. We just need `Deserialize`. We can remove the `Serialize` impl from the derive statement, and mark it a `skip_serializing` as follows.

e.g.
```
#[serde(tag = "version")]
pub(crate) enum VersionedXXXXXX {
    #[serde(rename = "0")]
    V0(#[serde(skip_serializing)] XXXXV0),
    #[serde(rename = "1")]
    V1(XXXXXX1),
}
```
- Complete the conversion from VersionedXXXX to
- make sure the conversion `From<XXXX> for VersionedXXXX` creates the new item.
- run unit tests
- commit the 2 files that were autogenerated by build.rs
- eyeball and commit the autoupdated `expected.json` files
- Possibly update the generation of the default XXXX instance used for regression. It is in the function `sample_for_regression_tests`.


## Backward compatibility test project.

This is just a project used to test backward compatibility of Quickwit.
Right now, `SplitMetadata`, `IndexMetadata`, and `FileBackedIndex` are tested.

We want to be able to read all past versions of these files, but only write the most recent format.
The tests consist of pairs of JSON files.
XXXX.json and XXXX.expected.json

XXXX.json consists of a JSON file in an old format.
XXXX.expected.json consists of the expected result of
serialize_new_version(deserialize(XXXX.json)).

## Change of format autodetection

Two things need to happen upon a change of format.

### Updating expected.json

Of course to make all of this work, we need to keep `*.expected.json` files up-to-date
with the changes of format.

This is done in a semi-automatic fashion.

On a first pass, `deserialize(original_json) == deserialize(expectation_json)`.
On a second pass, the tests are checking that `expectation_json = serialize(deserialize(expectation_json))`.

When changing the json format, it is expected to see this test fail.
The unit test then updates automatically the `expected.json`. The developer just has to
check the diff of the result (in particular no information should be lost) and commit the updated expected.json files.

Adding this update operation within the unit test is a tad unexpected, but it has the merit of
integrating well with CI. If a developer forgets to update the expected.json file,
the CI will catch it.

### Adding a new test case.

If the serialization format changes, the unit test will also automatically add
a new unit test generated from a sample tested object.
Concretely, it will just write two files `XXXX.json` and `XXXX.expected.json`.

The two files will be identical. This is expected as this is a unit test for the
most recent version.

The unit test will start making sense in future updates thanks to the update phase
described in the previous section.
