# Design

Actually, there isn't much design. It's very simple and easy to understand at a glance. Basically, everything is
organized according to the directory.

## Document

Because the minimum unit of operation in VQLite is `Document`, let's first take a look at what `Document` is.

A Document consists of the following four parts:

|   Fields    |          Desc           |
|:-----------:|:-----------------------:|
|    vqid     | document's id, not null |
|  metadata   |        metadata         |
|   vectors   |    vectors，not null     |
| vectors_tag |   the tag for vectors   |

- vqid needs to be defined by the user in advance, and it is used to represent the id of the document. It cannot be
  empty.
- Metadata can contain any type of data (but it is important to ensure that useful information is inserted and kept to a
  minimum, as this data will all be stored in memory), and may be empty.
- vectors is an array, even if there is only one vector it must be written in the form of an array and cannot be empty.
- vectors_tag corresponds to the vector above and is used to tag the vector. The value here is an array such as [1,2,3],
  and the values inside should not exceed the maximum value of uint32 (4294967295). It can be empty, in which case
  VQLite will generate tags for them according to the order of vectors.

Here's an actual example of a document below.

Assuming we have a song that we want to add to the VQLite:

```
{
    "vqid": "f8f78d25-23b0-4941-a7f4-7be66e6d8eea",
    "metadata": {"title":"Yellow", "artist":"coldplay"}
    "vectors": [[1,2,3], [4,5,6], [7,8,9]]
    "vectors_tag": [0, 60, 120]  
}
```

- vqid is the unique ID generated by us for this song, which users generate themselves.
- metadata stores information about the song's title and artist.
- vectors store the audio embeddings.
- vectors_tag is used to store tags corresponding to the vectors. For example, here, the first vector corresponds to
  content at 0 seconds, the second vector corresponds to content at 60 seconds, and so on. If you don't need it, you
  don't have to write it as sequence numbers will be automatically generated.

# Structure

Basically, all operations are called layer by layer from the outside to the inside, and the order
is `API` -> `CollectionList` -> `Collection` -> `Segment`.

## File structure

After understanding the file structure of data storage, you will know about the design of VQLite.

```
vqlite_data/
`-- collection
    `-- segment_0
        |-- datasets.vql
        |-- index/
        |-- metadata.gob
        `-- vids.vql
```

Here we have borrowed~~copied~~ the approach of Milvus and introduced the concept of collection, which can be understood
as the concept of table in a database.

- vqlite_data is a folder that stores all collections.
- Collection is a folder that stores all segments.
- Segment_{number} is a folder that stores all data.
- metadata.gob is a file, which is an object in golang. It contains basic information about the current segment,
  including metadata corresponding to vectors.
- datasets.vql is a file that stores all vectors.
- vids.vql is a file that stores all vqids corresponding to vectors.
- index/ is a folder containing ScaNN index files.

## Segment

Segment is the smallest storage unit of VQLite, and it stores a certain number of documents which can be configured.

## Collection

Collection manages Segments. A Collection can have multiple Segments. All operations received by a Collection are
forwarded to all the Segments it manages.

## CollectionList

CollectionList manages Collections. VQLite has only one CollectionList that manages all Collections.

Every time an operation is received, it first looks for the corresponding Collection in the CollectionList before
performing the corresponding operation.