<p align="center">
  <a href="https://vectoradmin.com"><img src="https://github.com/Mintplex-Labs/vector-admin/blob/master/images/logo-light.png?raw=true" alt="VectorAdmin logo"></a>
</p>

<p align="center">
    <b>The universal UI and tool suite for managing vector databases at scale.</b><br />
</p>

<p align="center">
 <a href="https://twitter.com/mintplexlabs" target="_blank">
      <img src="https://img.shields.io/twitter/url/https/twitter.com/mintplexlabs.svg?style=social&label=Follow%20%40Mintplex%20Labs" alt="Twitter">
  </a> |
  <a href="https://discord.gg/6UyHPeGZAC" target="_blank">
      <img src="https://img.shields.io/badge/chat-mintplex_labs-blue.svg?style=flat&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAAIGNIUk0AAHomAACAhAAA+gAAAIDoAAB1MAAA6mAAADqYAAAXcJy6UTwAAAH1UExURQAAAP////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////r6+ubn5+7u7/3+/v39/enq6urq6/v7+97f39rb26eoqT1BQ0pOT4+Rkuzs7cnKykZKS0NHSHl8fdzd3ejo6UxPUUBDRdzc3RwgIh8jJSAkJm5xcvHx8aanqB4iJFBTVezt7V5hYlJVVuLj43p9fiImKCMnKZKUlaaoqSElJ21wcfT09O3u7uvr6zE0Nr6/wCUpK5qcnf7+/nh7fEdKTHx+f0tPUOTl5aipqiouMGtubz5CRDQ4OsTGxufn515hY7a3uH1/gXBydIOFhlVYWvX29qaoqCQoKs7Pz/Pz87/AwUtOUNfY2dHR0mhrbOvr7E5RUy8zNXR2d/f39+Xl5UZJSx0hIzQ3Odra2/z8/GlsbaGjpERHSezs7L/BwScrLTQ4Odna2zM3Obm7u3x/gKSmp9jZ2T1AQu/v71pdXkVISr2+vygsLiInKTg7PaOlpisvMcXGxzk8PldaXPLy8u7u7rm6u7S1tsDBwvj4+MPExbe4ueXm5s/Q0Kyf7ewAAAAodFJOUwAABClsrNjx/QM2l9/7lhmI6jTB/kA1GgKJN+nea6vy/MLZQYeVKK3rVA5tAAAAAWJLR0QB/wIt3gAAAAd0SU1FB+cKBAAmMZBHjXIAAAISSURBVDjLY2CAAkYmZhZWNnYODnY2VhZmJkYGVMDIycXNw6sBBbw8fFycyEoYGfkFBDVQgKAAPyMjQl5IWEQDDYgIC8FUMDKKsmlgAWyiEBWMjGJY5YEqxMAqGMWFNXAAYXGgAkYJSQ2cQFKCkYFRShq3AmkpRgYJbghbU0tbB0Tr6ukbgGhDI10gySfBwCwDUWBsYmpmDqQtLK2sbTQ0bO3sHYA8GWYGWWj4WTs6Obu4ami4OTm7exhqeHp5+4DCVJZBDmqdr7ufn3+ArkZgkJ+fU3CIRmgYWFiOARYGvo5OQUHhEUAFTkF+kVHRsLBgkIeyYmLjwoOc4hMSk5JTnINS06DC8gwcEEZ6RqZGlpOfc3ZObl5+gZ+TR2ERWFyBQQFMF5eklmqUpQb5+ReU61ZUOvkFVVXXQBSAraitq29o1GiKcfLzc29u0mjxBzq0tQ0kww5xZHtHUGeXhkZhdxBYgZ4d0LI6c4gjwd7siQQraOp1AivQ6CuAKZCDBBRQQQNQgUb/BGf3cqCCiZOcnCe3QQIKHNRTpk6bDgpZjRkzg3pBQTBrdtCcuZCgluAD0vPmL1gIdvSixUuWgqNs2YJ+DUhkEYxuggkGmOQUcckrioPTJCOXEnZ5JS5YslbGnuyVERlDDFvGEUPOWvwqaH6RVkHKeuDMK6SKnHlVhTgx8jeTmqy6Eij7K6nLqiGyPwChsa1MUrnq1wAAACV0RVh0ZGF0ZTpjcmVhdGUAMjAyMy0xMC0wNFQwMDozODo0OSswMDowMB9V0a8AAAAldEVYdGRhdGU6bW9kaWZ5ADIwMjMtMTAtMDRUMDA6Mzg6NDkrMDA6MDBuCGkTAAAAKHRFWHRkYXRlOnRpbWVzdGFtcAAyMDIzLTEwLTA0VDAwOjM4OjQ5KzAwOjAwOR1IzAAAAABJRU5ErkJggg==" alt="Discord">
  </a>  |
  <a href="https://github.com/Mintplex-Labs/vector-admin/blob/master/LICENSE" target="_blank">
      <img src="https://img.shields.io/static/v1?label=license&message=MIT&color=white" alt="License">
  </a> |
  <a href="https://docs.vectoradmin.com/" target="_blank">
    Docs
  </a> |
  <a href="https://vectoradmin.com" target="_blank">
    Hosted Instance
  </a>
</p>

**Quick!** Can you tell me _exactly_ what information is embedded in your Pinecone or Chroma vector database? I bet you can't. While those teams are focusing on building the underlying architecture we made it easier for you to _manage_ vector data without the headaches and API calls.

We call it **VectorAdmin** and we want to be the best universal GUI for vector database management.

![Managing VectorData](/images/screenshots/org_home.png)
[view more screenshots](/images/screenshots/SCREENSHOTS.md)

### Watch the demo!
[![Watch the video](/images/youtube.png)](https://youtu.be/cW8Eohz6pzs)

### Product Overview
VectorAdmin aims to be a full-stack application that gives you total control over your otherwise unwieldy vector data that you are embedding via an API or using tools like LangChain, which don't show you what you just saved into your database.

VectorAdmin is a full capable multi-user product that you can run locally via Docker as well as host remotely and manage multiple vector databases at once.

VectorAdmin is more than a single tool. VectorAdmin is a **suite** of tools that make interacting with and understanding vectorized text easy without compromise for the controls you would expect from a traditional database management system.

Some cool features of VectorAdmin
- Multi-user instance support and oversight
- Atomically view, update, and delete singular text chunks of embeddings.
- Copy entire documents or even whole namespaces and embeddings without paying to re-embed.
- Upload & embed new documents directly into the vector database.
- Migrate an entire existing vector database to another type or instance. _still in progress_
- Manage multiple concurrent vector databases at once.
- Permission data and access to data
- 100% Cloud deployment ready.
- Automated regression testing that run as namespaces or collections are updated with new documents to ensure response quality. _still in progress_
- Full API, Javascript, and Python standalone client and LangChain integration. _still in progress_
- Extremely efficient cost-saving measures for managing very large documents. You'll never pay to embed a massive document or transcript more than once.

### Technical Overview
This monorepo consists of three main sections:
- `document-processor`: Flask app to digest, parse, and embed documents easily.
- `frontend`: A viteJS + React frontend that you can run to easily create and manage all your content.
- `backend`: A nodeJS + express server to handle all the interactions and do all the vectorDB management.
- `workers`: An InngestJS instance to handle long-running processes background tasks for snappy performance.
- `docker`: Run this entire arch in a single command as a docker instance _recommended_.

### Requirements
- `yarn` and `node` on your machine
- `python` 3.9+ for running scripts in `document-processor/`.
- access to an OpenAI API key if planning to update embeddings or upload new documents.
- a [Pinecone.io](https://pinecone.io) free account or a running [ChromaDB](https://trychroma.com) instance.


## How to get started (Docker - simple setup)
[Get up and running in minutes with Docker](./docker/DOCKER.md)


### How to get started (Development environment)
The below instructions will **not** work on Windows.

- `yarn dev:setup` from the project root directory.
- `cd document-processor && python3.9 -m virtual-env v-env && source v-env/bin/activate && pip install -r requirements.txt`

In separate terminal windows from project root:
  - `yarn prisma:setup` to create DB migration and client and then run `yarn dev:server`
  - `yarn dev:frontend`
  - `yarn dev:workers`
  - `cd document-processor && flask run --host '0.0.0.0' --port 8888`

On first boot and visiting of the homepage, you will be automatically redirected to create your primary admin account, organization, and database connection.

## Contributing
- create issue
- create PR with branch name format of `<issue number>-<short name>`
- yee haw let's merge

## Telemetry
VectorAdmin by Mintplex Labs Inc contains a telemetry feature that collects anonymous usage information.

### Why?
We use this information to help us understand how VectorAdmin is used, to help us prioritize work on new features and bug fixes, and to help us improve VectorAdmin's performance and stability.

### Opting out
Set `DISABLE_TELEMETRY` in your server or docker .env settings to "true" to opt out of telemetry.

```
DISABLE_TELEMETRY="true"
```

### What do you explicitly track?
We will only track usage details that help us make product and roadmap decisions, specifically:
- Server is started or booted up.
- Version of your installation.
- Type of job when executed. 

You can verify these claims by finding all locations `Telemetry.sendTelemetry` is called. Additionally these events are written to the output log so you can also see the specific data which was sent - if enabled. No IP or other identifying information is collected. The Telemetry provider is [PostHog](https://posthog.com/) - an open-source telemetry collection service.
