mirror of
https://github.com/onyx-dot-app/onyx.git
synced 2026-02-16 23:35:46 +00:00
Contribution Guidelines (#7468)
This commit is contained in:
259
CONTRIBUTING.md
259
CONTRIBUTING.md
@@ -1,262 +1,31 @@
|
|||||||
<!-- ONYX_METADATA={"link": "https://github.com/onyx-dot-app/onyx/blob/main/CONTRIBUTING.md"} -->
|
|
||||||
|
|
||||||
# Contributing to Onyx
|
# Contributing to Onyx
|
||||||
|
|
||||||
Hey there! We are so excited that you're interested in Onyx.
|
Hey there! We are so excited that you're interested in Onyx.
|
||||||
|
|
||||||
As an open source project in a rapidly changing space, we welcome all contributions.
|
|
||||||
|
|
||||||
## 💃 Guidelines
|
## Contribution Opportunities
|
||||||
|
The [GitHub Issues](https://github.com/onyx-dot-app/onyx/issues) page is a great place to look for and share contribution ideas.
|
||||||
|
|
||||||
### Contribution Opportunities
|
If you have your own feature that you would like to build please create an issue and community members can provide feedback and
|
||||||
|
thumb it up if they feel a common need.
|
||||||
|
|
||||||
The [GitHub Issues](https://github.com/onyx-dot-app/onyx/issues) page is a great place to start for contribution ideas.
|
|
||||||
|
|
||||||
To ensure that your contribution is aligned with the project's direction, please reach out to any maintainer on the Onyx team
|
## Contributing Code
|
||||||
via [Discord](https://discord.gg/4NA5SbzrWb) or [email](mailto:hello@onyx.app).
|
Please reference the documents in contributing_guides folder to ensure that the code base is kept to a high standard.
|
||||||
|
1. dev_setup.md (start here): gives you a guide to setting up a local development environment.
|
||||||
|
2. contribution_process.md: how to ensure you are building valuable features that will get reviewed and merged.
|
||||||
|
3. best_practices.md: before asking for reviews, ensure your changes meet the repo code quality standards.
|
||||||
|
|
||||||
Issues that have been explicitly approved by the maintainers (aligned with the direction of the project)
|
To contribute, please follow the
|
||||||
will be marked with the `approved by maintainers` label.
|
|
||||||
Issues marked `good first issue` are an especially great place to start.
|
|
||||||
|
|
||||||
**Connectors** to other tools are another great place to contribute. For details on how, refer to this
|
|
||||||
[README.md](https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/connectors/README.md).
|
|
||||||
|
|
||||||
If you have a new/different contribution in mind, we'd love to hear about it!
|
|
||||||
Your input is vital to making sure that Onyx moves in the right direction.
|
|
||||||
Before starting on implementation, please raise a GitHub issue.
|
|
||||||
|
|
||||||
Also, always feel free to message the founders (Chris Weaver / Yuhong Sun) on
|
|
||||||
[Discord](https://discord.gg/4NA5SbzrWb) directly about anything at all.
|
|
||||||
|
|
||||||
### Contributing Code
|
|
||||||
|
|
||||||
To contribute to this project, please follow the
|
|
||||||
["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
|
["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
|
||||||
When opening a pull request, mention related issues and feel free to tag relevant maintainers.
|
|
||||||
|
|
||||||
Before creating a pull request please make sure that the new changes conform to the formatting and linting requirements.
|
|
||||||
See the [Formatting and Linting](#formatting-and-linting) section for how to run these checks locally.
|
|
||||||
|
|
||||||
### Getting Help 🙋
|
## Getting Help 🙋
|
||||||
|
We have support channels and generally interesting discussions on our [Discord](https://discord.gg/4NA5SbzrWb).
|
||||||
|
|
||||||
Our goal is to make contributing as easy as possible. If you run into any issues please don't hesitate to reach out.
|
See you there!
|
||||||
That way we can help future contributors and users can avoid the same issue.
|
|
||||||
|
|
||||||
We also have support channels and generally interesting discussions on our
|
|
||||||
[Discord](https://discord.gg/4NA5SbzrWb).
|
|
||||||
|
|
||||||
We would love to see you there!
|
|
||||||
|
|
||||||
## Get Started 🚀
|
|
||||||
|
|
||||||
Onyx being a fully functional app, relies on some external software, specifically:
|
|
||||||
|
|
||||||
- [Postgres](https://www.postgresql.org/) (Relational DB)
|
|
||||||
- [Vespa](https://vespa.ai/) (Vector DB/Search Engine)
|
|
||||||
- [Redis](https://redis.io/) (Cache)
|
|
||||||
- [MinIO](https://min.io/) (File Store)
|
|
||||||
- [Nginx](https://nginx.org/) (Not needed for development flows generally)
|
|
||||||
|
|
||||||
> **Note:**
|
|
||||||
> This guide provides instructions to build and run Onyx locally from source with Docker containers providing the above external software. We believe this combination is easier for
|
|
||||||
> development purposes. If you prefer to use pre-built container images, we provide instructions on running the full Onyx stack within Docker below.
|
|
||||||
|
|
||||||
### Local Set Up
|
|
||||||
|
|
||||||
Be sure to use Python version 3.11. For instructions on installing Python 3.11 on macOS, refer to the [CONTRIBUTING_MACOS.md](./CONTRIBUTING_MACOS.md) readme.
|
|
||||||
|
|
||||||
If using a lower version, modifications will have to be made to the code.
|
|
||||||
If using a higher version, sometimes some libraries will not be available (i.e. we had problems with Tensorflow in the past with higher versions of python).
|
|
||||||
|
|
||||||
#### Backend: Python requirements
|
|
||||||
|
|
||||||
Currently, we use [uv](https://docs.astral.sh/uv/) and recommend creating a [virtual environment](https://docs.astral.sh/uv/pip/environments/#using-a-virtual-environment).
|
|
||||||
|
|
||||||
For convenience here's a command for it:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uv venv .venv --python 3.11
|
|
||||||
source .venv/bin/activate
|
|
||||||
```
|
|
||||||
|
|
||||||
_For Windows, activate the virtual environment using Command Prompt:_
|
|
||||||
|
|
||||||
```bash
|
|
||||||
.venv\Scripts\activate
|
|
||||||
```
|
|
||||||
|
|
||||||
If using PowerShell, the command slightly differs:
|
|
||||||
|
|
||||||
```powershell
|
|
||||||
.venv\Scripts\Activate.ps1
|
|
||||||
```
|
|
||||||
|
|
||||||
Install the required python dependencies:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uv sync --all-extras
|
|
||||||
```
|
|
||||||
|
|
||||||
Install Playwright for Python (headless browser required by the Web Connector):
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uv run playwright install
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Frontend: Node dependencies
|
|
||||||
|
|
||||||
Onyx uses Node v22.20.0. We highly recommend you use [Node Version Manager (nvm)](https://github.com/nvm-sh/nvm)
|
|
||||||
to manage your Node installations. Once installed, you can run
|
|
||||||
|
|
||||||
```bash
|
|
||||||
nvm install 22 && nvm use 22
|
|
||||||
node -v # verify your active version
|
|
||||||
```
|
|
||||||
|
|
||||||
Navigate to `onyx/web` and run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm i
|
|
||||||
```
|
|
||||||
|
|
||||||
## Formatting and Linting
|
|
||||||
|
|
||||||
### Backend
|
|
||||||
|
|
||||||
For the backend, you'll need to setup pre-commit hooks (black / reorder-python-imports).
|
|
||||||
|
|
||||||
Then run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uv run pre-commit install
|
|
||||||
```
|
|
||||||
|
|
||||||
Additionally, we use `mypy` for static type checking.
|
|
||||||
Onyx is fully type-annotated, and we want to keep it that way!
|
|
||||||
To run the mypy checks manually, run `uv run mypy .` from the `onyx/backend` directory.
|
|
||||||
|
|
||||||
### Web
|
|
||||||
|
|
||||||
We use `prettier` for formatting. The desired version will be installed via a `npm i` from the `onyx/web` directory.
|
|
||||||
To run the formatter, use `npx prettier --write .` from the `onyx/web` directory.
|
|
||||||
|
|
||||||
Pre-commit will also run prettier automatically on files you've recently touched. If re-formatted, your commit will fail.
|
|
||||||
Re-stage your changes and commit again.
|
|
||||||
|
|
||||||
# Running the application for development
|
|
||||||
|
|
||||||
## Developing using VSCode Debugger (recommended)
|
|
||||||
|
|
||||||
**We highly recommend using VSCode debugger for development.**
|
|
||||||
See [CONTRIBUTING_VSCODE.md](./CONTRIBUTING_VSCODE.md) for more details.
|
|
||||||
|
|
||||||
Otherwise, you can follow the instructions below to run the application for development.
|
|
||||||
|
|
||||||
## Manually running the application for development
|
|
||||||
### Docker containers for external software
|
|
||||||
|
|
||||||
You will need Docker installed to run these containers.
|
|
||||||
|
|
||||||
First navigate to `onyx/deployment/docker_compose`, then start up Postgres/Vespa/Redis/MinIO with:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d index relational_db cache minio
|
|
||||||
```
|
|
||||||
|
|
||||||
(index refers to Vespa, relational_db refers to Postgres, and cache refers to Redis)
|
|
||||||
|
|
||||||
### Running Onyx locally
|
|
||||||
|
|
||||||
To start the frontend, navigate to `onyx/web` and run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm run dev
|
|
||||||
```
|
|
||||||
|
|
||||||
Next, start the model server which runs the local NLP models.
|
|
||||||
Navigate to `onyx/backend` and run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uvicorn model_server.main:app --reload --port 9000
|
|
||||||
```
|
|
||||||
|
|
||||||
_For Windows (for compatibility with both PowerShell and Command Prompt):_
|
|
||||||
|
|
||||||
```bash
|
|
||||||
powershell -Command "uvicorn model_server.main:app --reload --port 9000"
|
|
||||||
```
|
|
||||||
|
|
||||||
The first time running Onyx, you will need to run the DB migrations for Postgres.
|
|
||||||
After the first time, this is no longer required unless the DB models change.
|
|
||||||
|
|
||||||
Navigate to `onyx/backend` and with the venv active, run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
alembic upgrade head
|
|
||||||
```
|
|
||||||
|
|
||||||
Next, start the task queue which orchestrates the background jobs.
|
|
||||||
Jobs that take more time are run async from the API server.
|
|
||||||
|
|
||||||
Still in `onyx/backend`, run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
python ./scripts/dev_run_background_jobs.py
|
|
||||||
```
|
|
||||||
|
|
||||||
To run the backend API server, navigate back to `onyx/backend` and run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
AUTH_TYPE=disabled uvicorn onyx.main:app --reload --port 8080
|
|
||||||
```
|
|
||||||
|
|
||||||
_For Windows (for compatibility with both PowerShell and Command Prompt):_
|
|
||||||
|
|
||||||
```bash
|
|
||||||
powershell -Command "
|
|
||||||
$env:AUTH_TYPE='disabled'
|
|
||||||
uvicorn onyx.main:app --reload --port 8080
|
|
||||||
"
|
|
||||||
```
|
|
||||||
|
|
||||||
> **Note:**
|
|
||||||
> If you need finer logging, add the additional environment variable `LOG_LEVEL=DEBUG` to the relevant services.
|
|
||||||
|
|
||||||
#### Wrapping up
|
|
||||||
|
|
||||||
You should now have 4 servers running:
|
|
||||||
|
|
||||||
- Web server
|
|
||||||
- Backend API
|
|
||||||
- Model server
|
|
||||||
- Background jobs
|
|
||||||
|
|
||||||
Now, visit `http://localhost:3000` in your browser. You should see the Onyx onboarding wizard where you can connect your external LLM provider to Onyx.
|
|
||||||
|
|
||||||
You've successfully set up a local Onyx instance! 🏁
|
|
||||||
|
|
||||||
#### Running the Onyx application in a container
|
|
||||||
|
|
||||||
You can run the full Onyx application stack from pre-built images including all external software dependencies.
|
|
||||||
|
|
||||||
Navigate to `onyx/deployment/docker_compose` and run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker compose up -d
|
|
||||||
```
|
|
||||||
|
|
||||||
After Docker pulls and starts these containers, navigate to `http://localhost:3000` to use Onyx.
|
|
||||||
|
|
||||||
If you want to make changes to Onyx and run those changes in Docker, you can also build a local version of the Onyx container images that incorporates your changes like so:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker compose up -d --build
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
### Release Process
|
|
||||||
|
|
||||||
|
## Release Process
|
||||||
Onyx loosely follows the SemVer versioning standard.
|
Onyx loosely follows the SemVer versioning standard.
|
||||||
Major changes are released with a "minor" version bump. Currently we use patch release versions to indicate small feature changes.
|
Major changes are released with a "minor" version bump. Currently we use patch release versions to indicate small feature changes.
|
||||||
A set of Docker containers will be pushed automatically to DockerHub with every tag.
|
A set of Docker containers will be pushed automatically to DockerHub with every tag.
|
||||||
|
|||||||
@@ -0,0 +1,96 @@
|
|||||||
|
# Enterprise Edition Contribution IP Assignment Agreement (DanswerAI, Inc.)
|
||||||
|
|
||||||
|
**Effective Date:** ______________________
|
||||||
|
|
||||||
|
This Enterprise Edition Contribution IP Assignment Agreement (the “**Agreement**”) is entered into by and between:
|
||||||
|
|
||||||
|
- **DanswerAI, Inc.** (“**Company**”), the maintainer of the Onyx product, and
|
||||||
|
- **Contributor:** ______________________ (“**Contributor**”)
|
||||||
|
|
||||||
|
Company and Contributor may be referred to individually as a “**Party**” and collectively as the “**Parties**.”
|
||||||
|
|
||||||
|
## 1. Purpose and scope
|
||||||
|
|
||||||
|
Onyx’s repository is primarily licensed under the MIT License, but includes **proprietary-licensed Enterprise Edition components** (as defined below). This Agreement applies **only** to Contributions made to the Enterprise Edition components and is intended to ensure Company owns all rights necessary to license, distribute, and commercialize Enterprise Edition features.
|
||||||
|
|
||||||
|
## 2. Definitions
|
||||||
|
|
||||||
|
2.1 **“Enterprise Edition” or “EE”** means (a) any source code, documentation, configuration, assets, tests, build scripts, or other materials located in or under **any directory named `ee`** anywhere in the repository (including nested paths), and (b) any other files or directories that are explicitly marked as proprietary or Enterprise Edition in repository documentation, file headers, or license notices, and (c) any derivative works, modifications, or additions to the foregoing.
|
||||||
|
|
||||||
|
2.2 **“Contribution(s)”** means any work of authorship (including code, documentation, or other materials) that Contributor submits to Company for inclusion in EE, including via pull request, patch, commit, issue attachment, email, or any other submission method accepted by Company, and any modifications to existing EE materials.
|
||||||
|
|
||||||
|
2.3 **“Intellectual Property Rights”** means all rights worldwide in and to copyrights, moral rights, neighboring rights, trade secrets, mask work rights, design rights, database rights, patent rights, and any other proprietary rights, whether registered or unregistered.
|
||||||
|
|
||||||
|
## 3. Assignment of rights
|
||||||
|
|
||||||
|
3.1 **Assignment.** To the maximum extent permitted by law, Contributor hereby **assigns and transfers to Company**, and agrees to assign and transfer to Company, **all right, title, and interest** in and to all Contributions and all associated Intellectual Property Rights, including all rights to reproduce, prepare derivative works, distribute, publicly perform, publicly display, and otherwise exploit the Contributions in any manner.
|
||||||
|
|
||||||
|
3.2 **Future rights and further assurances.** Contributor agrees to execute and deliver (including electronically) any documents and take any actions reasonably requested by Company to perfect, record, or enforce Company’s rights in the Contributions. If Contributor fails to do so after reasonable request, Contributor appoints Company as Contributor’s attorney-in-fact solely to execute such documents on Contributor’s behalf.
|
||||||
|
|
||||||
|
3.3 **Work made for hire (where applicable).** To the extent any Contribution qualifies as a “work made for hire” under applicable law, it shall be deemed a work made for hire for Company. If not, it is assigned under Section 3.1.
|
||||||
|
|
||||||
|
## 4. Moral rights waiver
|
||||||
|
|
||||||
|
To the extent permitted by law, Contributor **waives and agrees not to assert** any moral rights (including rights of attribution and integrity) or similar rights in the Contributions against Company or Company’s licensees, successors, or assigns.
|
||||||
|
|
||||||
|
## 5. Patent rights (assignment / license)
|
||||||
|
|
||||||
|
5.1 **Patent assignment.** To the maximum extent permitted by law, Contributor hereby assigns to Company all right, title, and interest in any patent rights that are **necessarily infringed** by making, using, selling, offering for sale, importing, or otherwise exploiting the Contributions or EE as incorporated with the Contributions.
|
||||||
|
|
||||||
|
5.2 **Fallback patent license.** If any patent rights cannot be assigned as a matter of law, Contributor grants Company a **perpetual, irrevocable, worldwide, transferable, sublicensable, royalty-free** license under such patent rights to make, have made, use, sell, offer for sale, import, and otherwise exploit the Contributions and EE.
|
||||||
|
|
||||||
|
## 6. Contributor representations
|
||||||
|
|
||||||
|
Contributor represents and warrants that:
|
||||||
|
|
||||||
|
6.1 **Authority.** Contributor has the legal right and authority to enter into this Agreement and to make the assignments and grants herein.
|
||||||
|
|
||||||
|
6.2 **Originality / rights clearance.** Each Contribution is original to Contributor or Contributor has secured all necessary rights and permissions to submit it and to assign the rights described in this Agreement.
|
||||||
|
|
||||||
|
6.3 **No third-party restrictions.** Contributions are not subject to any employment, contractor, academic, or other agreement that would conflict with this Agreement or restrict assignment to Company. Contributor has not included any code or materials that require disclosure of source code or impose “copyleft” or similar reciprocal obligations on EE (including but not limited to GPL, AGPL, LGPL (in a way that would impose reciprocity on EE), or other licenses that would require EE to be distributed under different terms), unless Company has expressly agreed in writing.
|
||||||
|
|
||||||
|
6.4 **No confidential information.** Contributor will not submit any confidential or proprietary information of any third party (including an employer) as part of a Contribution.
|
||||||
|
|
||||||
|
## 7. Relationship to MIT-licensed portions of the repo
|
||||||
|
|
||||||
|
This Agreement applies **only** to Contributions to EE as defined in Section 2.1. Contributions made solely to MIT-licensed portions of the repository remain governed by the repository’s applicable open-source licensing and contribution terms, unless a separate written agreement states otherwise.
|
||||||
|
|
||||||
|
## 8. No obligation; consideration
|
||||||
|
|
||||||
|
8.1 **No obligation to accept.** Company has no obligation to accept, merge, or distribute any Contribution.
|
||||||
|
|
||||||
|
8.2 **Consideration.** Contributor agrees that the opportunity to contribute to EE and Company’s potential acceptance and use of the Contributions are adequate consideration for the assignments and grants in this Agreement.
|
||||||
|
|
||||||
|
## 9. Limitation of liability
|
||||||
|
|
||||||
|
To the maximum extent permitted by law, **neither Party** will be liable to the other for any indirect, incidental, special, consequential, or punitive damages arising out of this Agreement.
|
||||||
|
|
||||||
|
## 10. Governing law; venue
|
||||||
|
|
||||||
|
This Agreement is governed by the laws of the **State of California**, excluding conflict-of-laws rules. The Parties agree to exclusive jurisdiction and venue in the state or federal courts located in **California**, unless prohibited by applicable law.
|
||||||
|
|
||||||
|
## 11. Miscellaneous
|
||||||
|
|
||||||
|
11.1 **Entire agreement.** This Agreement is the entire agreement between the Parties regarding EE Contributions and supersedes all prior or contemporaneous understandings on that subject.
|
||||||
|
|
||||||
|
11.2 **Amendment.** Any amendment must be in writing and signed by both Parties.
|
||||||
|
|
||||||
|
11.3 **Severability.** If any provision is held unenforceable, the remaining provisions remain in full force and effect.
|
||||||
|
|
||||||
|
11.4 **Counterparts; electronic signatures.** This Agreement may be executed in counterparts, including via electronic signature, each of which is deemed an original.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Signatures
|
||||||
|
|
||||||
|
**COMPANY:** DanswerAI, Inc.
|
||||||
|
By: ____________________________________
|
||||||
|
Name: __________________________________
|
||||||
|
Title: ___________________________________
|
||||||
|
Date: ___________________________________
|
||||||
|
|
||||||
|
**CONTRIBUTOR:**
|
||||||
|
Signature: _______________________________
|
||||||
|
Name: ___________________________________
|
||||||
|
Email: ___________________________________
|
||||||
|
Date: ___________________________________
|
||||||
BIN
contributing_guides/EE_Contributor_IP_Assignment_Agreement.pdf
Normal file
BIN
contributing_guides/EE_Contributor_IP_Assignment_Agreement.pdf
Normal file
Binary file not shown.
157
contributing_guides/best_practices.md
Normal file
157
contributing_guides/best_practices.md
Normal file
@@ -0,0 +1,157 @@
|
|||||||
|
# Engineering Principles, Style, and Correctness Guide
|
||||||
|
|
||||||
|
## Principles and collaboration
|
||||||
|
|
||||||
|
- **Use 1-way vs 2-way doors.** For 2-way doors, move faster and iterate. For 1-way doors, be more deliberate.
|
||||||
|
- **Consistency > being “right.”** Prefer consistent patterns across the codebase. If something is truly bad, fix it everywhere.
|
||||||
|
- **Fix what you touch (selectively).**
|
||||||
|
- Don’t feel obligated to fix every best-practice issue you notice.
|
||||||
|
- Don’t introduce new bad practices.
|
||||||
|
- If your change touches code that violates best practices, fix it as part of the change.
|
||||||
|
- **Don’t tack features on.** When adding functionality, restructure logically as needed to avoid muddying interfaces and accumulating tech debt.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Style and maintainability
|
||||||
|
|
||||||
|
### Comments and readability
|
||||||
|
Add clear comments:
|
||||||
|
- At logical boundaries (e.g., interfaces) so the reader doesn’t need to dig 10 layers deeper.
|
||||||
|
- Wherever assumptions are made or something non-obvious/unexpected is done.
|
||||||
|
- For complicated flows/functions.
|
||||||
|
- Wherever it saves time (e.g., nontrivial regex patterns).
|
||||||
|
|
||||||
|
### Errors and exceptions
|
||||||
|
- **Fail loudly** rather than silently skipping work.
|
||||||
|
- Example: raise and let exceptions propagate instead of silently dropping a document.
|
||||||
|
- **Don’t overuse `try/except`.**
|
||||||
|
- Put `try/except` at the correct logical level.
|
||||||
|
- Do not mask exceptions unless it is clearly appropriate.
|
||||||
|
|
||||||
|
### Typing
|
||||||
|
- Everything should be **as strictly typed as possible**.
|
||||||
|
- Use `cast` for annoying/loose-typed interfaces (e.g., results of `run_functions_tuples_in_parallel`).
|
||||||
|
- Only `cast` when the type checker sees `Any` or types are too loose.
|
||||||
|
- Prefer types that are easy to read.
|
||||||
|
- Avoid dense types like `dict[tuple[str, str], list[list[float]]]`.
|
||||||
|
- Prefer domain models, e.g.:
|
||||||
|
- `EmbeddingModel(provider_name, model_name)` as a Pydantic model
|
||||||
|
- `dict[EmbeddingModel, list[EmbeddingVector]]`
|
||||||
|
|
||||||
|
### State, objects, and boundaries
|
||||||
|
- Keep **clear logical boundaries** for state containers and objects.
|
||||||
|
- A **config** object should never contain things like a `db_session`.
|
||||||
|
- Avoid state containers that are:
|
||||||
|
- overly nested, or
|
||||||
|
- huge + flat (use judgment).
|
||||||
|
- Prefer **composition and functional style** over inheritance/OOP.
|
||||||
|
- Prefer **no mutation** unless there’s a strong reason.
|
||||||
|
- State objects should be **intentional and explicit**, ideally nonmutating.
|
||||||
|
- Use interfaces/objects to create clear separation of responsibility.
|
||||||
|
- Prefer simplicity when there’s no clear gain
|
||||||
|
- Avoid overcomplicated mechanisms like semaphores.
|
||||||
|
- Prefer **hash maps (dicts)** over tree structures unless there’s a strong reason.
|
||||||
|
|
||||||
|
### Naming
|
||||||
|
- Name variables carefully and intentionally.
|
||||||
|
- Prefer long, explicit names when undecided.
|
||||||
|
- Avoid single-character variables except for small, self-contained utilities (or not at all).
|
||||||
|
- Keep the same object/name consistent through the call stack and within functions when reasonable.
|
||||||
|
- Good: `for token in tokens:`
|
||||||
|
- Bad: `for msg in tokens:` (if iterating tokens)
|
||||||
|
- Function names should bias toward **long + descriptive** for codebase search.
|
||||||
|
- IntelliSense can miss call sites; search works best with unique names.
|
||||||
|
- “Fetch versioned implementation” is an example of why this matters.
|
||||||
|
|
||||||
|
### Correctness by construction
|
||||||
|
- Prefer self-contained correctness.
|
||||||
|
- Don’t rely on callers to “use it right” if you can make misuse hard.
|
||||||
|
- Avoid redundancies:
|
||||||
|
- If a function takes an arg, it shouldn’t also take a state object that contains that same arg.
|
||||||
|
- No dead code (unless there’s a very good reason).
|
||||||
|
- No commented-out code in main or feature branches (unless there’s a very good reason).
|
||||||
|
- No duplicate logic:
|
||||||
|
- Don’t copy/paste into branches when shared logic can live above the conditional.
|
||||||
|
- If you’re afraid to touch the original, you don’t understand it well enough.
|
||||||
|
- LLMs often create subtle duplicate logic—review carefully and remove it.
|
||||||
|
- Avoid “nearly identical” objects that confuse when to use which.
|
||||||
|
- Avoid extremely long functions with chained logic:
|
||||||
|
- Encapsulate steps into helpers for readability, even if not reused.
|
||||||
|
- “Pythonic” multi-step expressions are OK in moderation; don’t trade clarity for cleverness.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Performance and correctness
|
||||||
|
|
||||||
|
- Avoid holding resources for extended periods:
|
||||||
|
- DB sessions
|
||||||
|
- locks/semaphores
|
||||||
|
- Validate objects:
|
||||||
|
- on creation, and
|
||||||
|
- right before use.
|
||||||
|
- Connector code (data → Onyx documents):
|
||||||
|
- Any in-memory structure that can grow without bound based on input must be periodically size-checked.
|
||||||
|
- If a connector is OOMing (often shows up as “missing celery tasks”), this is a top thing to check retroactively.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Repository conventions: where code lives
|
||||||
|
|
||||||
|
- Pydantic + data models: `models.py` files.
|
||||||
|
- DB interface functions (excluding lazy loading): `db/` directory.
|
||||||
|
- LLM prompts: `prompts/` directory, roughly mirroring the code layout that uses them.
|
||||||
|
- API routes: `server/` directory.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pydantic and modeling rules
|
||||||
|
|
||||||
|
- Prefer **Pydantic** over dataclasses.
|
||||||
|
- If absolutely required, use `allow_arbitrary_types`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Data conventions
|
||||||
|
|
||||||
|
- Prefer explicit `None` over sentinel empty strings (usually; depends on intent).
|
||||||
|
- Prefer explicit identifiers:
|
||||||
|
- Use string enums instead of integer codes.
|
||||||
|
- Avoid magic numbers (co-location is good when necessary). **Always avoid magic strings.**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Logging
|
||||||
|
|
||||||
|
- Log messages where they are created.
|
||||||
|
- Don’t propagate log messages around just to log them elsewhere.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Encapsulation
|
||||||
|
|
||||||
|
- Don’t use private attributes/methods/properties from other classes/modules.
|
||||||
|
- “Private” is private—respect that boundary.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## SQLAlchemy guidance
|
||||||
|
|
||||||
|
- Lazy loading is often bad at scale, especially across multiple list relationships.
|
||||||
|
- Be careful when accessing SQLAlchemy object attributes:
|
||||||
|
- It can help avoid redundant DB queries,
|
||||||
|
- but it can also fail if accessed outside an active session,
|
||||||
|
- and lazy loading can add hidden DB dependencies to otherwise “simple” functions.
|
||||||
|
- Reference: https://www.reddit.com/r/SQLAlchemy/comments/138f248/joinedload_vs_selectinload/
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Trunk-based development and feature flags
|
||||||
|
|
||||||
|
- **PRs should contain no more than 500 lines of real change**
|
||||||
|
- **Merge to main frequently.** Avoid long-lived feature branches—they create merge conflicts and integration pain.
|
||||||
|
- **Use feature flags for incremental rollout.**
|
||||||
|
- Large features should be merged in small, shippable increments behind a flag.
|
||||||
|
- This allows continuous integration without exposing incomplete functionality.
|
||||||
|
- **Keep flags short-lived.** Once a feature is fully rolled out, remove the flag and dead code paths promptly.
|
||||||
|
- **Flag at the right level.** Prefer flagging at API/UI entry points rather than deep in business logic.
|
||||||
|
- **Test both flag states.** Ensure the codebase works correctly with the flag on and off.
|
||||||
38
contributing_guides/contribution_process.md
Normal file
38
contributing_guides/contribution_process.md
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
# Contribution Process
|
||||||
|
|
||||||
|
## 1. Get the feature or enhancement approved
|
||||||
|
Create a GitHub issue and see if there are upvotes. If you feel the feature is sufficiently value additive and you would like
|
||||||
|
approval to contribute it to the repo, tag [Yuhong](https://github.com/yuhongsun96) to review.
|
||||||
|
|
||||||
|
If you do not get a response within a week, feel free to email yuhong@onyx.app and include the issue in the message.
|
||||||
|
|
||||||
|
Not all small features and enhancements will be accepted as there is a balance between feature richness and bloat.
|
||||||
|
We strive to provide the best user experience possible so we have to be intentional about what we include in the app.
|
||||||
|
|
||||||
|
|
||||||
|
## 2. Get the design approved
|
||||||
|
The Onyx team will either provide a design doc and PRD for the feature or request one from you, the contributor.
|
||||||
|
|
||||||
|
The scope and detail of the design will depend on the individual feature.
|
||||||
|
|
||||||
|
|
||||||
|
# 3. IP attribution for EE contributions
|
||||||
|
If you are contributing features to Onyx Enterprise Edition, you are required to sign the IP Assignment Agreement in the
|
||||||
|
contributing_guides directory.
|
||||||
|
|
||||||
|
|
||||||
|
## 4. Review and testing
|
||||||
|
Your features must pass all tests and all comments must be addressed prior to merging.
|
||||||
|
|
||||||
|
|
||||||
|
# Implicit agreements
|
||||||
|
If we approve an issue, we are promising you the following:
|
||||||
|
- Your work will receive timely attention and we will put aside other high priority items to ensure you are not blocked.
|
||||||
|
- You will receive necessary coaching on eng quality, system design, etc. to ensure the feature is completed well.
|
||||||
|
- The Onyx team will pull resources and bandwidth from design, PM, and engineering to ensure that you have all the
|
||||||
|
resources to build the feature to the quality required for merging.
|
||||||
|
|
||||||
|
Because this is a large investment from our team, we ask that you:
|
||||||
|
- Thoroughly read all the requirements of the design docs, engineering best practices, and try to minimize overhead for
|
||||||
|
the Onyx team.
|
||||||
|
- Complete the feature in a timely manner to reduce context switching and an ongoing resource pull from the Onyx team.
|
||||||
205
contributing_guides/dev_setup.md
Normal file
205
contributing_guides/dev_setup.md
Normal file
@@ -0,0 +1,205 @@
|
|||||||
|
## Get Started 🚀
|
||||||
|
|
||||||
|
Onyx being a fully functional app, relies on some external software, specifically:
|
||||||
|
|
||||||
|
- [Postgres](https://www.postgresql.org/) (Relational DB)
|
||||||
|
- [Vespa](https://vespa.ai/) (Vector DB/Search Engine)
|
||||||
|
- [Redis](https://redis.io/) (Cache)
|
||||||
|
- [MinIO](https://min.io/) (File Store)
|
||||||
|
- [Nginx](https://nginx.org/) (Not needed for development flows generally)
|
||||||
|
|
||||||
|
> **Note:**
|
||||||
|
> This guide provides instructions to build and run Onyx locally from source with Docker containers providing the above external software. We believe this combination is easier for
|
||||||
|
> development purposes. If you prefer to use pre-built container images, we provide instructions on running the full Onyx stack within Docker below.
|
||||||
|
|
||||||
|
### Local Set Up
|
||||||
|
|
||||||
|
Be sure to use Python version 3.11. For instructions on installing Python 3.11 on macOS, refer to the [contributing_macos.md](./contributing_macos.md) readme.
|
||||||
|
|
||||||
|
If using a lower version, modifications will have to be made to the code.
|
||||||
|
If using a higher version, sometimes some libraries will not be available (i.e. we had problems with Tensorflow in the past with higher versions of python).
|
||||||
|
|
||||||
|
#### Backend: Python requirements
|
||||||
|
|
||||||
|
Currently, we use [uv](https://docs.astral.sh/uv/) and recommend creating a [virtual environment](https://docs.astral.sh/uv/pip/environments/#using-a-virtual-environment).
|
||||||
|
|
||||||
|
For convenience here's a command for it:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv venv .venv --python 3.11
|
||||||
|
source .venv/bin/activate
|
||||||
|
```
|
||||||
|
|
||||||
|
_For Windows, activate the virtual environment using Command Prompt:_
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv\Scripts\activate
|
||||||
|
```
|
||||||
|
|
||||||
|
If using PowerShell, the command slightly differs:
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
.venv\Scripts\Activate.ps1
|
||||||
|
```
|
||||||
|
|
||||||
|
Install the required python dependencies:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv sync --all-extras
|
||||||
|
```
|
||||||
|
|
||||||
|
Install Playwright for Python (headless browser required by the Web Connector):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run playwright install
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Frontend: Node dependencies
|
||||||
|
|
||||||
|
Onyx uses Node v22.20.0. We highly recommend you use [Node Version Manager (nvm)](https://github.com/nvm-sh/nvm)
|
||||||
|
to manage your Node installations. Once installed, you can run
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nvm install 22 && nvm use 22
|
||||||
|
node -v # verify your active version
|
||||||
|
```
|
||||||
|
|
||||||
|
Navigate to `onyx/web` and run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm i
|
||||||
|
```
|
||||||
|
|
||||||
|
## Formatting and Linting
|
||||||
|
|
||||||
|
### Backend
|
||||||
|
|
||||||
|
For the backend, you'll need to setup pre-commit hooks (black / reorder-python-imports).
|
||||||
|
|
||||||
|
Then run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run pre-commit install
|
||||||
|
```
|
||||||
|
|
||||||
|
Additionally, we use `mypy` for static type checking.
|
||||||
|
Onyx is fully type-annotated, and we want to keep it that way!
|
||||||
|
To run the mypy checks manually, run `uv run mypy .` from the `onyx/backend` directory.
|
||||||
|
|
||||||
|
### Web
|
||||||
|
|
||||||
|
We use `prettier` for formatting. The desired version will be installed via a `npm i` from the `onyx/web` directory.
|
||||||
|
To run the formatter, use `npx prettier --write .` from the `onyx/web` directory.
|
||||||
|
|
||||||
|
Pre-commit will also run prettier automatically on files you've recently touched. If re-formatted, your commit will fail.
|
||||||
|
Re-stage your changes and commit again.
|
||||||
|
|
||||||
|
# Running the application for development
|
||||||
|
|
||||||
|
## Developing using VSCode Debugger (recommended)
|
||||||
|
|
||||||
|
**We highly recommend using VSCode debugger for development.**
|
||||||
|
See [contributing_vscode.md](./contributing_vscode.md) for more details.
|
||||||
|
|
||||||
|
Otherwise, you can follow the instructions below to run the application for development.
|
||||||
|
|
||||||
|
## Manually running the application for development
|
||||||
|
### Docker containers for external software
|
||||||
|
|
||||||
|
You will need Docker installed to run these containers.
|
||||||
|
|
||||||
|
First navigate to `onyx/deployment/docker_compose`, then start up Postgres/Vespa/Redis/MinIO with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d index relational_db cache minio
|
||||||
|
```
|
||||||
|
|
||||||
|
(index refers to Vespa, relational_db refers to Postgres, and cache refers to Redis)
|
||||||
|
|
||||||
|
### Running Onyx locally
|
||||||
|
|
||||||
|
To start the frontend, navigate to `onyx/web` and run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run dev
|
||||||
|
```
|
||||||
|
|
||||||
|
Next, start the model server which runs the local NLP models.
|
||||||
|
Navigate to `onyx/backend` and run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uvicorn model_server.main:app --reload --port 9000
|
||||||
|
```
|
||||||
|
|
||||||
|
_For Windows (for compatibility with both PowerShell and Command Prompt):_
|
||||||
|
|
||||||
|
```bash
|
||||||
|
powershell -Command "uvicorn model_server.main:app --reload --port 9000"
|
||||||
|
```
|
||||||
|
|
||||||
|
The first time running Onyx, you will need to run the DB migrations for Postgres.
|
||||||
|
After the first time, this is no longer required unless the DB models change.
|
||||||
|
|
||||||
|
Navigate to `onyx/backend` and with the venv active, run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
alembic upgrade head
|
||||||
|
```
|
||||||
|
|
||||||
|
Next, start the task queue which orchestrates the background jobs.
|
||||||
|
Jobs that take more time are run async from the API server.
|
||||||
|
|
||||||
|
Still in `onyx/backend`, run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python ./scripts/dev_run_background_jobs.py
|
||||||
|
```
|
||||||
|
|
||||||
|
To run the backend API server, navigate back to `onyx/backend` and run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
AUTH_TYPE=disabled uvicorn onyx.main:app --reload --port 8080
|
||||||
|
```
|
||||||
|
|
||||||
|
_For Windows (for compatibility with both PowerShell and Command Prompt):_
|
||||||
|
|
||||||
|
```bash
|
||||||
|
powershell -Command "
|
||||||
|
$env:AUTH_TYPE='disabled'
|
||||||
|
uvicorn onyx.main:app --reload --port 8080
|
||||||
|
"
|
||||||
|
```
|
||||||
|
|
||||||
|
> **Note:**
|
||||||
|
> If you need finer logging, add the additional environment variable `LOG_LEVEL=DEBUG` to the relevant services.
|
||||||
|
|
||||||
|
#### Wrapping up
|
||||||
|
|
||||||
|
You should now have 4 servers running:
|
||||||
|
|
||||||
|
- Web server
|
||||||
|
- Backend API
|
||||||
|
- Model server
|
||||||
|
- Background jobs
|
||||||
|
|
||||||
|
Now, visit `http://localhost:3000` in your browser. You should see the Onyx onboarding wizard where you can connect your external LLM provider to Onyx.
|
||||||
|
|
||||||
|
You've successfully set up a local Onyx instance! 🏁
|
||||||
|
|
||||||
|
#### Running the Onyx application in a container
|
||||||
|
|
||||||
|
You can run the full Onyx application stack from pre-built images including all external software dependencies.
|
||||||
|
|
||||||
|
Navigate to `onyx/deployment/docker_compose` and run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
After Docker pulls and starts these containers, navigate to `http://localhost:3000` to use Onyx.
|
||||||
|
|
||||||
|
If you want to make changes to Onyx and run those changes in Docker, you can also build a local version of the Onyx container images that incorporates your changes like so:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose up -d --build
|
||||||
|
```
|
||||||
Reference in New Issue
Block a user