Compare commits

..

1 Commits

Author SHA1 Message Date
Bo-Onyx
3d2cc175a8 feat(pruning): Add Wire Prometheus metrics into the Heavy Celery worker 2026-04-07 15:16:05 -07:00
188 changed files with 3188 additions and 7809 deletions

View File

@@ -1 +0,0 @@
../../../cli/internal/embedded/SKILL.md

View File

@@ -0,0 +1,186 @@
---
name: onyx-cli
description: Query the Onyx knowledge base using the onyx-cli command. Use when the user wants to search company documents, ask questions about internal knowledge, query connected data sources, or look up information stored in Onyx.
---
# Onyx CLI — Agent Tool
Onyx is an enterprise search and Gen-AI platform that connects to company documents, apps, and people. The `onyx-cli` CLI provides non-interactive commands to query the Onyx knowledge base and list available agents.
## Prerequisites
### 1. Check if installed
```bash
which onyx-cli
```
### 2. Install (if needed)
**Primary — pip:**
```bash
pip install onyx-cli
```
**From source (Go):**
```bash
cd cli && go build -o onyx-cli . && sudo mv onyx-cli /usr/local/bin/
```
### 3. Check if configured
```bash
onyx-cli validate-config
```
This checks the config file exists, API key is present, and tests the server connection via `/api/me`. Exit code 0 on success, non-zero with a descriptive error on failure.
If unconfigured, you have two options:
**Option A — Interactive setup (requires user input):**
```bash
onyx-cli configure
```
This prompts for the Onyx server URL and API key, tests the connection, and saves config.
**Option B — Environment variables (non-interactive, preferred for agents):**
```bash
export ONYX_SERVER_URL="https://your-onyx-server.com" # default: https://cloud.onyx.app
export ONYX_API_KEY="your-api-key"
```
Environment variables override the config file. If these are set, no config file is needed.
| Variable | Required | Description |
|----------|----------|-------------|
| `ONYX_SERVER_URL` | No | Onyx server base URL (default: `https://cloud.onyx.app`) |
| `ONYX_API_KEY` | Yes | API key for authentication |
| `ONYX_PERSONA_ID` | No | Default agent/persona ID |
If neither the config file nor environment variables are set, tell the user that `onyx-cli` needs to be configured and ask them to either:
- Run `onyx-cli configure` interactively, or
- Set `ONYX_SERVER_URL` and `ONYX_API_KEY` environment variables
## Commands
### Validate configuration
```bash
onyx-cli validate-config
```
Checks config file exists, API key is present, and tests the server connection. Use this before `ask` or `agents` to confirm the CLI is properly set up.
### List available agents
```bash
onyx-cli agents
```
Prints a table of agent IDs, names, and descriptions. Use `--json` for structured output:
```bash
onyx-cli agents --json
```
Use agent IDs with `ask --agent-id` to query a specific agent.
### Basic query (plain text output)
```bash
onyx-cli ask "What is our company's PTO policy?"
```
Streams the answer as plain text to stdout. Exit code 0 on success, non-zero on error.
### JSON output (structured events)
```bash
onyx-cli ask --json "What authentication methods do we support?"
```
Outputs JSON-encoded parsed stream events (one object per line). Key event objects include message deltas, stop, errors, search-start, and citation payloads.
Each line is a JSON object with this envelope:
```json
{"type": "<event_type>", "event": { ... }}
```
| Event Type | Description |
|------------|-------------|
| `message_delta` | Content token — concatenate all `content` fields for the full answer |
| `stop` | Stream complete |
| `error` | Error with `error` message field |
| `search_tool_start` | Onyx started searching documents |
| `citation_info` | Source citation — see shape below |
`citation_info` event shape:
```json
{
"type": "citation_info",
"event": {
"citation_number": 1,
"document_id": "abc123def456",
"placement": {"turn_index": 0, "tab_index": 0, "sub_turn_index": null}
}
}
```
`placement` is metadata about where in the conversation the citation appeared and can be ignored for most use cases.
### Specify an agent
```bash
onyx-cli ask --agent-id 5 "Summarize our Q4 roadmap"
```
Uses a specific Onyx agent/persona instead of the default.
### All flags
| Flag | Type | Description |
|------|------|-------------|
| `--agent-id` | int | Agent ID to use (overrides default) |
| `--json` | bool | Output raw NDJSON events instead of plain text |
## Statelessness
Each `onyx-cli ask` call creates an independent chat session. There is no built-in way to chain context across multiple `ask` invocations — every call starts fresh. If you need multi-turn conversation with memory, use the interactive TUI (`onyx-cli` or `onyx-cli chat`) instead.
## When to Use
Use `onyx-cli ask` when:
- The user asks about company-specific information (policies, docs, processes)
- You need to search internal knowledge bases or connected data sources
- The user references Onyx, asks you to "search Onyx", or wants to query their documents
- You need context from company wikis, Confluence, Google Drive, Slack, or other connected sources
Do NOT use when:
- The question is about general programming knowledge (use your own knowledge)
- The user is asking about code in the current repository (use grep/read tools)
- The user hasn't mentioned Onyx and the question doesn't require internal company data
## Examples
```bash
# Simple question
onyx-cli ask "What are the steps to deploy to production?"
# Get structured output for parsing
onyx-cli ask --json "List all active API integrations"
# Use a specialized agent
onyx-cli ask --agent-id 3 "What were the action items from last week's standup?"
# Pipe the answer into another command
onyx-cli ask "What is the database schema for users?" | head -20
```

View File

@@ -228,7 +228,7 @@ jobs:
- name: Create GitHub Release
id: create-release
uses: softprops/action-gh-release@153bb8e04406b158c6c84fc1615b65b24149a1fe # ratchet:softprops/action-gh-release@v2
uses: softprops/action-gh-release@da05d552573ad5aba039eaac05058a918a7bf631 # ratchet:softprops/action-gh-release@v2
with:
tag_name: ${{ steps.release-tag.outputs.tag }}
name: ${{ steps.release-tag.outputs.tag }}

View File

@@ -21,7 +21,7 @@ jobs:
persist-credentials: false
- name: Install Helm CLI
uses: azure/setup-helm@dda3372f752e03dde6b3237bc9431cdc2f7a02a2 # ratchet:azure/setup-helm@v5.0.0
uses: azure/setup-helm@1a275c3b69536ee54be43f2070a358922e12c8d4 # ratchet:azure/setup-helm@v4
with:
version: v3.12.1

View File

@@ -13,7 +13,7 @@ jobs:
runs-on: ubuntu-latest
timeout-minutes: 45
steps:
- uses: actions/stale@b5d41d4e1d5dceea10e7104786b73624c18a190f # ratchet:actions/stale@v10
- uses: actions/stale@997185467fa4f803885201cee163a9f38240193d # ratchet:actions/stale@v10
with:
stale-issue-message: 'This issue is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days.'
stale-pr-message: 'This PR is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days.'

View File

@@ -36,7 +36,7 @@ jobs:
persist-credentials: false
- name: Set up Helm
uses: azure/setup-helm@dda3372f752e03dde6b3237bc9431cdc2f7a02a2 # ratchet:azure/setup-helm@v5.0.0
uses: azure/setup-helm@1a275c3b69536ee54be43f2070a358922e12c8d4 # ratchet:azure/setup-helm@v4.3.1
with:
version: v3.19.0

3
.gitignore vendored
View File

@@ -59,6 +59,3 @@ node_modules
# plans
plans/
# Added context for LLMs
onyx-llm-context/

View File

@@ -1,4 +1,4 @@
from typing import Any
from typing import Any, Literal
from onyx.db.engine.iam_auth import get_iam_auth_token
from onyx.configs.app_configs import USE_IAM_AUTH
from onyx.configs.app_configs import POSTGRES_HOST
@@ -19,6 +19,7 @@ from logging.config import fileConfig
from alembic import context
from sqlalchemy.ext.asyncio import create_async_engine
from sqlalchemy.sql.schema import SchemaItem
from onyx.configs.constants import SSL_CERT_FILE
from shared_configs.configs import (
MULTI_TENANT,
@@ -44,6 +45,8 @@ if config.config_file_name is not None and config.attributes.get(
target_metadata = [Base.metadata, ResultModelBase.metadata]
EXCLUDE_TABLES = {"kombu_queue", "kombu_message"}
logger = logging.getLogger(__name__)
ssl_context: ssl.SSLContext | None = None
@@ -53,6 +56,25 @@ if USE_IAM_AUTH:
ssl_context = ssl.create_default_context(cafile=SSL_CERT_FILE)
def include_object(
object: SchemaItem, # noqa: ARG001
name: str | None,
type_: Literal[
"schema",
"table",
"column",
"index",
"unique_constraint",
"foreign_key_constraint",
],
reflected: bool, # noqa: ARG001
compare_to: SchemaItem | None, # noqa: ARG001
) -> bool:
if type_ == "table" and name in EXCLUDE_TABLES:
return False
return True
def filter_tenants_by_range(
tenant_ids: list[str], start_range: int | None = None, end_range: int | None = None
) -> list[str]:
@@ -209,6 +231,7 @@ def do_run_migrations(
context.configure(
connection=connection,
target_metadata=target_metadata, # type: ignore
include_object=include_object,
version_table_schema=schema_name,
include_schemas=True,
compare_type=True,
@@ -382,6 +405,7 @@ def run_migrations_offline() -> None:
url=url,
target_metadata=target_metadata, # type: ignore
literal_binds=True,
include_object=include_object,
version_table_schema=schema,
include_schemas=True,
script_location=config.get_main_option("script_location"),
@@ -423,6 +447,7 @@ def run_migrations_offline() -> None:
url=url,
target_metadata=target_metadata, # type: ignore
literal_binds=True,
include_object=include_object,
version_table_schema=schema,
include_schemas=True,
script_location=config.get_main_option("script_location"),
@@ -465,6 +490,7 @@ def run_migrations_online() -> None:
context.configure(
connection=connection,
target_metadata=target_metadata, # type: ignore
include_object=include_object,
version_table_schema=schema_name,
include_schemas=True,
compare_type=True,

View File

@@ -1,9 +1,11 @@
import asyncio
from logging.config import fileConfig
from typing import Literal
from sqlalchemy import pool
from sqlalchemy.engine import Connection
from sqlalchemy.ext.asyncio import create_async_engine
from sqlalchemy.schema import SchemaItem
from alembic import context
from onyx.db.engine.sql_engine import build_connection_string
@@ -33,6 +35,27 @@ target_metadata = [PublicBase.metadata]
# my_important_option = config.get_main_option("my_important_option")
# ... etc.
EXCLUDE_TABLES = {"kombu_queue", "kombu_message"}
def include_object(
object: SchemaItem, # noqa: ARG001
name: str | None,
type_: Literal[
"schema",
"table",
"column",
"index",
"unique_constraint",
"foreign_key_constraint",
],
reflected: bool, # noqa: ARG001
compare_to: SchemaItem | None, # noqa: ARG001
) -> bool:
if type_ == "table" and name in EXCLUDE_TABLES:
return False
return True
def run_migrations_offline() -> None:
"""Run migrations in 'offline' mode.
@@ -62,6 +85,7 @@ def do_run_migrations(connection: Connection) -> None:
context.configure(
connection=connection,
target_metadata=target_metadata, # type: ignore[arg-type]
include_object=include_object,
)
with context.begin_transaction():

View File

@@ -5,7 +5,6 @@ from celery import Task
from celery.exceptions import SoftTimeLimitExceeded
from redis.lock import Lock as RedisLock
from ee.onyx.server.tenants.product_gating import get_gated_tenants
from onyx.background.celery.apps.app_base import task_logger
from onyx.background.celery.tasks.beat_schedule import BEAT_EXPIRES_DEFAULT
from onyx.configs.constants import CELERY_GENERIC_BEAT_LOCK_TIMEOUT
@@ -31,7 +30,6 @@ def cloud_beat_task_generator(
queue: str = OnyxCeleryTask.DEFAULT,
priority: int = OnyxCeleryPriority.MEDIUM,
expires: int = BEAT_EXPIRES_DEFAULT,
skip_gated: bool = True,
) -> bool | None:
"""a lightweight task used to kick off individual beat tasks per tenant."""
time_start = time.monotonic()
@@ -50,22 +48,20 @@ def cloud_beat_task_generator(
last_lock_time = time.monotonic()
tenant_ids: list[str] = []
num_processed_tenants = 0
num_skipped_gated = 0
try:
tenant_ids = get_all_tenant_ids()
# Per-task control over whether gated tenants are included. Most periodic tasks
# do no useful work on gated tenants and just waste DB connections fanning out
# to ~10k+ inactive tenants. A small number of cleanup tasks (connector deletion,
# checkpoint/index attempt cleanup) need to run on gated tenants and pass
# `skip_gated=False` from the beat schedule.
gated_tenants: set[str] = get_gated_tenants() if skip_gated else set()
# NOTE: for now, we are running tasks for gated tenants, since we want to allow
# connector deletion to run successfully. The new plan is to continously prune
# the gated tenants set, so we won't have a build up of old, unused gated tenants.
# Keeping this around in case we want to revert to the previous behavior.
# gated_tenants = get_gated_tenants()
for tenant_id in tenant_ids:
if tenant_id in gated_tenants:
num_skipped_gated += 1
continue
# Same comment here as the above NOTE
# if tenant_id in gated_tenants:
# continue
current_time = time.monotonic()
if current_time - last_lock_time >= (CELERY_GENERIC_BEAT_LOCK_TIMEOUT / 4):
@@ -108,7 +104,6 @@ def cloud_beat_task_generator(
f"cloud_beat_task_generator finished: "
f"task={task_name} "
f"num_processed_tenants={num_processed_tenants} "
f"num_skipped_gated={num_skipped_gated} "
f"num_tenants={len(tenant_ids)} "
f"elapsed={time_elapsed:.2f}"
)

View File

@@ -27,13 +27,13 @@ from shared_configs.configs import MULTI_TENANT
from shared_configs.configs import TENANT_ID_PREFIX
# Maximum tenants to provision in a single task run.
# Each tenant takes ~80s (alembic migrations), so 15 tenants ≈ 20 minutes.
_MAX_TENANTS_PER_RUN = 15
# Each tenant takes ~80s (alembic migrations), so 5 tenants ≈ 7 minutes.
_MAX_TENANTS_PER_RUN = 5
# Time limits sized for worst-case: provisioning up to _MAX_TENANTS_PER_RUN new tenants
# (~90s each) plus migrating up to TARGET_AVAILABLE_TENANTS pool tenants (~90s each).
_TENANT_PROVISIONING_SOFT_TIME_LIMIT = 60 * 40 # 40 minutes
_TENANT_PROVISIONING_TIME_LIMIT = 60 * 45 # 45 minutes
_TENANT_PROVISIONING_SOFT_TIME_LIMIT = 60 * 20 # 20 minutes
_TENANT_PROVISIONING_TIME_LIMIT = 60 * 25 # 25 minutes
@shared_task(

View File

@@ -1,14 +1,20 @@
from datetime import datetime
from datetime import timezone
from uuid import UUID
from celery import shared_task
from celery import Task
from ee.onyx.background.celery_utils import should_perform_chat_ttl_check
from ee.onyx.background.task_name_builders import name_chat_ttl_task
from onyx.configs.app_configs import JOB_TIMEOUT
from onyx.configs.constants import OnyxCeleryTask
from onyx.db.chat import delete_chat_session
from onyx.db.chat import get_chat_sessions_older_than
from onyx.db.engine.sql_engine import get_session_with_current_tenant
from onyx.db.enums import TaskStatus
from onyx.db.tasks import mark_task_as_finished_with_id
from onyx.db.tasks import register_task
from onyx.server.settings.store import load_settings
from onyx.utils.logger import setup_logger
@@ -23,42 +29,59 @@ logger = setup_logger()
trail=False,
)
def perform_ttl_management_task(
self: Task, retention_limit_days: int, *, tenant_id: str # noqa: ARG001
self: Task, retention_limit_days: int, *, tenant_id: str
) -> None:
task_id = self.request.id
if not task_id:
raise RuntimeError("No task id defined for this task; cannot identify it")
start_time = datetime.now(tz=timezone.utc)
user_id: UUID | None = None
session_id: UUID | None = None
try:
with get_session_with_current_tenant() as db_session:
# we generally want to move off this, but keeping for now
register_task(
db_session=db_session,
task_name=name_chat_ttl_task(retention_limit_days, tenant_id),
task_id=task_id,
status=TaskStatus.STARTED,
start_time=start_time,
)
old_chat_sessions = get_chat_sessions_older_than(
retention_limit_days, db_session
)
for user_id, session_id in old_chat_sessions:
try:
with get_session_with_current_tenant() as db_session:
delete_chat_session(
user_id,
session_id,
db_session,
include_deleted=True,
hard_delete=True,
)
except Exception:
logger.exception(
"Failed to delete chat session "
f"user_id={user_id} session_id={session_id}, "
"continuing with remaining sessions"
# one session per delete so that we don't blow up if a deletion fails.
with get_session_with_current_tenant() as db_session:
delete_chat_session(
user_id,
session_id,
db_session,
include_deleted=True,
hard_delete=True,
)
with get_session_with_current_tenant() as db_session:
mark_task_as_finished_with_id(
db_session=db_session,
task_id=task_id,
success=True,
)
except Exception:
logger.exception(
f"delete_chat_session exceptioned. user_id={user_id} session_id={session_id}"
)
with get_session_with_current_tenant() as db_session:
mark_task_as_finished_with_id(
db_session=db_session,
task_id=task_id,
success=False,
)
raise

View File

@@ -1,7 +1,6 @@
# Overview of Onyx Background Jobs
The background jobs take care of:
1. Pulling/Indexing documents (from connectors)
2. Updating document metadata (from connectors)
3. Cleaning up checkpoints and logic around indexing work (indexing indexing checkpoints and index attempt metadata)
@@ -10,41 +9,37 @@ The background jobs take care of:
## Worker → Queue Mapping
| Worker | File | Queues |
| ------------------------- | ------------------------------ | -------------------------------------------------------------------------------------------------------------------- |
| Primary | `apps/primary.py` | `celery` |
| Light | `apps/light.py` | `vespa_metadata_sync`, `connector_deletion`, `doc_permissions_upsert`, `checkpoint_cleanup`, `index_attempt_cleanup` |
| Heavy | `apps/heavy.py` | `connector_pruning`, `connector_doc_permissions_sync`, `connector_external_group_sync`, `csv_generation`, `sandbox` |
| Docprocessing | `apps/docprocessing.py` | `docprocessing` |
| Docfetching | `apps/docfetching.py` | `connector_doc_fetching` |
| User File Processing | `apps/user_file_processing.py` | `user_file_processing`, `user_file_project_sync`, `user_file_delete` |
| Monitoring | `apps/monitoring.py` | `monitoring` |
| Background (consolidated) | `apps/background.py` | All queues above except `celery` |
| Worker | File | Queues |
|--------|------|--------|
| Primary | `apps/primary.py` | `celery` |
| Light | `apps/light.py` | `vespa_metadata_sync`, `connector_deletion`, `doc_permissions_upsert`, `checkpoint_cleanup`, `index_attempt_cleanup` |
| Heavy | `apps/heavy.py` | `connector_pruning`, `connector_doc_permissions_sync`, `connector_external_group_sync`, `csv_generation`, `sandbox` |
| Docprocessing | `apps/docprocessing.py` | `docprocessing` |
| Docfetching | `apps/docfetching.py` | `connector_doc_fetching` |
| User File Processing | `apps/user_file_processing.py` | `user_file_processing`, `user_file_project_sync`, `user_file_delete` |
| Monitoring | `apps/monitoring.py` | `monitoring` |
| Background (consolidated) | `apps/background.py` | All queues above except `celery` |
## Non-Worker Apps
| App | File | Purpose |
| ---------- | ----------- | ----------------------------------------------------------------------------------------------------- |
| **Beat** | `beat.py` | Celery beat scheduler with `DynamicTenantScheduler` that generates per-tenant periodic task schedules |
| **Client** | `client.py` | Minimal app for task submission from non-worker processes (e.g., API server) |
| App | File | Purpose |
|-----|------|---------|
| **Beat** | `beat.py` | Celery beat scheduler with `DynamicTenantScheduler` that generates per-tenant periodic task schedules |
| **Client** | `client.py` | Minimal app for task submission from non-worker processes (e.g., API server) |
### Shared Module
`app_base.py` provides:
- `TenantAwareTask` - Base task class that sets tenant context
- Signal handlers for logging, cleanup, and lifecycle events
- Readiness probes and health checks
## Worker Details
### Primary (Coordinator and task dispatcher)
It is the single worker which handles tasks from the default celery queue. It is a singleton worker ensured by the `PRIMARY_WORKER` Redis lock
which it touches every `CELERY_PRIMARY_WORKER_LOCK_TIMEOUT / 8` seconds (using Celery Bootsteps)
On startup:
- waits for redis, postgres, document index to all be healthy
- acquires the singleton lock
- cleans all the redis states associated with background jobs
@@ -52,34 +47,34 @@ On startup:
Then it cycles through its tasks as scheduled by Celery Beat:
| Task | Frequency | Description |
| --------------------------------- | --------- | ------------------------------------------------------------------------------------------ |
| `check_for_indexing` | 15s | Scans for connectors needing indexing → dispatches to `DOCFETCHING` queue |
| `check_for_vespa_sync_task` | 20s | Finds stale documents/document sets → dispatches sync tasks to `VESPA_METADATA_SYNC` queue |
| `check_for_pruning` | 20s | Finds connectors due for pruning → dispatches to `CONNECTOR_PRUNING` queue |
| `check_for_connector_deletion` | 20s | Processes deletion requests → dispatches to `CONNECTOR_DELETION` queue |
| `check_for_user_file_processing` | 20s | Checks for user uploads → dispatches to `USER_FILE_PROCESSING` queue |
| `check_for_checkpoint_cleanup` | 1h | Cleans up old indexing checkpoints |
| `check_for_index_attempt_cleanup` | 30m | Cleans up old index attempts |
| `celery_beat_heartbeat` | 1m | Heartbeat for Beat watchdog |
| Task | Frequency | Description |
|------|-----------|-------------|
| `check_for_indexing` | 15s | Scans for connectors needing indexing → dispatches to `DOCFETCHING` queue |
| `check_for_vespa_sync_task` | 20s | Finds stale documents/document sets → dispatches sync tasks to `VESPA_METADATA_SYNC` queue |
| `check_for_pruning` | 20s | Finds connectors due for pruning → dispatches to `CONNECTOR_PRUNING` queue |
| `check_for_connector_deletion` | 20s | Processes deletion requests → dispatches to `CONNECTOR_DELETION` queue |
| `check_for_user_file_processing` | 20s | Checks for user uploads → dispatches to `USER_FILE_PROCESSING` queue |
| `check_for_checkpoint_cleanup` | 1h | Cleans up old indexing checkpoints |
| `check_for_index_attempt_cleanup` | 30m | Cleans up old index attempts |
| `kombu_message_cleanup_task` | periodic | Cleans orphaned Kombu messages from DB (Kombu being the messaging framework used by Celery) |
| `celery_beat_heartbeat` | 1m | Heartbeat for Beat watchdog |
Watchdog is a separate Python process managed by supervisord which runs alongside celery workers. It checks the ONYX_CELERY_BEAT_HEARTBEAT_KEY in
Redis to ensure Celery Beat is not dead. Beat schedules the celery_beat_heartbeat for Primary to touch the key and share that it's still alive.
See supervisord.conf for watchdog config.
### Light
### Light
Fast and short living tasks that are not resource intensive. High concurrency:
Can have 24 concurrent workers, each with a prefetch of 8 for a total of 192 tasks in flight at once.
Tasks it handles:
- Syncs access/permissions, document sets, boosts, hidden state
- Deletes documents that are marked for deletion in Postgres
- Cleanup of checkpoints and index attempts
### Heavy
### Heavy
Long running, resource intensive tasks, handles pruning and sandbox operations. Low concurrency - max concurrency of 4 with 1 prefetch.
Does not interact with the Document Index, it handles the syncs with external systems. Large volume API calls to handle pruning and fetching permissions, etc.
@@ -88,24 +83,16 @@ Generates CSV exports which may take a long time with significant data in Postgr
Sandbox (new feature) for running Next.js, Python virtual env, OpenCode AI Agent, and access to knowledge files
### Docprocessing, Docfetching, User File Processing
Docprocessing and Docfetching are for indexing documents:
- Docfetching runs connectors to pull documents from external APIs (Google Drive, Confluence, etc.), stores batches to file storage, and dispatches docprocessing tasks
- Docprocessing retrieves batches, runs the indexing pipeline (chunking, embedding), and indexes into the Document Index
- User Files come from uploads directly via the input bar
- Docprocessing retrieves batches, runs the indexing pipeline (chunking, embedding), and indexes into the Document Index
User Files come from uploads directly via the input bar
### Monitoring
Observability and metrics collections:
- Queue lengths, connector success/failure, connector latencies
- Queue lengths, connector success/failure, lconnector latencies
- Memory of supervisor managed processes (workers, beat, slack)
- Cloud and multitenant specific monitorings
## Prometheus Metrics
Workers can expose Prometheus metrics via a standalone HTTP server. Currently docfetching and docprocessing have push-based task lifecycle metrics; the monitoring worker runs pull-based collectors for queue depth and connector health.
For the full metric reference, integration guide, and PromQL examples, see [`docs/METRICS.md`](../../../docs/METRICS.md#celery-worker-metrics).

View File

@@ -13,6 +13,12 @@ from celery.signals import worker_shutdown
import onyx.background.celery.apps.app_base as app_base
from onyx.configs.constants import POSTGRES_CELERY_WORKER_HEAVY_APP_NAME
from onyx.db.engine.sql_engine import SqlEngine
from onyx.server.metrics.celery_task_metrics import on_celery_task_postrun
from onyx.server.metrics.celery_task_metrics import on_celery_task_prerun
from onyx.server.metrics.celery_task_metrics import on_celery_task_rejected
from onyx.server.metrics.celery_task_metrics import on_celery_task_retry
from onyx.server.metrics.celery_task_metrics import on_celery_task_revoked
from onyx.server.metrics.metrics_server import start_metrics_server
from onyx.utils.logger import setup_logger
from shared_configs.configs import MULTI_TENANT
@@ -34,6 +40,7 @@ def on_task_prerun(
**kwds: Any,
) -> None:
app_base.on_task_prerun(sender, task_id, task, args, kwargs, **kwds)
on_celery_task_prerun(task_id, task)
@signals.task_postrun.connect
@@ -48,6 +55,31 @@ def on_task_postrun(
**kwds: Any,
) -> None:
app_base.on_task_postrun(sender, task_id, task, args, kwargs, retval, state, **kwds)
on_celery_task_postrun(task_id, task, state)
@signals.task_retry.connect
def on_task_retry(sender: Any | None = None, **kwargs: Any) -> None: # noqa: ARG001
task_id = getattr(getattr(sender, "request", None), "id", None)
on_celery_task_retry(task_id, sender)
@signals.task_revoked.connect
def on_task_revoked(sender: Any | None = None, **kwargs: Any) -> None:
task_name = getattr(sender, "name", None) or str(sender)
on_celery_task_revoked(kwargs.get("task_id"), task_name)
@signals.task_rejected.connect
def on_task_rejected(sender: Any | None = None, **kwargs: Any) -> None: # noqa: ARG001
message = kwargs.get("message")
task_name: str | None = None
if message is not None:
headers = getattr(message, "headers", None) or {}
task_name = headers.get("task")
if task_name is None:
task_name = "unknown"
on_celery_task_rejected(None, task_name)
@celeryd_init.connect
@@ -76,6 +108,7 @@ def on_worker_init(sender: Worker, **kwargs: Any) -> None:
@worker_ready.connect
def on_worker_ready(sender: Any, **kwargs: Any) -> None:
start_metrics_server("heavy")
app_base.on_worker_ready(sender, **kwargs)

View File

@@ -317,6 +317,7 @@ celery_app.autodiscover_tasks(
"onyx.background.celery.tasks.docprocessing",
"onyx.background.celery.tasks.evals",
"onyx.background.celery.tasks.hierarchyfetching",
"onyx.background.celery.tasks.periodic",
"onyx.background.celery.tasks.pruning",
"onyx.background.celery.tasks.shared",
"onyx.background.celery.tasks.vespa",

View File

@@ -75,8 +75,6 @@ beat_task_templates: list[dict] = [
"options": {
"priority": OnyxCeleryPriority.LOW,
"expires": BEAT_EXPIRES_DEFAULT,
# Run on gated tenants too — they may still have stale checkpoints to clean.
"skip_gated": False,
},
},
{
@@ -86,8 +84,6 @@ beat_task_templates: list[dict] = [
"options": {
"priority": OnyxCeleryPriority.MEDIUM,
"expires": BEAT_EXPIRES_DEFAULT,
# Run on gated tenants too — they may still have stale index attempts.
"skip_gated": False,
},
},
{
@@ -97,8 +93,6 @@ beat_task_templates: list[dict] = [
"options": {
"priority": OnyxCeleryPriority.MEDIUM,
"expires": BEAT_EXPIRES_DEFAULT,
# Gated tenants may still have connectors awaiting deletion.
"skip_gated": False,
},
},
{
@@ -272,7 +266,7 @@ def make_cloud_generator_task(task: dict[str, Any]) -> dict[str, Any]:
cloud_task["kwargs"] = {}
cloud_task["kwargs"]["task_name"] = task["task"]
optional_fields = ["queue", "priority", "expires", "skip_gated"]
optional_fields = ["queue", "priority", "expires"]
for field in optional_fields:
if field in task["options"]:
cloud_task["kwargs"][field] = task["options"][field]
@@ -308,7 +302,7 @@ beat_cloud_tasks: list[dict] = [
{
"name": f"{ONYX_CLOUD_CELERY_TASK_PREFIX}_check-available-tenants",
"task": OnyxCeleryTask.CLOUD_CHECK_AVAILABLE_TENANTS,
"schedule": timedelta(minutes=2),
"schedule": timedelta(minutes=10),
"options": {
"queue": OnyxCeleryQueues.MONITORING,
"priority": OnyxCeleryPriority.HIGH,
@@ -365,13 +359,7 @@ if not MULTI_TENANT:
]
)
# `skip_gated` is a cloud-only hint consumed by `cloud_beat_task_generator`. Strip
# it before extending the self-hosted schedule so it doesn't leak into apply_async
# as an unrecognised option on every fired task message.
for _template in beat_task_templates:
_self_hosted_template = copy.deepcopy(_template)
_self_hosted_template["options"].pop("skip_gated", None)
tasks_to_schedule.append(_self_hosted_template)
tasks_to_schedule.extend(beat_task_templates)
def generate_cloud_tasks(

View File

@@ -36,7 +36,6 @@ from onyx.configs.constants import OnyxRedisLocks
from onyx.db.engine.sql_engine import get_session_with_current_tenant
from onyx.db.opensearch_migration import build_sanitized_to_original_doc_id_mapping
from onyx.db.opensearch_migration import get_vespa_visit_state
from onyx.db.opensearch_migration import is_migration_completed
from onyx.db.opensearch_migration import (
mark_migration_completed_time_if_not_set_with_commit,
)
@@ -107,19 +106,14 @@ def migrate_chunks_from_vespa_to_opensearch_task(
acquired; effectively a no-op. True if the task completed
successfully. False if the task errored.
"""
# 1. Check if we should run the task.
# 1.a. If OpenSearch indexing is disabled, we don't run the task.
if not ENABLE_OPENSEARCH_INDEXING_FOR_ONYX:
task_logger.warning(
"OpenSearch migration is not enabled, skipping chunk migration task."
)
return None
task_logger.info("Starting chunk-level migration from Vespa to OpenSearch.")
task_start_time = time.monotonic()
# 1.b. Only one instance per tenant of this task may run concurrently at
# once. If we fail to acquire a lock, we assume it is because another task
# has one and we exit.
r = get_redis_client()
lock: RedisLock = r.lock(
name=OnyxRedisLocks.OPENSEARCH_MIGRATION_BEAT_LOCK,
@@ -142,11 +136,10 @@ def migrate_chunks_from_vespa_to_opensearch_task(
f"Token: {lock.local.token}"
)
# 2. Prepare to migrate.
total_chunks_migrated_this_task = 0
total_chunks_errored_this_task = 0
try:
# 2.a. Double-check that tenant info is correct.
# Double check that tenant info is correct.
if tenant_id != get_current_tenant_id():
err_str = (
f"Tenant ID mismatch in the OpenSearch migration task: "
@@ -155,62 +148,16 @@ def migrate_chunks_from_vespa_to_opensearch_task(
task_logger.error(err_str)
return False
# Do as much as we can with a DB session in one spot to not hold a
# session during a migration batch.
with get_session_with_current_tenant() as db_session:
# 2.b. Immediately check to see if this tenant is done, to save
# having to do any other work. This function does not require a
# migration record to necessarily exist.
if is_migration_completed(db_session):
return True
# 2.c. Try to insert the OpenSearchTenantMigrationRecord table if it
# does not exist.
with (
get_session_with_current_tenant() as db_session,
get_vespa_http_client(
timeout=VESPA_MIGRATION_REQUEST_TIMEOUT_S
) as vespa_client,
):
try_insert_opensearch_tenant_migration_record_with_commit(db_session)
# 2.d. Get search settings.
search_settings = get_current_search_settings(db_session)
indexing_setting = IndexingSetting.from_db_model(search_settings)
# 2.e. Build sanitized to original doc ID mapping to check for
# conflicts in the event we sanitize a doc ID to an
# already-existing doc ID.
# We reconstruct this mapping for every task invocation because
# a document may have been added in the time between two tasks.
sanitized_doc_start_time = time.monotonic()
sanitized_to_original_doc_id_mapping = (
build_sanitized_to_original_doc_id_mapping(db_session)
)
task_logger.debug(
f"Built sanitized_to_original_doc_id_mapping with {len(sanitized_to_original_doc_id_mapping)} entries "
f"in {time.monotonic() - sanitized_doc_start_time:.3f} seconds."
)
# 2.f. Get the current migration state.
continuation_token_map, total_chunks_migrated = get_vespa_visit_state(
db_session
)
# 2.f.1. Double-check that the migration state does not imply
# completion. Really we should never have to enter this block as we
# would expect is_migration_completed to return True, but in the
# strange event that the migration is complete but the migration
# completed time was never stamped, we do so here.
if is_continuation_token_done_for_all_slices(continuation_token_map):
task_logger.info(
f"OpenSearch migration COMPLETED for tenant {tenant_id}. Total chunks migrated: {total_chunks_migrated}."
)
mark_migration_completed_time_if_not_set_with_commit(db_session)
return True
task_logger.debug(
f"Read the tenant migration record. Total chunks migrated: {total_chunks_migrated}. "
f"Continuation token map: {continuation_token_map}"
)
with get_vespa_http_client(
timeout=VESPA_MIGRATION_REQUEST_TIMEOUT_S
) as vespa_client:
# 2.g. Create the OpenSearch and Vespa document indexes.
tenant_state = TenantState(tenant_id=tenant_id, multitenant=MULTI_TENANT)
indexing_setting = IndexingSetting.from_db_model(search_settings)
opensearch_document_index = OpenSearchDocumentIndex(
tenant_state=tenant_state,
index_name=search_settings.index_name,
@@ -224,14 +171,22 @@ def migrate_chunks_from_vespa_to_opensearch_task(
httpx_client=vespa_client,
)
# 2.h. Get the approximate chunk count in Vespa as of this time to
# update the migration record.
sanitized_doc_start_time = time.monotonic()
# We reconstruct this mapping for every task invocation because a
# document may have been added in the time between two tasks.
sanitized_to_original_doc_id_mapping = (
build_sanitized_to_original_doc_id_mapping(db_session)
)
task_logger.debug(
f"Built sanitized_to_original_doc_id_mapping with {len(sanitized_to_original_doc_id_mapping)} entries "
f"in {time.monotonic() - sanitized_doc_start_time:.3f} seconds."
)
approx_chunk_count_in_vespa: int | None = None
get_chunk_count_start_time = time.monotonic()
try:
approx_chunk_count_in_vespa = vespa_document_index.get_chunk_count()
except Exception:
# This failure should not be blocking.
task_logger.exception(
"Error getting approximate chunk count in Vespa. Moving on..."
)
@@ -240,12 +195,25 @@ def migrate_chunks_from_vespa_to_opensearch_task(
f"approximate chunk count in Vespa. Got {approx_chunk_count_in_vespa}."
)
# 3. Do the actual migration in batches until we run out of time.
while (
time.monotonic() - task_start_time < MIGRATION_TASK_SOFT_TIME_LIMIT_S
and lock.owned()
):
# 3.a. Get the next batch of raw chunks from Vespa.
(
continuation_token_map,
total_chunks_migrated,
) = get_vespa_visit_state(db_session)
if is_continuation_token_done_for_all_slices(continuation_token_map):
task_logger.info(
f"OpenSearch migration COMPLETED for tenant {tenant_id}. Total chunks migrated: {total_chunks_migrated}."
)
mark_migration_completed_time_if_not_set_with_commit(db_session)
break
task_logger.debug(
f"Read the tenant migration record. Total chunks migrated: {total_chunks_migrated}. "
f"Continuation token map: {continuation_token_map}"
)
get_vespa_chunks_start_time = time.monotonic()
raw_vespa_chunks, next_continuation_token_map = (
vespa_document_index.get_all_raw_document_chunks_paginated(
@@ -258,7 +226,6 @@ def migrate_chunks_from_vespa_to_opensearch_task(
f"seconds. Next continuation token map: {next_continuation_token_map}"
)
# 3.b. Transform the raw chunks to OpenSearch chunks in memory.
opensearch_document_chunks, errored_chunks = (
transform_vespa_chunks_to_opensearch_chunks(
raw_vespa_chunks,
@@ -273,7 +240,6 @@ def migrate_chunks_from_vespa_to_opensearch_task(
"errored."
)
# 3.c. Index the OpenSearch chunks into OpenSearch.
index_opensearch_chunks_start_time = time.monotonic()
opensearch_document_index.index_raw_chunks(
chunks=opensearch_document_chunks
@@ -285,38 +251,12 @@ def migrate_chunks_from_vespa_to_opensearch_task(
total_chunks_migrated_this_task += len(opensearch_document_chunks)
total_chunks_errored_this_task += len(errored_chunks)
# Do as much as we can with a DB session in one spot to not hold a
# session during a migration batch.
with get_session_with_current_tenant() as db_session:
# 3.d. Update the migration state.
update_vespa_visit_progress_with_commit(
db_session,
continuation_token_map=next_continuation_token_map,
chunks_processed=len(opensearch_document_chunks),
chunks_errored=len(errored_chunks),
approx_chunk_count_in_vespa=approx_chunk_count_in_vespa,
)
# 3.e. Get the current migration state. Even thought we
# technically have it in-memory since we just wrote it, we
# want to reference the DB as the source of truth at all
# times.
continuation_token_map, total_chunks_migrated = (
get_vespa_visit_state(db_session)
)
# 3.e.1. Check if the migration is done.
if is_continuation_token_done_for_all_slices(
continuation_token_map
):
task_logger.info(
f"OpenSearch migration COMPLETED for tenant {tenant_id}. Total chunks migrated: {total_chunks_migrated}."
)
mark_migration_completed_time_if_not_set_with_commit(db_session)
return True
task_logger.debug(
f"Read the tenant migration record. Total chunks migrated: {total_chunks_migrated}. "
f"Continuation token map: {continuation_token_map}"
update_vespa_visit_progress_with_commit(
db_session,
continuation_token_map=next_continuation_token_map,
chunks_processed=len(opensearch_document_chunks),
chunks_errored=len(errored_chunks),
approx_chunk_count_in_vespa=approx_chunk_count_in_vespa,
)
except Exception:
traceback.print_exc()

View File

@@ -0,0 +1,138 @@
#####
# Periodic Tasks
#####
import json
from typing import Any
from celery import shared_task
from celery.contrib.abortable import AbortableTask # type: ignore
from celery.exceptions import TaskRevokedError
from sqlalchemy import inspect
from sqlalchemy import text
from sqlalchemy.orm import Session
from onyx.background.celery.apps.app_base import task_logger
from onyx.configs.app_configs import JOB_TIMEOUT
from onyx.configs.constants import OnyxCeleryTask
from onyx.configs.constants import PostgresAdvisoryLocks
from onyx.db.engine.sql_engine import get_session_with_current_tenant
@shared_task(
name=OnyxCeleryTask.KOMBU_MESSAGE_CLEANUP_TASK,
soft_time_limit=JOB_TIMEOUT,
bind=True,
base=AbortableTask,
)
def kombu_message_cleanup_task(self: Any, tenant_id: str) -> int: # noqa: ARG001
"""Runs periodically to clean up the kombu_message table"""
# we will select messages older than this amount to clean up
KOMBU_MESSAGE_CLEANUP_AGE = 7 # days
KOMBU_MESSAGE_CLEANUP_PAGE_LIMIT = 1000
ctx = {}
ctx["last_processed_id"] = 0
ctx["deleted"] = 0
ctx["cleanup_age"] = KOMBU_MESSAGE_CLEANUP_AGE
ctx["page_limit"] = KOMBU_MESSAGE_CLEANUP_PAGE_LIMIT
with get_session_with_current_tenant() as db_session:
# Exit the task if we can't take the advisory lock
result = db_session.execute(
text("SELECT pg_try_advisory_lock(:id)"),
{"id": PostgresAdvisoryLocks.KOMBU_MESSAGE_CLEANUP_LOCK_ID.value},
).scalar()
if not result:
return 0
while True:
if self.is_aborted():
raise TaskRevokedError("kombu_message_cleanup_task was aborted.")
b = kombu_message_cleanup_task_helper(ctx, db_session)
if not b:
break
db_session.commit()
if ctx["deleted"] > 0:
task_logger.info(
f"Deleted {ctx['deleted']} orphaned messages from kombu_message."
)
return ctx["deleted"]
def kombu_message_cleanup_task_helper(ctx: dict, db_session: Session) -> bool:
"""
Helper function to clean up old messages from the `kombu_message` table that are no longer relevant.
This function retrieves messages from the `kombu_message` table that are no longer visible and
older than a specified interval. It checks if the corresponding task_id exists in the
`celery_taskmeta` table. If the task_id does not exist, the message is deleted.
Args:
ctx (dict): A context dictionary containing configuration parameters such as:
- 'cleanup_age' (int): The age in days after which messages are considered old.
- 'page_limit' (int): The maximum number of messages to process in one batch.
- 'last_processed_id' (int): The ID of the last processed message to handle pagination.
- 'deleted' (int): A counter to track the number of deleted messages.
db_session (Session): The SQLAlchemy database session for executing queries.
Returns:
bool: Returns True if there are more rows to process, False if not.
"""
inspector = inspect(db_session.bind)
if not inspector:
return False
# With the move to redis as celery's broker and backend, kombu tables may not even exist.
# We can fail silently.
if not inspector.has_table("kombu_message"):
return False
query = text(
"""
SELECT id, timestamp, payload
FROM kombu_message WHERE visible = 'false'
AND timestamp < CURRENT_TIMESTAMP - INTERVAL :interval_days
AND id > :last_processed_id
ORDER BY id
LIMIT :page_limit
"""
)
kombu_messages = db_session.execute(
query,
{
"interval_days": f"{ctx['cleanup_age']} days",
"page_limit": ctx["page_limit"],
"last_processed_id": ctx["last_processed_id"],
},
).fetchall()
if len(kombu_messages) == 0:
return False
for msg in kombu_messages:
payload = json.loads(msg[2])
task_id = payload["headers"]["id"]
# Check if task_id exists in celery_taskmeta
task_exists = db_session.execute(
text("SELECT 1 FROM celery_taskmeta WHERE task_id = :task_id"),
{"task_id": task_id},
).fetchone()
# If task_id does not exist, delete the message
if not task_exists:
result = db_session.execute(
text("DELETE FROM kombu_message WHERE id = :message_id"),
{"message_id": msg[0]},
)
if result.rowcount > 0: # type: ignore
ctx["deleted"] += 1
ctx["last_processed_id"] = msg[0]
return True

View File

@@ -217,7 +217,7 @@ def check_for_pruning(self: Task, *, tenant_id: str) -> bool | None:
try:
# the entire task needs to run frequently in order to finalize pruning
# but pruning only kicks off once per hour
# but pruning only kicks off once per min
if not r.exists(OnyxRedisSignals.BLOCK_PRUNING):
task_logger.info("Checking for pruning due")

View File

@@ -996,7 +996,6 @@ def _run_models(
def _run_model(model_idx: int) -> None:
"""Run one LLM loop inside a worker thread, writing packets to ``merged_queue``."""
model_emitter = Emitter(
model_idx=model_idx,
merged_queue=merged_queue,
@@ -1103,33 +1102,33 @@ def _run_models(
finally:
merged_queue.put((model_idx, _MODEL_DONE))
def _save_errored_message(model_idx: int, context: str) -> None:
"""Save an error message to a reserved ChatMessage that failed during execution."""
def _delete_orphaned_message(model_idx: int, context: str) -> None:
"""Delete a reserved ChatMessage that was never populated due to a model error."""
try:
msg = db_session.get(ChatMessage, setup.reserved_messages[model_idx].id)
if msg is not None:
error_text = f"Error from {setup.model_display_names[model_idx]}: model encountered an error during generation."
msg.message = error_text
msg.error = error_text
orphaned = db_session.get(
ChatMessage, setup.reserved_messages[model_idx].id
)
if orphaned is not None:
db_session.delete(orphaned)
db_session.commit()
except Exception:
logger.exception(
"%s error save failed for model %d (%s)",
"%s orphan cleanup failed for model %d (%s)",
context,
model_idx,
setup.model_display_names[model_idx],
)
# Each worker thread needs its own Context copy — a single Context object
# cannot be entered concurrently by multiple threads (RuntimeError).
# Copy contextvars before submitting futures — ThreadPoolExecutor does NOT
# auto-propagate contextvars in Python 3.11; threads would inherit a blank context.
worker_context = contextvars.copy_context()
executor = ThreadPoolExecutor(
max_workers=n_models, thread_name_prefix="multi-model"
)
completion_persisted: bool = False
try:
for i in range(n_models):
ctx = contextvars.copy_context()
executor.submit(ctx.run, _run_model, i)
executor.submit(worker_context.run, _run_model, i)
# ── Main thread: merge and yield packets ────────────────────────────
models_remaining = n_models
@@ -1146,7 +1145,7 @@ def _run_models(
# save "stopped by user" for a model that actually threw an exception.
for i in range(n_models):
if model_errored[i]:
_save_errored_message(i, "stop-button")
_delete_orphaned_message(i, "stop-button")
continue
try:
succeeded = model_succeeded[i]
@@ -1212,7 +1211,7 @@ def _run_models(
for i in range(n_models):
if not model_succeeded[i]:
# Model errored — delete its orphaned reserved message.
_save_errored_message(i, "normal")
_delete_orphaned_message(i, "normal")
continue
try:
llm_loop_completion_handle(
@@ -1265,7 +1264,7 @@ def _run_models(
setup.model_display_names[i],
)
elif model_errored[i]:
_save_errored_message(i, "disconnect")
_delete_orphaned_message(i, "disconnect")
# 4. Drain buffered packets from memory — no consumer is running.
while not merged_queue.empty():
try:

View File

@@ -379,14 +379,6 @@ POSTGRES_HOST = os.environ.get("POSTGRES_HOST") or "127.0.0.1"
POSTGRES_PORT = os.environ.get("POSTGRES_PORT") or "5432"
POSTGRES_DB = os.environ.get("POSTGRES_DB") or "postgres"
AWS_REGION_NAME = os.environ.get("AWS_REGION_NAME") or "us-east-2"
# Comma-separated replica / multi-host list. If unset, defaults to POSTGRES_HOST
# only.
_POSTGRES_HOSTS_STR = os.environ.get("POSTGRES_HOSTS", "").strip()
POSTGRES_HOSTS: list[str] = (
[h.strip() for h in _POSTGRES_HOSTS_STR.split(",") if h.strip()]
if _POSTGRES_HOSTS_STR
else [POSTGRES_HOST]
)
POSTGRES_API_SERVER_POOL_SIZE = int(
os.environ.get("POSTGRES_API_SERVER_POOL_SIZE") or 40

View File

@@ -12,11 +12,6 @@ SLACK_USER_TOKEN_PREFIX = "xoxp-"
SLACK_BOT_TOKEN_PREFIX = "xoxb-"
ONYX_EMAILABLE_LOGO_MAX_DIM = 512
# The mask_string() function in encryption.py uses "•" (U+2022 BULLET) to mask secrets.
MASK_CREDENTIAL_CHAR = "\u2022"
# Pattern produced by mask_string for strings >= 14 chars: "abcd...wxyz" (exactly 11 chars)
MASK_CREDENTIAL_LONG_RE = re.compile(r"^.{4}\.{3}.{4}$")
SOURCE_TYPE = "source_type"
# stored in the `metadata` of a chunk. Used to signify that this chunk should
# not be used for QA. For example, Google Drive file types which can't be parsed
@@ -396,6 +391,10 @@ class MilestoneRecordType(str, Enum):
REQUESTED_CONNECTOR = "requested_connector"
class PostgresAdvisoryLocks(Enum):
KOMBU_MESSAGE_CLEANUP_LOCK_ID = auto()
class OnyxCeleryQueues:
# "celery" is the default queue defined by celery and also the queue
# we are running in the primary worker to run system tasks
@@ -578,6 +577,7 @@ class OnyxCeleryTask:
MONITOR_PROCESS_MEMORY = "monitor_process_memory"
CELERY_BEAT_HEARTBEAT = "celery_beat_heartbeat"
KOMBU_MESSAGE_CLEANUP_TASK = "kombu_message_cleanup_task"
CONNECTOR_PERMISSION_SYNC_GENERATOR_TASK = (
"connector_permission_sync_generator_task"
)

View File

@@ -44,7 +44,7 @@ _NOTION_CALL_TIMEOUT = 30 # 30 seconds
_MAX_PAGES = 1000
# TODO: Pages need to have their metadata ingested
# TODO: Tables need to be ingested, Pages need to have their metadata ingested
class NotionPage(BaseModel):
@@ -452,19 +452,6 @@ class NotionConnector(LoadConnector, PollConnector):
sub_inner_dict: dict[str, Any] | list[Any] | str = inner_dict
while isinstance(sub_inner_dict, dict) and "type" in sub_inner_dict:
type_name = sub_inner_dict["type"]
# Notion user objects (people properties, created_by, etc.) have
# "name" at the same level as "type": "person"/"bot". If we drill
# into the person/bot sub-dict we lose the name. Capture it here
# before descending, but skip "title"-type properties where "name"
# is not the display value we want.
if (
"name" in sub_inner_dict
and isinstance(sub_inner_dict["name"], str)
and type_name not in ("title",)
):
return sub_inner_dict["name"]
sub_inner_dict = sub_inner_dict[type_name]
# If the innermost layer is None, the value is not set
@@ -676,19 +663,6 @@ class NotionConnector(LoadConnector, PollConnector):
text = rich_text["text"]["content"]
cur_result_text_arr.append(text)
# table_row blocks store content in "cells" (list of lists
# of rich text objects) rather than "rich_text"
if "cells" in result_obj:
row_cells: list[str] = []
for cell in result_obj["cells"]:
cell_texts = [
rt.get("plain_text", "")
for rt in cell
if isinstance(rt, dict)
]
row_cells.append(" ".join(cell_texts))
cur_result_text_arr.append("\t".join(row_cells))
if result["has_children"]:
if result_type == "child_page":
# Child pages will not be included at this top level, it will be a separate document.

View File

@@ -190,23 +190,16 @@ def delete_messages_and_files_from_chat_session(
chat_session_id: UUID, db_session: Session
) -> None:
# Select messages older than cutoff_time with files
messages_with_files = (
db_session.execute(
select(ChatMessage.id, ChatMessage.files).where(
ChatMessage.chat_session_id == chat_session_id,
)
messages_with_files = db_session.execute(
select(ChatMessage.id, ChatMessage.files).where(
ChatMessage.chat_session_id == chat_session_id,
)
.tuples()
.all()
)
).fetchall()
file_store = get_default_file_store()
for _, files in messages_with_files:
file_store = get_default_file_store()
for file_info in files or []:
if file_info.get("user_file_id"):
# user files are managed by the user file lifecycle
continue
file_store.delete_file(file_id=file_info["id"], error_on_missing=False)
file_store.delete_file(file_id=file_info.get("id"))
# Delete ChatMessage records - CASCADE constraints will automatically handle:
# - ChatMessage__StandardAnswer relationship records

View File

@@ -8,8 +8,6 @@ from sqlalchemy.orm import selectinload
from sqlalchemy.orm import Session
from onyx.configs.constants import FederatedConnectorSource
from onyx.configs.constants import MASK_CREDENTIAL_CHAR
from onyx.configs.constants import MASK_CREDENTIAL_LONG_RE
from onyx.db.engine.sql_engine import get_session_with_current_tenant
from onyx.db.models import DocumentSet
from onyx.db.models import FederatedConnector
@@ -47,23 +45,6 @@ def fetch_all_federated_connectors_parallel() -> list[FederatedConnector]:
return fetch_all_federated_connectors(db_session)
def _reject_masked_credentials(credentials: dict[str, Any]) -> None:
"""Raise if any credential string value contains mask placeholder characters.
mask_string() has two output formats:
- Short strings (< 14 chars): "••••••••••••" (U+2022 BULLET)
- Long strings (>= 14 chars): "abcd...wxyz" (first4 + "..." + last4)
Both must be rejected.
"""
for key, val in credentials.items():
if isinstance(val, str) and (
MASK_CREDENTIAL_CHAR in val or MASK_CREDENTIAL_LONG_RE.match(val)
):
raise ValueError(
f"Credential field '{key}' contains masked placeholder characters. Please provide the actual credential value."
)
def validate_federated_connector_credentials(
source: FederatedConnectorSource,
credentials: dict[str, Any],
@@ -85,8 +66,6 @@ def create_federated_connector(
config: dict[str, Any] | None = None,
) -> FederatedConnector:
"""Create a new federated connector with credential and config validation."""
_reject_masked_credentials(credentials)
# Validate credentials before creating
if not validate_federated_connector_credentials(source, credentials):
raise ValueError(
@@ -298,8 +277,6 @@ def update_federated_connector(
)
if credentials is not None:
_reject_masked_credentials(credentials)
# Validate credentials before updating
if not validate_federated_connector_credentials(
federated_connector.source, credentials

View File

@@ -236,15 +236,14 @@ def upsert_llm_provider(
db_session.add(existing_llm_provider)
# Filter out empty strings and None values from custom_config to allow
# providers like Bedrock to fall back to IAM roles when credentials are not provided.
# NOTE: An empty dict ({}) is preserved as-is — it signals that the provider was
# created via the custom modal and must be reopened with CustomModal, not a
# provider-specific modal. Only None means "no custom config at all".
# providers like Bedrock to fall back to IAM roles when credentials are not provided
custom_config = llm_provider_upsert_request.custom_config
if custom_config:
custom_config = {
k: v for k, v in custom_config.items() if v is not None and v.strip() != ""
}
# Set to None if the dict is empty after filtering
custom_config = custom_config or None
api_base = llm_provider_upsert_request.api_base or None
existing_llm_provider.provider = llm_provider_upsert_request.provider
@@ -304,7 +303,16 @@ def upsert_llm_provider(
).delete(synchronize_session="fetch")
db_session.flush()
# Import here to avoid circular imports
from onyx.llm.utils import get_max_input_tokens
for model_config in llm_provider_upsert_request.model_configurations:
max_input_tokens = model_config.max_input_tokens
if max_input_tokens is None:
max_input_tokens = get_max_input_tokens(
model_name=model_config.name,
model_provider=llm_provider_upsert_request.provider,
)
supported_flows = [LLMModelFlowType.CHAT]
if model_config.supports_image_input:
@@ -317,7 +325,7 @@ def upsert_llm_provider(
model_configuration_id=existing.id,
supported_flows=supported_flows,
is_visible=model_config.is_visible,
max_input_tokens=model_config.max_input_tokens,
max_input_tokens=max_input_tokens,
display_name=model_config.display_name,
)
else:
@@ -327,7 +335,7 @@ def upsert_llm_provider(
model_name=model_config.name,
supported_flows=supported_flows,
is_visible=model_config.is_visible,
max_input_tokens=model_config.max_input_tokens,
max_input_tokens=max_input_tokens,
display_name=model_config.display_name,
)

View File

@@ -324,15 +324,6 @@ def mark_migration_completed_time_if_not_set_with_commit(
db_session.commit()
def is_migration_completed(db_session: Session) -> bool:
"""Returns True if the migration is completed.
Can be run even if the migration record does not exist.
"""
record = db_session.query(OpenSearchTenantMigrationRecord).first()
return record is not None and record.migration_completed_at is not None
def build_sanitized_to_original_doc_id_mapping(
db_session: Session,
) -> dict[str, str]:

View File

@@ -1,4 +1,3 @@
import hashlib
from datetime import datetime
from datetime import timezone
from typing import Any
@@ -21,13 +20,9 @@ from onyx.document_index.opensearch.constants import DEFAULT_MAX_CHUNK_SIZE
from onyx.document_index.opensearch.constants import EF_CONSTRUCTION
from onyx.document_index.opensearch.constants import EF_SEARCH
from onyx.document_index.opensearch.constants import M
from onyx.document_index.opensearch.string_filtering import DocumentIDTooLongError
from onyx.document_index.opensearch.string_filtering import (
filter_and_validate_document_id,
)
from onyx.document_index.opensearch.string_filtering import (
MAX_DOCUMENT_ID_ENCODED_LENGTH,
)
from onyx.utils.tenant import get_tenant_id_short_string
from shared_configs.configs import MULTI_TENANT
from shared_configs.contextvars import get_current_tenant_id
@@ -80,50 +75,17 @@ def get_opensearch_doc_chunk_id(
This will be the string used to identify the chunk in OpenSearch. Any direct
chunk queries should use this function.
If the document ID is too long, a hash of the ID is used instead.
"""
opensearch_doc_chunk_id_suffix: str = f"__{max_chunk_size}__{chunk_index}"
encoded_suffix_length: int = len(opensearch_doc_chunk_id_suffix.encode("utf-8"))
max_encoded_permissible_doc_id_length: int = (
MAX_DOCUMENT_ID_ENCODED_LENGTH - encoded_suffix_length
sanitized_document_id = filter_and_validate_document_id(document_id)
opensearch_doc_chunk_id = (
f"{sanitized_document_id}__{max_chunk_size}__{chunk_index}"
)
opensearch_doc_chunk_id_tenant_prefix: str = ""
if tenant_state.multitenant:
short_tenant_id: str = get_tenant_id_short_string(tenant_state.tenant_id)
# Use tenant ID because in multitenant mode each tenant has its own
# Documents table, so there is a very small chance that doc IDs are not
# actually unique across all tenants.
opensearch_doc_chunk_id_tenant_prefix = f"{short_tenant_id}__"
encoded_prefix_length: int = len(
opensearch_doc_chunk_id_tenant_prefix.encode("utf-8")
)
max_encoded_permissible_doc_id_length -= encoded_prefix_length
try:
sanitized_document_id: str = filter_and_validate_document_id(
document_id, max_encoded_length=max_encoded_permissible_doc_id_length
)
except DocumentIDTooLongError:
# If the document ID is too long, use a hash instead.
# We use blake2b because it is faster and equally secure as SHA256, and
# accepts digest_size which controls the number of bytes returned in the
# hash.
# digest_size is the size of the returned hash in bytes. Since we're
# decoding the hash bytes as a hex string, the digest_size should be
# half the max target size of the hash string.
# Subtract 1 because filter_and_validate_document_id compares on >= on
# max_encoded_length.
# 64 is the max digest_size blake2b returns.
digest_size: int = min((max_encoded_permissible_doc_id_length - 1) // 2, 64)
sanitized_document_id = hashlib.blake2b(
document_id.encode("utf-8"), digest_size=digest_size
).hexdigest()
opensearch_doc_chunk_id: str = (
f"{opensearch_doc_chunk_id_tenant_prefix}{sanitized_document_id}{opensearch_doc_chunk_id_suffix}"
)
short_tenant_id = get_tenant_id_short_string(tenant_state.tenant_id)
opensearch_doc_chunk_id = f"{short_tenant_id}__{opensearch_doc_chunk_id}"
# Do one more validation to ensure we haven't exceeded the max length.
opensearch_doc_chunk_id = filter_and_validate_document_id(opensearch_doc_chunk_id)
return opensearch_doc_chunk_id

View File

@@ -1,15 +1,7 @@
import re
MAX_DOCUMENT_ID_ENCODED_LENGTH: int = 512
class DocumentIDTooLongError(ValueError):
"""Raised when a document ID is too long for OpenSearch after filtering."""
def filter_and_validate_document_id(
document_id: str, max_encoded_length: int = MAX_DOCUMENT_ID_ENCODED_LENGTH
) -> str:
def filter_and_validate_document_id(document_id: str) -> str:
"""
Filters and validates a document ID such that it can be used as an ID in
OpenSearch.
@@ -27,13 +19,9 @@ def filter_and_validate_document_id(
Args:
document_id: The document ID to filter and validate.
max_encoded_length: The maximum length of the document ID after
filtering in bytes. Compared with >= for extra resilience, so
encoded values of this length will fail.
Raises:
DocumentIDTooLongError: If the document ID is too long after filtering.
ValueError: If the document ID is empty after filtering.
ValueError: If the document ID is empty or too long after filtering.
Returns:
str: The filtered document ID.
@@ -41,8 +29,6 @@ def filter_and_validate_document_id(
filtered_document_id = re.sub(r"[^A-Za-z0-9_.\-~]", "", document_id)
if not filtered_document_id:
raise ValueError(f"Document ID {document_id} is empty after filtering.")
if len(filtered_document_id.encode("utf-8")) >= max_encoded_length:
raise DocumentIDTooLongError(
f"Document ID {document_id} is too long after filtering."
)
if len(filtered_document_id.encode("utf-8")) >= 512:
raise ValueError(f"Document ID {document_id} is too long after filtering.")
return filtered_document_id

View File

@@ -52,21 +52,9 @@ KNOWN_OPENPYXL_BUGS = [
def get_markitdown_converter() -> "MarkItDown":
global _MARKITDOWN_CONVERTER
from markitdown import MarkItDown
if _MARKITDOWN_CONVERTER is None:
from markitdown import MarkItDown
# Patch this function to effectively no-op because we were seeing this
# module take an inordinate amount of time to convert charts to markdown,
# making some powerpoint files with many or complicated charts nearly
# unindexable.
from markitdown.converters._pptx_converter import PptxConverter
setattr(
PptxConverter,
"_convert_chart_to_markdown",
lambda self, chart: "\n\n[chart omitted]\n\n", # noqa: ARG005
)
_MARKITDOWN_CONVERTER = MarkItDown(enable_plugins=False)
return _MARKITDOWN_CONVERTER
@@ -217,26 +205,18 @@ def read_pdf_file(
try:
pdf_reader = PdfReader(file)
if pdf_reader.is_encrypted:
# Try the explicit password first, then fall back to an empty
# string. Owner-password-only PDFs (permission restrictions but
# no open password) decrypt successfully with "".
# See https://github.com/onyx-dot-app/onyx/issues/9754
passwords = [p for p in [pdf_pass, ""] if p is not None]
if pdf_reader.is_encrypted and pdf_pass is not None:
decrypt_success = False
for pw in passwords:
try:
if pdf_reader.decrypt(pw) != 0:
decrypt_success = True
break
except Exception:
pass
try:
decrypt_success = pdf_reader.decrypt(pdf_pass) != 0
except Exception:
logger.error("Unable to decrypt pdf")
if not decrypt_success:
logger.error(
"Encrypted PDF could not be decrypted, returning empty text."
)
return "", metadata, []
elif pdf_reader.is_encrypted:
logger.warning("No Password for an encrypted PDF, returning empty text.")
return "", metadata, []
# Basic PDF metadata
if pdf_reader.metadata is not None:

View File

@@ -33,20 +33,8 @@ def is_pdf_protected(file: IO[Any]) -> bool:
with preserve_position(file):
reader = PdfReader(file)
if not reader.is_encrypted:
return False
# PDFs with only an owner password (permission restrictions like
# print/copy disabled) use an empty user password — any viewer can open
# them without prompting. decrypt("") returns 0 only when a real user
# password is required. See https://github.com/onyx-dot-app/onyx/issues/9754
try:
return reader.decrypt("") == 0
except Exception:
logger.exception(
"Failed to evaluate PDF encryption; treating as password protected"
)
return True
return bool(reader.is_encrypted)
def is_docx_protected(file: IO[Any]) -> bool:

View File

@@ -136,14 +136,12 @@ class FileStore(ABC):
"""
@abstractmethod
def delete_file(self, file_id: str, error_on_missing: bool = True) -> None:
def delete_file(self, file_id: str) -> None:
"""
Delete a file by its ID.
Parameters:
- file_id: ID of file to delete
- error_on_missing: If False, silently return when the file record
does not exist instead of raising.
- file_name: Name of file to delete
"""
@abstractmethod
@@ -454,23 +452,12 @@ class S3BackedFileStore(FileStore):
logger.warning(f"Error getting file size for {file_id}: {e}")
return None
def delete_file(
self,
file_id: str,
error_on_missing: bool = True,
db_session: Session | None = None,
) -> None:
def delete_file(self, file_id: str, db_session: Session | None = None) -> None:
with get_session_with_current_tenant_if_none(db_session) as db_session:
try:
file_record = get_filerecord_by_file_id_optional(
file_record = get_filerecord_by_file_id(
file_id=file_id, db_session=db_session
)
if file_record is None:
if error_on_missing:
raise RuntimeError(
f"File by id {file_id} does not exist or was deleted"
)
return
if not file_record.bucket_name:
logger.error(
f"File record {file_id} with key {file_record.object_key} "

View File

@@ -222,23 +222,12 @@ class PostgresBackedFileStore(FileStore):
logger.warning(f"Error getting file size for {file_id}: {e}")
return None
def delete_file(
self,
file_id: str,
error_on_missing: bool = True,
db_session: Session | None = None,
) -> None:
def delete_file(self, file_id: str, db_session: Session | None = None) -> None:
with get_session_with_current_tenant_if_none(db_session) as session:
try:
file_content = get_file_content_by_file_id_optional(
file_content = get_file_content_by_file_id(
file_id=file_id, db_session=session
)
if file_content is None:
if error_on_missing:
raise RuntimeError(
f"File content for file_id {file_id} does not exist or was deleted"
)
return
raw_conn = _get_raw_connection(session)
try:

View File

@@ -26,7 +26,6 @@ class LlmProviderNames(str, Enum):
MISTRAL = "mistral"
LITELLM_PROXY = "litellm_proxy"
BIFROST = "bifrost"
OPENAI_COMPATIBLE = "openai_compatible"
def __str__(self) -> str:
"""Needed so things like:
@@ -47,7 +46,6 @@ WELL_KNOWN_PROVIDER_NAMES = [
LlmProviderNames.LM_STUDIO,
LlmProviderNames.LITELLM_PROXY,
LlmProviderNames.BIFROST,
LlmProviderNames.OPENAI_COMPATIBLE,
]
@@ -66,7 +64,6 @@ PROVIDER_DISPLAY_NAMES: dict[str, str] = {
LlmProviderNames.LM_STUDIO: "LM Studio",
LlmProviderNames.LITELLM_PROXY: "LiteLLM Proxy",
LlmProviderNames.BIFROST: "Bifrost",
LlmProviderNames.OPENAI_COMPATIBLE: "OpenAI Compatible",
"groq": "Groq",
"anyscale": "Anyscale",
"deepseek": "DeepSeek",
@@ -119,7 +116,6 @@ AGGREGATOR_PROVIDERS: set[str] = {
LlmProviderNames.AZURE,
LlmProviderNames.LITELLM_PROXY,
LlmProviderNames.BIFROST,
LlmProviderNames.OPENAI_COMPATIBLE,
}
# Model family name mappings for display name generation

View File

@@ -327,19 +327,12 @@ class LitellmLLM(LLM):
):
model_kwargs[VERTEX_LOCATION_KWARG] = "global"
# Bifrost and OpenAI-compatible: OpenAI-compatible proxies that send
# model names directly to the endpoint. We route through LiteLLM's
# openai provider with the server's base URL, and ensure /v1 is appended.
if model_provider in (
LlmProviderNames.BIFROST,
LlmProviderNames.OPENAI_COMPATIBLE,
):
# Bifrost: OpenAI-compatible proxy that expects model names in
# provider/model format (e.g. "anthropic/claude-sonnet-4-6").
# We route through LiteLLM's openai provider with the Bifrost base URL,
# and ensure /v1 is appended.
if model_provider == LlmProviderNames.BIFROST:
self._custom_llm_provider = "openai"
# LiteLLM's OpenAI client requires an api_key to be set.
# Many OpenAI-compatible servers don't need auth, so supply a
# placeholder to prevent LiteLLM from raising AuthenticationError.
if not self._api_key:
model_kwargs.setdefault("api_key", "not-needed")
if self._api_base is not None:
base = self._api_base.rstrip("/")
self._api_base = base if base.endswith("/v1") else f"{base}/v1"
@@ -456,20 +449,17 @@ class LitellmLLM(LLM):
optional_kwargs: dict[str, Any] = {}
# Model name
is_openai_compatible_proxy = self._model_provider in (
LlmProviderNames.BIFROST,
LlmProviderNames.OPENAI_COMPATIBLE,
)
is_bifrost = self._model_provider == LlmProviderNames.BIFROST
model_provider = (
f"{self.config.model_provider}/responses"
if is_openai_model # Uses litellm's completions -> responses bridge
else self.config.model_provider
)
if is_openai_compatible_proxy:
# OpenAI-compatible proxies (Bifrost, generic OpenAI-compatible
# servers) expect model names sent directly to their endpoint.
# We use custom_llm_provider="openai" so LiteLLM doesn't try
# to route based on the provider prefix.
if is_bifrost:
# Bifrost expects model names in provider/model format
# (e.g. "anthropic/claude-sonnet-4-6") sent directly to its
# OpenAI-compatible endpoint. We use custom_llm_provider="openai"
# so LiteLLM doesn't try to route based on the provider prefix.
model = self.config.deployment_name or self.config.model_name
else:
model = f"{model_provider}/{self.config.deployment_name or self.config.model_name}"
@@ -560,10 +550,7 @@ class LitellmLLM(LLM):
if structured_response_format:
optional_kwargs["response_format"] = structured_response_format
if (
not (is_claude_model or is_ollama or is_mistral)
or is_openai_compatible_proxy
):
if not (is_claude_model or is_ollama or is_mistral) or is_bifrost:
# Litellm bug: tool_choice is dropped silently if not specified here for OpenAI
# However, this param breaks Anthropic and Mistral models,
# so it must be conditionally included unless the request is

View File

@@ -15,8 +15,6 @@ LITELLM_PROXY_PROVIDER_NAME = "litellm_proxy"
BIFROST_PROVIDER_NAME = "bifrost"
OPENAI_COMPATIBLE_PROVIDER_NAME = "openai_compatible"
# Providers that use optional Bearer auth from custom_config
PROVIDERS_WITH_SPECIAL_API_KEY_HANDLING: dict[str, str] = {
LlmProviderNames.OLLAMA_CHAT: OLLAMA_API_KEY_CONFIG_KEY,

View File

@@ -19,7 +19,6 @@ from onyx.llm.well_known_providers.constants import BIFROST_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import LITELLM_PROXY_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import LM_STUDIO_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import OLLAMA_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import OPENAI_COMPATIBLE_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import OPENAI_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import OPENROUTER_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import VERTEXAI_PROVIDER_NAME
@@ -52,7 +51,6 @@ def _get_provider_to_models_map() -> dict[str, list[str]]:
OPENROUTER_PROVIDER_NAME: [], # Dynamic - fetched from OpenRouter API
LITELLM_PROXY_PROVIDER_NAME: [], # Dynamic - fetched from LiteLLM proxy API
BIFROST_PROVIDER_NAME: [], # Dynamic - fetched from Bifrost API
OPENAI_COMPATIBLE_PROVIDER_NAME: [], # Dynamic - fetched from OpenAI-compatible API
}
@@ -338,7 +336,6 @@ def get_provider_display_name(provider_name: str) -> str:
VERTEXAI_PROVIDER_NAME: "Google Vertex AI",
OPENROUTER_PROVIDER_NAME: "OpenRouter",
LITELLM_PROXY_PROVIDER_NAME: "LiteLLM Proxy",
OPENAI_COMPATIBLE_PROVIDER_NAME: "OpenAI Compatible",
}
if provider_name in _ONYX_PROVIDER_DISPLAY_NAMES:

View File

@@ -3,8 +3,6 @@
from datetime import datetime
from typing import Any
import httpx
from onyx.configs.constants import DocumentSource
from onyx.mcp_server.api import mcp_server
from onyx.mcp_server.utils import get_http_client
@@ -17,21 +15,6 @@ from onyx.utils.variable_functionality import global_version
logger = setup_logger()
def _extract_error_detail(response: httpx.Response) -> str:
"""Extract a human-readable error message from a failed backend response.
The backend returns OnyxError responses as
``{"error_code": "...", "detail": "..."}``.
"""
try:
body = response.json()
if detail := body.get("detail"):
return str(detail)
except Exception:
pass
return f"Request failed with status {response.status_code}"
@mcp_server.tool()
async def search_indexed_documents(
query: str,
@@ -175,14 +158,7 @@ async def search_indexed_documents(
json=search_request,
headers=auth_headers,
)
if not response.is_success:
error_detail = _extract_error_detail(response)
return {
"documents": [],
"total_results": 0,
"query": query,
"error": error_detail,
}
response.raise_for_status()
result = response.json()
# Check for error in response
@@ -258,13 +234,7 @@ async def search_web(
json=request_payload,
headers={"Authorization": f"Bearer {access_token.token}"},
)
if not response.is_success:
error_detail = _extract_error_detail(response)
return {
"error": error_detail,
"results": [],
"query": query,
}
response.raise_for_status()
response_payload = response.json()
results = response_payload.get("results", [])
return {
@@ -310,12 +280,7 @@ async def open_urls(
json={"urls": urls},
headers={"Authorization": f"Bearer {access_token.token}"},
)
if not response.is_success:
error_detail = _extract_error_detail(response)
return {
"error": error_detail,
"results": [],
}
response.raise_for_status()
response_payload = response.json()
results = response_payload.get("results", [])
return {

View File

@@ -6,7 +6,6 @@ from onyx.configs.app_configs import MCP_SERVER_ENABLED
from onyx.configs.app_configs import MCP_SERVER_HOST
from onyx.configs.app_configs import MCP_SERVER_PORT
from onyx.utils.logger import setup_logger
from onyx.utils.variable_functionality import set_is_ee_based_on_env_variable
logger = setup_logger()
@@ -17,7 +16,6 @@ def main() -> None:
logger.info("MCP server is disabled (MCP_SERVER_ENABLED=false)")
return
set_is_ee_based_on_env_variable()
logger.info(f"Starting MCP server on {MCP_SERVER_HOST}:{MCP_SERVER_PORT}")
from onyx.mcp_server.api import mcp_app

View File

@@ -1,5 +1,6 @@
from fastapi import APIRouter
from fastapi import Depends
from fastapi import HTTPException
from sqlalchemy.orm import Session
from onyx.auth.users import current_user
@@ -8,8 +9,6 @@ from onyx.db.engine.sql_engine import get_session
from onyx.db.models import User
from onyx.db.web_search import fetch_active_web_content_provider
from onyx.db.web_search import fetch_active_web_search_provider
from onyx.error_handling.error_codes import OnyxErrorCode
from onyx.error_handling.exceptions import OnyxError
from onyx.server.features.web_search.models import OpenUrlsToolRequest
from onyx.server.features.web_search.models import OpenUrlsToolResponse
from onyx.server.features.web_search.models import WebSearchToolRequest
@@ -62,10 +61,9 @@ def _get_active_search_provider(
) -> tuple[WebSearchProviderView, WebSearchProvider]:
provider_model = fetch_active_web_search_provider(db_session)
if provider_model is None:
raise OnyxError(
OnyxErrorCode.INVALID_INPUT,
"No web search provider configured. Please configure one in "
"Admin > Web Search settings.",
raise HTTPException(
status_code=400,
detail="No web search provider configured.",
)
provider_view = WebSearchProviderView(
@@ -78,10 +76,9 @@ def _get_active_search_provider(
)
if provider_model.api_key is None:
raise OnyxError(
OnyxErrorCode.INVALID_INPUT,
"Web search provider requires an API key. Please configure one in "
"Admin > Web Search settings.",
raise HTTPException(
status_code=400,
detail="Web search provider requires an API key.",
)
try:
@@ -91,7 +88,7 @@ def _get_active_search_provider(
config=provider_model.config or {},
)
except ValueError as exc:
raise OnyxError(OnyxErrorCode.INVALID_INPUT, str(exc)) from exc
raise HTTPException(status_code=400, detail=str(exc)) from exc
return provider_view, provider
@@ -113,9 +110,9 @@ def _get_active_content_provider(
if provider_model.api_key is None:
# TODO - this is not a great error, in fact, this key should not be nullable.
raise OnyxError(
OnyxErrorCode.INVALID_INPUT,
"Web content provider requires an API key.",
raise HTTPException(
status_code=400,
detail="Web content provider requires an API key.",
)
try:
@@ -128,12 +125,12 @@ def _get_active_content_provider(
config=config,
)
except ValueError as exc:
raise OnyxError(OnyxErrorCode.INVALID_INPUT, str(exc)) from exc
raise HTTPException(status_code=400, detail=str(exc)) from exc
if provider is None:
raise OnyxError(
OnyxErrorCode.INVALID_INPUT,
"Unable to initialize the configured web content provider.",
raise HTTPException(
status_code=400,
detail="Unable to initialize the configured web content provider.",
)
provider_view = WebContentProviderView(
@@ -157,13 +154,12 @@ def _run_web_search(
for query in request.queries:
try:
search_results = provider.search(query)
except OnyxError:
except HTTPException:
raise
except Exception as exc:
logger.exception("Web search provider failed for query '%s'", query)
raise OnyxError(
OnyxErrorCode.BAD_GATEWAY,
"Web search provider failed to execute query.",
raise HTTPException(
status_code=502, detail="Web search provider failed to execute query."
) from exc
filtered_results = filter_web_search_results_with_no_title_or_snippet(
@@ -196,13 +192,12 @@ def _open_urls(
docs = filter_web_contents_with_no_title_or_content(
list(provider.contents(urls))
)
except OnyxError:
except HTTPException:
raise
except Exception as exc:
logger.exception("Web content provider failed to fetch URLs")
raise OnyxError(
OnyxErrorCode.BAD_GATEWAY,
"Web content provider failed to fetch URLs.",
raise HTTPException(
status_code=502, detail="Web content provider failed to fetch URLs."
) from exc
results: list[LlmOpenUrlResult] = []

View File

@@ -74,8 +74,6 @@ from onyx.server.manage.llm.models import ModelConfigurationUpsertRequest
from onyx.server.manage.llm.models import OllamaFinalModelResponse
from onyx.server.manage.llm.models import OllamaModelDetails
from onyx.server.manage.llm.models import OllamaModelsRequest
from onyx.server.manage.llm.models import OpenAICompatibleFinalModelResponse
from onyx.server.manage.llm.models import OpenAICompatibleModelsRequest
from onyx.server.manage.llm.models import OpenRouterFinalModelResponse
from onyx.server.manage.llm.models import OpenRouterModelDetails
from onyx.server.manage.llm.models import OpenRouterModelsRequest
@@ -1577,95 +1575,3 @@ def _get_bifrost_models_response(api_base: str, api_key: str | None = None) -> d
source_name="Bifrost",
api_key=api_key,
)
@admin_router.post("/openai-compatible/available-models")
def get_openai_compatible_server_available_models(
request: OpenAICompatibleModelsRequest,
_: User = Depends(current_admin_user),
db_session: Session = Depends(get_session),
) -> list[OpenAICompatibleFinalModelResponse]:
"""Fetch available models from a generic OpenAI-compatible /v1/models endpoint."""
response_json = _get_openai_compatible_server_response(
api_base=request.api_base, api_key=request.api_key
)
models = response_json.get("data", [])
if not isinstance(models, list) or len(models) == 0:
raise OnyxError(
OnyxErrorCode.VALIDATION_ERROR,
"No models found from your OpenAI-compatible endpoint",
)
results: list[OpenAICompatibleFinalModelResponse] = []
for model in models:
try:
model_id = model.get("id", "")
model_name = model.get("name", model_id)
if not model_id:
continue
# Skip embedding models
if is_embedding_model(model_id):
continue
results.append(
OpenAICompatibleFinalModelResponse(
name=model_id,
display_name=model_name,
max_input_tokens=model.get("context_length"),
supports_image_input=infer_vision_support(model_id),
supports_reasoning=is_reasoning_model(model_id, model_name),
)
)
except Exception as e:
logger.warning(
"Failed to parse OpenAI-compatible model entry",
extra={"error": str(e), "item": str(model)[:1000]},
)
if not results:
raise OnyxError(
OnyxErrorCode.VALIDATION_ERROR,
"No compatible models found from OpenAI-compatible endpoint",
)
sorted_results = sorted(results, key=lambda m: m.name.lower())
# Sync new models to DB if provider_name is specified
if request.provider_name:
_sync_fetched_models(
db_session=db_session,
provider_name=request.provider_name,
models=[
SyncModelEntry(
name=r.name,
display_name=r.display_name,
max_input_tokens=r.max_input_tokens,
supports_image_input=r.supports_image_input,
)
for r in sorted_results
],
source_label="OpenAI Compatible",
)
return sorted_results
def _get_openai_compatible_server_response(
api_base: str, api_key: str | None = None
) -> dict:
"""Perform GET to an OpenAI-compatible /v1/models and return parsed JSON."""
cleaned_api_base = api_base.strip().rstrip("/")
# Ensure we hit /v1/models
if cleaned_api_base.endswith("/v1"):
url = f"{cleaned_api_base}/models"
else:
url = f"{cleaned_api_base}/v1/models"
return _get_openai_compatible_models_response(
url=url,
source_name="OpenAI Compatible",
api_key=api_key,
)

View File

@@ -79,9 +79,7 @@ class LLMProviderDescriptor(BaseModel):
provider=provider,
provider_display_name=get_provider_display_name(provider),
model_configurations=filter_model_configurations(
llm_provider_model.model_configurations,
provider,
use_stored_display_name=llm_provider_model.custom_config is not None,
llm_provider_model.model_configurations, provider
),
)
@@ -158,9 +156,7 @@ class LLMProviderView(LLMProvider):
personas=personas,
deployment_name=llm_provider_model.deployment_name,
model_configurations=filter_model_configurations(
llm_provider_model.model_configurations,
provider,
use_stored_display_name=llm_provider_model.custom_config is not None,
llm_provider_model.model_configurations, provider
),
)
@@ -202,13 +198,13 @@ class ModelConfigurationView(BaseModel):
cls,
model_configuration_model: "ModelConfigurationModel",
provider_name: str,
use_stored_display_name: bool = False,
) -> "ModelConfigurationView":
# For dynamic providers (OpenRouter, Bedrock, Ollama) and custom-config
# providers, use the display_name stored in DB. Skip LiteLLM parsing.
# For dynamic providers (OpenRouter, Bedrock, Ollama), use the display_name
# stored in DB from the source API. Skip LiteLLM parsing entirely.
if (
provider_name in DYNAMIC_LLM_PROVIDERS or use_stored_display_name
) and model_configuration_model.display_name:
provider_name in DYNAMIC_LLM_PROVIDERS
and model_configuration_model.display_name
):
# Extract vendor from model name for grouping (e.g., "Anthropic", "OpenAI")
vendor = extract_vendor_from_model_name(
model_configuration_model.name, provider_name
@@ -468,18 +464,3 @@ class BifrostFinalModelResponse(BaseModel):
max_input_tokens: int | None
supports_image_input: bool
supports_reasoning: bool
# OpenAI Compatible dynamic models fetch
class OpenAICompatibleModelsRequest(BaseModel):
api_base: str
api_key: str | None = None
provider_name: str | None = None # Optional: to save models to existing provider
class OpenAICompatibleFinalModelResponse(BaseModel):
name: str # Model ID (e.g. "meta-llama/Llama-3-8B-Instruct")
display_name: str # Human-readable name from API
max_input_tokens: int | None
supports_image_input: bool
supports_reasoning: bool

View File

@@ -26,7 +26,6 @@ DYNAMIC_LLM_PROVIDERS = frozenset(
LlmProviderNames.OLLAMA_CHAT,
LlmProviderNames.LM_STUDIO,
LlmProviderNames.BIFROST,
LlmProviderNames.OPENAI_COMPATIBLE,
}
)
@@ -309,15 +308,12 @@ def should_filter_as_dated_duplicate(
def filter_model_configurations(
model_configurations: list,
provider: str,
use_stored_display_name: bool = False,
) -> list:
"""Filter out obsolete and dated duplicate models from configurations.
Args:
model_configurations: List of ModelConfiguration DB models
provider: The provider name (e.g., "openai", "anthropic")
use_stored_display_name: If True, prefer the display_name stored in the
DB over LiteLLM enrichments. Set for custom-config providers.
Returns:
List of ModelConfigurationView objects with obsolete/duplicate models removed
@@ -337,9 +333,7 @@ def filter_model_configurations(
if should_filter_as_dated_duplicate(model_configuration.name, all_model_names):
continue
filtered_configs.append(
ModelConfigurationView.from_model(
model_configuration, provider, use_stored_display_name
)
ModelConfigurationView.from_model(model_configuration, provider)
)
return filtered_configs

View File

@@ -26,6 +26,7 @@ _DEFAULT_PORTS: dict[str, int] = {
"monitoring": 9096,
"docfetching": 9092,
"docprocessing": 9093,
"heavy": 9094,
}
_server_started = False

View File

@@ -186,7 +186,7 @@ class TestDocumentIndexNew:
)
document_index.index(chunks=[pre_chunk], indexing_metadata=pre_metadata)
time.sleep(2)
time.sleep(1)
# Now index a batch with the existing doc and a new doc.
chunks = [

View File

@@ -9,7 +9,6 @@ This test verifies the full flow: provisioning failure → rollback → schema c
"""
import uuid
from unittest.mock import MagicMock
from unittest.mock import patch
from sqlalchemy import text
@@ -56,28 +55,18 @@ class TestTenantProvisioningRollback:
created_tenant_id = tenant_id
return create_schema_if_not_exists(tenant_id)
# Mock setup_tenant to fail after schema creation.
# Also mock the Redis lock so the test doesn't compete with a live
# monitoring worker that may already hold the provision lock.
mock_lock = MagicMock()
mock_lock.acquire.return_value = True
# Mock setup_tenant to fail after schema creation
with patch(
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.get_redis_client"
) as mock_redis:
mock_redis.return_value.lock.return_value = mock_lock
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.setup_tenant"
) as mock_setup:
mock_setup.side_effect = Exception("Simulated provisioning failure")
with patch(
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.setup_tenant"
) as mock_setup:
mock_setup.side_effect = Exception("Simulated provisioning failure")
with patch(
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.create_schema_if_not_exists",
side_effect=track_schema_creation,
):
# Run pre-provisioning - it should fail and trigger rollback
pre_provision_tenant()
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.create_schema_if_not_exists",
side_effect=track_schema_creation,
):
# Run pre-provisioning - it should fail and trigger rollback
pre_provision_tenant()
# Verify that the schema was created and then cleaned up
assert created_tenant_id is not None, "Schema should have been created"

View File

@@ -1,58 +0,0 @@
import pytest
from onyx.configs.constants import MASK_CREDENTIAL_CHAR
from onyx.db.federated import _reject_masked_credentials
class TestRejectMaskedCredentials:
"""Verify that masked credential values are never accepted for DB writes.
mask_string() has two output formats:
- Short strings (< 14 chars): "••••••••••••" (U+2022 BULLET)
- Long strings (>= 14 chars): "abcd...wxyz" (first4 + "..." + last4)
_reject_masked_credentials must catch both.
"""
def test_rejects_fully_masked_value(self) -> None:
masked = MASK_CREDENTIAL_CHAR * 12 # "••••••••••••"
with pytest.raises(ValueError, match="masked placeholder"):
_reject_masked_credentials({"client_id": masked})
def test_rejects_long_string_masked_value(self) -> None:
"""mask_string returns 'first4...last4' for long strings — the real
format used for OAuth credentials like client_id and client_secret."""
with pytest.raises(ValueError, match="masked placeholder"):
_reject_masked_credentials({"client_id": "1234...7890"})
def test_rejects_when_any_field_is_masked(self) -> None:
"""Even if client_id is real, a masked client_secret must be caught."""
with pytest.raises(ValueError, match="client_secret"):
_reject_masked_credentials(
{
"client_id": "1234567890.1234567890",
"client_secret": MASK_CREDENTIAL_CHAR * 12,
}
)
def test_accepts_real_credentials(self) -> None:
# Should not raise
_reject_masked_credentials(
{
"client_id": "1234567890.1234567890",
"client_secret": "test_client_secret_value",
}
)
def test_accepts_empty_dict(self) -> None:
# Should not raise — empty credentials are handled elsewhere
_reject_masked_credentials({})
def test_ignores_non_string_values(self) -> None:
# Non-string values (None, bool, int) should pass through
_reject_masked_credentials(
{
"client_id": "real_value",
"redirect_uri": None,
"some_flag": True,
}
)

View File

@@ -1,318 +0,0 @@
"""Unit tests for Notion connector handling of people properties and table blocks.
Reproduces two bugs:
1. ENG-3970: People-type database properties (user mentions) are not extracted —
the user's "name" field is lost when _recurse_properties drills into the
"person" sub-dict.
2. ENG-3971: Inline table blocks (table/table_row) are not indexed — table_row
blocks store content in "cells" rather than "rich_text", so no text is extracted.
"""
from unittest.mock import patch
from onyx.connectors.notion.connector import NotionConnector
def _make_connector() -> NotionConnector:
connector = NotionConnector()
connector.load_credentials({"notion_integration_token": "fake-token"})
return connector
class TestPeoplePropertyExtraction:
"""ENG-3970: Verifies that 'people' type database properties extract user names."""
def test_single_person_property(self) -> None:
"""A database cell with a single @mention should extract the user name."""
properties = {
"Team Lead": {
"id": "abc",
"type": "people",
"people": [
{
"object": "user",
"id": "user-uuid-1",
"name": "Arturo Martinez",
"type": "person",
"person": {"email": "arturo@example.com"},
}
],
}
}
result = NotionConnector._properties_to_str(properties)
assert (
"Arturo Martinez" in result
), f"Expected 'Arturo Martinez' in extracted text, got: {result!r}"
def test_multiple_people_property(self) -> None:
"""A database cell with multiple @mentions should extract all user names."""
properties = {
"Members": {
"id": "def",
"type": "people",
"people": [
{
"object": "user",
"id": "user-uuid-1",
"name": "Arturo Martinez",
"type": "person",
"person": {"email": "arturo@example.com"},
},
{
"object": "user",
"id": "user-uuid-2",
"name": "Jane Smith",
"type": "person",
"person": {"email": "jane@example.com"},
},
],
}
}
result = NotionConnector._properties_to_str(properties)
assert (
"Arturo Martinez" in result
), f"Expected 'Arturo Martinez' in extracted text, got: {result!r}"
assert (
"Jane Smith" in result
), f"Expected 'Jane Smith' in extracted text, got: {result!r}"
def test_bot_user_property(self) -> None:
"""Bot users (integrations) have 'type': 'bot' — name should still be extracted."""
properties = {
"Created By": {
"id": "ghi",
"type": "people",
"people": [
{
"object": "user",
"id": "bot-uuid-1",
"name": "Onyx Integration",
"type": "bot",
"bot": {},
}
],
}
}
result = NotionConnector._properties_to_str(properties)
assert (
"Onyx Integration" in result
), f"Expected 'Onyx Integration' in extracted text, got: {result!r}"
def test_person_without_person_details(self) -> None:
"""Some user objects may have an empty/null person sub-dict."""
properties = {
"Assignee": {
"id": "jkl",
"type": "people",
"people": [
{
"object": "user",
"id": "user-uuid-3",
"name": "Ghost User",
"type": "person",
"person": {},
}
],
}
}
result = NotionConnector._properties_to_str(properties)
assert (
"Ghost User" in result
), f"Expected 'Ghost User' in extracted text, got: {result!r}"
def test_people_mixed_with_other_properties(self) -> None:
"""People property should work alongside other property types."""
properties = {
"Name": {
"id": "aaa",
"type": "title",
"title": [
{
"plain_text": "Project Alpha",
"type": "text",
"text": {"content": "Project Alpha"},
}
],
},
"Lead": {
"id": "bbb",
"type": "people",
"people": [
{
"object": "user",
"id": "user-uuid-1",
"name": "Arturo Martinez",
"type": "person",
"person": {"email": "arturo@example.com"},
}
],
},
"Status": {
"id": "ccc",
"type": "status",
"status": {"name": "In Progress", "id": "status-1"},
},
}
result = NotionConnector._properties_to_str(properties)
assert "Arturo Martinez" in result
assert "In Progress" in result
class TestTableBlockExtraction:
"""ENG-3971: Verifies that inline table blocks (table/table_row) are indexed."""
def _make_blocks_response(self, results: list) -> dict:
return {"results": results, "next_cursor": None}
def test_table_row_cells_are_extracted(self) -> None:
"""table_row blocks store content in 'cells', not 'rich_text'.
The connector should extract text from cells."""
connector = _make_connector()
connector.workspace_id = "ws-1"
table_block = {
"id": "table-block-1",
"type": "table",
"table": {
"has_column_header": True,
"has_row_header": False,
"table_width": 3,
},
"has_children": True,
}
header_row = {
"id": "row-1",
"type": "table_row",
"table_row": {
"cells": [
[
{
"type": "text",
"text": {"content": "Name"},
"plain_text": "Name",
}
],
[
{
"type": "text",
"text": {"content": "Role"},
"plain_text": "Role",
}
],
[
{
"type": "text",
"text": {"content": "Team"},
"plain_text": "Team",
}
],
]
},
"has_children": False,
}
data_row = {
"id": "row-2",
"type": "table_row",
"table_row": {
"cells": [
[
{
"type": "text",
"text": {"content": "Arturo Martinez"},
"plain_text": "Arturo Martinez",
}
],
[
{
"type": "text",
"text": {"content": "Engineer"},
"plain_text": "Engineer",
}
],
[
{
"type": "text",
"text": {"content": "Platform"},
"plain_text": "Platform",
}
],
]
},
"has_children": False,
}
with patch.object(
connector,
"_fetch_child_blocks",
side_effect=[
self._make_blocks_response([table_block]),
self._make_blocks_response([header_row, data_row]),
],
):
output = connector._read_blocks("page-1")
all_text = " ".join(block.text for block in output.blocks)
assert "Arturo Martinez" in all_text, (
f"Expected 'Arturo Martinez' in table row text, got blocks: "
f"{[(b.id, b.text) for b in output.blocks]}"
)
assert "Engineer" in all_text, (
f"Expected 'Engineer' in table row text, got blocks: "
f"{[(b.id, b.text) for b in output.blocks]}"
)
assert "Platform" in all_text, (
f"Expected 'Platform' in table row text, got blocks: "
f"{[(b.id, b.text) for b in output.blocks]}"
)
def test_table_with_empty_cells(self) -> None:
"""Table rows with some empty cells should still extract non-empty content."""
connector = _make_connector()
connector.workspace_id = "ws-1"
table_block = {
"id": "table-block-2",
"type": "table",
"table": {
"has_column_header": False,
"has_row_header": False,
"table_width": 2,
},
"has_children": True,
}
row_with_empty = {
"id": "row-3",
"type": "table_row",
"table_row": {
"cells": [
[
{
"type": "text",
"text": {"content": "Has Value"},
"plain_text": "Has Value",
}
],
[], # empty cell
]
},
"has_children": False,
}
with patch.object(
connector,
"_fetch_child_blocks",
side_effect=[
self._make_blocks_response([table_block]),
self._make_blocks_response([row_with_empty]),
],
):
output = connector._read_blocks("page-2")
all_text = " ".join(block.text for block in output.blocks)
assert "Has Value" in all_text, (
f"Expected 'Has Value' in table row text, got blocks: "
f"{[(b.id, b.text) for b in output.blocks]}"
)

View File

@@ -1,100 +0,0 @@
"""Regression tests for delete_messages_and_files_from_chat_session.
Verifies that user-owned files (those with user_file_id) are never deleted
during chat session cleanup — only chat-only files should be removed.
"""
from unittest.mock import call
from unittest.mock import MagicMock
from unittest.mock import patch
from uuid import uuid4
from onyx.db.chat import delete_messages_and_files_from_chat_session
_MODULE = "onyx.db.chat"
def _make_db_session(
rows: list[tuple[int, list[dict[str, str]] | None]],
) -> MagicMock:
db_session = MagicMock()
db_session.execute.return_value.tuples.return_value.all.return_value = rows
return db_session
@patch(f"{_MODULE}.delete_orphaned_search_docs")
@patch(f"{_MODULE}.get_default_file_store")
def test_user_files_are_not_deleted(
mock_get_file_store: MagicMock,
_mock_orphan_cleanup: MagicMock,
) -> None:
"""User files (with user_file_id) must be skipped during cleanup."""
file_store = MagicMock()
mock_get_file_store.return_value = file_store
db_session = _make_db_session(
[
(
1,
[
{"id": "chat-file-1", "type": "image"},
{"id": "user-file-1", "type": "document", "user_file_id": "uf-1"},
{"id": "chat-file-2", "type": "image"},
],
),
]
)
delete_messages_and_files_from_chat_session(uuid4(), db_session)
assert file_store.delete_file.call_count == 2
file_store.delete_file.assert_has_calls(
[
call(file_id="chat-file-1", error_on_missing=False),
call(file_id="chat-file-2", error_on_missing=False),
]
)
@patch(f"{_MODULE}.delete_orphaned_search_docs")
@patch(f"{_MODULE}.get_default_file_store")
def test_only_user_files_means_no_deletions(
mock_get_file_store: MagicMock,
_mock_orphan_cleanup: MagicMock,
) -> None:
"""When every file in the session is a user file, nothing should be deleted."""
file_store = MagicMock()
mock_get_file_store.return_value = file_store
db_session = _make_db_session(
[
(1, [{"id": "uf-a", "type": "document", "user_file_id": "uf-1"}]),
(2, [{"id": "uf-b", "type": "document", "user_file_id": "uf-2"}]),
]
)
delete_messages_and_files_from_chat_session(uuid4(), db_session)
file_store.delete_file.assert_not_called()
@patch(f"{_MODULE}.delete_orphaned_search_docs")
@patch(f"{_MODULE}.get_default_file_store")
def test_messages_with_no_files(
mock_get_file_store: MagicMock,
_mock_orphan_cleanup: MagicMock,
) -> None:
"""Messages with None or empty file lists should not trigger any deletions."""
file_store = MagicMock()
mock_get_file_store.return_value = file_store
db_session = _make_db_session(
[
(1, None),
(2, []),
]
)
delete_messages_and_files_from_chat_session(uuid4(), db_session)
file_store.delete_file.assert_not_called()

View File

@@ -1,203 +0,0 @@
import pytest
from onyx.document_index.interfaces_new import TenantState
from onyx.document_index.opensearch.constants import DEFAULT_MAX_CHUNK_SIZE
from onyx.document_index.opensearch.schema import get_opensearch_doc_chunk_id
from onyx.document_index.opensearch.string_filtering import (
MAX_DOCUMENT_ID_ENCODED_LENGTH,
)
from shared_configs.configs import POSTGRES_DEFAULT_SCHEMA_STANDARD_VALUE
SINGLE_TENANT_STATE = TenantState(
tenant_id=POSTGRES_DEFAULT_SCHEMA_STANDARD_VALUE, multitenant=False
)
MULTI_TENANT_STATE = TenantState(
tenant_id="tenant_abcdef12-3456-7890-abcd-ef1234567890", multitenant=True
)
EXPECTED_SHORT_TENANT = "abcdef12"
class TestGetOpensearchDocChunkIdSingleTenant:
def test_basic(self) -> None:
result = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, "my-doc-id", chunk_index=0
)
assert result == f"my-doc-id__{DEFAULT_MAX_CHUNK_SIZE}__0"
def test_custom_chunk_size(self) -> None:
result = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, "doc1", chunk_index=3, max_chunk_size=1024
)
assert result == "doc1__1024__3"
def test_special_chars_are_stripped(self) -> None:
"""Tests characters not matching [A-Za-z0-9_.-~] are removed."""
result = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, "doc/with?special#chars&more%stuff", chunk_index=0
)
assert "/" not in result
assert "?" not in result
assert "#" not in result
assert result == f"docwithspecialcharsmorestuff__{DEFAULT_MAX_CHUNK_SIZE}__0"
def test_short_doc_id_not_hashed(self) -> None:
"""
Tests that a short doc ID should appear directly in the result, not as a
hash.
"""
doc_id = "short-id"
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, doc_id, chunk_index=0)
assert "short-id" in result
def test_long_doc_id_is_hashed(self) -> None:
"""
Tests that a doc ID exceeding the max length should be replaced with a
blake2b hash.
"""
# Create a doc ID that will exceed max length after the suffix is
# appended.
doc_id = "a" * MAX_DOCUMENT_ID_ENCODED_LENGTH
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, doc_id, chunk_index=0)
# The original doc ID should NOT appear in the result.
assert doc_id not in result
# The suffix should still be present.
assert f"__{DEFAULT_MAX_CHUNK_SIZE}__0" in result
def test_long_doc_id_hash_is_deterministic(self) -> None:
doc_id = "x" * MAX_DOCUMENT_ID_ENCODED_LENGTH
result1 = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, doc_id, chunk_index=5
)
result2 = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, doc_id, chunk_index=5
)
assert result1 == result2
def test_long_doc_id_different_inputs_produce_different_hashes(self) -> None:
doc_id_a = "a" * MAX_DOCUMENT_ID_ENCODED_LENGTH
doc_id_b = "b" * MAX_DOCUMENT_ID_ENCODED_LENGTH
result_a = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, doc_id_a, chunk_index=0
)
result_b = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, doc_id_b, chunk_index=0
)
assert result_a != result_b
def test_result_never_exceeds_max_length(self) -> None:
"""
Tests that the final result should always be under
MAX_DOCUMENT_ID_ENCODED_LENGTH bytes.
"""
doc_id = "z" * (MAX_DOCUMENT_ID_ENCODED_LENGTH * 2)
result = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, doc_id, chunk_index=999, max_chunk_size=99999
)
assert len(result.encode("utf-8")) < MAX_DOCUMENT_ID_ENCODED_LENGTH
def test_no_tenant_prefix_in_single_tenant(self) -> None:
result = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, "mydoc", chunk_index=0
)
assert not result.startswith(SINGLE_TENANT_STATE.tenant_id)
class TestGetOpensearchDocChunkIdMultiTenant:
def test_includes_tenant_prefix(self) -> None:
result = get_opensearch_doc_chunk_id(MULTI_TENANT_STATE, "mydoc", chunk_index=0)
assert result.startswith(f"{EXPECTED_SHORT_TENANT}__")
def test_format(self) -> None:
result = get_opensearch_doc_chunk_id(
MULTI_TENANT_STATE, "mydoc", chunk_index=2, max_chunk_size=256
)
assert result == f"{EXPECTED_SHORT_TENANT}__mydoc__256__2"
def test_long_doc_id_is_hashed_multitenant(self) -> None:
doc_id = "d" * MAX_DOCUMENT_ID_ENCODED_LENGTH
result = get_opensearch_doc_chunk_id(MULTI_TENANT_STATE, doc_id, chunk_index=0)
# Should still have tenant prefix.
assert result.startswith(f"{EXPECTED_SHORT_TENANT}__")
# The original doc ID should NOT appear in the result.
assert doc_id not in result
# The suffix should still be present.
assert f"__{DEFAULT_MAX_CHUNK_SIZE}__0" in result
def test_result_never_exceeds_max_length_multitenant(self) -> None:
doc_id = "q" * (MAX_DOCUMENT_ID_ENCODED_LENGTH * 2)
result = get_opensearch_doc_chunk_id(
MULTI_TENANT_STATE, doc_id, chunk_index=999, max_chunk_size=99999
)
assert len(result.encode("utf-8")) < MAX_DOCUMENT_ID_ENCODED_LENGTH
def test_different_tenants_produce_different_ids(self) -> None:
tenant_a = TenantState(
tenant_id="tenant_aaaaaaaa-0000-0000-0000-000000000000", multitenant=True
)
tenant_b = TenantState(
tenant_id="tenant_bbbbbbbb-0000-0000-0000-000000000000", multitenant=True
)
result_a = get_opensearch_doc_chunk_id(tenant_a, "same-doc", chunk_index=0)
result_b = get_opensearch_doc_chunk_id(tenant_b, "same-doc", chunk_index=0)
assert result_a != result_b
class TestGetOpensearchDocChunkIdEdgeCases:
def test_chunk_index_zero(self) -> None:
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, "doc", chunk_index=0)
assert result.endswith("__0")
def test_large_chunk_index(self) -> None:
result = get_opensearch_doc_chunk_id(
SINGLE_TENANT_STATE, "doc", chunk_index=99999
)
assert result.endswith("__99999")
def test_doc_id_with_only_special_chars_raises(self) -> None:
"""
Tests that a doc ID that becomes empty after filtering should raise
ValueError.
"""
with pytest.raises(ValueError, match="empty after filtering"):
get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, "###???///", chunk_index=0)
def test_doc_id_at_boundary_length(self) -> None:
"""
Tests that a doc ID right at the boundary should not be hashed.
"""
suffix = f"__{DEFAULT_MAX_CHUNK_SIZE}__0"
suffix_len = len(suffix.encode("utf-8"))
# Max doc ID length that won't trigger hashing (must be <
# max_encoded_length).
max_doc_len = MAX_DOCUMENT_ID_ENCODED_LENGTH - suffix_len - 1
doc_id = "a" * max_doc_len
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, doc_id, chunk_index=0)
assert doc_id in result
def test_doc_id_at_boundary_length_multitenant(self) -> None:
"""
Tests that a doc ID right at the boundary should not be hashed in
multitenant mode.
"""
suffix = f"__{DEFAULT_MAX_CHUNK_SIZE}__0"
suffix_len = len(suffix.encode("utf-8"))
prefix = f"{EXPECTED_SHORT_TENANT}__"
prefix_len = len(prefix.encode("utf-8"))
# Max doc ID length that won't trigger hashing (must be <
# max_encoded_length).
max_doc_len = MAX_DOCUMENT_ID_ENCODED_LENGTH - suffix_len - prefix_len - 1
doc_id = "a" * max_doc_len
result = get_opensearch_doc_chunk_id(MULTI_TENANT_STATE, doc_id, chunk_index=0)
assert doc_id in result
def test_doc_id_one_over_boundary_is_hashed(self) -> None:
"""
Tests that a doc ID one byte over the boundary should be hashed.
"""
suffix = f"__{DEFAULT_MAX_CHUNK_SIZE}__0"
suffix_len = len(suffix.encode("utf-8"))
# This length will trigger the >= check in filter_and_validate_document_id
doc_id = "a" * (MAX_DOCUMENT_ID_ENCODED_LENGTH - suffix_len)
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, doc_id, chunk_index=0)
assert doc_id not in result

View File

@@ -1,76 +0,0 @@
%PDF-1.3
%<25><><EFBFBD><EFBFBD>
1 0 obj
<<
/Producer <1083d595b1>
>>
endobj
2 0 obj
<<
/Type /Pages
/Count 1
/Kids [ 4 0 R ]
>>
endobj
3 0 obj
<<
/Type /Catalog
/Pages 2 0 R
>>
endobj
4 0 obj
<<
/Type /Page
/Resources <<
/Font <<
/F1 <<
/Type /Font
/Subtype /Type1
/BaseFont /Helvetica
>>
>>
>>
/MediaBox [ 0.0 0.0 200 200 ]
/Contents 5 0 R
/Parent 2 0 R
>>
endobj
5 0 obj
<<
/Length 42
>>
stream
,N<><6~<7E>)<29><><EFBFBD><EFBFBD><EFBFBD>u<EFBFBD> <0C><><EFBFBD>Zc'<27><>>8g<38><67><EFBFBD>n<EFBFBD><6E><EFBFBD><EFBFBD><EFBFBD>9"
endstream
endobj
6 0 obj
<<
/V 2
/R 3
/Length 128
/P 4294967292
/Filter /Standard
/O <6a340a292629053da84a6d8b19a5d505953b8b3fdac3d2d389fde0e354528d44>
/U <d6f0dc91c7b9de264a8d708515468e6528bf4e5e4e758a4164004e56fffa0108>
>>
endobj
xref
0 7
0000000000 65535 f
0000000015 00000 n
0000000059 00000 n
0000000118 00000 n
0000000167 00000 n
0000000348 00000 n
0000000440 00000 n
trailer
<<
/Size 7
/Root 3 0 R
/Info 1 0 R
/ID [ <6364336635356135633239323638353039306635656133623165313637366430> <6364336635356135633239323638353039306635656133623165313637366430> ]
/Encrypt 6 0 R
>>
startxref
655
%%EOF

View File

@@ -54,12 +54,6 @@ class TestReadPdfFile:
text, _, _ = read_pdf_file(_load("encrypted.pdf"), pdf_pass="wrong")
assert text == ""
def test_owner_password_only_pdf_extracts_text(self) -> None:
"""A PDF encrypted with only an owner password (no user password)
should still yield its text content. Regression for #9754."""
text, _, _ = read_pdf_file(_load("owner_protected.pdf"))
assert "Hello World" in text
def test_empty_pdf(self) -> None:
text, _, _ = read_pdf_file(_load("empty.pdf"))
assert text.strip() == ""
@@ -123,12 +117,6 @@ class TestIsPdfProtected:
def test_protected_pdf(self) -> None:
assert is_pdf_protected(_load("encrypted.pdf")) is True
def test_owner_password_only_is_not_protected(self) -> None:
"""A PDF with only an owner password (permission restrictions) but no
user password should NOT be considered protected — any viewer can open
it without prompting for a password."""
assert is_pdf_protected(_load("owner_protected.pdf")) is False
def test_preserves_file_position(self) -> None:
pdf = _load("simple.pdf")
pdf.seek(42)

View File

@@ -1,79 +0,0 @@
import io
from pptx import Presentation # type: ignore[import-untyped]
from pptx.chart.data import CategoryChartData # type: ignore[import-untyped]
from pptx.enum.chart import XL_CHART_TYPE # type: ignore[import-untyped]
from pptx.util import Inches # type: ignore[import-untyped]
from onyx.file_processing.extract_file_text import pptx_to_text
def _make_pptx_with_chart() -> io.BytesIO:
"""Create an in-memory pptx with one text slide and one chart slide."""
prs = Presentation()
# Slide 1: text only
slide1 = prs.slides.add_slide(prs.slide_layouts[1])
slide1.shapes.title.text = "Introduction"
slide1.placeholders[1].text = "This is the first slide."
# Slide 2: chart
slide2 = prs.slides.add_slide(prs.slide_layouts[5]) # Blank layout
chart_data = CategoryChartData()
chart_data.categories = ["Q1", "Q2", "Q3"]
chart_data.add_series("Revenue", (100, 200, 300))
slide2.shapes.add_chart(
XL_CHART_TYPE.COLUMN_CLUSTERED,
Inches(1),
Inches(1),
Inches(6),
Inches(4),
chart_data,
)
buf = io.BytesIO()
prs.save(buf)
buf.seek(0)
return buf
def _make_pptx_without_chart() -> io.BytesIO:
"""Create an in-memory pptx with a single text-only slide."""
prs = Presentation()
slide = prs.slides.add_slide(prs.slide_layouts[1])
slide.shapes.title.text = "Hello World"
slide.placeholders[1].text = "Some content here."
buf = io.BytesIO()
prs.save(buf)
buf.seek(0)
return buf
class TestPptxToText:
def test_chart_is_omitted(self) -> None:
# Precondition
pptx_file = _make_pptx_with_chart()
# Under test
result = pptx_to_text(pptx_file)
# Postcondition
assert "Introduction" in result
assert "first slide" in result
assert "[chart omitted]" in result
# The actual chart data should NOT appear in the output.
assert "Revenue" not in result
assert "Q1" not in result
def test_text_only_pptx(self) -> None:
# Precondition
pptx_file = _make_pptx_without_chart()
# Under test
result = pptx_to_text(pptx_file)
# Postcondition
assert "Hello World" in result
assert "Some content" in result
assert "[chart omitted]" not in result

View File

@@ -1,91 +0,0 @@
"""Tests for FileStore.delete_file error_on_missing behavior."""
from unittest.mock import MagicMock
from unittest.mock import patch
import pytest
_S3_MODULE = "onyx.file_store.file_store"
_PG_MODULE = "onyx.file_store.postgres_file_store"
def _mock_db_session() -> MagicMock:
session = MagicMock()
session.__enter__ = MagicMock(return_value=session)
session.__exit__ = MagicMock(return_value=False)
return session
# ── S3BackedFileStore ────────────────────────────────────────────────
@patch(f"{_S3_MODULE}.get_session_with_current_tenant_if_none")
@patch(f"{_S3_MODULE}.get_filerecord_by_file_id_optional", return_value=None)
def test_s3_delete_missing_file_raises_by_default(
_mock_get_record: MagicMock,
mock_ctx: MagicMock,
) -> None:
from onyx.file_store.file_store import S3BackedFileStore
mock_ctx.return_value = _mock_db_session()
store = S3BackedFileStore(bucket_name="b")
with pytest.raises(RuntimeError, match="does not exist"):
store.delete_file("nonexistent")
@patch(f"{_S3_MODULE}.get_session_with_current_tenant_if_none")
@patch(f"{_S3_MODULE}.get_filerecord_by_file_id_optional", return_value=None)
@patch(f"{_S3_MODULE}.delete_filerecord_by_file_id")
def test_s3_delete_missing_file_silent_when_error_on_missing_false(
mock_delete_record: MagicMock,
_mock_get_record: MagicMock,
mock_ctx: MagicMock,
) -> None:
from onyx.file_store.file_store import S3BackedFileStore
mock_ctx.return_value = _mock_db_session()
store = S3BackedFileStore(bucket_name="b")
store.delete_file("nonexistent", error_on_missing=False)
mock_delete_record.assert_not_called()
# ── PostgresBackedFileStore ──────────────────────────────────────────
@patch(f"{_PG_MODULE}.get_session_with_current_tenant_if_none")
@patch(f"{_PG_MODULE}.get_file_content_by_file_id_optional", return_value=None)
def test_pg_delete_missing_file_raises_by_default(
_mock_get_content: MagicMock,
mock_ctx: MagicMock,
) -> None:
from onyx.file_store.postgres_file_store import PostgresBackedFileStore
mock_ctx.return_value = _mock_db_session()
store = PostgresBackedFileStore()
with pytest.raises(RuntimeError, match="does not exist"):
store.delete_file("nonexistent")
@patch(f"{_PG_MODULE}.get_session_with_current_tenant_if_none")
@patch(f"{_PG_MODULE}.get_file_content_by_file_id_optional", return_value=None)
@patch(f"{_PG_MODULE}.delete_file_content_by_file_id")
@patch(f"{_PG_MODULE}.delete_filerecord_by_file_id")
def test_pg_delete_missing_file_silent_when_error_on_missing_false(
mock_delete_record: MagicMock,
mock_delete_content: MagicMock,
_mock_get_content: MagicMock,
mock_ctx: MagicMock,
) -> None:
from onyx.file_store.postgres_file_store import PostgresBackedFileStore
mock_ctx.return_value = _mock_db_session()
store = PostgresBackedFileStore()
store.delete_file("nonexistent", error_on_missing=False)
mock_delete_record.assert_not_called()
mock_delete_content.assert_not_called()

View File

@@ -98,7 +98,6 @@ Useful hardening flags:
| `serve` | Serve the interactive chat TUI over SSH |
| `configure` | Configure server URL and API key |
| `validate-config` | Validate configuration and test connection |
| `install-skill` | Install the agent skill file into a project |
## Slash Commands (in TUI)

View File

@@ -7,7 +7,6 @@ import (
"github.com/onyx-dot-app/onyx/cli/internal/api"
"github.com/onyx-dot-app/onyx/cli/internal/config"
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
"github.com/spf13/cobra"
)
@@ -17,23 +16,16 @@ func newAgentsCmd() *cobra.Command {
cmd := &cobra.Command{
Use: "agents",
Short: "List available agents",
Long: `List all visible agents configured on the Onyx server.
By default, output is a human-readable table with ID, name, and description.
Use --json for machine-readable output.`,
Example: ` onyx-cli agents
onyx-cli agents --json
onyx-cli agents --json | jq '.[].name'`,
RunE: func(cmd *cobra.Command, args []string) error {
cfg := config.Load()
if !cfg.IsConfigured() {
return exitcodes.New(exitcodes.NotConfigured, "onyx CLI is not configured\n Run: onyx-cli configure")
return fmt.Errorf("onyx CLI is not configured — run 'onyx-cli configure' first")
}
client := api.NewClient(cfg)
agents, err := client.ListAgents(cmd.Context())
if err != nil {
return fmt.Errorf("failed to list agents: %w\n Check your connection with: onyx-cli validate-config", err)
return fmt.Errorf("failed to list agents: %w", err)
}
if agentsJSON {

View File

@@ -4,65 +4,33 @@ import (
"context"
"encoding/json"
"fmt"
"io"
"os"
"os/signal"
"strings"
"syscall"
"github.com/onyx-dot-app/onyx/cli/internal/api"
"github.com/onyx-dot-app/onyx/cli/internal/config"
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
"github.com/onyx-dot-app/onyx/cli/internal/models"
"github.com/onyx-dot-app/onyx/cli/internal/overflow"
"github.com/spf13/cobra"
"golang.org/x/term"
)
const defaultMaxOutputBytes = 4096
func newAskCmd() *cobra.Command {
var (
askAgentID int
askJSON bool
askQuiet bool
askPrompt string
maxOutput int
)
cmd := &cobra.Command{
Use: "ask [question]",
Short: "Ask a one-shot question (non-interactive)",
Long: `Send a one-shot question to an Onyx agent and print the response.
The question can be provided as a positional argument, via --prompt, or piped
through stdin. When stdin contains piped data, it is sent as context along
with the question from --prompt (or used as the question itself).
When stdout is not a TTY (e.g., called by a script or AI agent), output is
automatically truncated to --max-output bytes and the full response is saved
to a temp file. Set --max-output 0 to disable truncation.`,
Args: cobra.MaximumNArgs(1),
Example: ` onyx-cli ask "What connectors are available?"
onyx-cli ask --agent-id 3 "Summarize our Q4 revenue"
onyx-cli ask --json "List all users" | jq '.event.content'
cat error.log | onyx-cli ask --prompt "Find the root cause"
echo "what is onyx?" | onyx-cli ask`,
Args: cobra.ExactArgs(1),
RunE: func(cmd *cobra.Command, args []string) error {
cfg := config.Load()
if !cfg.IsConfigured() {
return exitcodes.New(exitcodes.NotConfigured, "onyx CLI is not configured\n Run: onyx-cli configure")
}
if askJSON && askQuiet {
return exitcodes.New(exitcodes.BadRequest, "--json and --quiet cannot be used together")
}
question, err := resolveQuestion(args, askPrompt)
if err != nil {
return err
return fmt.Errorf("onyx CLI is not configured — run 'onyx-cli configure' first")
}
question := args[0]
agentID := cfg.DefaultAgentID
if cmd.Flags().Changed("agent-id") {
agentID = askAgentID
@@ -82,23 +50,9 @@ to a temp file. Set --max-output 0 to disable truncation.`,
nil,
)
// Determine truncation threshold.
isTTY := term.IsTerminal(int(os.Stdout.Fd()))
truncateAt := 0 // 0 means no truncation
if cmd.Flags().Changed("max-output") {
truncateAt = maxOutput
} else if !isTTY {
truncateAt = defaultMaxOutputBytes
}
var sessionID string
var lastErr error
gotStop := false
// Overflow writer: tees to stdout and optionally to a temp file.
// In quiet mode, buffer everything and print once at the end.
ow := &overflow.Writer{Limit: truncateAt, Quiet: askQuiet}
for event := range ch {
if e, ok := event.(models.SessionCreatedEvent); ok {
sessionID = e.ChatSessionID
@@ -128,50 +82,22 @@ to a temp file. Set --max-output 0 to disable truncation.`,
switch e := event.(type) {
case models.MessageDeltaEvent:
ow.Write(e.Content)
case models.SearchStartEvent:
if isTTY && !askQuiet {
if e.IsInternetSearch {
fmt.Fprintf(os.Stderr, "\033[2mSearching the web...\033[0m\n")
} else {
fmt.Fprintf(os.Stderr, "\033[2mSearching documents...\033[0m\n")
}
}
case models.SearchQueriesEvent:
if isTTY && !askQuiet {
for _, q := range e.Queries {
fmt.Fprintf(os.Stderr, "\033[2m → %s\033[0m\n", q)
}
}
case models.SearchDocumentsEvent:
if isTTY && !askQuiet && len(e.Documents) > 0 {
fmt.Fprintf(os.Stderr, "\033[2mFound %d documents\033[0m\n", len(e.Documents))
}
case models.ReasoningStartEvent:
if isTTY && !askQuiet {
fmt.Fprintf(os.Stderr, "\033[2mThinking...\033[0m\n")
}
case models.ToolStartEvent:
if isTTY && !askQuiet && e.ToolName != "" {
fmt.Fprintf(os.Stderr, "\033[2mUsing %s...\033[0m\n", e.ToolName)
}
fmt.Print(e.Content)
case models.ErrorEvent:
ow.Finish()
return fmt.Errorf("%s", e.Error)
case models.StopEvent:
ow.Finish()
fmt.Println()
return nil
}
}
if !askJSON {
ow.Finish()
}
if ctx.Err() != nil {
if sessionID != "" {
client.StopChatSession(context.Background(), sessionID)
}
if !askJSON {
fmt.Println()
}
return nil
}
@@ -179,56 +105,20 @@ to a temp file. Set --max-output 0 to disable truncation.`,
return lastErr
}
if !gotStop {
if !askJSON {
fmt.Println()
}
return fmt.Errorf("stream ended unexpectedly")
}
if !askJSON {
fmt.Println()
}
return nil
},
}
cmd.Flags().IntVar(&askAgentID, "agent-id", 0, "Agent ID to use")
cmd.Flags().BoolVar(&askJSON, "json", false, "Output raw JSON events")
cmd.Flags().BoolVarP(&askQuiet, "quiet", "q", false, "Buffer output and print once at end (no streaming)")
cmd.Flags().StringVar(&askPrompt, "prompt", "", "Question text (use with piped stdin context)")
cmd.Flags().IntVar(&maxOutput, "max-output", defaultMaxOutputBytes,
"Max bytes to print before truncating (0 to disable, auto-enabled for non-TTY)")
// Suppress cobra's default error/usage on RunE errors
return cmd
}
// resolveQuestion builds the final question string from args, --prompt, and stdin.
func resolveQuestion(args []string, prompt string) (string, error) {
hasArg := len(args) > 0
hasPrompt := prompt != ""
hasStdin := !term.IsTerminal(int(os.Stdin.Fd()))
if hasArg && hasPrompt {
return "", exitcodes.New(exitcodes.BadRequest, "specify the question as an argument or --prompt, not both")
}
var stdinContent string
if hasStdin {
const maxStdinBytes = 10 * 1024 * 1024 // 10MB
data, err := io.ReadAll(io.LimitReader(os.Stdin, maxStdinBytes))
if err != nil {
return "", fmt.Errorf("failed to read stdin: %w", err)
}
stdinContent = strings.TrimSpace(string(data))
}
switch {
case hasArg && stdinContent != "":
// arg is the question, stdin is context
return args[0] + "\n\n" + stdinContent, nil
case hasArg:
return args[0], nil
case hasPrompt && stdinContent != "":
// --prompt is the question, stdin is context
return prompt + "\n\n" + stdinContent, nil
case hasPrompt:
return prompt, nil
case stdinContent != "":
return stdinContent, nil
default:
return "", exitcodes.New(exitcodes.BadRequest, "no question provided\n Usage: onyx-cli ask \"your question\"\n Or: echo \"context\" | onyx-cli ask --prompt \"your question\"")
}
}

View File

@@ -10,16 +10,9 @@ import (
)
func newChatCmd() *cobra.Command {
var noStreamMarkdown bool
cmd := &cobra.Command{
return &cobra.Command{
Use: "chat",
Short: "Launch the interactive chat TUI (default)",
Long: `Launch the interactive terminal UI for chatting with your Onyx agent.
This is the default command when no subcommand is specified. On first run,
an interactive setup wizard will guide you through configuration.`,
Example: ` onyx-cli chat
onyx-cli`,
RunE: func(cmd *cobra.Command, args []string) error {
cfg := config.Load()
@@ -32,12 +25,6 @@ an interactive setup wizard will guide you through configuration.`,
cfg = *result
}
// CLI flag overrides config/env
if cmd.Flags().Changed("no-stream-markdown") {
v := !noStreamMarkdown
cfg.Features.StreamMarkdown = &v
}
starprompt.MaybePrompt()
m := tui.NewModel(cfg)
@@ -46,8 +33,4 @@ an interactive setup wizard will guide you through configuration.`,
return err
},
}
cmd.Flags().BoolVar(&noStreamMarkdown, "no-stream-markdown", false, "Disable progressive markdown rendering during streaming")
return cmd
}

View File

@@ -1,126 +1,19 @@
package cmd
import (
"context"
"errors"
"fmt"
"io"
"os"
"strings"
"time"
"github.com/onyx-dot-app/onyx/cli/internal/api"
"github.com/onyx-dot-app/onyx/cli/internal/config"
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
"github.com/onyx-dot-app/onyx/cli/internal/onboarding"
"github.com/spf13/cobra"
"golang.org/x/term"
)
func newConfigureCmd() *cobra.Command {
var (
serverURL string
apiKey string
apiKeyStdin bool
dryRun bool
)
cmd := &cobra.Command{
return &cobra.Command{
Use: "configure",
Short: "Configure server URL and API key",
Long: `Set up the Onyx CLI with your server URL and API key.
When --server-url and --api-key are both provided, the configuration is saved
non-interactively (useful for scripts and AI agents). Otherwise, an interactive
setup wizard is launched.
If --api-key is omitted but stdin has piped data, the API key is read from
stdin automatically. You can also use --api-key-stdin to make this explicit.
This avoids leaking the key in shell history.
Use --dry-run to test the connection without saving the configuration.`,
Example: ` onyx-cli configure
onyx-cli configure --server-url https://my-onyx.com --api-key sk-...
echo "$ONYX_API_KEY" | onyx-cli configure --server-url https://my-onyx.com
echo "$ONYX_API_KEY" | onyx-cli configure --server-url https://my-onyx.com --api-key-stdin
onyx-cli configure --server-url https://my-onyx.com --api-key sk-... --dry-run`,
RunE: func(cmd *cobra.Command, args []string) error {
// Read API key from stdin if piped (implicit) or --api-key-stdin (explicit)
if apiKeyStdin && apiKey != "" {
return exitcodes.New(exitcodes.BadRequest, "--api-key and --api-key-stdin cannot be used together")
}
if (apiKey == "" && !term.IsTerminal(int(os.Stdin.Fd()))) || apiKeyStdin {
data, err := io.ReadAll(os.Stdin)
if err != nil {
return fmt.Errorf("failed to read API key from stdin: %w", err)
}
apiKey = strings.TrimSpace(string(data))
}
if serverURL != "" && apiKey != "" {
return configureNonInteractive(serverURL, apiKey, dryRun)
}
if dryRun {
return exitcodes.New(exitcodes.BadRequest, "--dry-run requires --server-url and --api-key")
}
if serverURL != "" || apiKey != "" {
return exitcodes.New(exitcodes.BadRequest, "both --server-url and --api-key are required for non-interactive setup\n Run 'onyx-cli configure' without flags for interactive setup")
}
cfg := config.Load()
onboarding.Run(&cfg)
return nil
},
}
cmd.Flags().StringVar(&serverURL, "server-url", "", "Onyx server URL (e.g., https://cloud.onyx.app)")
cmd.Flags().StringVar(&apiKey, "api-key", "", "API key for authentication (or pipe via stdin)")
cmd.Flags().BoolVar(&apiKeyStdin, "api-key-stdin", false, "Read API key from stdin (explicit; also happens automatically when stdin is piped)")
cmd.Flags().BoolVar(&dryRun, "dry-run", false, "Test connection without saving config (requires --server-url and --api-key)")
return cmd
}
func configureNonInteractive(serverURL, apiKey string, dryRun bool) error {
cfg := config.OnyxCliConfig{
ServerURL: serverURL,
APIKey: apiKey,
DefaultAgentID: 0,
}
// Preserve existing default agent ID from disk (not env overrides)
if existing := config.LoadFromDisk(); existing.DefaultAgentID != 0 {
cfg.DefaultAgentID = existing.DefaultAgentID
}
// Test connection
client := api.NewClient(cfg)
ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
defer cancel()
if err := client.TestConnection(ctx); err != nil {
var authErr *api.AuthError
if errors.As(err, &authErr) {
return exitcodes.Newf(exitcodes.AuthFailure, "authentication failed: %v\n Check your API key", err)
}
return exitcodes.Newf(exitcodes.Unreachable, "connection failed: %v\n Check your server URL", err)
}
if dryRun {
fmt.Printf("Server: %s\n", serverURL)
fmt.Println("Status: connected and authenticated")
fmt.Println("Dry run: config was NOT saved")
return nil
}
if err := config.Save(cfg); err != nil {
return fmt.Errorf("could not save config: %w", err)
}
fmt.Printf("Config: %s\n", config.ConfigFilePath())
fmt.Printf("Server: %s\n", serverURL)
fmt.Println("Status: connected and authenticated")
return nil
}

View File

@@ -1,20 +0,0 @@
package cmd
import (
"fmt"
"github.com/onyx-dot-app/onyx/cli/internal/config"
"github.com/spf13/cobra"
)
func newExperimentsCmd() *cobra.Command {
return &cobra.Command{
Use: "experiments",
Short: "List experimental features and their status",
RunE: func(cmd *cobra.Command, args []string) error {
cfg := config.Load()
_, _ = fmt.Fprintln(cmd.OutOrStdout(), config.ExperimentsText(cfg.Features))
return nil
},
}
}

View File

@@ -1,176 +0,0 @@
package cmd
import (
"fmt"
"os"
"path/filepath"
"github.com/onyx-dot-app/onyx/cli/internal/embedded"
"github.com/onyx-dot-app/onyx/cli/internal/fsutil"
"github.com/spf13/cobra"
)
// agentSkillDirs maps agent names to their skill directory paths (relative to
// the project or home root). "Universal" agents like Cursor and Codex read
// from .agents/skills directly, so they don't need their own entry here.
var agentSkillDirs = map[string]string{
"claude-code": filepath.Join(".claude", "skills"),
}
const (
canonicalDir = ".agents/skills"
skillName = "onyx-cli"
)
func newInstallSkillCmd() *cobra.Command {
var (
global bool
copyMode bool
agents []string
)
cmd := &cobra.Command{
Use: "install-skill",
Short: "Install the Onyx CLI agent skill file",
Long: `Install the bundled SKILL.md so that AI coding agents can discover and use
the Onyx CLI as a tool.
Files are written to the canonical .agents/skills/onyx-cli/ directory. For
agents that use their own skill directory (e.g. Claude Code uses .claude/skills/),
a symlink is created pointing back to the canonical copy.
By default the skill is installed at the project level (current directory).
Use --global to install under your home directory instead.
Use --copy to write independent copies instead of symlinks.
Use --agent to target specific agents (can be repeated).`,
Example: ` onyx-cli install-skill
onyx-cli install-skill --global
onyx-cli install-skill --agent claude-code
onyx-cli install-skill --copy`,
RunE: func(cmd *cobra.Command, args []string) error {
base, err := installBase(global)
if err != nil {
return err
}
// Write the canonical copy.
canonicalSkillDir := filepath.Join(base, canonicalDir, skillName)
dest := filepath.Join(canonicalSkillDir, "SKILL.md")
content := []byte(embedded.SkillMD)
status, err := fsutil.CompareFile(dest, content)
if err != nil {
return err
}
switch status {
case fsutil.StatusUpToDate:
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Up to date %s\n", dest)
case fsutil.StatusDiffers:
_, _ = fmt.Fprintf(cmd.ErrOrStderr(), "Warning: overwriting modified %s\n", dest)
if err := os.WriteFile(dest, content, 0o644); err != nil {
return fmt.Errorf("could not write skill file: %w", err)
}
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Installed %s\n", dest)
default: // statusMissing
if err := os.MkdirAll(canonicalSkillDir, 0o755); err != nil {
return fmt.Errorf("could not create directory: %w", err)
}
if err := os.WriteFile(dest, content, 0o644); err != nil {
return fmt.Errorf("could not write skill file: %w", err)
}
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Installed %s\n", dest)
}
// Determine which agents to link.
targets := agentSkillDirs
if len(agents) > 0 {
targets = make(map[string]string)
for _, a := range agents {
dir, ok := agentSkillDirs[a]
if !ok {
_, _ = fmt.Fprintf(cmd.ErrOrStderr(), "Unknown agent %q (skipped) — known agents:", a)
for name := range agentSkillDirs {
_, _ = fmt.Fprintf(cmd.ErrOrStderr(), " %s", name)
}
_, _ = fmt.Fprintln(cmd.ErrOrStderr())
continue
}
targets[a] = dir
}
}
// Create symlinks (or copies) from agent-specific dirs to canonical.
for name, skillsDir := range targets {
agentSkillDir := filepath.Join(base, skillsDir, skillName)
if copyMode {
copyDest := filepath.Join(agentSkillDir, "SKILL.md")
if err := fsutil.EnsureDirForCopy(agentSkillDir); err != nil {
return fmt.Errorf("could not prepare %s directory: %w", name, err)
}
if err := os.MkdirAll(agentSkillDir, 0o755); err != nil {
return fmt.Errorf("could not create %s directory: %w", name, err)
}
if err := os.WriteFile(copyDest, []byte(embedded.SkillMD), 0o644); err != nil {
return fmt.Errorf("could not write %s skill file: %w", name, err)
}
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Copied %s\n", copyDest)
continue
}
// Compute relative symlink target. Symlinks resolve relative to
// the parent directory of the link, not the link itself.
rel, err := filepath.Rel(filepath.Dir(agentSkillDir), canonicalSkillDir)
if err != nil {
return fmt.Errorf("could not compute relative path for %s: %w", name, err)
}
if err := os.MkdirAll(filepath.Dir(agentSkillDir), 0o755); err != nil {
return fmt.Errorf("could not create %s directory: %w", name, err)
}
// Remove existing symlink/dir before creating.
_ = os.Remove(agentSkillDir)
if err := os.Symlink(rel, agentSkillDir); err != nil {
// Fall back to copy if symlink fails (e.g. Windows without dev mode).
copyDest := filepath.Join(agentSkillDir, "SKILL.md")
if mkErr := os.MkdirAll(agentSkillDir, 0o755); mkErr != nil {
return fmt.Errorf("could not create %s directory: %w", name, mkErr)
}
if wErr := os.WriteFile(copyDest, []byte(embedded.SkillMD), 0o644); wErr != nil {
return fmt.Errorf("could not write %s skill file: %w", name, wErr)
}
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Copied %s (symlink failed)\n", copyDest)
continue
}
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Linked %s -> %s\n", agentSkillDir, rel)
}
return nil
},
}
cmd.Flags().BoolVarP(&global, "global", "g", false, "Install to home directory instead of project")
cmd.Flags().BoolVar(&copyMode, "copy", false, "Copy files instead of symlinking")
cmd.Flags().StringSliceVarP(&agents, "agent", "a", nil, "Target specific agents (e.g. claude-code)")
return cmd
}
func installBase(global bool) (string, error) {
if global {
home, err := os.UserHomeDir()
if err != nil {
return "", fmt.Errorf("could not determine home directory: %w", err)
}
return home, nil
}
cwd, err := os.Getwd()
if err != nil {
return "", fmt.Errorf("could not determine working directory: %w", err)
}
return cwd, nil
}

View File

@@ -97,8 +97,6 @@ func Execute() error {
rootCmd.AddCommand(newConfigureCmd())
rootCmd.AddCommand(newValidateConfigCmd())
rootCmd.AddCommand(newServeCmd())
rootCmd.AddCommand(newInstallSkillCmd())
rootCmd.AddCommand(newExperimentsCmd())
// Default command is chat, but intercept --version first
rootCmd.RunE = func(cmd *cobra.Command, args []string) error {

View File

@@ -23,7 +23,6 @@ import (
"github.com/charmbracelet/wish/ratelimiter"
"github.com/onyx-dot-app/onyx/cli/internal/api"
"github.com/onyx-dot-app/onyx/cli/internal/config"
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
"github.com/onyx-dot-app/onyx/cli/internal/tui"
"github.com/spf13/cobra"
"golang.org/x/time/rate"
@@ -296,15 +295,15 @@ provided via the ONYX_API_KEY environment variable to skip the prompt:
The server URL is taken from the server operator's config. The server
auto-generates an Ed25519 host key on first run if the key file does not
already exist. The host key path can also be set via the ONYX_SSH_HOST_KEY
environment variable (the --host-key flag takes precedence).`,
Example: ` onyx-cli serve --port 2222
ssh localhost -p 2222
onyx-cli serve --host 0.0.0.0 --port 2222
onyx-cli serve --idle-timeout 30m --max-session-timeout 2h`,
environment variable (the --host-key flag takes precedence).
Example:
onyx-cli serve --port 2222
ssh localhost -p 2222`,
RunE: func(cmd *cobra.Command, args []string) error {
serverCfg := config.Load()
if serverCfg.ServerURL == "" {
return exitcodes.New(exitcodes.NotConfigured, "server URL is not configured\n Run: onyx-cli configure")
return fmt.Errorf("server URL is not configured; run 'onyx-cli configure' first")
}
if !cmd.Flags().Changed("host-key") {
if v := os.Getenv(config.EnvSSHHostKey); v != "" {

View File

@@ -2,13 +2,11 @@ package cmd
import (
"context"
"errors"
"fmt"
"time"
"github.com/onyx-dot-app/onyx/cli/internal/api"
"github.com/onyx-dot-app/onyx/cli/internal/config"
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
"github.com/onyx-dot-app/onyx/cli/internal/version"
log "github.com/sirupsen/logrus"
"github.com/spf13/cobra"
@@ -18,21 +16,17 @@ func newValidateConfigCmd() *cobra.Command {
return &cobra.Command{
Use: "validate-config",
Short: "Validate configuration and test server connection",
Long: `Check that the CLI is configured, the server is reachable, and the API key
is valid. Also reports the server version and warns if it is below the
minimum required.`,
Example: ` onyx-cli validate-config`,
RunE: func(cmd *cobra.Command, args []string) error {
// Check config file
if !config.ConfigExists() {
return exitcodes.Newf(exitcodes.NotConfigured, "config file not found at %s\n Run: onyx-cli configure", config.ConfigFilePath())
return fmt.Errorf("config file not found at %s\n Run 'onyx-cli configure' to set up", config.ConfigFilePath())
}
cfg := config.Load()
// Check API key
if !cfg.IsConfigured() {
return exitcodes.New(exitcodes.NotConfigured, "API key is missing\n Run: onyx-cli configure")
return fmt.Errorf("API key is missing\n Run 'onyx-cli configure' to set up")
}
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Config: %s\n", config.ConfigFilePath())
@@ -41,11 +35,7 @@ minimum required.`,
// Test connection
client := api.NewClient(cfg)
if err := client.TestConnection(cmd.Context()); err != nil {
var authErr *api.AuthError
if errors.As(err, &authErr) {
return exitcodes.Newf(exitcodes.AuthFailure, "authentication failed: %v\n Reconfigure with: onyx-cli configure", err)
}
return exitcodes.Newf(exitcodes.Unreachable, "connection failed: %v\n Reconfigure with: onyx-cli configure", err)
return fmt.Errorf("connection failed: %w", err)
}
_, _ = fmt.Fprintln(cmd.OutOrStdout(), "Status: connected and authenticated")

View File

@@ -149,12 +149,12 @@ func (c *Client) TestConnection(ctx context.Context) error {
if resp2.StatusCode == 401 || resp2.StatusCode == 403 {
if isHTML || strings.Contains(respServer, "awselb") {
return &AuthError{Message: fmt.Sprintf("HTTP %d from a reverse proxy (not the Onyx backend).\n Check your deployment's ingress / proxy configuration", resp2.StatusCode)}
return fmt.Errorf("HTTP %d from a reverse proxy (not the Onyx backend).\n Check your deployment's ingress / proxy configuration", resp2.StatusCode)
}
if resp2.StatusCode == 401 {
return &AuthError{Message: fmt.Sprintf("invalid API key or token.\n %s", body)}
return fmt.Errorf("invalid API key or token.\n %s", body)
}
return &AuthError{Message: fmt.Sprintf("access denied — check that the API key is valid.\n %s", body)}
return fmt.Errorf("access denied — check that the API key is valid.\n %s", body)
}
detail := fmt.Sprintf("HTTP %d", resp2.StatusCode)

View File

@@ -11,12 +11,3 @@ type OnyxAPIError struct {
func (e *OnyxAPIError) Error() string {
return fmt.Sprintf("HTTP %d: %s", e.StatusCode, e.Detail)
}
// AuthError is returned when authentication or authorization fails.
type AuthError struct {
Message string
}
func (e *AuthError) Error() string {
return e.Message
}

View File

@@ -9,47 +9,28 @@ import (
)
const (
EnvServerURL = "ONYX_SERVER_URL"
EnvAPIKey = "ONYX_API_KEY"
EnvAgentID = "ONYX_PERSONA_ID"
EnvSSHHostKey = "ONYX_SSH_HOST_KEY"
EnvStreamMarkdown = "ONYX_STREAM_MARKDOWN"
EnvServerURL = "ONYX_SERVER_URL"
EnvAPIKey = "ONYX_API_KEY"
EnvAgentID = "ONYX_PERSONA_ID"
EnvSSHHostKey = "ONYX_SSH_HOST_KEY"
)
// Features holds experimental feature flags for the CLI.
type Features struct {
// StreamMarkdown enables progressive markdown rendering during streaming,
// so output is formatted as it arrives rather than after completion.
// nil means use the app default (true).
StreamMarkdown *bool `json:"stream_markdown,omitempty"`
}
// OnyxCliConfig holds the CLI configuration.
type OnyxCliConfig struct {
ServerURL string `json:"server_url"`
APIKey string `json:"api_key"`
DefaultAgentID int `json:"default_persona_id"`
Features Features `json:"features,omitempty"`
ServerURL string `json:"server_url"`
APIKey string `json:"api_key"`
DefaultAgentID int `json:"default_persona_id"`
}
// DefaultConfig returns a config with default values.
func DefaultConfig() OnyxCliConfig {
return OnyxCliConfig{
ServerURL: "https://cloud.onyx.app",
APIKey: "",
ServerURL: "https://cloud.onyx.app",
APIKey: "",
DefaultAgentID: 0,
}
}
// StreamMarkdownEnabled returns whether stream markdown is enabled,
// defaulting to true when the user hasn't set an explicit preference.
func (f Features) StreamMarkdownEnabled() bool {
if f.StreamMarkdown != nil {
return *f.StreamMarkdown
}
return true
}
// IsConfigured returns true if the config has an API key.
func (c OnyxCliConfig) IsConfigured() bool {
return c.APIKey != ""
@@ -78,10 +59,8 @@ func ConfigExists() bool {
return err == nil
}
// LoadFromDisk reads config from the file only, without applying environment
// variable overrides. Use this when you need the persisted config values
// (e.g., to preserve them during a save operation).
func LoadFromDisk() OnyxCliConfig {
// Load reads config from file and applies environment variable overrides.
func Load() OnyxCliConfig {
cfg := DefaultConfig()
data, err := os.ReadFile(ConfigFilePath())
@@ -91,13 +70,6 @@ func LoadFromDisk() OnyxCliConfig {
}
}
return cfg
}
// Load reads config from file and applies environment variable overrides.
func Load() OnyxCliConfig {
cfg := LoadFromDisk()
// Environment overrides
if v := os.Getenv(EnvServerURL); v != "" {
cfg.ServerURL = v
@@ -110,13 +82,6 @@ func Load() OnyxCliConfig {
cfg.DefaultAgentID = id
}
}
if v := os.Getenv(EnvStreamMarkdown); v != "" {
if b, err := strconv.ParseBool(v); err == nil {
cfg.Features.StreamMarkdown = &b
} else {
fmt.Fprintf(os.Stderr, "warning: invalid value %q for %s (expected true/false), ignoring\n", v, EnvStreamMarkdown)
}
}
return cfg
}

View File

@@ -9,7 +9,7 @@ import (
func clearEnvVars(t *testing.T) {
t.Helper()
for _, key := range []string{EnvServerURL, EnvAPIKey, EnvAgentID, EnvStreamMarkdown} {
for _, key := range []string{EnvServerURL, EnvAPIKey, EnvAgentID} {
t.Setenv(key, "")
if err := os.Unsetenv(key); err != nil {
t.Fatal(err)
@@ -199,48 +199,6 @@ func TestSaveAndReload(t *testing.T) {
}
}
func TestDefaultFeaturesStreamMarkdownNil(t *testing.T) {
cfg := DefaultConfig()
if cfg.Features.StreamMarkdown != nil {
t.Error("expected StreamMarkdown to be nil by default")
}
if !cfg.Features.StreamMarkdownEnabled() {
t.Error("expected StreamMarkdownEnabled() to return true when nil")
}
}
func TestEnvOverrideStreamMarkdownFalse(t *testing.T) {
clearEnvVars(t)
dir := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", dir)
t.Setenv(EnvStreamMarkdown, "false")
cfg := Load()
if cfg.Features.StreamMarkdown == nil || *cfg.Features.StreamMarkdown {
t.Error("expected StreamMarkdown=false from env override")
}
}
func TestLoadFeaturesFromFile(t *testing.T) {
clearEnvVars(t)
dir := t.TempDir()
t.Setenv("XDG_CONFIG_HOME", dir)
data, _ := json.Marshal(map[string]interface{}{
"server_url": "https://example.com",
"api_key": "key",
"features": map[string]interface{}{
"stream_markdown": true,
},
})
writeConfig(t, dir, data)
cfg := Load()
if cfg.Features.StreamMarkdown == nil || !*cfg.Features.StreamMarkdown {
t.Error("expected StreamMarkdown=true from config file")
}
}
func TestSaveCreatesParentDirs(t *testing.T) {
clearEnvVars(t)
dir := t.TempDir()

View File

@@ -1,46 +0,0 @@
package config
import "fmt"
// Experiment describes an experimental feature flag.
type Experiment struct {
Name string
Flag string // CLI flag name
EnvVar string // environment variable name
Config string // JSON path in config file
Enabled bool
Desc string
}
// Experiments returns the list of available experimental features
// with their current status based on the given feature flags.
func Experiments(f Features) []Experiment {
return []Experiment{
{
Name: "Stream Markdown",
Flag: "--no-stream-markdown",
EnvVar: EnvStreamMarkdown,
Config: "features.stream_markdown",
Enabled: f.StreamMarkdownEnabled(),
Desc: "Render markdown progressively as the response streams in (enabled by default)",
},
}
}
// ExperimentsText formats the experiments list for display.
func ExperimentsText(f Features) string {
exps := Experiments(f)
text := "Experimental Features\n\n"
for _, e := range exps {
status := "off"
if e.Enabled {
status = "on"
}
text += fmt.Sprintf(" %-20s [%s]\n", e.Name, status)
text += fmt.Sprintf(" %s\n", e.Desc)
text += fmt.Sprintf(" flag: %s env: %s config: %s\n\n", e.Flag, e.EnvVar, e.Config)
}
text += "Toggle via CLI flag, environment variable, or config file.\n"
text += "Example: onyx-cli chat --no-stream-markdown"
return text
}

View File

@@ -1,187 +0,0 @@
---
name: onyx-cli
description: Query the Onyx knowledge base using the onyx-cli command. Use when the user wants to search company documents, ask questions about internal knowledge, query connected data sources, or look up information stored in Onyx.
---
# Onyx CLI — Agent Tool
Onyx is an enterprise search and Gen-AI platform that connects to company documents, apps, and people. The `onyx-cli` CLI provides non-interactive commands to query the Onyx knowledge base and list available agents.
## Prerequisites
### 1. Check if installed
```bash
which onyx-cli
```
### 2. Install (if needed)
**Primary — pip:**
```bash
pip install onyx-cli
```
**From source (Go):**
```bash
go build -o onyx-cli github.com/onyx-dot-app/onyx/cli && sudo mv onyx-cli /usr/local/bin/
```
### 3. Check if configured
```bash
onyx-cli validate-config
```
This checks the config file exists, API key is present, and tests the server connection via `/api/me`. Exit code 0 on success, non-zero with a descriptive error on failure.
If unconfigured, you have two options:
**Option A — Interactive setup (requires user input):**
```bash
onyx-cli configure
```
This prompts for the Onyx server URL and API key, tests the connection, and saves config.
**Option B — Environment variables (non-interactive, preferred for agents):**
```bash
export ONYX_SERVER_URL="https://your-onyx-server.com" # default: https://cloud.onyx.app
export ONYX_API_KEY="your-api-key"
```
Environment variables override the config file. If these are set, no config file is needed.
| Variable | Required | Description |
| ----------------- | -------- | -------------------------------------------------------- |
| `ONYX_SERVER_URL` | No | Onyx server base URL (default: `https://cloud.onyx.app`) |
| `ONYX_API_KEY` | Yes | API key for authentication |
| `ONYX_PERSONA_ID` | No | Default agent/persona ID |
If neither the config file nor environment variables are set, tell the user that `onyx-cli` needs to be configured and ask them to either:
- Run `onyx-cli configure` interactively, or
- Set `ONYX_SERVER_URL` and `ONYX_API_KEY` environment variables
## Commands
### Validate configuration
```bash
onyx-cli validate-config
```
Checks config file exists, API key is present, and tests the server connection. Use this before `ask` or `agents` to confirm the CLI is properly set up.
### List available agents
```bash
onyx-cli agents
```
Prints a table of agent IDs, names, and descriptions. Use `--json` for structured output:
```bash
onyx-cli agents --json
```
Use agent IDs with `ask --agent-id` to query a specific agent.
### Basic query (plain text output)
```bash
onyx-cli ask "What is our company's PTO policy?"
```
Streams the answer as plain text to stdout. Exit code 0 on success, non-zero on error.
### JSON output (structured events)
```bash
onyx-cli ask --json "What authentication methods do we support?"
```
Outputs JSON-encoded parsed stream events (one object per line). Key event objects include message deltas, stop, errors, search-start, and citation payloads.
Each line is a JSON object with this envelope:
```json
{"type": "<event_type>", "event": { ... }}
```
| Event Type | Description |
| ------------------- | -------------------------------------------------------------------- |
| `message_delta` | Content token — concatenate all `content` fields for the full answer |
| `stop` | Stream complete |
| `error` | Error with `error` message field |
| `search_tool_start` | Onyx started searching documents |
| `citation_info` | Source citation — see shape below |
`citation_info` event shape:
```json
{
"type": "citation_info",
"event": {
"citation_number": 1,
"document_id": "abc123def456",
"placement": { "turn_index": 0, "tab_index": 0, "sub_turn_index": null }
}
}
```
`placement` is metadata about where in the conversation the citation appeared and can be ignored for most use cases.
### Specify an agent
```bash
onyx-cli ask --agent-id 5 "Summarize our Q4 roadmap"
```
Uses a specific Onyx agent/persona instead of the default.
### All flags
| Flag | Type | Description |
| ------------ | ---- | ---------------------------------------------- |
| `--agent-id` | int | Agent ID to use (overrides default) |
| `--json` | bool | Output raw NDJSON events instead of plain text |
## Statelessness
Each `onyx-cli ask` call creates an independent chat session. There is no built-in way to chain context across multiple `ask` invocations — every call starts fresh. If you need multi-turn conversation with memory, use the interactive TUI (`onyx-cli` or `onyx-cli chat`) instead.
## When to Use
Use `onyx-cli ask` when:
- The user asks about company-specific information (policies, docs, processes)
- You need to search internal knowledge bases or connected data sources
- The user references Onyx, asks you to "search Onyx", or wants to query their documents
- You need context from company wikis, Confluence, Google Drive, Slack, or other connected sources
Do NOT use when:
- The question is about general programming knowledge (use your own knowledge)
- The user is asking about code in the current repository (use grep/read tools)
- The user hasn't mentioned Onyx and the question doesn't require internal company data
## Examples
```bash
# Simple question
onyx-cli ask "What are the steps to deploy to production?"
# Get structured output for parsing
onyx-cli ask --json "List all active API integrations"
# Use a specialized agent
onyx-cli ask --agent-id 3 "What were the action items from last week's standup?"
# Pipe the answer into another command
onyx-cli ask "What is the database schema for users?" | head -20
```

View File

@@ -1,7 +0,0 @@
// Package embedded holds files that are compiled into the onyx-cli binary.
package embedded
import _ "embed"
//go:embed SKILL.md
var SkillMD string

View File

@@ -1,33 +0,0 @@
// Package exitcodes defines semantic exit codes for the Onyx CLI.
package exitcodes
import "fmt"
const (
Success = 0
General = 1
BadRequest = 2 // invalid args / command-line errors (convention)
NotConfigured = 3
AuthFailure = 4
Unreachable = 5
)
// ExitError wraps an error with a specific exit code.
type ExitError struct {
Code int
Err error
}
func (e *ExitError) Error() string {
return e.Err.Error()
}
// New creates an ExitError with the given code and message.
func New(code int, msg string) *ExitError {
return &ExitError{Code: code, Err: fmt.Errorf("%s", msg)}
}
// Newf creates an ExitError with a formatted message.
func Newf(code int, format string, args ...any) *ExitError {
return &ExitError{Code: code, Err: fmt.Errorf(format, args...)}
}

View File

@@ -1,40 +0,0 @@
package exitcodes
import (
"errors"
"fmt"
"testing"
)
func TestExitError_Error(t *testing.T) {
e := New(NotConfigured, "not configured")
if e.Error() != "not configured" {
t.Fatalf("expected 'not configured', got %q", e.Error())
}
if e.Code != NotConfigured {
t.Fatalf("expected code %d, got %d", NotConfigured, e.Code)
}
}
func TestExitError_Newf(t *testing.T) {
e := Newf(Unreachable, "cannot reach %s", "server")
if e.Error() != "cannot reach server" {
t.Fatalf("expected 'cannot reach server', got %q", e.Error())
}
if e.Code != Unreachable {
t.Fatalf("expected code %d, got %d", Unreachable, e.Code)
}
}
func TestExitError_ErrorsAs(t *testing.T) {
e := New(BadRequest, "bad input")
wrapped := fmt.Errorf("wrapper: %w", e)
var exitErr *ExitError
if !errors.As(wrapped, &exitErr) {
t.Fatal("errors.As should find ExitError")
}
if exitErr.Code != BadRequest {
t.Fatalf("expected code %d, got %d", BadRequest, exitErr.Code)
}
}

View File

@@ -1,50 +0,0 @@
// Package fsutil provides filesystem helper functions.
package fsutil
import (
"bytes"
"errors"
"fmt"
"os"
)
// FileStatus describes how an on-disk file compares to expected content.
type FileStatus int
const (
StatusMissing FileStatus = iota
StatusUpToDate // file exists with identical content
StatusDiffers // file exists with different content
)
// CompareFile checks whether the file at path matches the expected content.
func CompareFile(path string, expected []byte) (FileStatus, error) {
existing, err := os.ReadFile(path)
if err != nil {
if errors.Is(err, os.ErrNotExist) {
return StatusMissing, nil
}
return 0, fmt.Errorf("could not read %s: %w", path, err)
}
if bytes.Equal(existing, expected) {
return StatusUpToDate, nil
}
return StatusDiffers, nil
}
// EnsureDirForCopy makes sure path is a real directory, not a symlink or
// regular file. If a symlink or file exists at path it is removed so the
// caller can create a directory with independent content.
func EnsureDirForCopy(path string) error {
info, err := os.Lstat(path)
if err == nil {
if info.Mode()&os.ModeSymlink != 0 || !info.IsDir() {
if err := os.Remove(path); err != nil {
return err
}
}
} else if !errors.Is(err, os.ErrNotExist) {
return err
}
return nil
}

View File

@@ -1,116 +0,0 @@
package fsutil
import (
"os"
"path/filepath"
"testing"
)
// TestCompareFile verifies that CompareFile correctly distinguishes between a
// missing file, a file with matching content, and a file with different content.
func TestCompareFile(t *testing.T) {
tmpDir := t.TempDir()
path := filepath.Join(tmpDir, "skill.md")
expected := []byte("expected content")
status, err := CompareFile(path, expected)
if err != nil {
t.Fatalf("CompareFile on missing file failed: %v", err)
}
if status != StatusMissing {
t.Fatalf("expected StatusMissing, got %v", status)
}
if err := os.WriteFile(path, expected, 0o644); err != nil {
t.Fatalf("write expected file failed: %v", err)
}
status, err = CompareFile(path, expected)
if err != nil {
t.Fatalf("CompareFile on matching file failed: %v", err)
}
if status != StatusUpToDate {
t.Fatalf("expected StatusUpToDate, got %v", status)
}
if err := os.WriteFile(path, []byte("different content"), 0o644); err != nil {
t.Fatalf("write different file failed: %v", err)
}
status, err = CompareFile(path, expected)
if err != nil {
t.Fatalf("CompareFile on different file failed: %v", err)
}
if status != StatusDiffers {
t.Fatalf("expected StatusDiffers, got %v", status)
}
}
// TestEnsureDirForCopy verifies that EnsureDirForCopy clears symlinks and
// regular files so --copy can write a real directory, while leaving existing
// directories and missing paths untouched.
func TestEnsureDirForCopy(t *testing.T) {
t.Run("removes symlink", func(t *testing.T) {
tmpDir := t.TempDir()
targetDir := filepath.Join(tmpDir, "target")
linkPath := filepath.Join(tmpDir, "link")
if err := os.MkdirAll(targetDir, 0o755); err != nil {
t.Fatalf("mkdir target failed: %v", err)
}
if err := os.Symlink(targetDir, linkPath); err != nil {
t.Fatalf("create symlink failed: %v", err)
}
if err := EnsureDirForCopy(linkPath); err != nil {
t.Fatalf("EnsureDirForCopy failed: %v", err)
}
if _, err := os.Lstat(linkPath); !os.IsNotExist(err) {
t.Fatalf("expected symlink path to be removed, got err=%v", err)
}
})
t.Run("removes regular file", func(t *testing.T) {
tmpDir := t.TempDir()
filePath := filepath.Join(tmpDir, "onyx-cli")
if err := os.WriteFile(filePath, []byte("x"), 0o644); err != nil {
t.Fatalf("write file failed: %v", err)
}
if err := EnsureDirForCopy(filePath); err != nil {
t.Fatalf("EnsureDirForCopy failed: %v", err)
}
if _, err := os.Lstat(filePath); !os.IsNotExist(err) {
t.Fatalf("expected file path to be removed, got err=%v", err)
}
})
t.Run("keeps existing directory", func(t *testing.T) {
tmpDir := t.TempDir()
dirPath := filepath.Join(tmpDir, "onyx-cli")
if err := os.MkdirAll(dirPath, 0o755); err != nil {
t.Fatalf("mkdir failed: %v", err)
}
if err := EnsureDirForCopy(dirPath); err != nil {
t.Fatalf("EnsureDirForCopy failed: %v", err)
}
info, err := os.Lstat(dirPath)
if err != nil {
t.Fatalf("lstat directory failed: %v", err)
}
if !info.IsDir() {
t.Fatalf("expected directory to remain, got mode %v", info.Mode())
}
})
t.Run("missing path is no-op", func(t *testing.T) {
tmpDir := t.TempDir()
missingPath := filepath.Join(tmpDir, "does-not-exist")
if err := EnsureDirForCopy(missingPath); err != nil {
t.Fatalf("EnsureDirForCopy failed: %v", err)
}
})
}

View File

@@ -1,121 +0,0 @@
// Package overflow provides a streaming writer that auto-truncates output
// for non-TTY callers (e.g., AI agents, scripts). Full content is saved to
// a temp file on disk; only the first N bytes are printed to stdout.
package overflow
import (
"fmt"
"os"
"strings"
log "github.com/sirupsen/logrus"
)
// Writer handles streaming output with optional truncation.
// When Limit > 0, it streams to a temp file on disk (not memory) and stops
// writing to stdout after Limit bytes. When Limit == 0, it writes directly
// to stdout. In Quiet mode, it buffers in memory and prints once at the end.
type Writer struct {
Limit int
Quiet bool
written int
totalBytes int
truncated bool
buf strings.Builder // used only in quiet mode
tmpFile *os.File // used only in truncation mode (Limit > 0)
}
// Write sends a chunk of content through the writer.
func (w *Writer) Write(s string) {
w.totalBytes += len(s)
// Quiet mode: buffer in memory, print nothing
if w.Quiet {
w.buf.WriteString(s)
return
}
if w.Limit <= 0 {
fmt.Print(s)
return
}
// Truncation mode: stream all content to temp file on disk
if w.tmpFile == nil {
f, err := os.CreateTemp("", "onyx-ask-*.txt")
if err != nil {
// Fall back to no-truncation if we can't create the file
fmt.Fprintf(os.Stderr, "warning: could not create temp file: %v\n", err)
w.Limit = 0
fmt.Print(s)
return
}
w.tmpFile = f
}
if _, err := w.tmpFile.WriteString(s); err != nil {
// Disk write failed — abandon truncation, stream directly to stdout
fmt.Fprintf(os.Stderr, "warning: temp file write failed: %v\n", err)
w.closeTmpFile(true)
w.Limit = 0
w.truncated = false
fmt.Print(s)
return
}
if w.truncated {
return
}
remaining := w.Limit - w.written
if len(s) <= remaining {
fmt.Print(s)
w.written += len(s)
} else {
if remaining > 0 {
fmt.Print(s[:remaining])
w.written += remaining
}
w.truncated = true
}
}
// Finish flushes remaining output. Call once after all Write calls are done.
func (w *Writer) Finish() {
// Quiet mode: print buffered content at once
if w.Quiet {
fmt.Println(w.buf.String())
return
}
if !w.truncated {
w.closeTmpFile(true) // clean up unused temp file
fmt.Println()
return
}
// Close the temp file so it's readable
tmpPath := w.tmpFile.Name()
w.closeTmpFile(false) // close but keep the file
fmt.Printf("\n\n--- response truncated (%d bytes total) ---\n", w.totalBytes)
fmt.Printf("Full response: %s\n", tmpPath)
fmt.Printf("Explore:\n")
fmt.Printf(" cat %s | grep \"<pattern>\"\n", tmpPath)
fmt.Printf(" cat %s | tail -50\n", tmpPath)
}
// closeTmpFile closes and optionally removes the temp file.
func (w *Writer) closeTmpFile(remove bool) {
if w.tmpFile == nil {
return
}
if err := w.tmpFile.Close(); err != nil {
log.Debugf("warning: failed to close temp file: %v", err)
}
if remove {
if err := os.Remove(w.tmpFile.Name()); err != nil {
log.Debugf("warning: failed to remove temp file: %v", err)
}
}
w.tmpFile = nil
}

View File

@@ -1,95 +0,0 @@
package overflow
import (
"os"
"testing"
)
func TestWriter_NoLimit(t *testing.T) {
w := &Writer{Limit: 0}
w.Write("hello world")
if w.truncated {
t.Fatal("should not be truncated with limit 0")
}
if w.totalBytes != 11 {
t.Fatalf("expected 11 total bytes, got %d", w.totalBytes)
}
}
func TestWriter_UnderLimit(t *testing.T) {
w := &Writer{Limit: 100}
w.Write("hello")
w.Write(" world")
if w.truncated {
t.Fatal("should not be truncated when under limit")
}
if w.written != 11 {
t.Fatalf("expected 11 written bytes, got %d", w.written)
}
}
func TestWriter_OverLimit(t *testing.T) {
w := &Writer{Limit: 5}
w.Write("hello world") // 11 bytes, limit 5
if !w.truncated {
t.Fatal("should be truncated")
}
if w.written != 5 {
t.Fatalf("expected 5 written bytes, got %d", w.written)
}
if w.totalBytes != 11 {
t.Fatalf("expected 11 total bytes, got %d", w.totalBytes)
}
if w.tmpFile == nil {
t.Fatal("temp file should have been created")
}
_ = w.tmpFile.Close()
data, _ := os.ReadFile(w.tmpFile.Name())
_ = os.Remove(w.tmpFile.Name())
if string(data) != "hello world" {
t.Fatalf("temp file should contain full content, got %q", string(data))
}
}
func TestWriter_MultipleChunks(t *testing.T) {
w := &Writer{Limit: 10}
w.Write("hello") // 5 bytes
w.Write(" ") // 6 bytes
w.Write("world") // 11 bytes, crosses limit
w.Write("!") // 12 bytes, already truncated
if !w.truncated {
t.Fatal("should be truncated")
}
if w.written != 10 {
t.Fatalf("expected 10 written bytes, got %d", w.written)
}
if w.totalBytes != 12 {
t.Fatalf("expected 12 total bytes, got %d", w.totalBytes)
}
if w.tmpFile == nil {
t.Fatal("temp file should have been created")
}
_ = w.tmpFile.Close()
data, _ := os.ReadFile(w.tmpFile.Name())
_ = os.Remove(w.tmpFile.Name())
if string(data) != "hello world!" {
t.Fatalf("temp file should contain full content, got %q", string(data))
}
}
func TestWriter_QuietMode(t *testing.T) {
w := &Writer{Limit: 0, Quiet: true}
w.Write("hello")
w.Write(" world")
if w.written != 0 {
t.Fatalf("quiet mode should not write to stdout, got %d written", w.written)
}
if w.totalBytes != 11 {
t.Fatalf("expected 11 total bytes, got %d", w.totalBytes)
}
if w.buf.String() != "hello world" {
t.Fatalf("buffer should contain full content, got %q", w.buf.String())
}
}

View File

@@ -55,7 +55,7 @@ func NewModel(cfg config.OnyxCliConfig) Model {
return Model{
config: cfg,
client: client,
viewport: newViewport(80, cfg.Features.StreamMarkdownEnabled()),
viewport: newViewport(80),
input: newInputModel(),
status: newStatusBar(),
agentID: cfg.DefaultAgentID,

View File

@@ -67,10 +67,6 @@ func handleSlashCommand(m Model, text string) (Model, tea.Cmd) {
}
return m, nil
case "/experiments":
m.viewport.addInfo(m.experimentsText())
return m, nil
case "/quit":
return m, tea.Quit

View File

@@ -1,8 +0,0 @@
package tui
import "github.com/onyx-dot-app/onyx/cli/internal/config"
// experimentsText returns the formatted experiments list for the current config.
func (m Model) experimentsText() string {
return config.ExperimentsText(m.config.Features)
}

View File

@@ -10,7 +10,6 @@ const helpText = `Onyx CLI Commands
/configure Re-run connection setup
/connectors Open connectors page in browser
/settings Open Onyx settings in browser
/experiments List experimental features and their status
/quit Exit Onyx CLI
Keyboard Shortcuts

View File

@@ -24,7 +24,6 @@ var slashCommands = []slashCommand{
{"/configure", "Re-run connection setup"},
{"/connectors", "Open connectors in browser"},
{"/settings", "Open settings in browser"},
{"/experiments", "List experimental features"},
{"/quit", "Exit Onyx CLI"},
}

View File

@@ -4,7 +4,6 @@ import (
"fmt"
"sort"
"strings"
"time"
"github.com/charmbracelet/glamour"
"github.com/charmbracelet/glamour/styles"
@@ -45,9 +44,6 @@ type pickerItem struct {
label string
}
// streamRenderInterval is the minimum time between markdown re-renders during streaming.
const streamRenderInterval = 100 * time.Millisecond
// viewport manages the chat display.
type viewport struct {
entries []chatEntry
@@ -61,12 +57,6 @@ type viewport struct {
pickerIndex int
pickerType pickerKind
scrollOffset int // lines scrolled up from bottom (0 = pinned to bottom)
// Progressive markdown rendering during streaming
streamMarkdown bool // feature flag: render markdown while streaming
streamRendered string // cached rendered output during streaming
lastRenderTime time.Time
lastRenderLen int // length of streamBuf at last render (skip if unchanged)
}
// newMarkdownRenderer creates a Glamour renderer with zero left margin.
@@ -81,11 +71,10 @@ func newMarkdownRenderer(width int) *glamour.TermRenderer {
return r
}
func newViewport(width int, streamMarkdown bool) *viewport {
func newViewport(width int) *viewport {
return &viewport{
width: width,
renderer: newMarkdownRenderer(width),
streamMarkdown: streamMarkdown,
width: width,
renderer: newMarkdownRenderer(width),
}
}
@@ -119,27 +108,12 @@ func (v *viewport) addUserMessage(msg string) {
func (v *viewport) startAgent() {
v.streaming = true
v.streamBuf = ""
v.streamRendered = ""
v.lastRenderLen = 0
v.lastRenderTime = time.Time{}
// Add a blank-line spacer entry before the agent message
v.entries = append(v.entries, chatEntry{kind: entryInfo, rendered: ""})
}
func (v *viewport) appendToken(token string) {
v.streamBuf += token
if !v.streamMarkdown {
return
}
now := time.Now()
bufLen := len(v.streamBuf)
if bufLen != v.lastRenderLen && now.Sub(v.lastRenderTime) >= streamRenderInterval {
v.streamRendered = v.renderAgentContent(v.streamBuf)
v.lastRenderTime = now
v.lastRenderLen = bufLen
}
}
func (v *viewport) finishAgent() {
@@ -161,8 +135,6 @@ func (v *viewport) finishAgent() {
})
v.streaming = false
v.streamBuf = ""
v.streamRendered = ""
v.lastRenderLen = 0
}
func (v *viewport) renderAgentContent(content string) string {
@@ -386,22 +358,6 @@ func (v *viewport) renderPicker(width, height int) string {
return lipgloss.Place(width, height, lipgloss.Center, lipgloss.Center, panel)
}
// streamingContent returns the display content for the in-progress stream.
func (v *viewport) streamingContent() string {
if v.streamMarkdown && v.streamRendered != "" {
return v.streamRendered
}
// Fall back to raw text with agent dot prefix
bufLines := strings.Split(v.streamBuf, "\n")
if len(bufLines) > 0 {
bufLines[0] = agentDot + " " + bufLines[0]
for i := 1; i < len(bufLines); i++ {
bufLines[i] = " " + bufLines[i]
}
}
return strings.Join(bufLines, "\n")
}
// totalLines computes the total number of rendered content lines.
func (v *viewport) totalLines() int {
var lines []string
@@ -412,7 +368,14 @@ func (v *viewport) totalLines() int {
lines = append(lines, e.rendered)
}
if v.streaming && v.streamBuf != "" {
lines = append(lines, v.streamingContent())
bufLines := strings.Split(v.streamBuf, "\n")
if len(bufLines) > 0 {
bufLines[0] = agentDot + " " + bufLines[0]
for i := 1; i < len(bufLines); i++ {
bufLines[i] = " " + bufLines[i]
}
}
lines = append(lines, strings.Join(bufLines, "\n"))
} else if v.streaming {
lines = append(lines, agentDot+" ")
}
@@ -436,9 +399,16 @@ func (v *viewport) view(height int) string {
lines = append(lines, e.rendered)
}
// Streaming buffer
// Streaming buffer (plain text, not markdown)
if v.streaming && v.streamBuf != "" {
lines = append(lines, v.streamingContent())
bufLines := strings.Split(v.streamBuf, "\n")
if len(bufLines) > 0 {
bufLines[0] = agentDot + " " + bufLines[0]
for i := 1; i < len(bufLines); i++ {
bufLines[i] = " " + bufLines[i]
}
}
lines = append(lines, strings.Join(bufLines, "\n"))
} else if v.streaming {
lines = append(lines, agentDot+" ")
}

View File

@@ -4,7 +4,6 @@ import (
"regexp"
"strings"
"testing"
"time"
)
// stripANSI removes ANSI escape sequences for test comparisons.
@@ -15,7 +14,7 @@ func stripANSI(s string) string {
}
func TestAddUserMessage(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addUserMessage("hello world")
if len(v.entries) != 1 {
@@ -38,7 +37,7 @@ func TestAddUserMessage(t *testing.T) {
}
func TestStartAndFinishAgent(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.startAgent()
if !v.streaming {
@@ -84,7 +83,7 @@ func TestStartAndFinishAgent(t *testing.T) {
}
func TestFinishAgentNoPadding(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.startAgent()
v.appendToken("Test message")
v.finishAgent()
@@ -99,7 +98,7 @@ func TestFinishAgentNoPadding(t *testing.T) {
}
func TestFinishAgentMultiline(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.startAgent()
v.appendToken("Line one\n\nLine three")
v.finishAgent()
@@ -116,7 +115,7 @@ func TestFinishAgentMultiline(t *testing.T) {
}
func TestFinishAgentEmpty(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.startAgent()
v.finishAgent()
@@ -129,7 +128,7 @@ func TestFinishAgentEmpty(t *testing.T) {
}
func TestAddInfo(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addInfo("test info")
if len(v.entries) != 1 {
@@ -146,7 +145,7 @@ func TestAddInfo(t *testing.T) {
}
func TestAddError(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addError("something broke")
if len(v.entries) != 1 {
@@ -163,7 +162,7 @@ func TestAddError(t *testing.T) {
}
func TestAddCitations(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addCitations(map[int]string{1: "doc-a", 2: "doc-b"})
if len(v.entries) != 1 {
@@ -183,7 +182,7 @@ func TestAddCitations(t *testing.T) {
}
func TestAddCitationsEmpty(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addCitations(map[int]string{})
if len(v.entries) != 0 {
@@ -192,7 +191,7 @@ func TestAddCitationsEmpty(t *testing.T) {
}
func TestCitationVisibility(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addInfo("hello")
v.addCitations(map[int]string{1: "doc"})
@@ -212,7 +211,7 @@ func TestCitationVisibility(t *testing.T) {
}
func TestClearAll(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addUserMessage("test")
v.startAgent()
v.appendToken("response")
@@ -231,7 +230,7 @@ func TestClearAll(t *testing.T) {
}
func TestClearDisplay(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addUserMessage("test")
v.clearDisplay()
@@ -241,7 +240,7 @@ func TestClearDisplay(t *testing.T) {
}
func TestViewPadsShortContent(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
v.addInfo("hello")
view := v.view(10)
@@ -252,7 +251,7 @@ func TestViewPadsShortContent(t *testing.T) {
}
func TestViewTruncatesTallContent(t *testing.T) {
v := newViewport(80, false)
v := newViewport(80)
for i := 0; i < 20; i++ {
v.addInfo("line")
}
@@ -263,93 +262,3 @@ func TestViewTruncatesTallContent(t *testing.T) {
t.Errorf("expected 5 lines (truncated), got %d", len(lines))
}
}
func TestStreamMarkdownRendersOnThrottle(t *testing.T) {
v := newViewport(80, true)
v.startAgent()
// First token: no prior render, so it should render immediately
v.appendToken("**bold text**")
if v.streamRendered == "" {
t.Error("expected streamRendered to be populated after first token")
}
plain := stripANSI(v.streamRendered)
if !strings.Contains(plain, "bold text") {
t.Errorf("expected rendered to contain 'bold text', got %q", plain)
}
// Should not contain raw markdown asterisks
if strings.Contains(plain, "**") {
t.Errorf("expected markdown to be rendered (no **), got %q", plain)
}
// Second token within throttle window: should NOT re-render
v.lastRenderTime = time.Now() // simulate recent render
prevRendered := v.streamRendered
v.appendToken(" more")
if v.streamRendered != prevRendered {
t.Error("expected streamRendered to be unchanged within throttle window")
}
// After throttle interval: should re-render
v.lastRenderTime = time.Now().Add(-streamRenderInterval - time.Millisecond)
v.appendToken("!")
if v.streamRendered == prevRendered {
t.Error("expected streamRendered to update after throttle interval")
}
plain = stripANSI(v.streamRendered)
if !strings.Contains(plain, "bold text more!") {
t.Errorf("expected updated rendered content, got %q", plain)
}
}
func TestStreamMarkdownDisabledNoRender(t *testing.T) {
v := newViewport(80, false)
v.startAgent()
v.appendToken("**bold**")
if v.streamRendered != "" {
t.Error("expected no streamRendered when streamMarkdown is disabled")
}
// View should show raw markdown
view := v.view(10)
plain := stripANSI(view)
if !strings.Contains(plain, "**bold**") {
t.Errorf("expected raw markdown in view, got %q", plain)
}
}
func TestStreamMarkdownViewUsesRendered(t *testing.T) {
v := newViewport(80, true)
v.startAgent()
v.appendToken("**formatted**")
view := v.view(10)
plain := stripANSI(view)
// Should show rendered content, not raw **formatted**
if strings.Contains(plain, "**") {
t.Errorf("expected rendered markdown in view (no **), got %q", plain)
}
if !strings.Contains(plain, "formatted") {
t.Errorf("expected 'formatted' in view, got %q", plain)
}
}
func TestStreamMarkdownResetOnStart(t *testing.T) {
v := newViewport(80, true)
// First stream cycle
v.startAgent()
v.appendToken("first")
v.finishAgent()
// Start second stream - state should be clean
v.startAgent()
if v.streamRendered != "" {
t.Error("expected streamRendered cleared on startAgent")
}
if v.lastRenderLen != 0 {
t.Error("expected lastRenderLen reset on startAgent")
}
}

View File

@@ -1,12 +1,10 @@
package main
import (
"errors"
"fmt"
"os"
"github.com/onyx-dot-app/onyx/cli/cmd"
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
)
var (
@@ -20,10 +18,6 @@ func main() {
if err := cmd.Execute(); err != nil {
fmt.Fprintf(os.Stderr, "Error: %v\n", err)
var exitErr *exitcodes.ExitError
if errors.As(err, &exitErr) {
os.Exit(exitErr.Code)
}
os.Exit(1)
}
}

View File

@@ -19,6 +19,6 @@ dependencies:
version: 5.4.0
- name: code-interpreter
repository: https://onyx-dot-app.github.io/python-sandbox/
version: 0.3.2
digest: sha256:74908ea45ace2b4be913ff762772e6d87e40bab64e92c6662aa51730eaeb9d87
generated: "2026-04-06T15:34:02.597166-07:00"
version: 0.3.1
digest: sha256:4965b6ea3674c37163832a2192cd3bc8004f2228729fca170af0b9f457e8f987
generated: "2026-03-02T15:29:39.632344-08:00"

View File

@@ -5,7 +5,7 @@ home: https://www.onyx.app/
sources:
- "https://github.com/onyx-dot-app/onyx"
type: application
version: 0.4.40
version: 0.4.39
appVersion: latest
annotations:
category: Productivity
@@ -45,6 +45,6 @@ dependencies:
repository: https://charts.min.io/
condition: minio.enabled
- name: code-interpreter
version: 0.3.2
version: 0.3.1
repository: https://onyx-dot-app.github.io/python-sandbox/
condition: codeInterpreter.enabled

View File

@@ -67,9 +67,6 @@ spec:
- "/bin/sh"
- "-c"
- |
{{- if .Values.api.runUpdateCaCertificates }}
update-ca-certificates &&
{{- end }}
alembic upgrade head &&
echo "Starting Onyx Api Server" &&
uvicorn onyx.main:app --host {{ .Values.global.host }} --port {{ .Values.api.containerPorts.server }}

View File

@@ -504,18 +504,6 @@ api:
tolerations: []
affinity: {}
# Run update-ca-certificates before starting the server.
# Useful when mounting custom CA certificates via volumes/volumeMounts.
# NOTE: Requires the container to run as root (runAsUser: 0).
# CA certificate files must be mounted under /usr/local/share/ca-certificates/
# with a .crt extension (e.g. /usr/local/share/ca-certificates/my-ca.crt).
# NOTE: Python HTTP clients (requests, httpx) use certifi's bundle by default
# and will not pick up the system CA store automatically. Set the following
# environment variables via configMap values (loaded through envFrom) to make them use the updated system bundle:
# REQUESTS_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
# SSL_CERT_FILE: /etc/ssl/certs/ca-certificates.crt
runUpdateCaCertificates: false
######################################################################
#

View File

@@ -30,10 +30,7 @@ target "backend" {
context = "backend"
dockerfile = "Dockerfile"
cache-from = [
"type=registry,ref=${BACKEND_REPOSITORY}:latest",
"type=registry,ref=${BACKEND_REPOSITORY}:edge",
]
cache-from = ["type=registry,ref=${BACKEND_REPOSITORY}:latest"]
cache-to = ["type=inline"]
tags = ["${BACKEND_REPOSITORY}:${TAG}"]
@@ -43,10 +40,7 @@ target "web" {
context = "web"
dockerfile = "Dockerfile"
cache-from = [
"type=registry,ref=${WEB_SERVER_REPOSITORY}:latest",
"type=registry,ref=${WEB_SERVER_REPOSITORY}:edge",
]
cache-from = ["type=registry,ref=${WEB_SERVER_REPOSITORY}:latest"]
cache-to = ["type=inline"]
tags = ["${WEB_SERVER_REPOSITORY}:${TAG}"]
@@ -57,10 +51,7 @@ target "model-server" {
dockerfile = "Dockerfile.model_server"
cache-from = [
"type=registry,ref=${MODEL_SERVER_REPOSITORY}:latest",
"type=registry,ref=${MODEL_SERVER_REPOSITORY}:edge",
]
cache-from = ["type=registry,ref=${MODEL_SERVER_REPOSITORY}:latest"]
cache-to = ["type=inline"]
tags = ["${MODEL_SERVER_REPOSITORY}:${TAG}"]
@@ -82,10 +73,7 @@ target "cli" {
context = "cli"
dockerfile = "Dockerfile"
cache-from = [
"type=registry,ref=${CLI_REPOSITORY}:latest",
"type=registry,ref=${CLI_REPOSITORY}:edge",
]
cache-from = ["type=registry,ref=${CLI_REPOSITORY}:latest"]
cache-to = ["type=inline"]
tags = ["${CLI_REPOSITORY}:${TAG}"]

View File

@@ -6,11 +6,11 @@ All Prometheus metrics live in the `backend/onyx/server/metrics/` package. Follo
### 1. Choose the right file (or create a new one)
| File | Purpose |
| ------------------------------------- | -------------------------------------------- |
| `metrics/slow_requests.py` | Slow request counter + callback |
| `metrics/postgres_connection_pool.py` | SQLAlchemy connection pool metrics |
| `metrics/prometheus_setup.py` | FastAPI instrumentator config (orchestrator) |
| File | Purpose |
|------|---------|
| `metrics/slow_requests.py` | Slow request counter + callback |
| `metrics/postgres_connection_pool.py` | SQLAlchemy connection pool metrics |
| `metrics/prometheus_setup.py` | FastAPI instrumentator config (orchestrator) |
If your metric is a standalone concern (e.g. cache hit rates, queue depths), create a new file under `metrics/` and keep one metric concept per file.
@@ -30,7 +30,6 @@ _my_counter = Counter(
```
**Naming conventions:**
- Prefix all metric names with `onyx_`
- Counters: `_total` suffix (e.g. `onyx_api_slow_requests_total`)
- Histograms: `_seconds` or `_bytes` suffix for durations/sizes
@@ -108,26 +107,26 @@ These metrics are exposed at `GET /metrics` on the API server.
### Built-in (via `prometheus-fastapi-instrumentator`)
| Metric | Type | Labels | Description |
| ------------------------------------- | --------- | ----------------------------- | ------------------------------------------------- |
| `http_requests_total` | Counter | `method`, `status`, `handler` | Total request count |
| `http_request_duration_highr_seconds` | Histogram | _(none)_ | High-resolution latency (many buckets, no labels) |
| `http_request_duration_seconds` | Histogram | `method`, `handler` | Latency by handler (custom buckets for P95/P99) |
| `http_request_size_bytes` | Summary | `handler` | Incoming request content length |
| `http_response_size_bytes` | Summary | `handler` | Outgoing response content length |
| `http_requests_inprogress` | Gauge | `method`, `handler` | Currently in-flight requests |
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `http_requests_total` | Counter | `method`, `status`, `handler` | Total request count |
| `http_request_duration_highr_seconds` | Histogram | _(none)_ | High-resolution latency (many buckets, no labels) |
| `http_request_duration_seconds` | Histogram | `method`, `handler` | Latency by handler (custom buckets for P95/P99) |
| `http_request_size_bytes` | Summary | `handler` | Incoming request content length |
| `http_response_size_bytes` | Summary | `handler` | Outgoing response content length |
| `http_requests_inprogress` | Gauge | `method`, `handler` | Currently in-flight requests |
### Custom (via `onyx.server.metrics`)
| Metric | Type | Labels | Description |
| ------------------------------ | ------- | ----------------------------- | ---------------------------------------------------------------- |
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `onyx_api_slow_requests_total` | Counter | `method`, `handler`, `status` | Requests exceeding `SLOW_REQUEST_THRESHOLD_SECONDS` (default 1s) |
### Configuration
| Env Var | Default | Description |
| -------------------------------- | ------- | -------------------------------------------- |
| `SLOW_REQUEST_THRESHOLD_SECONDS` | `1.0` | Duration threshold for slow request counting |
| Env Var | Default | Description |
|---------|---------|-------------|
| `SLOW_REQUEST_THRESHOLD_SECONDS` | `1.0` | Duration threshold for slow request counting |
### Instrumentator Settings
@@ -142,188 +141,44 @@ These metrics provide visibility into SQLAlchemy connection pool state across al
### Pool State (via custom Prometheus collector — snapshot on each scrape)
| Metric | Type | Labels | Description |
| -------------------------- | ----- | -------- | ----------------------------------------------- |
| `onyx_db_pool_checked_out` | Gauge | `engine` | Currently checked-out connections |
| `onyx_db_pool_checked_in` | Gauge | `engine` | Idle connections available in the pool |
| `onyx_db_pool_overflow` | Gauge | `engine` | Current overflow connections beyond `pool_size` |
| `onyx_db_pool_size` | Gauge | `engine` | Configured pool size (constant) |
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `onyx_db_pool_checked_out` | Gauge | `engine` | Currently checked-out connections |
| `onyx_db_pool_checked_in` | Gauge | `engine` | Idle connections available in the pool |
| `onyx_db_pool_overflow` | Gauge | `engine` | Current overflow connections beyond `pool_size` |
| `onyx_db_pool_size` | Gauge | `engine` | Configured pool size (constant) |
### Pool Lifecycle (via SQLAlchemy pool event listeners)
| Metric | Type | Labels | Description |
| ---------------------------------------- | ------- | -------- | ---------------------------------------- |
| `onyx_db_pool_checkout_total` | Counter | `engine` | Total connection checkouts from the pool |
| `onyx_db_pool_checkin_total` | Counter | `engine` | Total connection checkins to the pool |
| `onyx_db_pool_connections_created_total` | Counter | `engine` | Total new database connections created |
| `onyx_db_pool_invalidations_total` | Counter | `engine` | Total connection invalidations |
| `onyx_db_pool_checkout_timeout_total` | Counter | `engine` | Total connection checkout timeouts |
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `onyx_db_pool_checkout_total` | Counter | `engine` | Total connection checkouts from the pool |
| `onyx_db_pool_checkin_total` | Counter | `engine` | Total connection checkins to the pool |
| `onyx_db_pool_connections_created_total` | Counter | `engine` | Total new database connections created |
| `onyx_db_pool_invalidations_total` | Counter | `engine` | Total connection invalidations |
| `onyx_db_pool_checkout_timeout_total` | Counter | `engine` | Total connection checkout timeouts |
### Per-Endpoint Attribution (via pool events + endpoint context middleware)
| Metric | Type | Labels | Description |
| -------------------------------------- | --------- | ------------------- | ----------------------------------------------- |
| `onyx_db_connections_held_by_endpoint` | Gauge | `handler`, `engine` | DB connections currently held, by endpoint |
| `onyx_db_connection_hold_seconds` | Histogram | `handler`, `engine` | Duration a DB connection is held by an endpoint |
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `onyx_db_connections_held_by_endpoint` | Gauge | `handler`, `engine` | DB connections currently held, by endpoint |
| `onyx_db_connection_hold_seconds` | Histogram | `handler`, `engine` | Duration a DB connection is held by an endpoint |
Engine label values: `sync` (main read-write), `async` (async sessions), `readonly` (read-only user).
Connections from background tasks (Celery) or boot-time warmup appear as `handler="unknown"`.
## Celery Worker Metrics
Celery workers expose metrics via a standalone Prometheus HTTP server (separate from the API server's `/metrics` endpoint). Each worker type runs its own server on a dedicated port.
### Metrics Server (`onyx.server.metrics.metrics_server`)
| Env Var | Default | Description |
| ---------------------------- | ------------------- | ----------------------------------------------------- |
| `PROMETHEUS_METRICS_PORT` | _(per worker type)_ | Override the default port for this worker |
| `PROMETHEUS_METRICS_ENABLED` | `true` | Set to `false` to disable the metrics server entirely |
Default ports:
| Worker | Port |
| --------------- | ---- |
| `docfetching` | 9092 |
| `docprocessing` | 9093 |
| `monitoring` | 9096 |
Workers without a default port and no `PROMETHEUS_METRICS_PORT` env var will skip starting the server.
### Generic Task Lifecycle Metrics (`onyx.server.metrics.celery_task_metrics`)
Push-based metrics that fire on Celery signals for all tasks on the worker.
| Metric | Type | Labels | Description |
| ----------------------------------- | --------- | ------------------------------- | ----------------------------------------------------------------------------- |
| `onyx_celery_task_started_total` | Counter | `task_name`, `queue` | Total tasks started |
| `onyx_celery_task_completed_total` | Counter | `task_name`, `queue`, `outcome` | Total tasks completed (`outcome`: `success` or `failure`) |
| `onyx_celery_task_duration_seconds` | Histogram | `task_name`, `queue` | Task execution duration. Buckets: 1, 5, 15, 30, 60, 120, 300, 600, 1800, 3600 |
| `onyx_celery_tasks_active` | Gauge | `task_name`, `queue` | Currently executing tasks |
| `onyx_celery_task_retried_total` | Counter | `task_name`, `queue` | Total task retries |
| `onyx_celery_task_revoked_total` | Counter | `task_name` | Total tasks revoked (cancelled) |
| `onyx_celery_task_rejected_total` | Counter | `task_name` | Total tasks rejected by worker |
Stale start-time entries (tasks killed via SIGTERM/OOM where `task_postrun` never fires) are evicted after 1 hour.
### Per-Connector Indexing Metrics (`onyx.server.metrics.indexing_task_metrics`)
Enriches docfetching and docprocessing tasks with connector-level labels. Silently no-ops for all other tasks.
| Metric | Type | Labels | Description |
| ------------------------------------- | --------- | ----------------------------------------------------------- | ---------------------------------------- |
| `onyx_indexing_task_started_total` | Counter | `task_name`, `source`, `tenant_id`, `cc_pair_id` | Indexing tasks started per connector |
| `onyx_indexing_task_completed_total` | Counter | `task_name`, `source`, `tenant_id`, `cc_pair_id`, `outcome` | Indexing tasks completed per connector |
| `onyx_indexing_task_duration_seconds` | Histogram | `task_name`, `source`, `tenant_id` | Indexing task duration by connector type |
`connector_name` is intentionally excluded from these push-based counters to avoid unbounded cardinality (it's a free-form user string). The pull-based collectors on the monitoring worker include it since they have bounded cardinality (one series per connector).
### Pull-Based Collectors (`onyx.server.metrics.indexing_pipeline`)
Registered only in the **Monitoring** worker. Collectors query Redis/Postgres at scrape time with a 30-second TTL cache.
| Metric | Type | Labels | Description |
| ------------------------------------ | ----- | ------- | ----------------------------------- |
| `onyx_queue_depth` | Gauge | `queue` | Celery queue length |
| `onyx_queue_unacked` | Gauge | `queue` | Unacknowledged messages per queue |
| `onyx_queue_oldest_task_age_seconds` | Gauge | `queue` | Age of the oldest task in the queue |
Plus additional connector health, index attempt, and worker heartbeat metrics — see `indexing_pipeline.py` for the full list.
### Adding Metrics to a Worker
Currently only the docfetching and docprocessing workers have push-based task metrics wired up. To add metrics to another worker (e.g. heavy, light, primary):
**1. Import and call the generic handlers from the worker's signal handlers:**
```python
from onyx.server.metrics.celery_task_metrics import (
on_celery_task_prerun,
on_celery_task_postrun,
on_celery_task_retry,
on_celery_task_revoked,
on_celery_task_rejected,
)
@signals.task_prerun.connect
def on_task_prerun(sender, task_id, task, args, kwargs, **kwds):
app_base.on_task_prerun(sender, task_id, task, args, kwargs, **kwds)
on_celery_task_prerun(task_id, task)
```
Do the same for `task_postrun`, `task_retry`, `task_revoked`, and `task_rejected` — see `apps/docfetching.py` for the complete example.
**2. Start the metrics server on `worker_ready`:**
```python
from onyx.server.metrics.metrics_server import start_metrics_server
@worker_ready.connect
def on_worker_ready(sender, **kwargs):
start_metrics_server("your_worker_type")
app_base.on_worker_ready(sender, **kwargs)
```
Add a default port for your worker type in `metrics_server.py`'s `_DEFAULT_PORTS` dict, or set `PROMETHEUS_METRICS_PORT` in the environment.
**3. (Optional) Add domain-specific enrichment:**
If your tasks need richer labels beyond `task_name`/`queue`, create a new module in `server/metrics/` following `indexing_task_metrics.py`:
- Define Counters/Histograms with your domain labels
- Write `on_<domain>_task_prerun` / `on_<domain>_task_postrun` handlers that filter by task name and no-op for others
- Call them from the worker's signal handlers alongside the generic ones
**Cardinality warning:** Never use user-defined free-form strings as metric labels — they create unbounded cardinality. Use IDs or enum values. If you need free-form labels, use pull-based collectors (monitoring worker) where cardinality is naturally bounded.
### Current Worker Integration Status
| Worker | Generic Task Metrics | Domain Metrics | Metrics Server |
| -------------------- | -------------------- | -------------- | ------------------------------------ |
| Docfetching | ✓ | ✓ (indexing) | ✓ (port 9092) |
| Docprocessing | ✓ | ✓ (indexing) | ✓ (port 9093) |
| Monitoring | — | — | ✓ (port 9096, pull-based collectors) |
| Primary | — | — | — |
| Light | — | — | — |
| Heavy | — | — | — |
| User File Processing | — | — | — |
| KG Processing | — | — | — |
### Example PromQL Queries (Celery)
```promql
# Task completion rate by worker queue
sum by (queue) (rate(onyx_celery_task_completed_total[5m]))
# P95 task duration for pruning tasks
histogram_quantile(0.95,
sum by (le) (rate(onyx_celery_task_duration_seconds_bucket{task_name=~".*pruning.*"}[5m])))
# Task failure rate
sum by (task_name) (rate(onyx_celery_task_completed_total{outcome="failure"}[5m]))
/ sum by (task_name) (rate(onyx_celery_task_completed_total[5m]))
# Active tasks per queue
sum by (queue) (onyx_celery_tasks_active)
# Indexing throughput by source type
sum by (source) (rate(onyx_indexing_task_completed_total{outcome="success"}[5m]))
# Queue depth — are tasks backing up?
onyx_queue_depth > 100
```
## OpenSearch Search Metrics
These metrics track OpenSearch search latency and throughput. Collected via `onyx.server.metrics.opensearch_search`.
| Metric | Type | Labels | Description |
| ------------------------------------------------ | --------- | ------------- | --------------------------------------------------------------------------- |
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `onyx_opensearch_search_client_duration_seconds` | Histogram | `search_type` | Client-side end-to-end latency (network + serialization + server execution) |
| `onyx_opensearch_search_server_duration_seconds` | Histogram | `search_type` | Server-side execution time from OpenSearch `took` field |
| `onyx_opensearch_search_total` | Counter | `search_type` | Total search requests sent to OpenSearch |
| `onyx_opensearch_searches_in_progress` | Gauge | `search_type` | Currently in-flight OpenSearch searches |
| `onyx_opensearch_search_server_duration_seconds` | Histogram | `search_type` | Server-side execution time from OpenSearch `took` field |
| `onyx_opensearch_search_total` | Counter | `search_type` | Total search requests sent to OpenSearch |
| `onyx_opensearch_searches_in_progress` | Gauge | `search_type` | Currently in-flight OpenSearch searches |
Search type label values: See `OpenSearchSearchType`.

View File

@@ -70,10 +70,6 @@ backend = [
"lazy_imports==1.0.1",
"lxml==5.3.0",
"Mako==1.2.4",
# NOTE: Do not update without understanding the patching behavior in
# get_markitdown_converter in
# backend/onyx/file_processing/extract_file_text.py and what impacts
# updating might have on this behavior.
"markitdown[pdf, docx, pptx, xlsx, xls]==0.1.2",
"mcp[cli]==1.26.0",
"msal==1.34.0",

View File

@@ -127,7 +127,7 @@ function SidebarTab({
rightChildren={truncationSpacer}
/>
) : (
<div className="flex flex-row items-center gap-2 w-full">
<div className="flex flex-row items-center gap-2 flex-1">
{Icon && (
<div className="flex items-center justify-center p-0.5">
<Icon className="h-[1rem] w-[1rem] text-text-03" />
@@ -153,7 +153,7 @@ function SidebarTab({
side="right"
sideOffset={4}
>
{children}
<Text>{children}</Text>
</TooltipPrimitive.Content>
</TooltipPrimitive.Portal>
</TooltipPrimitive.Root>

View File

@@ -1,22 +1,18 @@
import { Card } from "@opal/components/cards/card/components";
import { Content, SizePreset } from "@opal/layouts";
import { Content } from "@opal/layouts";
import { SvgEmpty } from "@opal/icons";
import type {
IconFunctionComponent,
PaddingVariants,
RichStr,
} from "@opal/types";
import type { IconFunctionComponent, PaddingVariants } from "@opal/types";
// ---------------------------------------------------------------------------
// Types
// ---------------------------------------------------------------------------
type EmptyMessageCardBaseProps = {
type EmptyMessageCardProps = {
/** Icon displayed alongside the title. */
icon?: IconFunctionComponent;
/** Primary message text. */
title: string | RichStr;
title: string;
/** Padding preset for the card. @default "md" */
padding?: PaddingVariants;
@@ -25,30 +21,16 @@ type EmptyMessageCardBaseProps = {
ref?: React.Ref<HTMLDivElement>;
};
type EmptyMessageCardProps =
| (EmptyMessageCardBaseProps & {
/** @default "secondary" */
sizePreset?: "secondary";
})
| (EmptyMessageCardBaseProps & {
sizePreset: "main-ui";
/** Description text. Only supported when `sizePreset` is `"main-ui"`. */
description?: string | RichStr;
});
// ---------------------------------------------------------------------------
// EmptyMessageCard
// ---------------------------------------------------------------------------
function EmptyMessageCard(props: EmptyMessageCardProps) {
const {
sizePreset = "secondary",
icon = SvgEmpty,
title,
padding = "md",
ref,
} = props;
function EmptyMessageCard({
icon = SvgEmpty,
title,
padding = "md",
ref,
}: EmptyMessageCardProps) {
return (
<Card
ref={ref}
@@ -57,23 +39,13 @@ function EmptyMessageCard(props: EmptyMessageCardProps) {
padding={padding}
rounding="md"
>
{sizePreset === "secondary" ? (
<Content
icon={icon}
title={title}
sizePreset="secondary"
variant="body"
prominence="muted"
/>
) : (
<Content
icon={icon}
title={title}
description={"description" in props ? props.description : undefined}
sizePreset={sizePreset}
variant="section"
/>
)}
<Content
icon={icon}
title={title}
sizePreset="secondary"
variant="body"
prominence="muted"
/>
</Card>
);
}

View File

@@ -1,16 +1,41 @@
"use client";
import "@opal/core/animations/styles.css";
import React from "react";
import React, { createContext, useContext, useState, useCallback } from "react";
import { cn } from "@opal/utils";
import type { WithoutStyles, ExtremaSizeVariants } from "@opal/types";
import { widthVariants } from "@opal/shared";
// ---------------------------------------------------------------------------
// Types
// Context-per-group registry
// ---------------------------------------------------------------------------
type HoverableInteraction = "rest" | "hover";
/**
* Lazily-created map of group names to React contexts.
*
* Each group gets its own `React.Context<boolean | null>` so that a
* `Hoverable.Item` only re-renders when its *own* group's hover state
* changes — not when any unrelated group changes.
*
* The default value is `null` (no provider found), which lets
* `Hoverable.Item` distinguish "no Root ancestor" from "Root says
* not hovered" and throw when `group` was explicitly specified.
*/
const contextMap = new Map<string, React.Context<boolean | null>>();
function getOrCreateContext(group: string): React.Context<boolean | null> {
let ctx = contextMap.get(group);
if (!ctx) {
ctx = createContext<boolean | null>(null);
ctx.displayName = `HoverableContext(${group})`;
contextMap.set(group, ctx);
}
return ctx;
}
// ---------------------------------------------------------------------------
// Types
// ---------------------------------------------------------------------------
interface HoverableRootProps
extends WithoutStyles<React.HTMLAttributes<HTMLDivElement>> {
@@ -18,17 +43,6 @@ interface HoverableRootProps
group: string;
/** Width preset. @default "auto" */
widthVariant?: ExtremaSizeVariants;
/**
* JS-controllable interaction state override.
*
* - `"rest"` (default): items are shown/hidden by CSS `:hover`.
* - `"hover"`: forces items visible regardless of hover state. Useful when
* a hoverable action opens a modal — set `interaction="hover"` while the
* modal is open so the user can see which element they're interacting with.
*
* @default "rest"
*/
interaction?: HoverableInteraction;
/** Ref forwarded to the root `<div>`. */
ref?: React.Ref<HTMLDivElement>;
}
@@ -51,10 +65,12 @@ interface HoverableItemProps
/**
* Hover-tracking container for a named group.
*
* Uses a `data-hover-group` attribute and CSS `:hover` to control
* descendant `Hoverable.Item` visibility. No React state or context
* the browser natively removes `:hover` when modals/portals steal
* pointer events, preventing stale hover state.
* Wraps children in a `<div>` that tracks mouse-enter / mouse-leave and
* provides the hover state via a per-group React context.
*
* Nesting works because each `Hoverable.Root` creates a **new** context
* provider that shadows the parent — so an inner `Hoverable.Item group="b"`
* reads from the inner provider, not the outer `group="a"` provider.
*
* @example
* ```tsx
@@ -71,20 +87,70 @@ function HoverableRoot({
group,
children,
widthVariant = "full",
interaction = "rest",
ref,
onMouseEnter: consumerMouseEnter,
onMouseLeave: consumerMouseLeave,
onFocusCapture: consumerFocusCapture,
onBlurCapture: consumerBlurCapture,
...props
}: HoverableRootProps) {
const [hovered, setHovered] = useState(false);
const [focused, setFocused] = useState(false);
const onMouseEnter = useCallback(
(e: React.MouseEvent<HTMLDivElement>) => {
setHovered(true);
consumerMouseEnter?.(e);
},
[consumerMouseEnter]
);
const onMouseLeave = useCallback(
(e: React.MouseEvent<HTMLDivElement>) => {
setHovered(false);
consumerMouseLeave?.(e);
},
[consumerMouseLeave]
);
const onFocusCapture = useCallback(
(e: React.FocusEvent<HTMLDivElement>) => {
setFocused(true);
consumerFocusCapture?.(e);
},
[consumerFocusCapture]
);
const onBlurCapture = useCallback(
(e: React.FocusEvent<HTMLDivElement>) => {
if (
!(e.relatedTarget instanceof Node) ||
!e.currentTarget.contains(e.relatedTarget)
) {
setFocused(false);
}
consumerBlurCapture?.(e);
},
[consumerBlurCapture]
);
const active = hovered || focused;
const GroupContext = getOrCreateContext(group);
return (
<div
{...props}
ref={ref}
className={cn(widthVariants[widthVariant])}
data-hover-group={group}
data-interaction={interaction !== "rest" ? interaction : undefined}
>
{children}
</div>
<GroupContext.Provider value={active}>
<div
{...props}
ref={ref}
className={cn(widthVariants[widthVariant])}
onMouseEnter={onMouseEnter}
onMouseLeave={onMouseLeave}
onFocusCapture={onFocusCapture}
onBlurCapture={onBlurCapture}
>
{children}
</div>
</GroupContext.Provider>
);
}
@@ -96,10 +162,13 @@ function HoverableRoot({
* An element whose visibility is controlled by hover state.
*
* **Local mode** (`group` omitted): the item handles hover on its own
* element via CSS `:hover`.
* element via CSS `:hover`. This is the core abstraction.
*
* **Group mode** (`group` provided): visibility is driven by CSS `:hover`
* on the nearest `Hoverable.Root` ancestor via `[data-hover-group]:hover`.
* **Group mode** (`group` provided): visibility is driven by a matching
* `Hoverable.Root` ancestor's hover state via React context. If no
* matching Root is found, an error is thrown.
*
* Uses data-attributes for variant styling (see `styles.css`).
*
* @example
* ```tsx
@@ -115,6 +184,8 @@ function HoverableRoot({
* </Hoverable.Item>
* </Hoverable.Root>
* ```
*
* @throws If `group` is specified but no matching `Hoverable.Root` ancestor exists.
*/
function HoverableItem({
group,
@@ -123,6 +194,17 @@ function HoverableItem({
ref,
...props
}: HoverableItemProps) {
const contextValue = useContext(
group ? getOrCreateContext(group) : NOOP_CONTEXT
);
if (group && contextValue === null) {
throw new Error(
`Hoverable.Item group="${group}" has no matching Hoverable.Root ancestor. ` +
`Either wrap it in <Hoverable.Root group="${group}"> or remove the group prop for local hover.`
);
}
const isLocal = group === undefined;
return (
@@ -131,6 +213,9 @@ function HoverableItem({
ref={ref}
className={cn("hoverable-item")}
data-hoverable-variant={variant}
data-hoverable-active={
isLocal ? undefined : contextValue ? "true" : undefined
}
data-hoverable-local={isLocal ? "true" : undefined}
>
{children}
@@ -138,6 +223,9 @@ function HoverableItem({
);
}
/** Stable context used when no group is specified (local mode). */
const NOOP_CONTEXT = createContext<boolean | null>(null);
// ---------------------------------------------------------------------------
// Compound export
// ---------------------------------------------------------------------------
@@ -145,16 +233,18 @@ function HoverableItem({
/**
* Hoverable compound component for hover-to-reveal patterns.
*
* Entirely CSS-driven — no React state or context. The browser's native
* `:hover` pseudo-class handles all state, which means hover is
* automatically cleared when modals/portals steal pointer events.
* Provides two sub-components:
*
* - `Hoverable.Root` — Container with `data-hover-group`. CSS `:hover`
* on this element reveals descendant `Hoverable.Item` elements.
* - `Hoverable.Root` — A container that tracks hover state for a named group
* and provides it via React context.
*
* - `Hoverable.Item` — Hidden by default. In group mode, revealed when
* the ancestor Root is hovered. In local mode (no `group`), revealed
* when the item itself is hovered.
* - `Hoverable.Item` — The core abstraction. On its own (no `group`), it
* applies local CSS `:hover` for the variant effect. When `group` is
* specified, it reads hover state from the nearest matching
* `Hoverable.Root` — and throws if no matching Root is found.
*
* Supports nesting: a child `Hoverable.Root` shadows the parent's context,
* so each group's items only respond to their own root's hover.
*
* @example
* ```tsx
@@ -186,5 +276,4 @@ export {
type HoverableRootProps,
type HoverableItemProps,
type HoverableItemVariant,
type HoverableInteraction,
};

View File

@@ -7,20 +7,8 @@
opacity: 0;
}
/* Group mode — Root :hover controls descendant item visibility via CSS.
Exclude local-mode items so they aren't revealed by an ancestor root. */
[data-hover-group]:hover
.hoverable-item[data-hoverable-variant="opacity-on-hover"]:not(
[data-hoverable-local]
) {
opacity: 1;
}
/* Interaction override — force items visible via JS */
[data-hover-group][data-interaction="hover"]
.hoverable-item[data-hoverable-variant="opacity-on-hover"]:not(
[data-hoverable-local]
) {
/* Group mode — Root controls visibility via React context */
.hoverable-item[data-hoverable-variant="opacity-on-hover"][data-hoverable-active="true"] {
opacity: 1;
}
@@ -29,16 +17,7 @@
opacity: 1;
}
/* Group focus — any focusable descendant of the Root receives keyboard focus,
revealing all group items (same behavior as hover). */
[data-hover-group]:focus-within
.hoverable-item[data-hoverable-variant="opacity-on-hover"]:not(
[data-hoverable-local]
) {
opacity: 1;
}
/* Local focus — item (or a focusable descendant) receives keyboard focus */
/* Focus — item (or a focusable descendant) receives keyboard focus */
.hoverable-item[data-hoverable-variant="opacity-on-hover"]:has(:focus-visible) {
opacity: 1;
}

View File

@@ -8,7 +8,7 @@ const SvgBifrost = ({ size, className, ...props }: IconProps) => (
viewBox="0 0 37 46"
fill="none"
xmlns="http://www.w3.org/2000/svg"
className={cn(className, "!text-[#33C19E]")}
className={cn(className, "text-[#33C19E] dark:text-white")}
{...props}
>
<title>Bifrost</title>

View File

@@ -1,116 +0,0 @@
# Card
**Import:** `import { Card } from "@opal/layouts";`
A namespace of card layout primitives. Each sub-component handles a specific region of a card.
## Card.Header
A card header layout that pairs a [`Content`](../content/README.md) block with a right-side column and an optional full-width children slot.
### Why Card.Header?
[`ContentAction`](../content-action/README.md) provides a single `rightChildren` slot. Card headers typically need two distinct right-side regions — a primary action on top and secondary actions on the bottom. `Card.Header` provides this with `rightChildren` and `bottomRightChildren` slots, plus a `children` slot for full-width content below the header row (e.g., search bars, expandable tool lists).
### Props
Inherits **all** props from [`Content`](../content/README.md) (icon, title, description, sizePreset, variant, editable, onTitleChange, suffix, etc.) plus:
| Prop | Type | Default | Description |
|---|---|---|---|
| `rightChildren` | `ReactNode` | `undefined` | Content rendered to the right of the Content block (top of right column). |
| `bottomRightChildren` | `ReactNode` | `undefined` | Content rendered below `rightChildren` in the same column. Laid out as `flex flex-row`. |
| `children` | `ReactNode` | `undefined` | Content rendered below the full header row, spanning the entire width. |
### Layout Structure
```
+---------------------------------------------------------+
| [Content (p-2, self-start)] [rightChildren] |
| icon + title + description [bottomRightChildren] |
+---------------------------------------------------------+
| [children — full width] |
+---------------------------------------------------------+
```
- Outer wrapper: `flex flex-col w-full`
- Header row: `flex flex-row items-stretch w-full`
- Content area: `flex-1 min-w-0 self-start p-2` — top-aligned with fixed padding
- Right column: `flex flex-col items-end shrink-0` — no padding, no gap
- `bottomRightChildren` wrapper: `flex flex-row` — lays children out horizontally
- `children` wrapper: `w-full` — only rendered when children are provided
### Usage
#### Card with primary and secondary actions
```tsx
import { Card } from "@opal/layouts";
import { Button } from "@opal/components";
import { SvgGlobe, SvgSettings, SvgUnplug, SvgCheckSquare } from "@opal/icons";
<Card.Header
icon={SvgGlobe}
title="Google Search"
description="Web search provider"
sizePreset="main-ui"
variant="section"
rightChildren={
<Button icon={SvgCheckSquare} variant="action" prominence="tertiary">
Current Default
</Button>
}
bottomRightChildren={
<>
<Button icon={SvgUnplug} size="sm" prominence="tertiary" tooltip="Disconnect" />
<Button icon={SvgSettings} size="sm" prominence="tertiary" tooltip="Edit" />
</>
}
/>
```
#### Card with only a connect action
```tsx
<Card.Header
icon={SvgCloud}
title="OpenAI"
description="Not configured"
sizePreset="main-ui"
variant="section"
rightChildren={
<Button rightIcon={SvgArrowExchange} prominence="tertiary">
Connect
</Button>
}
/>
```
#### Card with expandable children
```tsx
<Card.Header
icon={SvgServer}
title="MCP Server"
description="12 tools available"
sizePreset="main-ui"
variant="section"
rightChildren={<Button icon={SvgSettings} prominence="tertiary" />}
>
<SearchBar placeholder="Search tools..." />
</Card.Header>
```
#### No right children
```tsx
<Card.Header
icon={SvgInfo}
title="Section Header"
description="Description text"
sizePreset="main-content"
variant="section"
/>
```
When both `rightChildren` and `bottomRightChildren` are omitted and no `children` are provided, the component renders only the padded `Content`.

View File

@@ -1,5 +1,5 @@
import type { Meta, StoryObj } from "@storybook/react";
import { Card } from "@opal/layouts";
import { CardHeaderLayout } from "@opal/layouts";
import { Button } from "@opal/components";
import {
SvgArrowExchange,
@@ -18,14 +18,14 @@ const withTooltipProvider: Decorator = (Story) => (
);
const meta = {
title: "Layouts/Card.Header",
component: Card.Header,
title: "Layouts/CardHeaderLayout",
component: CardHeaderLayout,
tags: ["autodocs"],
decorators: [withTooltipProvider],
parameters: {
layout: "centered",
},
} satisfies Meta<typeof Card.Header>;
} satisfies Meta<typeof CardHeaderLayout>;
export default meta;
@@ -38,7 +38,7 @@ type Story = StoryObj<typeof meta>;
export const Default: Story = {
render: () => (
<div className="w-[28rem] border rounded-16">
<Card.Header
<CardHeaderLayout
sizePreset="main-ui"
variant="section"
icon={SvgGlobe}
@@ -57,7 +57,7 @@ export const Default: Story = {
export const WithBothSlots: Story = {
render: () => (
<div className="w-[28rem] border rounded-16">
<Card.Header
<CardHeaderLayout
sizePreset="main-ui"
variant="section"
icon={SvgGlobe}
@@ -92,7 +92,7 @@ export const WithBothSlots: Story = {
export const RightChildrenOnly: Story = {
render: () => (
<div className="w-[28rem] border rounded-16">
<Card.Header
<CardHeaderLayout
sizePreset="main-ui"
variant="section"
icon={SvgGlobe}
@@ -111,7 +111,7 @@ export const RightChildrenOnly: Story = {
export const NoRightChildren: Story = {
render: () => (
<div className="w-[28rem] border rounded-16">
<Card.Header
<CardHeaderLayout
sizePreset="main-ui"
variant="section"
icon={SvgGlobe}
@@ -125,7 +125,7 @@ export const NoRightChildren: Story = {
export const LongContent: Story = {
render: () => (
<div className="w-[28rem] border rounded-16">
<Card.Header
<CardHeaderLayout
sizePreset="main-ui"
variant="section"
icon={SvgGlobe}

Some files were not shown because too many files have changed in this diff Show More