mirror of
https://github.com/onyx-dot-app/onyx.git
synced 2026-04-08 00:12:45 +00:00
Compare commits
1 Commits
main
...
bo/index_p
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3d2cc175a8 |
@@ -1 +0,0 @@
|
||||
../../../cli/internal/embedded/SKILL.md
|
||||
186
.cursor/skills/onyx-cli/SKILL.md
Normal file
186
.cursor/skills/onyx-cli/SKILL.md
Normal file
@@ -0,0 +1,186 @@
|
||||
---
|
||||
name: onyx-cli
|
||||
description: Query the Onyx knowledge base using the onyx-cli command. Use when the user wants to search company documents, ask questions about internal knowledge, query connected data sources, or look up information stored in Onyx.
|
||||
---
|
||||
|
||||
# Onyx CLI — Agent Tool
|
||||
|
||||
Onyx is an enterprise search and Gen-AI platform that connects to company documents, apps, and people. The `onyx-cli` CLI provides non-interactive commands to query the Onyx knowledge base and list available agents.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### 1. Check if installed
|
||||
|
||||
```bash
|
||||
which onyx-cli
|
||||
```
|
||||
|
||||
### 2. Install (if needed)
|
||||
|
||||
**Primary — pip:**
|
||||
|
||||
```bash
|
||||
pip install onyx-cli
|
||||
```
|
||||
|
||||
**From source (Go):**
|
||||
|
||||
```bash
|
||||
cd cli && go build -o onyx-cli . && sudo mv onyx-cli /usr/local/bin/
|
||||
```
|
||||
|
||||
### 3. Check if configured
|
||||
|
||||
```bash
|
||||
onyx-cli validate-config
|
||||
```
|
||||
|
||||
This checks the config file exists, API key is present, and tests the server connection via `/api/me`. Exit code 0 on success, non-zero with a descriptive error on failure.
|
||||
|
||||
If unconfigured, you have two options:
|
||||
|
||||
**Option A — Interactive setup (requires user input):**
|
||||
|
||||
```bash
|
||||
onyx-cli configure
|
||||
```
|
||||
|
||||
This prompts for the Onyx server URL and API key, tests the connection, and saves config.
|
||||
|
||||
**Option B — Environment variables (non-interactive, preferred for agents):**
|
||||
|
||||
```bash
|
||||
export ONYX_SERVER_URL="https://your-onyx-server.com" # default: https://cloud.onyx.app
|
||||
export ONYX_API_KEY="your-api-key"
|
||||
```
|
||||
|
||||
Environment variables override the config file. If these are set, no config file is needed.
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `ONYX_SERVER_URL` | No | Onyx server base URL (default: `https://cloud.onyx.app`) |
|
||||
| `ONYX_API_KEY` | Yes | API key for authentication |
|
||||
| `ONYX_PERSONA_ID` | No | Default agent/persona ID |
|
||||
|
||||
If neither the config file nor environment variables are set, tell the user that `onyx-cli` needs to be configured and ask them to either:
|
||||
- Run `onyx-cli configure` interactively, or
|
||||
- Set `ONYX_SERVER_URL` and `ONYX_API_KEY` environment variables
|
||||
|
||||
## Commands
|
||||
|
||||
### Validate configuration
|
||||
|
||||
```bash
|
||||
onyx-cli validate-config
|
||||
```
|
||||
|
||||
Checks config file exists, API key is present, and tests the server connection. Use this before `ask` or `agents` to confirm the CLI is properly set up.
|
||||
|
||||
### List available agents
|
||||
|
||||
```bash
|
||||
onyx-cli agents
|
||||
```
|
||||
|
||||
Prints a table of agent IDs, names, and descriptions. Use `--json` for structured output:
|
||||
|
||||
```bash
|
||||
onyx-cli agents --json
|
||||
```
|
||||
|
||||
Use agent IDs with `ask --agent-id` to query a specific agent.
|
||||
|
||||
### Basic query (plain text output)
|
||||
|
||||
```bash
|
||||
onyx-cli ask "What is our company's PTO policy?"
|
||||
```
|
||||
|
||||
Streams the answer as plain text to stdout. Exit code 0 on success, non-zero on error.
|
||||
|
||||
### JSON output (structured events)
|
||||
|
||||
```bash
|
||||
onyx-cli ask --json "What authentication methods do we support?"
|
||||
```
|
||||
|
||||
Outputs JSON-encoded parsed stream events (one object per line). Key event objects include message deltas, stop, errors, search-start, and citation payloads.
|
||||
|
||||
Each line is a JSON object with this envelope:
|
||||
|
||||
```json
|
||||
{"type": "<event_type>", "event": { ... }}
|
||||
```
|
||||
|
||||
| Event Type | Description |
|
||||
|------------|-------------|
|
||||
| `message_delta` | Content token — concatenate all `content` fields for the full answer |
|
||||
| `stop` | Stream complete |
|
||||
| `error` | Error with `error` message field |
|
||||
| `search_tool_start` | Onyx started searching documents |
|
||||
| `citation_info` | Source citation — see shape below |
|
||||
|
||||
`citation_info` event shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "citation_info",
|
||||
"event": {
|
||||
"citation_number": 1,
|
||||
"document_id": "abc123def456",
|
||||
"placement": {"turn_index": 0, "tab_index": 0, "sub_turn_index": null}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`placement` is metadata about where in the conversation the citation appeared and can be ignored for most use cases.
|
||||
|
||||
### Specify an agent
|
||||
|
||||
```bash
|
||||
onyx-cli ask --agent-id 5 "Summarize our Q4 roadmap"
|
||||
```
|
||||
|
||||
Uses a specific Onyx agent/persona instead of the default.
|
||||
|
||||
### All flags
|
||||
|
||||
| Flag | Type | Description |
|
||||
|------|------|-------------|
|
||||
| `--agent-id` | int | Agent ID to use (overrides default) |
|
||||
| `--json` | bool | Output raw NDJSON events instead of plain text |
|
||||
|
||||
## Statelessness
|
||||
|
||||
Each `onyx-cli ask` call creates an independent chat session. There is no built-in way to chain context across multiple `ask` invocations — every call starts fresh. If you need multi-turn conversation with memory, use the interactive TUI (`onyx-cli` or `onyx-cli chat`) instead.
|
||||
|
||||
## When to Use
|
||||
|
||||
Use `onyx-cli ask` when:
|
||||
|
||||
- The user asks about company-specific information (policies, docs, processes)
|
||||
- You need to search internal knowledge bases or connected data sources
|
||||
- The user references Onyx, asks you to "search Onyx", or wants to query their documents
|
||||
- You need context from company wikis, Confluence, Google Drive, Slack, or other connected sources
|
||||
|
||||
Do NOT use when:
|
||||
|
||||
- The question is about general programming knowledge (use your own knowledge)
|
||||
- The user is asking about code in the current repository (use grep/read tools)
|
||||
- The user hasn't mentioned Onyx and the question doesn't require internal company data
|
||||
|
||||
## Examples
|
||||
|
||||
```bash
|
||||
# Simple question
|
||||
onyx-cli ask "What are the steps to deploy to production?"
|
||||
|
||||
# Get structured output for parsing
|
||||
onyx-cli ask --json "List all active API integrations"
|
||||
|
||||
# Use a specialized agent
|
||||
onyx-cli ask --agent-id 3 "What were the action items from last week's standup?"
|
||||
|
||||
# Pipe the answer into another command
|
||||
onyx-cli ask "What is the database schema for users?" | head -20
|
||||
```
|
||||
2
.github/workflows/deployment.yml
vendored
2
.github/workflows/deployment.yml
vendored
@@ -228,7 +228,7 @@ jobs:
|
||||
|
||||
- name: Create GitHub Release
|
||||
id: create-release
|
||||
uses: softprops/action-gh-release@153bb8e04406b158c6c84fc1615b65b24149a1fe # ratchet:softprops/action-gh-release@v2
|
||||
uses: softprops/action-gh-release@da05d552573ad5aba039eaac05058a918a7bf631 # ratchet:softprops/action-gh-release@v2
|
||||
with:
|
||||
tag_name: ${{ steps.release-tag.outputs.tag }}
|
||||
name: ${{ steps.release-tag.outputs.tag }}
|
||||
|
||||
2
.github/workflows/helm-chart-releases.yml
vendored
2
.github/workflows/helm-chart-releases.yml
vendored
@@ -21,7 +21,7 @@ jobs:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Install Helm CLI
|
||||
uses: azure/setup-helm@dda3372f752e03dde6b3237bc9431cdc2f7a02a2 # ratchet:azure/setup-helm@v5.0.0
|
||||
uses: azure/setup-helm@1a275c3b69536ee54be43f2070a358922e12c8d4 # ratchet:azure/setup-helm@v4
|
||||
with:
|
||||
version: v3.12.1
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 45
|
||||
steps:
|
||||
- uses: actions/stale@b5d41d4e1d5dceea10e7104786b73624c18a190f # ratchet:actions/stale@v10
|
||||
- uses: actions/stale@997185467fa4f803885201cee163a9f38240193d # ratchet:actions/stale@v10
|
||||
with:
|
||||
stale-issue-message: 'This issue is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days.'
|
||||
stale-pr-message: 'This PR is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days.'
|
||||
|
||||
2
.github/workflows/pr-helm-chart-testing.yml
vendored
2
.github/workflows/pr-helm-chart-testing.yml
vendored
@@ -36,7 +36,7 @@ jobs:
|
||||
persist-credentials: false
|
||||
|
||||
- name: Set up Helm
|
||||
uses: azure/setup-helm@dda3372f752e03dde6b3237bc9431cdc2f7a02a2 # ratchet:azure/setup-helm@v5.0.0
|
||||
uses: azure/setup-helm@1a275c3b69536ee54be43f2070a358922e12c8d4 # ratchet:azure/setup-helm@v4.3.1
|
||||
with:
|
||||
version: v3.19.0
|
||||
|
||||
|
||||
3
.gitignore
vendored
3
.gitignore
vendored
@@ -59,6 +59,3 @@ node_modules
|
||||
|
||||
# plans
|
||||
plans/
|
||||
|
||||
# Added context for LLMs
|
||||
onyx-llm-context/
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
from typing import Any
|
||||
from typing import Any, Literal
|
||||
from onyx.db.engine.iam_auth import get_iam_auth_token
|
||||
from onyx.configs.app_configs import USE_IAM_AUTH
|
||||
from onyx.configs.app_configs import POSTGRES_HOST
|
||||
@@ -19,6 +19,7 @@ from logging.config import fileConfig
|
||||
|
||||
from alembic import context
|
||||
from sqlalchemy.ext.asyncio import create_async_engine
|
||||
from sqlalchemy.sql.schema import SchemaItem
|
||||
from onyx.configs.constants import SSL_CERT_FILE
|
||||
from shared_configs.configs import (
|
||||
MULTI_TENANT,
|
||||
@@ -44,6 +45,8 @@ if config.config_file_name is not None and config.attributes.get(
|
||||
|
||||
target_metadata = [Base.metadata, ResultModelBase.metadata]
|
||||
|
||||
EXCLUDE_TABLES = {"kombu_queue", "kombu_message"}
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
ssl_context: ssl.SSLContext | None = None
|
||||
@@ -53,6 +56,25 @@ if USE_IAM_AUTH:
|
||||
ssl_context = ssl.create_default_context(cafile=SSL_CERT_FILE)
|
||||
|
||||
|
||||
def include_object(
|
||||
object: SchemaItem, # noqa: ARG001
|
||||
name: str | None,
|
||||
type_: Literal[
|
||||
"schema",
|
||||
"table",
|
||||
"column",
|
||||
"index",
|
||||
"unique_constraint",
|
||||
"foreign_key_constraint",
|
||||
],
|
||||
reflected: bool, # noqa: ARG001
|
||||
compare_to: SchemaItem | None, # noqa: ARG001
|
||||
) -> bool:
|
||||
if type_ == "table" and name in EXCLUDE_TABLES:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def filter_tenants_by_range(
|
||||
tenant_ids: list[str], start_range: int | None = None, end_range: int | None = None
|
||||
) -> list[str]:
|
||||
@@ -209,6 +231,7 @@ def do_run_migrations(
|
||||
context.configure(
|
||||
connection=connection,
|
||||
target_metadata=target_metadata, # type: ignore
|
||||
include_object=include_object,
|
||||
version_table_schema=schema_name,
|
||||
include_schemas=True,
|
||||
compare_type=True,
|
||||
@@ -382,6 +405,7 @@ def run_migrations_offline() -> None:
|
||||
url=url,
|
||||
target_metadata=target_metadata, # type: ignore
|
||||
literal_binds=True,
|
||||
include_object=include_object,
|
||||
version_table_schema=schema,
|
||||
include_schemas=True,
|
||||
script_location=config.get_main_option("script_location"),
|
||||
@@ -423,6 +447,7 @@ def run_migrations_offline() -> None:
|
||||
url=url,
|
||||
target_metadata=target_metadata, # type: ignore
|
||||
literal_binds=True,
|
||||
include_object=include_object,
|
||||
version_table_schema=schema,
|
||||
include_schemas=True,
|
||||
script_location=config.get_main_option("script_location"),
|
||||
@@ -465,6 +490,7 @@ def run_migrations_online() -> None:
|
||||
context.configure(
|
||||
connection=connection,
|
||||
target_metadata=target_metadata, # type: ignore
|
||||
include_object=include_object,
|
||||
version_table_schema=schema_name,
|
||||
include_schemas=True,
|
||||
compare_type=True,
|
||||
|
||||
@@ -1,9 +1,11 @@
|
||||
import asyncio
|
||||
from logging.config import fileConfig
|
||||
from typing import Literal
|
||||
|
||||
from sqlalchemy import pool
|
||||
from sqlalchemy.engine import Connection
|
||||
from sqlalchemy.ext.asyncio import create_async_engine
|
||||
from sqlalchemy.schema import SchemaItem
|
||||
|
||||
from alembic import context
|
||||
from onyx.db.engine.sql_engine import build_connection_string
|
||||
@@ -33,6 +35,27 @@ target_metadata = [PublicBase.metadata]
|
||||
# my_important_option = config.get_main_option("my_important_option")
|
||||
# ... etc.
|
||||
|
||||
EXCLUDE_TABLES = {"kombu_queue", "kombu_message"}
|
||||
|
||||
|
||||
def include_object(
|
||||
object: SchemaItem, # noqa: ARG001
|
||||
name: str | None,
|
||||
type_: Literal[
|
||||
"schema",
|
||||
"table",
|
||||
"column",
|
||||
"index",
|
||||
"unique_constraint",
|
||||
"foreign_key_constraint",
|
||||
],
|
||||
reflected: bool, # noqa: ARG001
|
||||
compare_to: SchemaItem | None, # noqa: ARG001
|
||||
) -> bool:
|
||||
if type_ == "table" and name in EXCLUDE_TABLES:
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def run_migrations_offline() -> None:
|
||||
"""Run migrations in 'offline' mode.
|
||||
@@ -62,6 +85,7 @@ def do_run_migrations(connection: Connection) -> None:
|
||||
context.configure(
|
||||
connection=connection,
|
||||
target_metadata=target_metadata, # type: ignore[arg-type]
|
||||
include_object=include_object,
|
||||
)
|
||||
|
||||
with context.begin_transaction():
|
||||
|
||||
@@ -5,7 +5,6 @@ from celery import Task
|
||||
from celery.exceptions import SoftTimeLimitExceeded
|
||||
from redis.lock import Lock as RedisLock
|
||||
|
||||
from ee.onyx.server.tenants.product_gating import get_gated_tenants
|
||||
from onyx.background.celery.apps.app_base import task_logger
|
||||
from onyx.background.celery.tasks.beat_schedule import BEAT_EXPIRES_DEFAULT
|
||||
from onyx.configs.constants import CELERY_GENERIC_BEAT_LOCK_TIMEOUT
|
||||
@@ -31,7 +30,6 @@ def cloud_beat_task_generator(
|
||||
queue: str = OnyxCeleryTask.DEFAULT,
|
||||
priority: int = OnyxCeleryPriority.MEDIUM,
|
||||
expires: int = BEAT_EXPIRES_DEFAULT,
|
||||
skip_gated: bool = True,
|
||||
) -> bool | None:
|
||||
"""a lightweight task used to kick off individual beat tasks per tenant."""
|
||||
time_start = time.monotonic()
|
||||
@@ -50,22 +48,20 @@ def cloud_beat_task_generator(
|
||||
last_lock_time = time.monotonic()
|
||||
tenant_ids: list[str] = []
|
||||
num_processed_tenants = 0
|
||||
num_skipped_gated = 0
|
||||
|
||||
try:
|
||||
tenant_ids = get_all_tenant_ids()
|
||||
|
||||
# Per-task control over whether gated tenants are included. Most periodic tasks
|
||||
# do no useful work on gated tenants and just waste DB connections fanning out
|
||||
# to ~10k+ inactive tenants. A small number of cleanup tasks (connector deletion,
|
||||
# checkpoint/index attempt cleanup) need to run on gated tenants and pass
|
||||
# `skip_gated=False` from the beat schedule.
|
||||
gated_tenants: set[str] = get_gated_tenants() if skip_gated else set()
|
||||
# NOTE: for now, we are running tasks for gated tenants, since we want to allow
|
||||
# connector deletion to run successfully. The new plan is to continously prune
|
||||
# the gated tenants set, so we won't have a build up of old, unused gated tenants.
|
||||
# Keeping this around in case we want to revert to the previous behavior.
|
||||
# gated_tenants = get_gated_tenants()
|
||||
|
||||
for tenant_id in tenant_ids:
|
||||
if tenant_id in gated_tenants:
|
||||
num_skipped_gated += 1
|
||||
continue
|
||||
# Same comment here as the above NOTE
|
||||
# if tenant_id in gated_tenants:
|
||||
# continue
|
||||
|
||||
current_time = time.monotonic()
|
||||
if current_time - last_lock_time >= (CELERY_GENERIC_BEAT_LOCK_TIMEOUT / 4):
|
||||
@@ -108,7 +104,6 @@ def cloud_beat_task_generator(
|
||||
f"cloud_beat_task_generator finished: "
|
||||
f"task={task_name} "
|
||||
f"num_processed_tenants={num_processed_tenants} "
|
||||
f"num_skipped_gated={num_skipped_gated} "
|
||||
f"num_tenants={len(tenant_ids)} "
|
||||
f"elapsed={time_elapsed:.2f}"
|
||||
)
|
||||
|
||||
@@ -27,13 +27,13 @@ from shared_configs.configs import MULTI_TENANT
|
||||
from shared_configs.configs import TENANT_ID_PREFIX
|
||||
|
||||
# Maximum tenants to provision in a single task run.
|
||||
# Each tenant takes ~80s (alembic migrations), so 15 tenants ≈ 20 minutes.
|
||||
_MAX_TENANTS_PER_RUN = 15
|
||||
# Each tenant takes ~80s (alembic migrations), so 5 tenants ≈ 7 minutes.
|
||||
_MAX_TENANTS_PER_RUN = 5
|
||||
|
||||
# Time limits sized for worst-case: provisioning up to _MAX_TENANTS_PER_RUN new tenants
|
||||
# (~90s each) plus migrating up to TARGET_AVAILABLE_TENANTS pool tenants (~90s each).
|
||||
_TENANT_PROVISIONING_SOFT_TIME_LIMIT = 60 * 40 # 40 minutes
|
||||
_TENANT_PROVISIONING_TIME_LIMIT = 60 * 45 # 45 minutes
|
||||
_TENANT_PROVISIONING_SOFT_TIME_LIMIT = 60 * 20 # 20 minutes
|
||||
_TENANT_PROVISIONING_TIME_LIMIT = 60 * 25 # 25 minutes
|
||||
|
||||
|
||||
@shared_task(
|
||||
|
||||
@@ -1,14 +1,20 @@
|
||||
from datetime import datetime
|
||||
from datetime import timezone
|
||||
from uuid import UUID
|
||||
|
||||
from celery import shared_task
|
||||
from celery import Task
|
||||
|
||||
from ee.onyx.background.celery_utils import should_perform_chat_ttl_check
|
||||
from ee.onyx.background.task_name_builders import name_chat_ttl_task
|
||||
from onyx.configs.app_configs import JOB_TIMEOUT
|
||||
from onyx.configs.constants import OnyxCeleryTask
|
||||
from onyx.db.chat import delete_chat_session
|
||||
from onyx.db.chat import get_chat_sessions_older_than
|
||||
from onyx.db.engine.sql_engine import get_session_with_current_tenant
|
||||
from onyx.db.enums import TaskStatus
|
||||
from onyx.db.tasks import mark_task_as_finished_with_id
|
||||
from onyx.db.tasks import register_task
|
||||
from onyx.server.settings.store import load_settings
|
||||
from onyx.utils.logger import setup_logger
|
||||
|
||||
@@ -23,42 +29,59 @@ logger = setup_logger()
|
||||
trail=False,
|
||||
)
|
||||
def perform_ttl_management_task(
|
||||
self: Task, retention_limit_days: int, *, tenant_id: str # noqa: ARG001
|
||||
self: Task, retention_limit_days: int, *, tenant_id: str
|
||||
) -> None:
|
||||
task_id = self.request.id
|
||||
if not task_id:
|
||||
raise RuntimeError("No task id defined for this task; cannot identify it")
|
||||
|
||||
start_time = datetime.now(tz=timezone.utc)
|
||||
|
||||
user_id: UUID | None = None
|
||||
session_id: UUID | None = None
|
||||
try:
|
||||
with get_session_with_current_tenant() as db_session:
|
||||
# we generally want to move off this, but keeping for now
|
||||
register_task(
|
||||
db_session=db_session,
|
||||
task_name=name_chat_ttl_task(retention_limit_days, tenant_id),
|
||||
task_id=task_id,
|
||||
status=TaskStatus.STARTED,
|
||||
start_time=start_time,
|
||||
)
|
||||
|
||||
old_chat_sessions = get_chat_sessions_older_than(
|
||||
retention_limit_days, db_session
|
||||
)
|
||||
|
||||
for user_id, session_id in old_chat_sessions:
|
||||
try:
|
||||
with get_session_with_current_tenant() as db_session:
|
||||
delete_chat_session(
|
||||
user_id,
|
||||
session_id,
|
||||
db_session,
|
||||
include_deleted=True,
|
||||
hard_delete=True,
|
||||
)
|
||||
except Exception:
|
||||
logger.exception(
|
||||
"Failed to delete chat session "
|
||||
f"user_id={user_id} session_id={session_id}, "
|
||||
"continuing with remaining sessions"
|
||||
# one session per delete so that we don't blow up if a deletion fails.
|
||||
with get_session_with_current_tenant() as db_session:
|
||||
delete_chat_session(
|
||||
user_id,
|
||||
session_id,
|
||||
db_session,
|
||||
include_deleted=True,
|
||||
hard_delete=True,
|
||||
)
|
||||
|
||||
with get_session_with_current_tenant() as db_session:
|
||||
mark_task_as_finished_with_id(
|
||||
db_session=db_session,
|
||||
task_id=task_id,
|
||||
success=True,
|
||||
)
|
||||
|
||||
except Exception:
|
||||
logger.exception(
|
||||
f"delete_chat_session exceptioned. user_id={user_id} session_id={session_id}"
|
||||
)
|
||||
with get_session_with_current_tenant() as db_session:
|
||||
mark_task_as_finished_with_id(
|
||||
db_session=db_session,
|
||||
task_id=task_id,
|
||||
success=False,
|
||||
)
|
||||
raise
|
||||
|
||||
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
# Overview of Onyx Background Jobs
|
||||
|
||||
The background jobs take care of:
|
||||
|
||||
1. Pulling/Indexing documents (from connectors)
|
||||
2. Updating document metadata (from connectors)
|
||||
3. Cleaning up checkpoints and logic around indexing work (indexing indexing checkpoints and index attempt metadata)
|
||||
@@ -10,41 +9,37 @@ The background jobs take care of:
|
||||
|
||||
## Worker → Queue Mapping
|
||||
|
||||
| Worker | File | Queues |
|
||||
| ------------------------- | ------------------------------ | -------------------------------------------------------------------------------------------------------------------- |
|
||||
| Primary | `apps/primary.py` | `celery` |
|
||||
| Light | `apps/light.py` | `vespa_metadata_sync`, `connector_deletion`, `doc_permissions_upsert`, `checkpoint_cleanup`, `index_attempt_cleanup` |
|
||||
| Heavy | `apps/heavy.py` | `connector_pruning`, `connector_doc_permissions_sync`, `connector_external_group_sync`, `csv_generation`, `sandbox` |
|
||||
| Docprocessing | `apps/docprocessing.py` | `docprocessing` |
|
||||
| Docfetching | `apps/docfetching.py` | `connector_doc_fetching` |
|
||||
| User File Processing | `apps/user_file_processing.py` | `user_file_processing`, `user_file_project_sync`, `user_file_delete` |
|
||||
| Monitoring | `apps/monitoring.py` | `monitoring` |
|
||||
| Background (consolidated) | `apps/background.py` | All queues above except `celery` |
|
||||
| Worker | File | Queues |
|
||||
|--------|------|--------|
|
||||
| Primary | `apps/primary.py` | `celery` |
|
||||
| Light | `apps/light.py` | `vespa_metadata_sync`, `connector_deletion`, `doc_permissions_upsert`, `checkpoint_cleanup`, `index_attempt_cleanup` |
|
||||
| Heavy | `apps/heavy.py` | `connector_pruning`, `connector_doc_permissions_sync`, `connector_external_group_sync`, `csv_generation`, `sandbox` |
|
||||
| Docprocessing | `apps/docprocessing.py` | `docprocessing` |
|
||||
| Docfetching | `apps/docfetching.py` | `connector_doc_fetching` |
|
||||
| User File Processing | `apps/user_file_processing.py` | `user_file_processing`, `user_file_project_sync`, `user_file_delete` |
|
||||
| Monitoring | `apps/monitoring.py` | `monitoring` |
|
||||
| Background (consolidated) | `apps/background.py` | All queues above except `celery` |
|
||||
|
||||
## Non-Worker Apps
|
||||
|
||||
| App | File | Purpose |
|
||||
| ---------- | ----------- | ----------------------------------------------------------------------------------------------------- |
|
||||
| **Beat** | `beat.py` | Celery beat scheduler with `DynamicTenantScheduler` that generates per-tenant periodic task schedules |
|
||||
| **Client** | `client.py` | Minimal app for task submission from non-worker processes (e.g., API server) |
|
||||
| App | File | Purpose |
|
||||
|-----|------|---------|
|
||||
| **Beat** | `beat.py` | Celery beat scheduler with `DynamicTenantScheduler` that generates per-tenant periodic task schedules |
|
||||
| **Client** | `client.py` | Minimal app for task submission from non-worker processes (e.g., API server) |
|
||||
|
||||
### Shared Module
|
||||
|
||||
`app_base.py` provides:
|
||||
|
||||
- `TenantAwareTask` - Base task class that sets tenant context
|
||||
- Signal handlers for logging, cleanup, and lifecycle events
|
||||
- Readiness probes and health checks
|
||||
|
||||
|
||||
## Worker Details
|
||||
|
||||
### Primary (Coordinator and task dispatcher)
|
||||
|
||||
It is the single worker which handles tasks from the default celery queue. It is a singleton worker ensured by the `PRIMARY_WORKER` Redis lock
|
||||
which it touches every `CELERY_PRIMARY_WORKER_LOCK_TIMEOUT / 8` seconds (using Celery Bootsteps)
|
||||
|
||||
On startup:
|
||||
|
||||
- waits for redis, postgres, document index to all be healthy
|
||||
- acquires the singleton lock
|
||||
- cleans all the redis states associated with background jobs
|
||||
@@ -52,34 +47,34 @@ On startup:
|
||||
|
||||
Then it cycles through its tasks as scheduled by Celery Beat:
|
||||
|
||||
| Task | Frequency | Description |
|
||||
| --------------------------------- | --------- | ------------------------------------------------------------------------------------------ |
|
||||
| `check_for_indexing` | 15s | Scans for connectors needing indexing → dispatches to `DOCFETCHING` queue |
|
||||
| `check_for_vespa_sync_task` | 20s | Finds stale documents/document sets → dispatches sync tasks to `VESPA_METADATA_SYNC` queue |
|
||||
| `check_for_pruning` | 20s | Finds connectors due for pruning → dispatches to `CONNECTOR_PRUNING` queue |
|
||||
| `check_for_connector_deletion` | 20s | Processes deletion requests → dispatches to `CONNECTOR_DELETION` queue |
|
||||
| `check_for_user_file_processing` | 20s | Checks for user uploads → dispatches to `USER_FILE_PROCESSING` queue |
|
||||
| `check_for_checkpoint_cleanup` | 1h | Cleans up old indexing checkpoints |
|
||||
| `check_for_index_attempt_cleanup` | 30m | Cleans up old index attempts |
|
||||
| `celery_beat_heartbeat` | 1m | Heartbeat for Beat watchdog |
|
||||
| Task | Frequency | Description |
|
||||
|------|-----------|-------------|
|
||||
| `check_for_indexing` | 15s | Scans for connectors needing indexing → dispatches to `DOCFETCHING` queue |
|
||||
| `check_for_vespa_sync_task` | 20s | Finds stale documents/document sets → dispatches sync tasks to `VESPA_METADATA_SYNC` queue |
|
||||
| `check_for_pruning` | 20s | Finds connectors due for pruning → dispatches to `CONNECTOR_PRUNING` queue |
|
||||
| `check_for_connector_deletion` | 20s | Processes deletion requests → dispatches to `CONNECTOR_DELETION` queue |
|
||||
| `check_for_user_file_processing` | 20s | Checks for user uploads → dispatches to `USER_FILE_PROCESSING` queue |
|
||||
| `check_for_checkpoint_cleanup` | 1h | Cleans up old indexing checkpoints |
|
||||
| `check_for_index_attempt_cleanup` | 30m | Cleans up old index attempts |
|
||||
| `kombu_message_cleanup_task` | periodic | Cleans orphaned Kombu messages from DB (Kombu being the messaging framework used by Celery) |
|
||||
| `celery_beat_heartbeat` | 1m | Heartbeat for Beat watchdog |
|
||||
|
||||
Watchdog is a separate Python process managed by supervisord which runs alongside celery workers. It checks the ONYX_CELERY_BEAT_HEARTBEAT_KEY in
|
||||
Redis to ensure Celery Beat is not dead. Beat schedules the celery_beat_heartbeat for Primary to touch the key and share that it's still alive.
|
||||
See supervisord.conf for watchdog config.
|
||||
|
||||
### Light
|
||||
|
||||
### Light
|
||||
Fast and short living tasks that are not resource intensive. High concurrency:
|
||||
Can have 24 concurrent workers, each with a prefetch of 8 for a total of 192 tasks in flight at once.
|
||||
|
||||
Tasks it handles:
|
||||
|
||||
- Syncs access/permissions, document sets, boosts, hidden state
|
||||
- Deletes documents that are marked for deletion in Postgres
|
||||
- Cleanup of checkpoints and index attempts
|
||||
|
||||
### Heavy
|
||||
|
||||
### Heavy
|
||||
Long running, resource intensive tasks, handles pruning and sandbox operations. Low concurrency - max concurrency of 4 with 1 prefetch.
|
||||
|
||||
Does not interact with the Document Index, it handles the syncs with external systems. Large volume API calls to handle pruning and fetching permissions, etc.
|
||||
@@ -88,24 +83,16 @@ Generates CSV exports which may take a long time with significant data in Postgr
|
||||
|
||||
Sandbox (new feature) for running Next.js, Python virtual env, OpenCode AI Agent, and access to knowledge files
|
||||
|
||||
|
||||
### Docprocessing, Docfetching, User File Processing
|
||||
|
||||
Docprocessing and Docfetching are for indexing documents:
|
||||
|
||||
- Docfetching runs connectors to pull documents from external APIs (Google Drive, Confluence, etc.), stores batches to file storage, and dispatches docprocessing tasks
|
||||
- Docprocessing retrieves batches, runs the indexing pipeline (chunking, embedding), and indexes into the Document Index
|
||||
- User Files come from uploads directly via the input bar
|
||||
- Docprocessing retrieves batches, runs the indexing pipeline (chunking, embedding), and indexes into the Document Index
|
||||
User Files come from uploads directly via the input bar
|
||||
|
||||
|
||||
### Monitoring
|
||||
|
||||
Observability and metrics collections:
|
||||
|
||||
- Queue lengths, connector success/failure, connector latencies
|
||||
- Queue lengths, connector success/failure, lconnector latencies
|
||||
- Memory of supervisor managed processes (workers, beat, slack)
|
||||
- Cloud and multitenant specific monitorings
|
||||
|
||||
## Prometheus Metrics
|
||||
|
||||
Workers can expose Prometheus metrics via a standalone HTTP server. Currently docfetching and docprocessing have push-based task lifecycle metrics; the monitoring worker runs pull-based collectors for queue depth and connector health.
|
||||
|
||||
For the full metric reference, integration guide, and PromQL examples, see [`docs/METRICS.md`](../../../docs/METRICS.md#celery-worker-metrics).
|
||||
|
||||
@@ -13,6 +13,12 @@ from celery.signals import worker_shutdown
|
||||
import onyx.background.celery.apps.app_base as app_base
|
||||
from onyx.configs.constants import POSTGRES_CELERY_WORKER_HEAVY_APP_NAME
|
||||
from onyx.db.engine.sql_engine import SqlEngine
|
||||
from onyx.server.metrics.celery_task_metrics import on_celery_task_postrun
|
||||
from onyx.server.metrics.celery_task_metrics import on_celery_task_prerun
|
||||
from onyx.server.metrics.celery_task_metrics import on_celery_task_rejected
|
||||
from onyx.server.metrics.celery_task_metrics import on_celery_task_retry
|
||||
from onyx.server.metrics.celery_task_metrics import on_celery_task_revoked
|
||||
from onyx.server.metrics.metrics_server import start_metrics_server
|
||||
from onyx.utils.logger import setup_logger
|
||||
from shared_configs.configs import MULTI_TENANT
|
||||
|
||||
@@ -34,6 +40,7 @@ def on_task_prerun(
|
||||
**kwds: Any,
|
||||
) -> None:
|
||||
app_base.on_task_prerun(sender, task_id, task, args, kwargs, **kwds)
|
||||
on_celery_task_prerun(task_id, task)
|
||||
|
||||
|
||||
@signals.task_postrun.connect
|
||||
@@ -48,6 +55,31 @@ def on_task_postrun(
|
||||
**kwds: Any,
|
||||
) -> None:
|
||||
app_base.on_task_postrun(sender, task_id, task, args, kwargs, retval, state, **kwds)
|
||||
on_celery_task_postrun(task_id, task, state)
|
||||
|
||||
|
||||
@signals.task_retry.connect
|
||||
def on_task_retry(sender: Any | None = None, **kwargs: Any) -> None: # noqa: ARG001
|
||||
task_id = getattr(getattr(sender, "request", None), "id", None)
|
||||
on_celery_task_retry(task_id, sender)
|
||||
|
||||
|
||||
@signals.task_revoked.connect
|
||||
def on_task_revoked(sender: Any | None = None, **kwargs: Any) -> None:
|
||||
task_name = getattr(sender, "name", None) or str(sender)
|
||||
on_celery_task_revoked(kwargs.get("task_id"), task_name)
|
||||
|
||||
|
||||
@signals.task_rejected.connect
|
||||
def on_task_rejected(sender: Any | None = None, **kwargs: Any) -> None: # noqa: ARG001
|
||||
message = kwargs.get("message")
|
||||
task_name: str | None = None
|
||||
if message is not None:
|
||||
headers = getattr(message, "headers", None) or {}
|
||||
task_name = headers.get("task")
|
||||
if task_name is None:
|
||||
task_name = "unknown"
|
||||
on_celery_task_rejected(None, task_name)
|
||||
|
||||
|
||||
@celeryd_init.connect
|
||||
@@ -76,6 +108,7 @@ def on_worker_init(sender: Worker, **kwargs: Any) -> None:
|
||||
|
||||
@worker_ready.connect
|
||||
def on_worker_ready(sender: Any, **kwargs: Any) -> None:
|
||||
start_metrics_server("heavy")
|
||||
app_base.on_worker_ready(sender, **kwargs)
|
||||
|
||||
|
||||
|
||||
@@ -317,6 +317,7 @@ celery_app.autodiscover_tasks(
|
||||
"onyx.background.celery.tasks.docprocessing",
|
||||
"onyx.background.celery.tasks.evals",
|
||||
"onyx.background.celery.tasks.hierarchyfetching",
|
||||
"onyx.background.celery.tasks.periodic",
|
||||
"onyx.background.celery.tasks.pruning",
|
||||
"onyx.background.celery.tasks.shared",
|
||||
"onyx.background.celery.tasks.vespa",
|
||||
|
||||
@@ -75,8 +75,6 @@ beat_task_templates: list[dict] = [
|
||||
"options": {
|
||||
"priority": OnyxCeleryPriority.LOW,
|
||||
"expires": BEAT_EXPIRES_DEFAULT,
|
||||
# Run on gated tenants too — they may still have stale checkpoints to clean.
|
||||
"skip_gated": False,
|
||||
},
|
||||
},
|
||||
{
|
||||
@@ -86,8 +84,6 @@ beat_task_templates: list[dict] = [
|
||||
"options": {
|
||||
"priority": OnyxCeleryPriority.MEDIUM,
|
||||
"expires": BEAT_EXPIRES_DEFAULT,
|
||||
# Run on gated tenants too — they may still have stale index attempts.
|
||||
"skip_gated": False,
|
||||
},
|
||||
},
|
||||
{
|
||||
@@ -97,8 +93,6 @@ beat_task_templates: list[dict] = [
|
||||
"options": {
|
||||
"priority": OnyxCeleryPriority.MEDIUM,
|
||||
"expires": BEAT_EXPIRES_DEFAULT,
|
||||
# Gated tenants may still have connectors awaiting deletion.
|
||||
"skip_gated": False,
|
||||
},
|
||||
},
|
||||
{
|
||||
@@ -272,7 +266,7 @@ def make_cloud_generator_task(task: dict[str, Any]) -> dict[str, Any]:
|
||||
cloud_task["kwargs"] = {}
|
||||
cloud_task["kwargs"]["task_name"] = task["task"]
|
||||
|
||||
optional_fields = ["queue", "priority", "expires", "skip_gated"]
|
||||
optional_fields = ["queue", "priority", "expires"]
|
||||
for field in optional_fields:
|
||||
if field in task["options"]:
|
||||
cloud_task["kwargs"][field] = task["options"][field]
|
||||
@@ -308,7 +302,7 @@ beat_cloud_tasks: list[dict] = [
|
||||
{
|
||||
"name": f"{ONYX_CLOUD_CELERY_TASK_PREFIX}_check-available-tenants",
|
||||
"task": OnyxCeleryTask.CLOUD_CHECK_AVAILABLE_TENANTS,
|
||||
"schedule": timedelta(minutes=2),
|
||||
"schedule": timedelta(minutes=10),
|
||||
"options": {
|
||||
"queue": OnyxCeleryQueues.MONITORING,
|
||||
"priority": OnyxCeleryPriority.HIGH,
|
||||
@@ -365,13 +359,7 @@ if not MULTI_TENANT:
|
||||
]
|
||||
)
|
||||
|
||||
# `skip_gated` is a cloud-only hint consumed by `cloud_beat_task_generator`. Strip
|
||||
# it before extending the self-hosted schedule so it doesn't leak into apply_async
|
||||
# as an unrecognised option on every fired task message.
|
||||
for _template in beat_task_templates:
|
||||
_self_hosted_template = copy.deepcopy(_template)
|
||||
_self_hosted_template["options"].pop("skip_gated", None)
|
||||
tasks_to_schedule.append(_self_hosted_template)
|
||||
tasks_to_schedule.extend(beat_task_templates)
|
||||
|
||||
|
||||
def generate_cloud_tasks(
|
||||
|
||||
@@ -36,7 +36,6 @@ from onyx.configs.constants import OnyxRedisLocks
|
||||
from onyx.db.engine.sql_engine import get_session_with_current_tenant
|
||||
from onyx.db.opensearch_migration import build_sanitized_to_original_doc_id_mapping
|
||||
from onyx.db.opensearch_migration import get_vespa_visit_state
|
||||
from onyx.db.opensearch_migration import is_migration_completed
|
||||
from onyx.db.opensearch_migration import (
|
||||
mark_migration_completed_time_if_not_set_with_commit,
|
||||
)
|
||||
@@ -107,19 +106,14 @@ def migrate_chunks_from_vespa_to_opensearch_task(
|
||||
acquired; effectively a no-op. True if the task completed
|
||||
successfully. False if the task errored.
|
||||
"""
|
||||
# 1. Check if we should run the task.
|
||||
# 1.a. If OpenSearch indexing is disabled, we don't run the task.
|
||||
if not ENABLE_OPENSEARCH_INDEXING_FOR_ONYX:
|
||||
task_logger.warning(
|
||||
"OpenSearch migration is not enabled, skipping chunk migration task."
|
||||
)
|
||||
return None
|
||||
|
||||
task_logger.info("Starting chunk-level migration from Vespa to OpenSearch.")
|
||||
task_start_time = time.monotonic()
|
||||
|
||||
# 1.b. Only one instance per tenant of this task may run concurrently at
|
||||
# once. If we fail to acquire a lock, we assume it is because another task
|
||||
# has one and we exit.
|
||||
r = get_redis_client()
|
||||
lock: RedisLock = r.lock(
|
||||
name=OnyxRedisLocks.OPENSEARCH_MIGRATION_BEAT_LOCK,
|
||||
@@ -142,11 +136,10 @@ def migrate_chunks_from_vespa_to_opensearch_task(
|
||||
f"Token: {lock.local.token}"
|
||||
)
|
||||
|
||||
# 2. Prepare to migrate.
|
||||
total_chunks_migrated_this_task = 0
|
||||
total_chunks_errored_this_task = 0
|
||||
try:
|
||||
# 2.a. Double-check that tenant info is correct.
|
||||
# Double check that tenant info is correct.
|
||||
if tenant_id != get_current_tenant_id():
|
||||
err_str = (
|
||||
f"Tenant ID mismatch in the OpenSearch migration task: "
|
||||
@@ -155,62 +148,16 @@ def migrate_chunks_from_vespa_to_opensearch_task(
|
||||
task_logger.error(err_str)
|
||||
return False
|
||||
|
||||
# Do as much as we can with a DB session in one spot to not hold a
|
||||
# session during a migration batch.
|
||||
with get_session_with_current_tenant() as db_session:
|
||||
# 2.b. Immediately check to see if this tenant is done, to save
|
||||
# having to do any other work. This function does not require a
|
||||
# migration record to necessarily exist.
|
||||
if is_migration_completed(db_session):
|
||||
return True
|
||||
|
||||
# 2.c. Try to insert the OpenSearchTenantMigrationRecord table if it
|
||||
# does not exist.
|
||||
with (
|
||||
get_session_with_current_tenant() as db_session,
|
||||
get_vespa_http_client(
|
||||
timeout=VESPA_MIGRATION_REQUEST_TIMEOUT_S
|
||||
) as vespa_client,
|
||||
):
|
||||
try_insert_opensearch_tenant_migration_record_with_commit(db_session)
|
||||
|
||||
# 2.d. Get search settings.
|
||||
search_settings = get_current_search_settings(db_session)
|
||||
indexing_setting = IndexingSetting.from_db_model(search_settings)
|
||||
|
||||
# 2.e. Build sanitized to original doc ID mapping to check for
|
||||
# conflicts in the event we sanitize a doc ID to an
|
||||
# already-existing doc ID.
|
||||
# We reconstruct this mapping for every task invocation because
|
||||
# a document may have been added in the time between two tasks.
|
||||
sanitized_doc_start_time = time.monotonic()
|
||||
sanitized_to_original_doc_id_mapping = (
|
||||
build_sanitized_to_original_doc_id_mapping(db_session)
|
||||
)
|
||||
task_logger.debug(
|
||||
f"Built sanitized_to_original_doc_id_mapping with {len(sanitized_to_original_doc_id_mapping)} entries "
|
||||
f"in {time.monotonic() - sanitized_doc_start_time:.3f} seconds."
|
||||
)
|
||||
|
||||
# 2.f. Get the current migration state.
|
||||
continuation_token_map, total_chunks_migrated = get_vespa_visit_state(
|
||||
db_session
|
||||
)
|
||||
# 2.f.1. Double-check that the migration state does not imply
|
||||
# completion. Really we should never have to enter this block as we
|
||||
# would expect is_migration_completed to return True, but in the
|
||||
# strange event that the migration is complete but the migration
|
||||
# completed time was never stamped, we do so here.
|
||||
if is_continuation_token_done_for_all_slices(continuation_token_map):
|
||||
task_logger.info(
|
||||
f"OpenSearch migration COMPLETED for tenant {tenant_id}. Total chunks migrated: {total_chunks_migrated}."
|
||||
)
|
||||
mark_migration_completed_time_if_not_set_with_commit(db_session)
|
||||
return True
|
||||
task_logger.debug(
|
||||
f"Read the tenant migration record. Total chunks migrated: {total_chunks_migrated}. "
|
||||
f"Continuation token map: {continuation_token_map}"
|
||||
)
|
||||
|
||||
with get_vespa_http_client(
|
||||
timeout=VESPA_MIGRATION_REQUEST_TIMEOUT_S
|
||||
) as vespa_client:
|
||||
# 2.g. Create the OpenSearch and Vespa document indexes.
|
||||
tenant_state = TenantState(tenant_id=tenant_id, multitenant=MULTI_TENANT)
|
||||
indexing_setting = IndexingSetting.from_db_model(search_settings)
|
||||
opensearch_document_index = OpenSearchDocumentIndex(
|
||||
tenant_state=tenant_state,
|
||||
index_name=search_settings.index_name,
|
||||
@@ -224,14 +171,22 @@ def migrate_chunks_from_vespa_to_opensearch_task(
|
||||
httpx_client=vespa_client,
|
||||
)
|
||||
|
||||
# 2.h. Get the approximate chunk count in Vespa as of this time to
|
||||
# update the migration record.
|
||||
sanitized_doc_start_time = time.monotonic()
|
||||
# We reconstruct this mapping for every task invocation because a
|
||||
# document may have been added in the time between two tasks.
|
||||
sanitized_to_original_doc_id_mapping = (
|
||||
build_sanitized_to_original_doc_id_mapping(db_session)
|
||||
)
|
||||
task_logger.debug(
|
||||
f"Built sanitized_to_original_doc_id_mapping with {len(sanitized_to_original_doc_id_mapping)} entries "
|
||||
f"in {time.monotonic() - sanitized_doc_start_time:.3f} seconds."
|
||||
)
|
||||
|
||||
approx_chunk_count_in_vespa: int | None = None
|
||||
get_chunk_count_start_time = time.monotonic()
|
||||
try:
|
||||
approx_chunk_count_in_vespa = vespa_document_index.get_chunk_count()
|
||||
except Exception:
|
||||
# This failure should not be blocking.
|
||||
task_logger.exception(
|
||||
"Error getting approximate chunk count in Vespa. Moving on..."
|
||||
)
|
||||
@@ -240,12 +195,25 @@ def migrate_chunks_from_vespa_to_opensearch_task(
|
||||
f"approximate chunk count in Vespa. Got {approx_chunk_count_in_vespa}."
|
||||
)
|
||||
|
||||
# 3. Do the actual migration in batches until we run out of time.
|
||||
while (
|
||||
time.monotonic() - task_start_time < MIGRATION_TASK_SOFT_TIME_LIMIT_S
|
||||
and lock.owned()
|
||||
):
|
||||
# 3.a. Get the next batch of raw chunks from Vespa.
|
||||
(
|
||||
continuation_token_map,
|
||||
total_chunks_migrated,
|
||||
) = get_vespa_visit_state(db_session)
|
||||
if is_continuation_token_done_for_all_slices(continuation_token_map):
|
||||
task_logger.info(
|
||||
f"OpenSearch migration COMPLETED for tenant {tenant_id}. Total chunks migrated: {total_chunks_migrated}."
|
||||
)
|
||||
mark_migration_completed_time_if_not_set_with_commit(db_session)
|
||||
break
|
||||
task_logger.debug(
|
||||
f"Read the tenant migration record. Total chunks migrated: {total_chunks_migrated}. "
|
||||
f"Continuation token map: {continuation_token_map}"
|
||||
)
|
||||
|
||||
get_vespa_chunks_start_time = time.monotonic()
|
||||
raw_vespa_chunks, next_continuation_token_map = (
|
||||
vespa_document_index.get_all_raw_document_chunks_paginated(
|
||||
@@ -258,7 +226,6 @@ def migrate_chunks_from_vespa_to_opensearch_task(
|
||||
f"seconds. Next continuation token map: {next_continuation_token_map}"
|
||||
)
|
||||
|
||||
# 3.b. Transform the raw chunks to OpenSearch chunks in memory.
|
||||
opensearch_document_chunks, errored_chunks = (
|
||||
transform_vespa_chunks_to_opensearch_chunks(
|
||||
raw_vespa_chunks,
|
||||
@@ -273,7 +240,6 @@ def migrate_chunks_from_vespa_to_opensearch_task(
|
||||
"errored."
|
||||
)
|
||||
|
||||
# 3.c. Index the OpenSearch chunks into OpenSearch.
|
||||
index_opensearch_chunks_start_time = time.monotonic()
|
||||
opensearch_document_index.index_raw_chunks(
|
||||
chunks=opensearch_document_chunks
|
||||
@@ -285,38 +251,12 @@ def migrate_chunks_from_vespa_to_opensearch_task(
|
||||
|
||||
total_chunks_migrated_this_task += len(opensearch_document_chunks)
|
||||
total_chunks_errored_this_task += len(errored_chunks)
|
||||
|
||||
# Do as much as we can with a DB session in one spot to not hold a
|
||||
# session during a migration batch.
|
||||
with get_session_with_current_tenant() as db_session:
|
||||
# 3.d. Update the migration state.
|
||||
update_vespa_visit_progress_with_commit(
|
||||
db_session,
|
||||
continuation_token_map=next_continuation_token_map,
|
||||
chunks_processed=len(opensearch_document_chunks),
|
||||
chunks_errored=len(errored_chunks),
|
||||
approx_chunk_count_in_vespa=approx_chunk_count_in_vespa,
|
||||
)
|
||||
|
||||
# 3.e. Get the current migration state. Even thought we
|
||||
# technically have it in-memory since we just wrote it, we
|
||||
# want to reference the DB as the source of truth at all
|
||||
# times.
|
||||
continuation_token_map, total_chunks_migrated = (
|
||||
get_vespa_visit_state(db_session)
|
||||
)
|
||||
# 3.e.1. Check if the migration is done.
|
||||
if is_continuation_token_done_for_all_slices(
|
||||
continuation_token_map
|
||||
):
|
||||
task_logger.info(
|
||||
f"OpenSearch migration COMPLETED for tenant {tenant_id}. Total chunks migrated: {total_chunks_migrated}."
|
||||
)
|
||||
mark_migration_completed_time_if_not_set_with_commit(db_session)
|
||||
return True
|
||||
task_logger.debug(
|
||||
f"Read the tenant migration record. Total chunks migrated: {total_chunks_migrated}. "
|
||||
f"Continuation token map: {continuation_token_map}"
|
||||
update_vespa_visit_progress_with_commit(
|
||||
db_session,
|
||||
continuation_token_map=next_continuation_token_map,
|
||||
chunks_processed=len(opensearch_document_chunks),
|
||||
chunks_errored=len(errored_chunks),
|
||||
approx_chunk_count_in_vespa=approx_chunk_count_in_vespa,
|
||||
)
|
||||
except Exception:
|
||||
traceback.print_exc()
|
||||
|
||||
138
backend/onyx/background/celery/tasks/periodic/tasks.py
Normal file
138
backend/onyx/background/celery/tasks/periodic/tasks.py
Normal file
@@ -0,0 +1,138 @@
|
||||
#####
|
||||
# Periodic Tasks
|
||||
#####
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
from celery import shared_task
|
||||
from celery.contrib.abortable import AbortableTask # type: ignore
|
||||
from celery.exceptions import TaskRevokedError
|
||||
from sqlalchemy import inspect
|
||||
from sqlalchemy import text
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from onyx.background.celery.apps.app_base import task_logger
|
||||
from onyx.configs.app_configs import JOB_TIMEOUT
|
||||
from onyx.configs.constants import OnyxCeleryTask
|
||||
from onyx.configs.constants import PostgresAdvisoryLocks
|
||||
from onyx.db.engine.sql_engine import get_session_with_current_tenant
|
||||
|
||||
|
||||
@shared_task(
|
||||
name=OnyxCeleryTask.KOMBU_MESSAGE_CLEANUP_TASK,
|
||||
soft_time_limit=JOB_TIMEOUT,
|
||||
bind=True,
|
||||
base=AbortableTask,
|
||||
)
|
||||
def kombu_message_cleanup_task(self: Any, tenant_id: str) -> int: # noqa: ARG001
|
||||
"""Runs periodically to clean up the kombu_message table"""
|
||||
|
||||
# we will select messages older than this amount to clean up
|
||||
KOMBU_MESSAGE_CLEANUP_AGE = 7 # days
|
||||
KOMBU_MESSAGE_CLEANUP_PAGE_LIMIT = 1000
|
||||
|
||||
ctx = {}
|
||||
ctx["last_processed_id"] = 0
|
||||
ctx["deleted"] = 0
|
||||
ctx["cleanup_age"] = KOMBU_MESSAGE_CLEANUP_AGE
|
||||
ctx["page_limit"] = KOMBU_MESSAGE_CLEANUP_PAGE_LIMIT
|
||||
with get_session_with_current_tenant() as db_session:
|
||||
# Exit the task if we can't take the advisory lock
|
||||
result = db_session.execute(
|
||||
text("SELECT pg_try_advisory_lock(:id)"),
|
||||
{"id": PostgresAdvisoryLocks.KOMBU_MESSAGE_CLEANUP_LOCK_ID.value},
|
||||
).scalar()
|
||||
if not result:
|
||||
return 0
|
||||
|
||||
while True:
|
||||
if self.is_aborted():
|
||||
raise TaskRevokedError("kombu_message_cleanup_task was aborted.")
|
||||
|
||||
b = kombu_message_cleanup_task_helper(ctx, db_session)
|
||||
if not b:
|
||||
break
|
||||
|
||||
db_session.commit()
|
||||
|
||||
if ctx["deleted"] > 0:
|
||||
task_logger.info(
|
||||
f"Deleted {ctx['deleted']} orphaned messages from kombu_message."
|
||||
)
|
||||
|
||||
return ctx["deleted"]
|
||||
|
||||
|
||||
def kombu_message_cleanup_task_helper(ctx: dict, db_session: Session) -> bool:
|
||||
"""
|
||||
Helper function to clean up old messages from the `kombu_message` table that are no longer relevant.
|
||||
|
||||
This function retrieves messages from the `kombu_message` table that are no longer visible and
|
||||
older than a specified interval. It checks if the corresponding task_id exists in the
|
||||
`celery_taskmeta` table. If the task_id does not exist, the message is deleted.
|
||||
|
||||
Args:
|
||||
ctx (dict): A context dictionary containing configuration parameters such as:
|
||||
- 'cleanup_age' (int): The age in days after which messages are considered old.
|
||||
- 'page_limit' (int): The maximum number of messages to process in one batch.
|
||||
- 'last_processed_id' (int): The ID of the last processed message to handle pagination.
|
||||
- 'deleted' (int): A counter to track the number of deleted messages.
|
||||
db_session (Session): The SQLAlchemy database session for executing queries.
|
||||
|
||||
Returns:
|
||||
bool: Returns True if there are more rows to process, False if not.
|
||||
"""
|
||||
|
||||
inspector = inspect(db_session.bind)
|
||||
if not inspector:
|
||||
return False
|
||||
|
||||
# With the move to redis as celery's broker and backend, kombu tables may not even exist.
|
||||
# We can fail silently.
|
||||
if not inspector.has_table("kombu_message"):
|
||||
return False
|
||||
|
||||
query = text(
|
||||
"""
|
||||
SELECT id, timestamp, payload
|
||||
FROM kombu_message WHERE visible = 'false'
|
||||
AND timestamp < CURRENT_TIMESTAMP - INTERVAL :interval_days
|
||||
AND id > :last_processed_id
|
||||
ORDER BY id
|
||||
LIMIT :page_limit
|
||||
"""
|
||||
)
|
||||
kombu_messages = db_session.execute(
|
||||
query,
|
||||
{
|
||||
"interval_days": f"{ctx['cleanup_age']} days",
|
||||
"page_limit": ctx["page_limit"],
|
||||
"last_processed_id": ctx["last_processed_id"],
|
||||
},
|
||||
).fetchall()
|
||||
|
||||
if len(kombu_messages) == 0:
|
||||
return False
|
||||
|
||||
for msg in kombu_messages:
|
||||
payload = json.loads(msg[2])
|
||||
task_id = payload["headers"]["id"]
|
||||
|
||||
# Check if task_id exists in celery_taskmeta
|
||||
task_exists = db_session.execute(
|
||||
text("SELECT 1 FROM celery_taskmeta WHERE task_id = :task_id"),
|
||||
{"task_id": task_id},
|
||||
).fetchone()
|
||||
|
||||
# If task_id does not exist, delete the message
|
||||
if not task_exists:
|
||||
result = db_session.execute(
|
||||
text("DELETE FROM kombu_message WHERE id = :message_id"),
|
||||
{"message_id": msg[0]},
|
||||
)
|
||||
if result.rowcount > 0: # type: ignore
|
||||
ctx["deleted"] += 1
|
||||
|
||||
ctx["last_processed_id"] = msg[0]
|
||||
|
||||
return True
|
||||
@@ -217,7 +217,7 @@ def check_for_pruning(self: Task, *, tenant_id: str) -> bool | None:
|
||||
try:
|
||||
# the entire task needs to run frequently in order to finalize pruning
|
||||
|
||||
# but pruning only kicks off once per hour
|
||||
# but pruning only kicks off once per min
|
||||
if not r.exists(OnyxRedisSignals.BLOCK_PRUNING):
|
||||
task_logger.info("Checking for pruning due")
|
||||
|
||||
|
||||
@@ -996,7 +996,6 @@ def _run_models(
|
||||
|
||||
def _run_model(model_idx: int) -> None:
|
||||
"""Run one LLM loop inside a worker thread, writing packets to ``merged_queue``."""
|
||||
|
||||
model_emitter = Emitter(
|
||||
model_idx=model_idx,
|
||||
merged_queue=merged_queue,
|
||||
@@ -1103,33 +1102,33 @@ def _run_models(
|
||||
finally:
|
||||
merged_queue.put((model_idx, _MODEL_DONE))
|
||||
|
||||
def _save_errored_message(model_idx: int, context: str) -> None:
|
||||
"""Save an error message to a reserved ChatMessage that failed during execution."""
|
||||
def _delete_orphaned_message(model_idx: int, context: str) -> None:
|
||||
"""Delete a reserved ChatMessage that was never populated due to a model error."""
|
||||
try:
|
||||
msg = db_session.get(ChatMessage, setup.reserved_messages[model_idx].id)
|
||||
if msg is not None:
|
||||
error_text = f"Error from {setup.model_display_names[model_idx]}: model encountered an error during generation."
|
||||
msg.message = error_text
|
||||
msg.error = error_text
|
||||
orphaned = db_session.get(
|
||||
ChatMessage, setup.reserved_messages[model_idx].id
|
||||
)
|
||||
if orphaned is not None:
|
||||
db_session.delete(orphaned)
|
||||
db_session.commit()
|
||||
except Exception:
|
||||
logger.exception(
|
||||
"%s error save failed for model %d (%s)",
|
||||
"%s orphan cleanup failed for model %d (%s)",
|
||||
context,
|
||||
model_idx,
|
||||
setup.model_display_names[model_idx],
|
||||
)
|
||||
|
||||
# Each worker thread needs its own Context copy — a single Context object
|
||||
# cannot be entered concurrently by multiple threads (RuntimeError).
|
||||
# Copy contextvars before submitting futures — ThreadPoolExecutor does NOT
|
||||
# auto-propagate contextvars in Python 3.11; threads would inherit a blank context.
|
||||
worker_context = contextvars.copy_context()
|
||||
executor = ThreadPoolExecutor(
|
||||
max_workers=n_models, thread_name_prefix="multi-model"
|
||||
)
|
||||
completion_persisted: bool = False
|
||||
try:
|
||||
for i in range(n_models):
|
||||
ctx = contextvars.copy_context()
|
||||
executor.submit(ctx.run, _run_model, i)
|
||||
executor.submit(worker_context.run, _run_model, i)
|
||||
|
||||
# ── Main thread: merge and yield packets ────────────────────────────
|
||||
models_remaining = n_models
|
||||
@@ -1146,7 +1145,7 @@ def _run_models(
|
||||
# save "stopped by user" for a model that actually threw an exception.
|
||||
for i in range(n_models):
|
||||
if model_errored[i]:
|
||||
_save_errored_message(i, "stop-button")
|
||||
_delete_orphaned_message(i, "stop-button")
|
||||
continue
|
||||
try:
|
||||
succeeded = model_succeeded[i]
|
||||
@@ -1212,7 +1211,7 @@ def _run_models(
|
||||
for i in range(n_models):
|
||||
if not model_succeeded[i]:
|
||||
# Model errored — delete its orphaned reserved message.
|
||||
_save_errored_message(i, "normal")
|
||||
_delete_orphaned_message(i, "normal")
|
||||
continue
|
||||
try:
|
||||
llm_loop_completion_handle(
|
||||
@@ -1265,7 +1264,7 @@ def _run_models(
|
||||
setup.model_display_names[i],
|
||||
)
|
||||
elif model_errored[i]:
|
||||
_save_errored_message(i, "disconnect")
|
||||
_delete_orphaned_message(i, "disconnect")
|
||||
# 4. Drain buffered packets from memory — no consumer is running.
|
||||
while not merged_queue.empty():
|
||||
try:
|
||||
|
||||
@@ -379,14 +379,6 @@ POSTGRES_HOST = os.environ.get("POSTGRES_HOST") or "127.0.0.1"
|
||||
POSTGRES_PORT = os.environ.get("POSTGRES_PORT") or "5432"
|
||||
POSTGRES_DB = os.environ.get("POSTGRES_DB") or "postgres"
|
||||
AWS_REGION_NAME = os.environ.get("AWS_REGION_NAME") or "us-east-2"
|
||||
# Comma-separated replica / multi-host list. If unset, defaults to POSTGRES_HOST
|
||||
# only.
|
||||
_POSTGRES_HOSTS_STR = os.environ.get("POSTGRES_HOSTS", "").strip()
|
||||
POSTGRES_HOSTS: list[str] = (
|
||||
[h.strip() for h in _POSTGRES_HOSTS_STR.split(",") if h.strip()]
|
||||
if _POSTGRES_HOSTS_STR
|
||||
else [POSTGRES_HOST]
|
||||
)
|
||||
|
||||
POSTGRES_API_SERVER_POOL_SIZE = int(
|
||||
os.environ.get("POSTGRES_API_SERVER_POOL_SIZE") or 40
|
||||
|
||||
@@ -12,11 +12,6 @@ SLACK_USER_TOKEN_PREFIX = "xoxp-"
|
||||
SLACK_BOT_TOKEN_PREFIX = "xoxb-"
|
||||
ONYX_EMAILABLE_LOGO_MAX_DIM = 512
|
||||
|
||||
# The mask_string() function in encryption.py uses "•" (U+2022 BULLET) to mask secrets.
|
||||
MASK_CREDENTIAL_CHAR = "\u2022"
|
||||
# Pattern produced by mask_string for strings >= 14 chars: "abcd...wxyz" (exactly 11 chars)
|
||||
MASK_CREDENTIAL_LONG_RE = re.compile(r"^.{4}\.{3}.{4}$")
|
||||
|
||||
SOURCE_TYPE = "source_type"
|
||||
# stored in the `metadata` of a chunk. Used to signify that this chunk should
|
||||
# not be used for QA. For example, Google Drive file types which can't be parsed
|
||||
@@ -396,6 +391,10 @@ class MilestoneRecordType(str, Enum):
|
||||
REQUESTED_CONNECTOR = "requested_connector"
|
||||
|
||||
|
||||
class PostgresAdvisoryLocks(Enum):
|
||||
KOMBU_MESSAGE_CLEANUP_LOCK_ID = auto()
|
||||
|
||||
|
||||
class OnyxCeleryQueues:
|
||||
# "celery" is the default queue defined by celery and also the queue
|
||||
# we are running in the primary worker to run system tasks
|
||||
@@ -578,6 +577,7 @@ class OnyxCeleryTask:
|
||||
MONITOR_PROCESS_MEMORY = "monitor_process_memory"
|
||||
CELERY_BEAT_HEARTBEAT = "celery_beat_heartbeat"
|
||||
|
||||
KOMBU_MESSAGE_CLEANUP_TASK = "kombu_message_cleanup_task"
|
||||
CONNECTOR_PERMISSION_SYNC_GENERATOR_TASK = (
|
||||
"connector_permission_sync_generator_task"
|
||||
)
|
||||
|
||||
@@ -44,7 +44,7 @@ _NOTION_CALL_TIMEOUT = 30 # 30 seconds
|
||||
_MAX_PAGES = 1000
|
||||
|
||||
|
||||
# TODO: Pages need to have their metadata ingested
|
||||
# TODO: Tables need to be ingested, Pages need to have their metadata ingested
|
||||
|
||||
|
||||
class NotionPage(BaseModel):
|
||||
@@ -452,19 +452,6 @@ class NotionConnector(LoadConnector, PollConnector):
|
||||
sub_inner_dict: dict[str, Any] | list[Any] | str = inner_dict
|
||||
while isinstance(sub_inner_dict, dict) and "type" in sub_inner_dict:
|
||||
type_name = sub_inner_dict["type"]
|
||||
|
||||
# Notion user objects (people properties, created_by, etc.) have
|
||||
# "name" at the same level as "type": "person"/"bot". If we drill
|
||||
# into the person/bot sub-dict we lose the name. Capture it here
|
||||
# before descending, but skip "title"-type properties where "name"
|
||||
# is not the display value we want.
|
||||
if (
|
||||
"name" in sub_inner_dict
|
||||
and isinstance(sub_inner_dict["name"], str)
|
||||
and type_name not in ("title",)
|
||||
):
|
||||
return sub_inner_dict["name"]
|
||||
|
||||
sub_inner_dict = sub_inner_dict[type_name]
|
||||
|
||||
# If the innermost layer is None, the value is not set
|
||||
@@ -676,19 +663,6 @@ class NotionConnector(LoadConnector, PollConnector):
|
||||
text = rich_text["text"]["content"]
|
||||
cur_result_text_arr.append(text)
|
||||
|
||||
# table_row blocks store content in "cells" (list of lists
|
||||
# of rich text objects) rather than "rich_text"
|
||||
if "cells" in result_obj:
|
||||
row_cells: list[str] = []
|
||||
for cell in result_obj["cells"]:
|
||||
cell_texts = [
|
||||
rt.get("plain_text", "")
|
||||
for rt in cell
|
||||
if isinstance(rt, dict)
|
||||
]
|
||||
row_cells.append(" ".join(cell_texts))
|
||||
cur_result_text_arr.append("\t".join(row_cells))
|
||||
|
||||
if result["has_children"]:
|
||||
if result_type == "child_page":
|
||||
# Child pages will not be included at this top level, it will be a separate document.
|
||||
|
||||
@@ -190,23 +190,16 @@ def delete_messages_and_files_from_chat_session(
|
||||
chat_session_id: UUID, db_session: Session
|
||||
) -> None:
|
||||
# Select messages older than cutoff_time with files
|
||||
messages_with_files = (
|
||||
db_session.execute(
|
||||
select(ChatMessage.id, ChatMessage.files).where(
|
||||
ChatMessage.chat_session_id == chat_session_id,
|
||||
)
|
||||
messages_with_files = db_session.execute(
|
||||
select(ChatMessage.id, ChatMessage.files).where(
|
||||
ChatMessage.chat_session_id == chat_session_id,
|
||||
)
|
||||
.tuples()
|
||||
.all()
|
||||
)
|
||||
).fetchall()
|
||||
|
||||
file_store = get_default_file_store()
|
||||
for _, files in messages_with_files:
|
||||
file_store = get_default_file_store()
|
||||
for file_info in files or []:
|
||||
if file_info.get("user_file_id"):
|
||||
# user files are managed by the user file lifecycle
|
||||
continue
|
||||
file_store.delete_file(file_id=file_info["id"], error_on_missing=False)
|
||||
file_store.delete_file(file_id=file_info.get("id"))
|
||||
|
||||
# Delete ChatMessage records - CASCADE constraints will automatically handle:
|
||||
# - ChatMessage__StandardAnswer relationship records
|
||||
|
||||
@@ -8,8 +8,6 @@ from sqlalchemy.orm import selectinload
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from onyx.configs.constants import FederatedConnectorSource
|
||||
from onyx.configs.constants import MASK_CREDENTIAL_CHAR
|
||||
from onyx.configs.constants import MASK_CREDENTIAL_LONG_RE
|
||||
from onyx.db.engine.sql_engine import get_session_with_current_tenant
|
||||
from onyx.db.models import DocumentSet
|
||||
from onyx.db.models import FederatedConnector
|
||||
@@ -47,23 +45,6 @@ def fetch_all_federated_connectors_parallel() -> list[FederatedConnector]:
|
||||
return fetch_all_federated_connectors(db_session)
|
||||
|
||||
|
||||
def _reject_masked_credentials(credentials: dict[str, Any]) -> None:
|
||||
"""Raise if any credential string value contains mask placeholder characters.
|
||||
|
||||
mask_string() has two output formats:
|
||||
- Short strings (< 14 chars): "••••••••••••" (U+2022 BULLET)
|
||||
- Long strings (>= 14 chars): "abcd...wxyz" (first4 + "..." + last4)
|
||||
Both must be rejected.
|
||||
"""
|
||||
for key, val in credentials.items():
|
||||
if isinstance(val, str) and (
|
||||
MASK_CREDENTIAL_CHAR in val or MASK_CREDENTIAL_LONG_RE.match(val)
|
||||
):
|
||||
raise ValueError(
|
||||
f"Credential field '{key}' contains masked placeholder characters. Please provide the actual credential value."
|
||||
)
|
||||
|
||||
|
||||
def validate_federated_connector_credentials(
|
||||
source: FederatedConnectorSource,
|
||||
credentials: dict[str, Any],
|
||||
@@ -85,8 +66,6 @@ def create_federated_connector(
|
||||
config: dict[str, Any] | None = None,
|
||||
) -> FederatedConnector:
|
||||
"""Create a new federated connector with credential and config validation."""
|
||||
_reject_masked_credentials(credentials)
|
||||
|
||||
# Validate credentials before creating
|
||||
if not validate_federated_connector_credentials(source, credentials):
|
||||
raise ValueError(
|
||||
@@ -298,8 +277,6 @@ def update_federated_connector(
|
||||
)
|
||||
|
||||
if credentials is not None:
|
||||
_reject_masked_credentials(credentials)
|
||||
|
||||
# Validate credentials before updating
|
||||
if not validate_federated_connector_credentials(
|
||||
federated_connector.source, credentials
|
||||
|
||||
@@ -236,15 +236,14 @@ def upsert_llm_provider(
|
||||
db_session.add(existing_llm_provider)
|
||||
|
||||
# Filter out empty strings and None values from custom_config to allow
|
||||
# providers like Bedrock to fall back to IAM roles when credentials are not provided.
|
||||
# NOTE: An empty dict ({}) is preserved as-is — it signals that the provider was
|
||||
# created via the custom modal and must be reopened with CustomModal, not a
|
||||
# provider-specific modal. Only None means "no custom config at all".
|
||||
# providers like Bedrock to fall back to IAM roles when credentials are not provided
|
||||
custom_config = llm_provider_upsert_request.custom_config
|
||||
if custom_config:
|
||||
custom_config = {
|
||||
k: v for k, v in custom_config.items() if v is not None and v.strip() != ""
|
||||
}
|
||||
# Set to None if the dict is empty after filtering
|
||||
custom_config = custom_config or None
|
||||
|
||||
api_base = llm_provider_upsert_request.api_base or None
|
||||
existing_llm_provider.provider = llm_provider_upsert_request.provider
|
||||
@@ -304,7 +303,16 @@ def upsert_llm_provider(
|
||||
).delete(synchronize_session="fetch")
|
||||
db_session.flush()
|
||||
|
||||
# Import here to avoid circular imports
|
||||
from onyx.llm.utils import get_max_input_tokens
|
||||
|
||||
for model_config in llm_provider_upsert_request.model_configurations:
|
||||
max_input_tokens = model_config.max_input_tokens
|
||||
if max_input_tokens is None:
|
||||
max_input_tokens = get_max_input_tokens(
|
||||
model_name=model_config.name,
|
||||
model_provider=llm_provider_upsert_request.provider,
|
||||
)
|
||||
|
||||
supported_flows = [LLMModelFlowType.CHAT]
|
||||
if model_config.supports_image_input:
|
||||
@@ -317,7 +325,7 @@ def upsert_llm_provider(
|
||||
model_configuration_id=existing.id,
|
||||
supported_flows=supported_flows,
|
||||
is_visible=model_config.is_visible,
|
||||
max_input_tokens=model_config.max_input_tokens,
|
||||
max_input_tokens=max_input_tokens,
|
||||
display_name=model_config.display_name,
|
||||
)
|
||||
else:
|
||||
@@ -327,7 +335,7 @@ def upsert_llm_provider(
|
||||
model_name=model_config.name,
|
||||
supported_flows=supported_flows,
|
||||
is_visible=model_config.is_visible,
|
||||
max_input_tokens=model_config.max_input_tokens,
|
||||
max_input_tokens=max_input_tokens,
|
||||
display_name=model_config.display_name,
|
||||
)
|
||||
|
||||
|
||||
@@ -324,15 +324,6 @@ def mark_migration_completed_time_if_not_set_with_commit(
|
||||
db_session.commit()
|
||||
|
||||
|
||||
def is_migration_completed(db_session: Session) -> bool:
|
||||
"""Returns True if the migration is completed.
|
||||
|
||||
Can be run even if the migration record does not exist.
|
||||
"""
|
||||
record = db_session.query(OpenSearchTenantMigrationRecord).first()
|
||||
return record is not None and record.migration_completed_at is not None
|
||||
|
||||
|
||||
def build_sanitized_to_original_doc_id_mapping(
|
||||
db_session: Session,
|
||||
) -> dict[str, str]:
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
import hashlib
|
||||
from datetime import datetime
|
||||
from datetime import timezone
|
||||
from typing import Any
|
||||
@@ -21,13 +20,9 @@ from onyx.document_index.opensearch.constants import DEFAULT_MAX_CHUNK_SIZE
|
||||
from onyx.document_index.opensearch.constants import EF_CONSTRUCTION
|
||||
from onyx.document_index.opensearch.constants import EF_SEARCH
|
||||
from onyx.document_index.opensearch.constants import M
|
||||
from onyx.document_index.opensearch.string_filtering import DocumentIDTooLongError
|
||||
from onyx.document_index.opensearch.string_filtering import (
|
||||
filter_and_validate_document_id,
|
||||
)
|
||||
from onyx.document_index.opensearch.string_filtering import (
|
||||
MAX_DOCUMENT_ID_ENCODED_LENGTH,
|
||||
)
|
||||
from onyx.utils.tenant import get_tenant_id_short_string
|
||||
from shared_configs.configs import MULTI_TENANT
|
||||
from shared_configs.contextvars import get_current_tenant_id
|
||||
@@ -80,50 +75,17 @@ def get_opensearch_doc_chunk_id(
|
||||
|
||||
This will be the string used to identify the chunk in OpenSearch. Any direct
|
||||
chunk queries should use this function.
|
||||
|
||||
If the document ID is too long, a hash of the ID is used instead.
|
||||
"""
|
||||
opensearch_doc_chunk_id_suffix: str = f"__{max_chunk_size}__{chunk_index}"
|
||||
encoded_suffix_length: int = len(opensearch_doc_chunk_id_suffix.encode("utf-8"))
|
||||
max_encoded_permissible_doc_id_length: int = (
|
||||
MAX_DOCUMENT_ID_ENCODED_LENGTH - encoded_suffix_length
|
||||
sanitized_document_id = filter_and_validate_document_id(document_id)
|
||||
opensearch_doc_chunk_id = (
|
||||
f"{sanitized_document_id}__{max_chunk_size}__{chunk_index}"
|
||||
)
|
||||
opensearch_doc_chunk_id_tenant_prefix: str = ""
|
||||
if tenant_state.multitenant:
|
||||
short_tenant_id: str = get_tenant_id_short_string(tenant_state.tenant_id)
|
||||
# Use tenant ID because in multitenant mode each tenant has its own
|
||||
# Documents table, so there is a very small chance that doc IDs are not
|
||||
# actually unique across all tenants.
|
||||
opensearch_doc_chunk_id_tenant_prefix = f"{short_tenant_id}__"
|
||||
encoded_prefix_length: int = len(
|
||||
opensearch_doc_chunk_id_tenant_prefix.encode("utf-8")
|
||||
)
|
||||
max_encoded_permissible_doc_id_length -= encoded_prefix_length
|
||||
|
||||
try:
|
||||
sanitized_document_id: str = filter_and_validate_document_id(
|
||||
document_id, max_encoded_length=max_encoded_permissible_doc_id_length
|
||||
)
|
||||
except DocumentIDTooLongError:
|
||||
# If the document ID is too long, use a hash instead.
|
||||
# We use blake2b because it is faster and equally secure as SHA256, and
|
||||
# accepts digest_size which controls the number of bytes returned in the
|
||||
# hash.
|
||||
# digest_size is the size of the returned hash in bytes. Since we're
|
||||
# decoding the hash bytes as a hex string, the digest_size should be
|
||||
# half the max target size of the hash string.
|
||||
# Subtract 1 because filter_and_validate_document_id compares on >= on
|
||||
# max_encoded_length.
|
||||
# 64 is the max digest_size blake2b returns.
|
||||
digest_size: int = min((max_encoded_permissible_doc_id_length - 1) // 2, 64)
|
||||
sanitized_document_id = hashlib.blake2b(
|
||||
document_id.encode("utf-8"), digest_size=digest_size
|
||||
).hexdigest()
|
||||
|
||||
opensearch_doc_chunk_id: str = (
|
||||
f"{opensearch_doc_chunk_id_tenant_prefix}{sanitized_document_id}{opensearch_doc_chunk_id_suffix}"
|
||||
)
|
||||
|
||||
short_tenant_id = get_tenant_id_short_string(tenant_state.tenant_id)
|
||||
opensearch_doc_chunk_id = f"{short_tenant_id}__{opensearch_doc_chunk_id}"
|
||||
# Do one more validation to ensure we haven't exceeded the max length.
|
||||
opensearch_doc_chunk_id = filter_and_validate_document_id(opensearch_doc_chunk_id)
|
||||
return opensearch_doc_chunk_id
|
||||
|
||||
@@ -1,15 +1,7 @@
|
||||
import re
|
||||
|
||||
MAX_DOCUMENT_ID_ENCODED_LENGTH: int = 512
|
||||
|
||||
|
||||
class DocumentIDTooLongError(ValueError):
|
||||
"""Raised when a document ID is too long for OpenSearch after filtering."""
|
||||
|
||||
|
||||
def filter_and_validate_document_id(
|
||||
document_id: str, max_encoded_length: int = MAX_DOCUMENT_ID_ENCODED_LENGTH
|
||||
) -> str:
|
||||
def filter_and_validate_document_id(document_id: str) -> str:
|
||||
"""
|
||||
Filters and validates a document ID such that it can be used as an ID in
|
||||
OpenSearch.
|
||||
@@ -27,13 +19,9 @@ def filter_and_validate_document_id(
|
||||
|
||||
Args:
|
||||
document_id: The document ID to filter and validate.
|
||||
max_encoded_length: The maximum length of the document ID after
|
||||
filtering in bytes. Compared with >= for extra resilience, so
|
||||
encoded values of this length will fail.
|
||||
|
||||
Raises:
|
||||
DocumentIDTooLongError: If the document ID is too long after filtering.
|
||||
ValueError: If the document ID is empty after filtering.
|
||||
ValueError: If the document ID is empty or too long after filtering.
|
||||
|
||||
Returns:
|
||||
str: The filtered document ID.
|
||||
@@ -41,8 +29,6 @@ def filter_and_validate_document_id(
|
||||
filtered_document_id = re.sub(r"[^A-Za-z0-9_.\-~]", "", document_id)
|
||||
if not filtered_document_id:
|
||||
raise ValueError(f"Document ID {document_id} is empty after filtering.")
|
||||
if len(filtered_document_id.encode("utf-8")) >= max_encoded_length:
|
||||
raise DocumentIDTooLongError(
|
||||
f"Document ID {document_id} is too long after filtering."
|
||||
)
|
||||
if len(filtered_document_id.encode("utf-8")) >= 512:
|
||||
raise ValueError(f"Document ID {document_id} is too long after filtering.")
|
||||
return filtered_document_id
|
||||
|
||||
@@ -52,21 +52,9 @@ KNOWN_OPENPYXL_BUGS = [
|
||||
|
||||
def get_markitdown_converter() -> "MarkItDown":
|
||||
global _MARKITDOWN_CONVERTER
|
||||
from markitdown import MarkItDown
|
||||
|
||||
if _MARKITDOWN_CONVERTER is None:
|
||||
from markitdown import MarkItDown
|
||||
|
||||
# Patch this function to effectively no-op because we were seeing this
|
||||
# module take an inordinate amount of time to convert charts to markdown,
|
||||
# making some powerpoint files with many or complicated charts nearly
|
||||
# unindexable.
|
||||
from markitdown.converters._pptx_converter import PptxConverter
|
||||
|
||||
setattr(
|
||||
PptxConverter,
|
||||
"_convert_chart_to_markdown",
|
||||
lambda self, chart: "\n\n[chart omitted]\n\n", # noqa: ARG005
|
||||
)
|
||||
_MARKITDOWN_CONVERTER = MarkItDown(enable_plugins=False)
|
||||
return _MARKITDOWN_CONVERTER
|
||||
|
||||
@@ -217,26 +205,18 @@ def read_pdf_file(
|
||||
try:
|
||||
pdf_reader = PdfReader(file)
|
||||
|
||||
if pdf_reader.is_encrypted:
|
||||
# Try the explicit password first, then fall back to an empty
|
||||
# string. Owner-password-only PDFs (permission restrictions but
|
||||
# no open password) decrypt successfully with "".
|
||||
# See https://github.com/onyx-dot-app/onyx/issues/9754
|
||||
passwords = [p for p in [pdf_pass, ""] if p is not None]
|
||||
if pdf_reader.is_encrypted and pdf_pass is not None:
|
||||
decrypt_success = False
|
||||
for pw in passwords:
|
||||
try:
|
||||
if pdf_reader.decrypt(pw) != 0:
|
||||
decrypt_success = True
|
||||
break
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
decrypt_success = pdf_reader.decrypt(pdf_pass) != 0
|
||||
except Exception:
|
||||
logger.error("Unable to decrypt pdf")
|
||||
|
||||
if not decrypt_success:
|
||||
logger.error(
|
||||
"Encrypted PDF could not be decrypted, returning empty text."
|
||||
)
|
||||
return "", metadata, []
|
||||
elif pdf_reader.is_encrypted:
|
||||
logger.warning("No Password for an encrypted PDF, returning empty text.")
|
||||
return "", metadata, []
|
||||
|
||||
# Basic PDF metadata
|
||||
if pdf_reader.metadata is not None:
|
||||
|
||||
@@ -33,20 +33,8 @@ def is_pdf_protected(file: IO[Any]) -> bool:
|
||||
|
||||
with preserve_position(file):
|
||||
reader = PdfReader(file)
|
||||
if not reader.is_encrypted:
|
||||
return False
|
||||
|
||||
# PDFs with only an owner password (permission restrictions like
|
||||
# print/copy disabled) use an empty user password — any viewer can open
|
||||
# them without prompting. decrypt("") returns 0 only when a real user
|
||||
# password is required. See https://github.com/onyx-dot-app/onyx/issues/9754
|
||||
try:
|
||||
return reader.decrypt("") == 0
|
||||
except Exception:
|
||||
logger.exception(
|
||||
"Failed to evaluate PDF encryption; treating as password protected"
|
||||
)
|
||||
return True
|
||||
return bool(reader.is_encrypted)
|
||||
|
||||
|
||||
def is_docx_protected(file: IO[Any]) -> bool:
|
||||
|
||||
@@ -136,14 +136,12 @@ class FileStore(ABC):
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
def delete_file(self, file_id: str, error_on_missing: bool = True) -> None:
|
||||
def delete_file(self, file_id: str) -> None:
|
||||
"""
|
||||
Delete a file by its ID.
|
||||
|
||||
Parameters:
|
||||
- file_id: ID of file to delete
|
||||
- error_on_missing: If False, silently return when the file record
|
||||
does not exist instead of raising.
|
||||
- file_name: Name of file to delete
|
||||
"""
|
||||
|
||||
@abstractmethod
|
||||
@@ -454,23 +452,12 @@ class S3BackedFileStore(FileStore):
|
||||
logger.warning(f"Error getting file size for {file_id}: {e}")
|
||||
return None
|
||||
|
||||
def delete_file(
|
||||
self,
|
||||
file_id: str,
|
||||
error_on_missing: bool = True,
|
||||
db_session: Session | None = None,
|
||||
) -> None:
|
||||
def delete_file(self, file_id: str, db_session: Session | None = None) -> None:
|
||||
with get_session_with_current_tenant_if_none(db_session) as db_session:
|
||||
try:
|
||||
file_record = get_filerecord_by_file_id_optional(
|
||||
file_record = get_filerecord_by_file_id(
|
||||
file_id=file_id, db_session=db_session
|
||||
)
|
||||
if file_record is None:
|
||||
if error_on_missing:
|
||||
raise RuntimeError(
|
||||
f"File by id {file_id} does not exist or was deleted"
|
||||
)
|
||||
return
|
||||
if not file_record.bucket_name:
|
||||
logger.error(
|
||||
f"File record {file_id} with key {file_record.object_key} "
|
||||
|
||||
@@ -222,23 +222,12 @@ class PostgresBackedFileStore(FileStore):
|
||||
logger.warning(f"Error getting file size for {file_id}: {e}")
|
||||
return None
|
||||
|
||||
def delete_file(
|
||||
self,
|
||||
file_id: str,
|
||||
error_on_missing: bool = True,
|
||||
db_session: Session | None = None,
|
||||
) -> None:
|
||||
def delete_file(self, file_id: str, db_session: Session | None = None) -> None:
|
||||
with get_session_with_current_tenant_if_none(db_session) as session:
|
||||
try:
|
||||
file_content = get_file_content_by_file_id_optional(
|
||||
file_content = get_file_content_by_file_id(
|
||||
file_id=file_id, db_session=session
|
||||
)
|
||||
if file_content is None:
|
||||
if error_on_missing:
|
||||
raise RuntimeError(
|
||||
f"File content for file_id {file_id} does not exist or was deleted"
|
||||
)
|
||||
return
|
||||
raw_conn = _get_raw_connection(session)
|
||||
|
||||
try:
|
||||
|
||||
@@ -26,7 +26,6 @@ class LlmProviderNames(str, Enum):
|
||||
MISTRAL = "mistral"
|
||||
LITELLM_PROXY = "litellm_proxy"
|
||||
BIFROST = "bifrost"
|
||||
OPENAI_COMPATIBLE = "openai_compatible"
|
||||
|
||||
def __str__(self) -> str:
|
||||
"""Needed so things like:
|
||||
@@ -47,7 +46,6 @@ WELL_KNOWN_PROVIDER_NAMES = [
|
||||
LlmProviderNames.LM_STUDIO,
|
||||
LlmProviderNames.LITELLM_PROXY,
|
||||
LlmProviderNames.BIFROST,
|
||||
LlmProviderNames.OPENAI_COMPATIBLE,
|
||||
]
|
||||
|
||||
|
||||
@@ -66,7 +64,6 @@ PROVIDER_DISPLAY_NAMES: dict[str, str] = {
|
||||
LlmProviderNames.LM_STUDIO: "LM Studio",
|
||||
LlmProviderNames.LITELLM_PROXY: "LiteLLM Proxy",
|
||||
LlmProviderNames.BIFROST: "Bifrost",
|
||||
LlmProviderNames.OPENAI_COMPATIBLE: "OpenAI Compatible",
|
||||
"groq": "Groq",
|
||||
"anyscale": "Anyscale",
|
||||
"deepseek": "DeepSeek",
|
||||
@@ -119,7 +116,6 @@ AGGREGATOR_PROVIDERS: set[str] = {
|
||||
LlmProviderNames.AZURE,
|
||||
LlmProviderNames.LITELLM_PROXY,
|
||||
LlmProviderNames.BIFROST,
|
||||
LlmProviderNames.OPENAI_COMPATIBLE,
|
||||
}
|
||||
|
||||
# Model family name mappings for display name generation
|
||||
|
||||
@@ -327,19 +327,12 @@ class LitellmLLM(LLM):
|
||||
):
|
||||
model_kwargs[VERTEX_LOCATION_KWARG] = "global"
|
||||
|
||||
# Bifrost and OpenAI-compatible: OpenAI-compatible proxies that send
|
||||
# model names directly to the endpoint. We route through LiteLLM's
|
||||
# openai provider with the server's base URL, and ensure /v1 is appended.
|
||||
if model_provider in (
|
||||
LlmProviderNames.BIFROST,
|
||||
LlmProviderNames.OPENAI_COMPATIBLE,
|
||||
):
|
||||
# Bifrost: OpenAI-compatible proxy that expects model names in
|
||||
# provider/model format (e.g. "anthropic/claude-sonnet-4-6").
|
||||
# We route through LiteLLM's openai provider with the Bifrost base URL,
|
||||
# and ensure /v1 is appended.
|
||||
if model_provider == LlmProviderNames.BIFROST:
|
||||
self._custom_llm_provider = "openai"
|
||||
# LiteLLM's OpenAI client requires an api_key to be set.
|
||||
# Many OpenAI-compatible servers don't need auth, so supply a
|
||||
# placeholder to prevent LiteLLM from raising AuthenticationError.
|
||||
if not self._api_key:
|
||||
model_kwargs.setdefault("api_key", "not-needed")
|
||||
if self._api_base is not None:
|
||||
base = self._api_base.rstrip("/")
|
||||
self._api_base = base if base.endswith("/v1") else f"{base}/v1"
|
||||
@@ -456,20 +449,17 @@ class LitellmLLM(LLM):
|
||||
optional_kwargs: dict[str, Any] = {}
|
||||
|
||||
# Model name
|
||||
is_openai_compatible_proxy = self._model_provider in (
|
||||
LlmProviderNames.BIFROST,
|
||||
LlmProviderNames.OPENAI_COMPATIBLE,
|
||||
)
|
||||
is_bifrost = self._model_provider == LlmProviderNames.BIFROST
|
||||
model_provider = (
|
||||
f"{self.config.model_provider}/responses"
|
||||
if is_openai_model # Uses litellm's completions -> responses bridge
|
||||
else self.config.model_provider
|
||||
)
|
||||
if is_openai_compatible_proxy:
|
||||
# OpenAI-compatible proxies (Bifrost, generic OpenAI-compatible
|
||||
# servers) expect model names sent directly to their endpoint.
|
||||
# We use custom_llm_provider="openai" so LiteLLM doesn't try
|
||||
# to route based on the provider prefix.
|
||||
if is_bifrost:
|
||||
# Bifrost expects model names in provider/model format
|
||||
# (e.g. "anthropic/claude-sonnet-4-6") sent directly to its
|
||||
# OpenAI-compatible endpoint. We use custom_llm_provider="openai"
|
||||
# so LiteLLM doesn't try to route based on the provider prefix.
|
||||
model = self.config.deployment_name or self.config.model_name
|
||||
else:
|
||||
model = f"{model_provider}/{self.config.deployment_name or self.config.model_name}"
|
||||
@@ -560,10 +550,7 @@ class LitellmLLM(LLM):
|
||||
if structured_response_format:
|
||||
optional_kwargs["response_format"] = structured_response_format
|
||||
|
||||
if (
|
||||
not (is_claude_model or is_ollama or is_mistral)
|
||||
or is_openai_compatible_proxy
|
||||
):
|
||||
if not (is_claude_model or is_ollama or is_mistral) or is_bifrost:
|
||||
# Litellm bug: tool_choice is dropped silently if not specified here for OpenAI
|
||||
# However, this param breaks Anthropic and Mistral models,
|
||||
# so it must be conditionally included unless the request is
|
||||
|
||||
@@ -15,8 +15,6 @@ LITELLM_PROXY_PROVIDER_NAME = "litellm_proxy"
|
||||
|
||||
BIFROST_PROVIDER_NAME = "bifrost"
|
||||
|
||||
OPENAI_COMPATIBLE_PROVIDER_NAME = "openai_compatible"
|
||||
|
||||
# Providers that use optional Bearer auth from custom_config
|
||||
PROVIDERS_WITH_SPECIAL_API_KEY_HANDLING: dict[str, str] = {
|
||||
LlmProviderNames.OLLAMA_CHAT: OLLAMA_API_KEY_CONFIG_KEY,
|
||||
|
||||
@@ -19,7 +19,6 @@ from onyx.llm.well_known_providers.constants import BIFROST_PROVIDER_NAME
|
||||
from onyx.llm.well_known_providers.constants import LITELLM_PROXY_PROVIDER_NAME
|
||||
from onyx.llm.well_known_providers.constants import LM_STUDIO_PROVIDER_NAME
|
||||
from onyx.llm.well_known_providers.constants import OLLAMA_PROVIDER_NAME
|
||||
from onyx.llm.well_known_providers.constants import OPENAI_COMPATIBLE_PROVIDER_NAME
|
||||
from onyx.llm.well_known_providers.constants import OPENAI_PROVIDER_NAME
|
||||
from onyx.llm.well_known_providers.constants import OPENROUTER_PROVIDER_NAME
|
||||
from onyx.llm.well_known_providers.constants import VERTEXAI_PROVIDER_NAME
|
||||
@@ -52,7 +51,6 @@ def _get_provider_to_models_map() -> dict[str, list[str]]:
|
||||
OPENROUTER_PROVIDER_NAME: [], # Dynamic - fetched from OpenRouter API
|
||||
LITELLM_PROXY_PROVIDER_NAME: [], # Dynamic - fetched from LiteLLM proxy API
|
||||
BIFROST_PROVIDER_NAME: [], # Dynamic - fetched from Bifrost API
|
||||
OPENAI_COMPATIBLE_PROVIDER_NAME: [], # Dynamic - fetched from OpenAI-compatible API
|
||||
}
|
||||
|
||||
|
||||
@@ -338,7 +336,6 @@ def get_provider_display_name(provider_name: str) -> str:
|
||||
VERTEXAI_PROVIDER_NAME: "Google Vertex AI",
|
||||
OPENROUTER_PROVIDER_NAME: "OpenRouter",
|
||||
LITELLM_PROXY_PROVIDER_NAME: "LiteLLM Proxy",
|
||||
OPENAI_COMPATIBLE_PROVIDER_NAME: "OpenAI Compatible",
|
||||
}
|
||||
|
||||
if provider_name in _ONYX_PROVIDER_DISPLAY_NAMES:
|
||||
|
||||
@@ -3,8 +3,6 @@
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
|
||||
import httpx
|
||||
|
||||
from onyx.configs.constants import DocumentSource
|
||||
from onyx.mcp_server.api import mcp_server
|
||||
from onyx.mcp_server.utils import get_http_client
|
||||
@@ -17,21 +15,6 @@ from onyx.utils.variable_functionality import global_version
|
||||
logger = setup_logger()
|
||||
|
||||
|
||||
def _extract_error_detail(response: httpx.Response) -> str:
|
||||
"""Extract a human-readable error message from a failed backend response.
|
||||
|
||||
The backend returns OnyxError responses as
|
||||
``{"error_code": "...", "detail": "..."}``.
|
||||
"""
|
||||
try:
|
||||
body = response.json()
|
||||
if detail := body.get("detail"):
|
||||
return str(detail)
|
||||
except Exception:
|
||||
pass
|
||||
return f"Request failed with status {response.status_code}"
|
||||
|
||||
|
||||
@mcp_server.tool()
|
||||
async def search_indexed_documents(
|
||||
query: str,
|
||||
@@ -175,14 +158,7 @@ async def search_indexed_documents(
|
||||
json=search_request,
|
||||
headers=auth_headers,
|
||||
)
|
||||
if not response.is_success:
|
||||
error_detail = _extract_error_detail(response)
|
||||
return {
|
||||
"documents": [],
|
||||
"total_results": 0,
|
||||
"query": query,
|
||||
"error": error_detail,
|
||||
}
|
||||
response.raise_for_status()
|
||||
result = response.json()
|
||||
|
||||
# Check for error in response
|
||||
@@ -258,13 +234,7 @@ async def search_web(
|
||||
json=request_payload,
|
||||
headers={"Authorization": f"Bearer {access_token.token}"},
|
||||
)
|
||||
if not response.is_success:
|
||||
error_detail = _extract_error_detail(response)
|
||||
return {
|
||||
"error": error_detail,
|
||||
"results": [],
|
||||
"query": query,
|
||||
}
|
||||
response.raise_for_status()
|
||||
response_payload = response.json()
|
||||
results = response_payload.get("results", [])
|
||||
return {
|
||||
@@ -310,12 +280,7 @@ async def open_urls(
|
||||
json={"urls": urls},
|
||||
headers={"Authorization": f"Bearer {access_token.token}"},
|
||||
)
|
||||
if not response.is_success:
|
||||
error_detail = _extract_error_detail(response)
|
||||
return {
|
||||
"error": error_detail,
|
||||
"results": [],
|
||||
}
|
||||
response.raise_for_status()
|
||||
response_payload = response.json()
|
||||
results = response_payload.get("results", [])
|
||||
return {
|
||||
|
||||
@@ -6,7 +6,6 @@ from onyx.configs.app_configs import MCP_SERVER_ENABLED
|
||||
from onyx.configs.app_configs import MCP_SERVER_HOST
|
||||
from onyx.configs.app_configs import MCP_SERVER_PORT
|
||||
from onyx.utils.logger import setup_logger
|
||||
from onyx.utils.variable_functionality import set_is_ee_based_on_env_variable
|
||||
|
||||
logger = setup_logger()
|
||||
|
||||
@@ -17,7 +16,6 @@ def main() -> None:
|
||||
logger.info("MCP server is disabled (MCP_SERVER_ENABLED=false)")
|
||||
return
|
||||
|
||||
set_is_ee_based_on_env_variable()
|
||||
logger.info(f"Starting MCP server on {MCP_SERVER_HOST}:{MCP_SERVER_PORT}")
|
||||
|
||||
from onyx.mcp_server.api import mcp_app
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
from fastapi import APIRouter
|
||||
from fastapi import Depends
|
||||
from fastapi import HTTPException
|
||||
from sqlalchemy.orm import Session
|
||||
|
||||
from onyx.auth.users import current_user
|
||||
@@ -8,8 +9,6 @@ from onyx.db.engine.sql_engine import get_session
|
||||
from onyx.db.models import User
|
||||
from onyx.db.web_search import fetch_active_web_content_provider
|
||||
from onyx.db.web_search import fetch_active_web_search_provider
|
||||
from onyx.error_handling.error_codes import OnyxErrorCode
|
||||
from onyx.error_handling.exceptions import OnyxError
|
||||
from onyx.server.features.web_search.models import OpenUrlsToolRequest
|
||||
from onyx.server.features.web_search.models import OpenUrlsToolResponse
|
||||
from onyx.server.features.web_search.models import WebSearchToolRequest
|
||||
@@ -62,10 +61,9 @@ def _get_active_search_provider(
|
||||
) -> tuple[WebSearchProviderView, WebSearchProvider]:
|
||||
provider_model = fetch_active_web_search_provider(db_session)
|
||||
if provider_model is None:
|
||||
raise OnyxError(
|
||||
OnyxErrorCode.INVALID_INPUT,
|
||||
"No web search provider configured. Please configure one in "
|
||||
"Admin > Web Search settings.",
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="No web search provider configured.",
|
||||
)
|
||||
|
||||
provider_view = WebSearchProviderView(
|
||||
@@ -78,10 +76,9 @@ def _get_active_search_provider(
|
||||
)
|
||||
|
||||
if provider_model.api_key is None:
|
||||
raise OnyxError(
|
||||
OnyxErrorCode.INVALID_INPUT,
|
||||
"Web search provider requires an API key. Please configure one in "
|
||||
"Admin > Web Search settings.",
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="Web search provider requires an API key.",
|
||||
)
|
||||
|
||||
try:
|
||||
@@ -91,7 +88,7 @@ def _get_active_search_provider(
|
||||
config=provider_model.config or {},
|
||||
)
|
||||
except ValueError as exc:
|
||||
raise OnyxError(OnyxErrorCode.INVALID_INPUT, str(exc)) from exc
|
||||
raise HTTPException(status_code=400, detail=str(exc)) from exc
|
||||
|
||||
return provider_view, provider
|
||||
|
||||
@@ -113,9 +110,9 @@ def _get_active_content_provider(
|
||||
|
||||
if provider_model.api_key is None:
|
||||
# TODO - this is not a great error, in fact, this key should not be nullable.
|
||||
raise OnyxError(
|
||||
OnyxErrorCode.INVALID_INPUT,
|
||||
"Web content provider requires an API key.",
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="Web content provider requires an API key.",
|
||||
)
|
||||
|
||||
try:
|
||||
@@ -128,12 +125,12 @@ def _get_active_content_provider(
|
||||
config=config,
|
||||
)
|
||||
except ValueError as exc:
|
||||
raise OnyxError(OnyxErrorCode.INVALID_INPUT, str(exc)) from exc
|
||||
raise HTTPException(status_code=400, detail=str(exc)) from exc
|
||||
|
||||
if provider is None:
|
||||
raise OnyxError(
|
||||
OnyxErrorCode.INVALID_INPUT,
|
||||
"Unable to initialize the configured web content provider.",
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail="Unable to initialize the configured web content provider.",
|
||||
)
|
||||
|
||||
provider_view = WebContentProviderView(
|
||||
@@ -157,13 +154,12 @@ def _run_web_search(
|
||||
for query in request.queries:
|
||||
try:
|
||||
search_results = provider.search(query)
|
||||
except OnyxError:
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as exc:
|
||||
logger.exception("Web search provider failed for query '%s'", query)
|
||||
raise OnyxError(
|
||||
OnyxErrorCode.BAD_GATEWAY,
|
||||
"Web search provider failed to execute query.",
|
||||
raise HTTPException(
|
||||
status_code=502, detail="Web search provider failed to execute query."
|
||||
) from exc
|
||||
|
||||
filtered_results = filter_web_search_results_with_no_title_or_snippet(
|
||||
@@ -196,13 +192,12 @@ def _open_urls(
|
||||
docs = filter_web_contents_with_no_title_or_content(
|
||||
list(provider.contents(urls))
|
||||
)
|
||||
except OnyxError:
|
||||
except HTTPException:
|
||||
raise
|
||||
except Exception as exc:
|
||||
logger.exception("Web content provider failed to fetch URLs")
|
||||
raise OnyxError(
|
||||
OnyxErrorCode.BAD_GATEWAY,
|
||||
"Web content provider failed to fetch URLs.",
|
||||
raise HTTPException(
|
||||
status_code=502, detail="Web content provider failed to fetch URLs."
|
||||
) from exc
|
||||
|
||||
results: list[LlmOpenUrlResult] = []
|
||||
|
||||
@@ -74,8 +74,6 @@ from onyx.server.manage.llm.models import ModelConfigurationUpsertRequest
|
||||
from onyx.server.manage.llm.models import OllamaFinalModelResponse
|
||||
from onyx.server.manage.llm.models import OllamaModelDetails
|
||||
from onyx.server.manage.llm.models import OllamaModelsRequest
|
||||
from onyx.server.manage.llm.models import OpenAICompatibleFinalModelResponse
|
||||
from onyx.server.manage.llm.models import OpenAICompatibleModelsRequest
|
||||
from onyx.server.manage.llm.models import OpenRouterFinalModelResponse
|
||||
from onyx.server.manage.llm.models import OpenRouterModelDetails
|
||||
from onyx.server.manage.llm.models import OpenRouterModelsRequest
|
||||
@@ -1577,95 +1575,3 @@ def _get_bifrost_models_response(api_base: str, api_key: str | None = None) -> d
|
||||
source_name="Bifrost",
|
||||
api_key=api_key,
|
||||
)
|
||||
|
||||
|
||||
@admin_router.post("/openai-compatible/available-models")
|
||||
def get_openai_compatible_server_available_models(
|
||||
request: OpenAICompatibleModelsRequest,
|
||||
_: User = Depends(current_admin_user),
|
||||
db_session: Session = Depends(get_session),
|
||||
) -> list[OpenAICompatibleFinalModelResponse]:
|
||||
"""Fetch available models from a generic OpenAI-compatible /v1/models endpoint."""
|
||||
response_json = _get_openai_compatible_server_response(
|
||||
api_base=request.api_base, api_key=request.api_key
|
||||
)
|
||||
|
||||
models = response_json.get("data", [])
|
||||
if not isinstance(models, list) or len(models) == 0:
|
||||
raise OnyxError(
|
||||
OnyxErrorCode.VALIDATION_ERROR,
|
||||
"No models found from your OpenAI-compatible endpoint",
|
||||
)
|
||||
|
||||
results: list[OpenAICompatibleFinalModelResponse] = []
|
||||
for model in models:
|
||||
try:
|
||||
model_id = model.get("id", "")
|
||||
model_name = model.get("name", model_id)
|
||||
|
||||
if not model_id:
|
||||
continue
|
||||
|
||||
# Skip embedding models
|
||||
if is_embedding_model(model_id):
|
||||
continue
|
||||
|
||||
results.append(
|
||||
OpenAICompatibleFinalModelResponse(
|
||||
name=model_id,
|
||||
display_name=model_name,
|
||||
max_input_tokens=model.get("context_length"),
|
||||
supports_image_input=infer_vision_support(model_id),
|
||||
supports_reasoning=is_reasoning_model(model_id, model_name),
|
||||
)
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
"Failed to parse OpenAI-compatible model entry",
|
||||
extra={"error": str(e), "item": str(model)[:1000]},
|
||||
)
|
||||
|
||||
if not results:
|
||||
raise OnyxError(
|
||||
OnyxErrorCode.VALIDATION_ERROR,
|
||||
"No compatible models found from OpenAI-compatible endpoint",
|
||||
)
|
||||
|
||||
sorted_results = sorted(results, key=lambda m: m.name.lower())
|
||||
|
||||
# Sync new models to DB if provider_name is specified
|
||||
if request.provider_name:
|
||||
_sync_fetched_models(
|
||||
db_session=db_session,
|
||||
provider_name=request.provider_name,
|
||||
models=[
|
||||
SyncModelEntry(
|
||||
name=r.name,
|
||||
display_name=r.display_name,
|
||||
max_input_tokens=r.max_input_tokens,
|
||||
supports_image_input=r.supports_image_input,
|
||||
)
|
||||
for r in sorted_results
|
||||
],
|
||||
source_label="OpenAI Compatible",
|
||||
)
|
||||
|
||||
return sorted_results
|
||||
|
||||
|
||||
def _get_openai_compatible_server_response(
|
||||
api_base: str, api_key: str | None = None
|
||||
) -> dict:
|
||||
"""Perform GET to an OpenAI-compatible /v1/models and return parsed JSON."""
|
||||
cleaned_api_base = api_base.strip().rstrip("/")
|
||||
# Ensure we hit /v1/models
|
||||
if cleaned_api_base.endswith("/v1"):
|
||||
url = f"{cleaned_api_base}/models"
|
||||
else:
|
||||
url = f"{cleaned_api_base}/v1/models"
|
||||
|
||||
return _get_openai_compatible_models_response(
|
||||
url=url,
|
||||
source_name="OpenAI Compatible",
|
||||
api_key=api_key,
|
||||
)
|
||||
|
||||
@@ -79,9 +79,7 @@ class LLMProviderDescriptor(BaseModel):
|
||||
provider=provider,
|
||||
provider_display_name=get_provider_display_name(provider),
|
||||
model_configurations=filter_model_configurations(
|
||||
llm_provider_model.model_configurations,
|
||||
provider,
|
||||
use_stored_display_name=llm_provider_model.custom_config is not None,
|
||||
llm_provider_model.model_configurations, provider
|
||||
),
|
||||
)
|
||||
|
||||
@@ -158,9 +156,7 @@ class LLMProviderView(LLMProvider):
|
||||
personas=personas,
|
||||
deployment_name=llm_provider_model.deployment_name,
|
||||
model_configurations=filter_model_configurations(
|
||||
llm_provider_model.model_configurations,
|
||||
provider,
|
||||
use_stored_display_name=llm_provider_model.custom_config is not None,
|
||||
llm_provider_model.model_configurations, provider
|
||||
),
|
||||
)
|
||||
|
||||
@@ -202,13 +198,13 @@ class ModelConfigurationView(BaseModel):
|
||||
cls,
|
||||
model_configuration_model: "ModelConfigurationModel",
|
||||
provider_name: str,
|
||||
use_stored_display_name: bool = False,
|
||||
) -> "ModelConfigurationView":
|
||||
# For dynamic providers (OpenRouter, Bedrock, Ollama) and custom-config
|
||||
# providers, use the display_name stored in DB. Skip LiteLLM parsing.
|
||||
# For dynamic providers (OpenRouter, Bedrock, Ollama), use the display_name
|
||||
# stored in DB from the source API. Skip LiteLLM parsing entirely.
|
||||
if (
|
||||
provider_name in DYNAMIC_LLM_PROVIDERS or use_stored_display_name
|
||||
) and model_configuration_model.display_name:
|
||||
provider_name in DYNAMIC_LLM_PROVIDERS
|
||||
and model_configuration_model.display_name
|
||||
):
|
||||
# Extract vendor from model name for grouping (e.g., "Anthropic", "OpenAI")
|
||||
vendor = extract_vendor_from_model_name(
|
||||
model_configuration_model.name, provider_name
|
||||
@@ -468,18 +464,3 @@ class BifrostFinalModelResponse(BaseModel):
|
||||
max_input_tokens: int | None
|
||||
supports_image_input: bool
|
||||
supports_reasoning: bool
|
||||
|
||||
|
||||
# OpenAI Compatible dynamic models fetch
|
||||
class OpenAICompatibleModelsRequest(BaseModel):
|
||||
api_base: str
|
||||
api_key: str | None = None
|
||||
provider_name: str | None = None # Optional: to save models to existing provider
|
||||
|
||||
|
||||
class OpenAICompatibleFinalModelResponse(BaseModel):
|
||||
name: str # Model ID (e.g. "meta-llama/Llama-3-8B-Instruct")
|
||||
display_name: str # Human-readable name from API
|
||||
max_input_tokens: int | None
|
||||
supports_image_input: bool
|
||||
supports_reasoning: bool
|
||||
|
||||
@@ -26,7 +26,6 @@ DYNAMIC_LLM_PROVIDERS = frozenset(
|
||||
LlmProviderNames.OLLAMA_CHAT,
|
||||
LlmProviderNames.LM_STUDIO,
|
||||
LlmProviderNames.BIFROST,
|
||||
LlmProviderNames.OPENAI_COMPATIBLE,
|
||||
}
|
||||
)
|
||||
|
||||
@@ -309,15 +308,12 @@ def should_filter_as_dated_duplicate(
|
||||
def filter_model_configurations(
|
||||
model_configurations: list,
|
||||
provider: str,
|
||||
use_stored_display_name: bool = False,
|
||||
) -> list:
|
||||
"""Filter out obsolete and dated duplicate models from configurations.
|
||||
|
||||
Args:
|
||||
model_configurations: List of ModelConfiguration DB models
|
||||
provider: The provider name (e.g., "openai", "anthropic")
|
||||
use_stored_display_name: If True, prefer the display_name stored in the
|
||||
DB over LiteLLM enrichments. Set for custom-config providers.
|
||||
|
||||
Returns:
|
||||
List of ModelConfigurationView objects with obsolete/duplicate models removed
|
||||
@@ -337,9 +333,7 @@ def filter_model_configurations(
|
||||
if should_filter_as_dated_duplicate(model_configuration.name, all_model_names):
|
||||
continue
|
||||
filtered_configs.append(
|
||||
ModelConfigurationView.from_model(
|
||||
model_configuration, provider, use_stored_display_name
|
||||
)
|
||||
ModelConfigurationView.from_model(model_configuration, provider)
|
||||
)
|
||||
|
||||
return filtered_configs
|
||||
|
||||
@@ -26,6 +26,7 @@ _DEFAULT_PORTS: dict[str, int] = {
|
||||
"monitoring": 9096,
|
||||
"docfetching": 9092,
|
||||
"docprocessing": 9093,
|
||||
"heavy": 9094,
|
||||
}
|
||||
|
||||
_server_started = False
|
||||
|
||||
@@ -186,7 +186,7 @@ class TestDocumentIndexNew:
|
||||
)
|
||||
document_index.index(chunks=[pre_chunk], indexing_metadata=pre_metadata)
|
||||
|
||||
time.sleep(2)
|
||||
time.sleep(1)
|
||||
|
||||
# Now index a batch with the existing doc and a new doc.
|
||||
chunks = [
|
||||
|
||||
@@ -9,7 +9,6 @@ This test verifies the full flow: provisioning failure → rollback → schema c
|
||||
"""
|
||||
|
||||
import uuid
|
||||
from unittest.mock import MagicMock
|
||||
from unittest.mock import patch
|
||||
|
||||
from sqlalchemy import text
|
||||
@@ -56,28 +55,18 @@ class TestTenantProvisioningRollback:
|
||||
created_tenant_id = tenant_id
|
||||
return create_schema_if_not_exists(tenant_id)
|
||||
|
||||
# Mock setup_tenant to fail after schema creation.
|
||||
# Also mock the Redis lock so the test doesn't compete with a live
|
||||
# monitoring worker that may already hold the provision lock.
|
||||
mock_lock = MagicMock()
|
||||
mock_lock.acquire.return_value = True
|
||||
|
||||
# Mock setup_tenant to fail after schema creation
|
||||
with patch(
|
||||
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.get_redis_client"
|
||||
) as mock_redis:
|
||||
mock_redis.return_value.lock.return_value = mock_lock
|
||||
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.setup_tenant"
|
||||
) as mock_setup:
|
||||
mock_setup.side_effect = Exception("Simulated provisioning failure")
|
||||
|
||||
with patch(
|
||||
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.setup_tenant"
|
||||
) as mock_setup:
|
||||
mock_setup.side_effect = Exception("Simulated provisioning failure")
|
||||
|
||||
with patch(
|
||||
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.create_schema_if_not_exists",
|
||||
side_effect=track_schema_creation,
|
||||
):
|
||||
# Run pre-provisioning - it should fail and trigger rollback
|
||||
pre_provision_tenant()
|
||||
"ee.onyx.background.celery.tasks.tenant_provisioning.tasks.create_schema_if_not_exists",
|
||||
side_effect=track_schema_creation,
|
||||
):
|
||||
# Run pre-provisioning - it should fail and trigger rollback
|
||||
pre_provision_tenant()
|
||||
|
||||
# Verify that the schema was created and then cleaned up
|
||||
assert created_tenant_id is not None, "Schema should have been created"
|
||||
|
||||
@@ -1,58 +0,0 @@
|
||||
import pytest
|
||||
|
||||
from onyx.configs.constants import MASK_CREDENTIAL_CHAR
|
||||
from onyx.db.federated import _reject_masked_credentials
|
||||
|
||||
|
||||
class TestRejectMaskedCredentials:
|
||||
"""Verify that masked credential values are never accepted for DB writes.
|
||||
|
||||
mask_string() has two output formats:
|
||||
- Short strings (< 14 chars): "••••••••••••" (U+2022 BULLET)
|
||||
- Long strings (>= 14 chars): "abcd...wxyz" (first4 + "..." + last4)
|
||||
_reject_masked_credentials must catch both.
|
||||
"""
|
||||
|
||||
def test_rejects_fully_masked_value(self) -> None:
|
||||
masked = MASK_CREDENTIAL_CHAR * 12 # "••••••••••••"
|
||||
with pytest.raises(ValueError, match="masked placeholder"):
|
||||
_reject_masked_credentials({"client_id": masked})
|
||||
|
||||
def test_rejects_long_string_masked_value(self) -> None:
|
||||
"""mask_string returns 'first4...last4' for long strings — the real
|
||||
format used for OAuth credentials like client_id and client_secret."""
|
||||
with pytest.raises(ValueError, match="masked placeholder"):
|
||||
_reject_masked_credentials({"client_id": "1234...7890"})
|
||||
|
||||
def test_rejects_when_any_field_is_masked(self) -> None:
|
||||
"""Even if client_id is real, a masked client_secret must be caught."""
|
||||
with pytest.raises(ValueError, match="client_secret"):
|
||||
_reject_masked_credentials(
|
||||
{
|
||||
"client_id": "1234567890.1234567890",
|
||||
"client_secret": MASK_CREDENTIAL_CHAR * 12,
|
||||
}
|
||||
)
|
||||
|
||||
def test_accepts_real_credentials(self) -> None:
|
||||
# Should not raise
|
||||
_reject_masked_credentials(
|
||||
{
|
||||
"client_id": "1234567890.1234567890",
|
||||
"client_secret": "test_client_secret_value",
|
||||
}
|
||||
)
|
||||
|
||||
def test_accepts_empty_dict(self) -> None:
|
||||
# Should not raise — empty credentials are handled elsewhere
|
||||
_reject_masked_credentials({})
|
||||
|
||||
def test_ignores_non_string_values(self) -> None:
|
||||
# Non-string values (None, bool, int) should pass through
|
||||
_reject_masked_credentials(
|
||||
{
|
||||
"client_id": "real_value",
|
||||
"redirect_uri": None,
|
||||
"some_flag": True,
|
||||
}
|
||||
)
|
||||
@@ -1,318 +0,0 @@
|
||||
"""Unit tests for Notion connector handling of people properties and table blocks.
|
||||
|
||||
Reproduces two bugs:
|
||||
1. ENG-3970: People-type database properties (user mentions) are not extracted —
|
||||
the user's "name" field is lost when _recurse_properties drills into the
|
||||
"person" sub-dict.
|
||||
2. ENG-3971: Inline table blocks (table/table_row) are not indexed — table_row
|
||||
blocks store content in "cells" rather than "rich_text", so no text is extracted.
|
||||
"""
|
||||
|
||||
from unittest.mock import patch
|
||||
|
||||
from onyx.connectors.notion.connector import NotionConnector
|
||||
|
||||
|
||||
def _make_connector() -> NotionConnector:
|
||||
connector = NotionConnector()
|
||||
connector.load_credentials({"notion_integration_token": "fake-token"})
|
||||
return connector
|
||||
|
||||
|
||||
class TestPeoplePropertyExtraction:
|
||||
"""ENG-3970: Verifies that 'people' type database properties extract user names."""
|
||||
|
||||
def test_single_person_property(self) -> None:
|
||||
"""A database cell with a single @mention should extract the user name."""
|
||||
properties = {
|
||||
"Team Lead": {
|
||||
"id": "abc",
|
||||
"type": "people",
|
||||
"people": [
|
||||
{
|
||||
"object": "user",
|
||||
"id": "user-uuid-1",
|
||||
"name": "Arturo Martinez",
|
||||
"type": "person",
|
||||
"person": {"email": "arturo@example.com"},
|
||||
}
|
||||
],
|
||||
}
|
||||
}
|
||||
result = NotionConnector._properties_to_str(properties)
|
||||
assert (
|
||||
"Arturo Martinez" in result
|
||||
), f"Expected 'Arturo Martinez' in extracted text, got: {result!r}"
|
||||
|
||||
def test_multiple_people_property(self) -> None:
|
||||
"""A database cell with multiple @mentions should extract all user names."""
|
||||
properties = {
|
||||
"Members": {
|
||||
"id": "def",
|
||||
"type": "people",
|
||||
"people": [
|
||||
{
|
||||
"object": "user",
|
||||
"id": "user-uuid-1",
|
||||
"name": "Arturo Martinez",
|
||||
"type": "person",
|
||||
"person": {"email": "arturo@example.com"},
|
||||
},
|
||||
{
|
||||
"object": "user",
|
||||
"id": "user-uuid-2",
|
||||
"name": "Jane Smith",
|
||||
"type": "person",
|
||||
"person": {"email": "jane@example.com"},
|
||||
},
|
||||
],
|
||||
}
|
||||
}
|
||||
result = NotionConnector._properties_to_str(properties)
|
||||
assert (
|
||||
"Arturo Martinez" in result
|
||||
), f"Expected 'Arturo Martinez' in extracted text, got: {result!r}"
|
||||
assert (
|
||||
"Jane Smith" in result
|
||||
), f"Expected 'Jane Smith' in extracted text, got: {result!r}"
|
||||
|
||||
def test_bot_user_property(self) -> None:
|
||||
"""Bot users (integrations) have 'type': 'bot' — name should still be extracted."""
|
||||
properties = {
|
||||
"Created By": {
|
||||
"id": "ghi",
|
||||
"type": "people",
|
||||
"people": [
|
||||
{
|
||||
"object": "user",
|
||||
"id": "bot-uuid-1",
|
||||
"name": "Onyx Integration",
|
||||
"type": "bot",
|
||||
"bot": {},
|
||||
}
|
||||
],
|
||||
}
|
||||
}
|
||||
result = NotionConnector._properties_to_str(properties)
|
||||
assert (
|
||||
"Onyx Integration" in result
|
||||
), f"Expected 'Onyx Integration' in extracted text, got: {result!r}"
|
||||
|
||||
def test_person_without_person_details(self) -> None:
|
||||
"""Some user objects may have an empty/null person sub-dict."""
|
||||
properties = {
|
||||
"Assignee": {
|
||||
"id": "jkl",
|
||||
"type": "people",
|
||||
"people": [
|
||||
{
|
||||
"object": "user",
|
||||
"id": "user-uuid-3",
|
||||
"name": "Ghost User",
|
||||
"type": "person",
|
||||
"person": {},
|
||||
}
|
||||
],
|
||||
}
|
||||
}
|
||||
result = NotionConnector._properties_to_str(properties)
|
||||
assert (
|
||||
"Ghost User" in result
|
||||
), f"Expected 'Ghost User' in extracted text, got: {result!r}"
|
||||
|
||||
def test_people_mixed_with_other_properties(self) -> None:
|
||||
"""People property should work alongside other property types."""
|
||||
properties = {
|
||||
"Name": {
|
||||
"id": "aaa",
|
||||
"type": "title",
|
||||
"title": [
|
||||
{
|
||||
"plain_text": "Project Alpha",
|
||||
"type": "text",
|
||||
"text": {"content": "Project Alpha"},
|
||||
}
|
||||
],
|
||||
},
|
||||
"Lead": {
|
||||
"id": "bbb",
|
||||
"type": "people",
|
||||
"people": [
|
||||
{
|
||||
"object": "user",
|
||||
"id": "user-uuid-1",
|
||||
"name": "Arturo Martinez",
|
||||
"type": "person",
|
||||
"person": {"email": "arturo@example.com"},
|
||||
}
|
||||
],
|
||||
},
|
||||
"Status": {
|
||||
"id": "ccc",
|
||||
"type": "status",
|
||||
"status": {"name": "In Progress", "id": "status-1"},
|
||||
},
|
||||
}
|
||||
result = NotionConnector._properties_to_str(properties)
|
||||
assert "Arturo Martinez" in result
|
||||
assert "In Progress" in result
|
||||
|
||||
|
||||
class TestTableBlockExtraction:
|
||||
"""ENG-3971: Verifies that inline table blocks (table/table_row) are indexed."""
|
||||
|
||||
def _make_blocks_response(self, results: list) -> dict:
|
||||
return {"results": results, "next_cursor": None}
|
||||
|
||||
def test_table_row_cells_are_extracted(self) -> None:
|
||||
"""table_row blocks store content in 'cells', not 'rich_text'.
|
||||
The connector should extract text from cells."""
|
||||
connector = _make_connector()
|
||||
connector.workspace_id = "ws-1"
|
||||
|
||||
table_block = {
|
||||
"id": "table-block-1",
|
||||
"type": "table",
|
||||
"table": {
|
||||
"has_column_header": True,
|
||||
"has_row_header": False,
|
||||
"table_width": 3,
|
||||
},
|
||||
"has_children": True,
|
||||
}
|
||||
|
||||
header_row = {
|
||||
"id": "row-1",
|
||||
"type": "table_row",
|
||||
"table_row": {
|
||||
"cells": [
|
||||
[
|
||||
{
|
||||
"type": "text",
|
||||
"text": {"content": "Name"},
|
||||
"plain_text": "Name",
|
||||
}
|
||||
],
|
||||
[
|
||||
{
|
||||
"type": "text",
|
||||
"text": {"content": "Role"},
|
||||
"plain_text": "Role",
|
||||
}
|
||||
],
|
||||
[
|
||||
{
|
||||
"type": "text",
|
||||
"text": {"content": "Team"},
|
||||
"plain_text": "Team",
|
||||
}
|
||||
],
|
||||
]
|
||||
},
|
||||
"has_children": False,
|
||||
}
|
||||
|
||||
data_row = {
|
||||
"id": "row-2",
|
||||
"type": "table_row",
|
||||
"table_row": {
|
||||
"cells": [
|
||||
[
|
||||
{
|
||||
"type": "text",
|
||||
"text": {"content": "Arturo Martinez"},
|
||||
"plain_text": "Arturo Martinez",
|
||||
}
|
||||
],
|
||||
[
|
||||
{
|
||||
"type": "text",
|
||||
"text": {"content": "Engineer"},
|
||||
"plain_text": "Engineer",
|
||||
}
|
||||
],
|
||||
[
|
||||
{
|
||||
"type": "text",
|
||||
"text": {"content": "Platform"},
|
||||
"plain_text": "Platform",
|
||||
}
|
||||
],
|
||||
]
|
||||
},
|
||||
"has_children": False,
|
||||
}
|
||||
|
||||
with patch.object(
|
||||
connector,
|
||||
"_fetch_child_blocks",
|
||||
side_effect=[
|
||||
self._make_blocks_response([table_block]),
|
||||
self._make_blocks_response([header_row, data_row]),
|
||||
],
|
||||
):
|
||||
output = connector._read_blocks("page-1")
|
||||
|
||||
all_text = " ".join(block.text for block in output.blocks)
|
||||
assert "Arturo Martinez" in all_text, (
|
||||
f"Expected 'Arturo Martinez' in table row text, got blocks: "
|
||||
f"{[(b.id, b.text) for b in output.blocks]}"
|
||||
)
|
||||
assert "Engineer" in all_text, (
|
||||
f"Expected 'Engineer' in table row text, got blocks: "
|
||||
f"{[(b.id, b.text) for b in output.blocks]}"
|
||||
)
|
||||
assert "Platform" in all_text, (
|
||||
f"Expected 'Platform' in table row text, got blocks: "
|
||||
f"{[(b.id, b.text) for b in output.blocks]}"
|
||||
)
|
||||
|
||||
def test_table_with_empty_cells(self) -> None:
|
||||
"""Table rows with some empty cells should still extract non-empty content."""
|
||||
connector = _make_connector()
|
||||
connector.workspace_id = "ws-1"
|
||||
|
||||
table_block = {
|
||||
"id": "table-block-2",
|
||||
"type": "table",
|
||||
"table": {
|
||||
"has_column_header": False,
|
||||
"has_row_header": False,
|
||||
"table_width": 2,
|
||||
},
|
||||
"has_children": True,
|
||||
}
|
||||
|
||||
row_with_empty = {
|
||||
"id": "row-3",
|
||||
"type": "table_row",
|
||||
"table_row": {
|
||||
"cells": [
|
||||
[
|
||||
{
|
||||
"type": "text",
|
||||
"text": {"content": "Has Value"},
|
||||
"plain_text": "Has Value",
|
||||
}
|
||||
],
|
||||
[], # empty cell
|
||||
]
|
||||
},
|
||||
"has_children": False,
|
||||
}
|
||||
|
||||
with patch.object(
|
||||
connector,
|
||||
"_fetch_child_blocks",
|
||||
side_effect=[
|
||||
self._make_blocks_response([table_block]),
|
||||
self._make_blocks_response([row_with_empty]),
|
||||
],
|
||||
):
|
||||
output = connector._read_blocks("page-2")
|
||||
|
||||
all_text = " ".join(block.text for block in output.blocks)
|
||||
assert "Has Value" in all_text, (
|
||||
f"Expected 'Has Value' in table row text, got blocks: "
|
||||
f"{[(b.id, b.text) for b in output.blocks]}"
|
||||
)
|
||||
@@ -1,100 +0,0 @@
|
||||
"""Regression tests for delete_messages_and_files_from_chat_session.
|
||||
|
||||
Verifies that user-owned files (those with user_file_id) are never deleted
|
||||
during chat session cleanup — only chat-only files should be removed.
|
||||
"""
|
||||
|
||||
from unittest.mock import call
|
||||
from unittest.mock import MagicMock
|
||||
from unittest.mock import patch
|
||||
from uuid import uuid4
|
||||
|
||||
from onyx.db.chat import delete_messages_and_files_from_chat_session
|
||||
|
||||
_MODULE = "onyx.db.chat"
|
||||
|
||||
|
||||
def _make_db_session(
|
||||
rows: list[tuple[int, list[dict[str, str]] | None]],
|
||||
) -> MagicMock:
|
||||
db_session = MagicMock()
|
||||
db_session.execute.return_value.tuples.return_value.all.return_value = rows
|
||||
return db_session
|
||||
|
||||
|
||||
@patch(f"{_MODULE}.delete_orphaned_search_docs")
|
||||
@patch(f"{_MODULE}.get_default_file_store")
|
||||
def test_user_files_are_not_deleted(
|
||||
mock_get_file_store: MagicMock,
|
||||
_mock_orphan_cleanup: MagicMock,
|
||||
) -> None:
|
||||
"""User files (with user_file_id) must be skipped during cleanup."""
|
||||
file_store = MagicMock()
|
||||
mock_get_file_store.return_value = file_store
|
||||
|
||||
db_session = _make_db_session(
|
||||
[
|
||||
(
|
||||
1,
|
||||
[
|
||||
{"id": "chat-file-1", "type": "image"},
|
||||
{"id": "user-file-1", "type": "document", "user_file_id": "uf-1"},
|
||||
{"id": "chat-file-2", "type": "image"},
|
||||
],
|
||||
),
|
||||
]
|
||||
)
|
||||
|
||||
delete_messages_and_files_from_chat_session(uuid4(), db_session)
|
||||
|
||||
assert file_store.delete_file.call_count == 2
|
||||
file_store.delete_file.assert_has_calls(
|
||||
[
|
||||
call(file_id="chat-file-1", error_on_missing=False),
|
||||
call(file_id="chat-file-2", error_on_missing=False),
|
||||
]
|
||||
)
|
||||
|
||||
|
||||
@patch(f"{_MODULE}.delete_orphaned_search_docs")
|
||||
@patch(f"{_MODULE}.get_default_file_store")
|
||||
def test_only_user_files_means_no_deletions(
|
||||
mock_get_file_store: MagicMock,
|
||||
_mock_orphan_cleanup: MagicMock,
|
||||
) -> None:
|
||||
"""When every file in the session is a user file, nothing should be deleted."""
|
||||
file_store = MagicMock()
|
||||
mock_get_file_store.return_value = file_store
|
||||
|
||||
db_session = _make_db_session(
|
||||
[
|
||||
(1, [{"id": "uf-a", "type": "document", "user_file_id": "uf-1"}]),
|
||||
(2, [{"id": "uf-b", "type": "document", "user_file_id": "uf-2"}]),
|
||||
]
|
||||
)
|
||||
|
||||
delete_messages_and_files_from_chat_session(uuid4(), db_session)
|
||||
|
||||
file_store.delete_file.assert_not_called()
|
||||
|
||||
|
||||
@patch(f"{_MODULE}.delete_orphaned_search_docs")
|
||||
@patch(f"{_MODULE}.get_default_file_store")
|
||||
def test_messages_with_no_files(
|
||||
mock_get_file_store: MagicMock,
|
||||
_mock_orphan_cleanup: MagicMock,
|
||||
) -> None:
|
||||
"""Messages with None or empty file lists should not trigger any deletions."""
|
||||
file_store = MagicMock()
|
||||
mock_get_file_store.return_value = file_store
|
||||
|
||||
db_session = _make_db_session(
|
||||
[
|
||||
(1, None),
|
||||
(2, []),
|
||||
]
|
||||
)
|
||||
|
||||
delete_messages_and_files_from_chat_session(uuid4(), db_session)
|
||||
|
||||
file_store.delete_file.assert_not_called()
|
||||
@@ -1,203 +0,0 @@
|
||||
import pytest
|
||||
|
||||
from onyx.document_index.interfaces_new import TenantState
|
||||
from onyx.document_index.opensearch.constants import DEFAULT_MAX_CHUNK_SIZE
|
||||
from onyx.document_index.opensearch.schema import get_opensearch_doc_chunk_id
|
||||
from onyx.document_index.opensearch.string_filtering import (
|
||||
MAX_DOCUMENT_ID_ENCODED_LENGTH,
|
||||
)
|
||||
from shared_configs.configs import POSTGRES_DEFAULT_SCHEMA_STANDARD_VALUE
|
||||
|
||||
|
||||
SINGLE_TENANT_STATE = TenantState(
|
||||
tenant_id=POSTGRES_DEFAULT_SCHEMA_STANDARD_VALUE, multitenant=False
|
||||
)
|
||||
MULTI_TENANT_STATE = TenantState(
|
||||
tenant_id="tenant_abcdef12-3456-7890-abcd-ef1234567890", multitenant=True
|
||||
)
|
||||
EXPECTED_SHORT_TENANT = "abcdef12"
|
||||
|
||||
|
||||
class TestGetOpensearchDocChunkIdSingleTenant:
|
||||
def test_basic(self) -> None:
|
||||
result = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, "my-doc-id", chunk_index=0
|
||||
)
|
||||
assert result == f"my-doc-id__{DEFAULT_MAX_CHUNK_SIZE}__0"
|
||||
|
||||
def test_custom_chunk_size(self) -> None:
|
||||
result = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, "doc1", chunk_index=3, max_chunk_size=1024
|
||||
)
|
||||
assert result == "doc1__1024__3"
|
||||
|
||||
def test_special_chars_are_stripped(self) -> None:
|
||||
"""Tests characters not matching [A-Za-z0-9_.-~] are removed."""
|
||||
result = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, "doc/with?special#chars&more%stuff", chunk_index=0
|
||||
)
|
||||
assert "/" not in result
|
||||
assert "?" not in result
|
||||
assert "#" not in result
|
||||
assert result == f"docwithspecialcharsmorestuff__{DEFAULT_MAX_CHUNK_SIZE}__0"
|
||||
|
||||
def test_short_doc_id_not_hashed(self) -> None:
|
||||
"""
|
||||
Tests that a short doc ID should appear directly in the result, not as a
|
||||
hash.
|
||||
"""
|
||||
doc_id = "short-id"
|
||||
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, doc_id, chunk_index=0)
|
||||
assert "short-id" in result
|
||||
|
||||
def test_long_doc_id_is_hashed(self) -> None:
|
||||
"""
|
||||
Tests that a doc ID exceeding the max length should be replaced with a
|
||||
blake2b hash.
|
||||
"""
|
||||
# Create a doc ID that will exceed max length after the suffix is
|
||||
# appended.
|
||||
doc_id = "a" * MAX_DOCUMENT_ID_ENCODED_LENGTH
|
||||
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, doc_id, chunk_index=0)
|
||||
# The original doc ID should NOT appear in the result.
|
||||
assert doc_id not in result
|
||||
# The suffix should still be present.
|
||||
assert f"__{DEFAULT_MAX_CHUNK_SIZE}__0" in result
|
||||
|
||||
def test_long_doc_id_hash_is_deterministic(self) -> None:
|
||||
doc_id = "x" * MAX_DOCUMENT_ID_ENCODED_LENGTH
|
||||
result1 = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, doc_id, chunk_index=5
|
||||
)
|
||||
result2 = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, doc_id, chunk_index=5
|
||||
)
|
||||
assert result1 == result2
|
||||
|
||||
def test_long_doc_id_different_inputs_produce_different_hashes(self) -> None:
|
||||
doc_id_a = "a" * MAX_DOCUMENT_ID_ENCODED_LENGTH
|
||||
doc_id_b = "b" * MAX_DOCUMENT_ID_ENCODED_LENGTH
|
||||
result_a = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, doc_id_a, chunk_index=0
|
||||
)
|
||||
result_b = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, doc_id_b, chunk_index=0
|
||||
)
|
||||
assert result_a != result_b
|
||||
|
||||
def test_result_never_exceeds_max_length(self) -> None:
|
||||
"""
|
||||
Tests that the final result should always be under
|
||||
MAX_DOCUMENT_ID_ENCODED_LENGTH bytes.
|
||||
"""
|
||||
doc_id = "z" * (MAX_DOCUMENT_ID_ENCODED_LENGTH * 2)
|
||||
result = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, doc_id, chunk_index=999, max_chunk_size=99999
|
||||
)
|
||||
assert len(result.encode("utf-8")) < MAX_DOCUMENT_ID_ENCODED_LENGTH
|
||||
|
||||
def test_no_tenant_prefix_in_single_tenant(self) -> None:
|
||||
result = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, "mydoc", chunk_index=0
|
||||
)
|
||||
assert not result.startswith(SINGLE_TENANT_STATE.tenant_id)
|
||||
|
||||
|
||||
class TestGetOpensearchDocChunkIdMultiTenant:
|
||||
def test_includes_tenant_prefix(self) -> None:
|
||||
result = get_opensearch_doc_chunk_id(MULTI_TENANT_STATE, "mydoc", chunk_index=0)
|
||||
assert result.startswith(f"{EXPECTED_SHORT_TENANT}__")
|
||||
|
||||
def test_format(self) -> None:
|
||||
result = get_opensearch_doc_chunk_id(
|
||||
MULTI_TENANT_STATE, "mydoc", chunk_index=2, max_chunk_size=256
|
||||
)
|
||||
assert result == f"{EXPECTED_SHORT_TENANT}__mydoc__256__2"
|
||||
|
||||
def test_long_doc_id_is_hashed_multitenant(self) -> None:
|
||||
doc_id = "d" * MAX_DOCUMENT_ID_ENCODED_LENGTH
|
||||
result = get_opensearch_doc_chunk_id(MULTI_TENANT_STATE, doc_id, chunk_index=0)
|
||||
# Should still have tenant prefix.
|
||||
assert result.startswith(f"{EXPECTED_SHORT_TENANT}__")
|
||||
# The original doc ID should NOT appear in the result.
|
||||
assert doc_id not in result
|
||||
# The suffix should still be present.
|
||||
assert f"__{DEFAULT_MAX_CHUNK_SIZE}__0" in result
|
||||
|
||||
def test_result_never_exceeds_max_length_multitenant(self) -> None:
|
||||
doc_id = "q" * (MAX_DOCUMENT_ID_ENCODED_LENGTH * 2)
|
||||
result = get_opensearch_doc_chunk_id(
|
||||
MULTI_TENANT_STATE, doc_id, chunk_index=999, max_chunk_size=99999
|
||||
)
|
||||
assert len(result.encode("utf-8")) < MAX_DOCUMENT_ID_ENCODED_LENGTH
|
||||
|
||||
def test_different_tenants_produce_different_ids(self) -> None:
|
||||
tenant_a = TenantState(
|
||||
tenant_id="tenant_aaaaaaaa-0000-0000-0000-000000000000", multitenant=True
|
||||
)
|
||||
tenant_b = TenantState(
|
||||
tenant_id="tenant_bbbbbbbb-0000-0000-0000-000000000000", multitenant=True
|
||||
)
|
||||
result_a = get_opensearch_doc_chunk_id(tenant_a, "same-doc", chunk_index=0)
|
||||
result_b = get_opensearch_doc_chunk_id(tenant_b, "same-doc", chunk_index=0)
|
||||
assert result_a != result_b
|
||||
|
||||
|
||||
class TestGetOpensearchDocChunkIdEdgeCases:
|
||||
def test_chunk_index_zero(self) -> None:
|
||||
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, "doc", chunk_index=0)
|
||||
assert result.endswith("__0")
|
||||
|
||||
def test_large_chunk_index(self) -> None:
|
||||
result = get_opensearch_doc_chunk_id(
|
||||
SINGLE_TENANT_STATE, "doc", chunk_index=99999
|
||||
)
|
||||
assert result.endswith("__99999")
|
||||
|
||||
def test_doc_id_with_only_special_chars_raises(self) -> None:
|
||||
"""
|
||||
Tests that a doc ID that becomes empty after filtering should raise
|
||||
ValueError.
|
||||
"""
|
||||
with pytest.raises(ValueError, match="empty after filtering"):
|
||||
get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, "###???///", chunk_index=0)
|
||||
|
||||
def test_doc_id_at_boundary_length(self) -> None:
|
||||
"""
|
||||
Tests that a doc ID right at the boundary should not be hashed.
|
||||
"""
|
||||
suffix = f"__{DEFAULT_MAX_CHUNK_SIZE}__0"
|
||||
suffix_len = len(suffix.encode("utf-8"))
|
||||
# Max doc ID length that won't trigger hashing (must be <
|
||||
# max_encoded_length).
|
||||
max_doc_len = MAX_DOCUMENT_ID_ENCODED_LENGTH - suffix_len - 1
|
||||
doc_id = "a" * max_doc_len
|
||||
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, doc_id, chunk_index=0)
|
||||
assert doc_id in result
|
||||
|
||||
def test_doc_id_at_boundary_length_multitenant(self) -> None:
|
||||
"""
|
||||
Tests that a doc ID right at the boundary should not be hashed in
|
||||
multitenant mode.
|
||||
"""
|
||||
suffix = f"__{DEFAULT_MAX_CHUNK_SIZE}__0"
|
||||
suffix_len = len(suffix.encode("utf-8"))
|
||||
prefix = f"{EXPECTED_SHORT_TENANT}__"
|
||||
prefix_len = len(prefix.encode("utf-8"))
|
||||
# Max doc ID length that won't trigger hashing (must be <
|
||||
# max_encoded_length).
|
||||
max_doc_len = MAX_DOCUMENT_ID_ENCODED_LENGTH - suffix_len - prefix_len - 1
|
||||
doc_id = "a" * max_doc_len
|
||||
result = get_opensearch_doc_chunk_id(MULTI_TENANT_STATE, doc_id, chunk_index=0)
|
||||
assert doc_id in result
|
||||
|
||||
def test_doc_id_one_over_boundary_is_hashed(self) -> None:
|
||||
"""
|
||||
Tests that a doc ID one byte over the boundary should be hashed.
|
||||
"""
|
||||
suffix = f"__{DEFAULT_MAX_CHUNK_SIZE}__0"
|
||||
suffix_len = len(suffix.encode("utf-8"))
|
||||
# This length will trigger the >= check in filter_and_validate_document_id
|
||||
doc_id = "a" * (MAX_DOCUMENT_ID_ENCODED_LENGTH - suffix_len)
|
||||
result = get_opensearch_doc_chunk_id(SINGLE_TENANT_STATE, doc_id, chunk_index=0)
|
||||
assert doc_id not in result
|
||||
@@ -1,76 +0,0 @@
|
||||
%PDF-1.3
|
||||
%<25><><EFBFBD><EFBFBD>
|
||||
1 0 obj
|
||||
<<
|
||||
/Producer <1083d595b1>
|
||||
>>
|
||||
endobj
|
||||
2 0 obj
|
||||
<<
|
||||
/Type /Pages
|
||||
/Count 1
|
||||
/Kids [ 4 0 R ]
|
||||
>>
|
||||
endobj
|
||||
3 0 obj
|
||||
<<
|
||||
/Type /Catalog
|
||||
/Pages 2 0 R
|
||||
>>
|
||||
endobj
|
||||
4 0 obj
|
||||
<<
|
||||
/Type /Page
|
||||
/Resources <<
|
||||
/Font <<
|
||||
/F1 <<
|
||||
/Type /Font
|
||||
/Subtype /Type1
|
||||
/BaseFont /Helvetica
|
||||
>>
|
||||
>>
|
||||
>>
|
||||
/MediaBox [ 0.0 0.0 200 200 ]
|
||||
/Contents 5 0 R
|
||||
/Parent 2 0 R
|
||||
>>
|
||||
endobj
|
||||
5 0 obj
|
||||
<<
|
||||
/Length 42
|
||||
>>
|
||||
stream
|
||||
,N<><6~<7E>)<29><><EFBFBD><EFBFBD><EFBFBD>u<EFBFBD><0C><><EFBFBD>Zc'<27><>>8g<38><67><EFBFBD>n<EFBFBD><6E><EFBFBD><EFBFBD><EFBFBD>9"
|
||||
endstream
|
||||
endobj
|
||||
6 0 obj
|
||||
<<
|
||||
/V 2
|
||||
/R 3
|
||||
/Length 128
|
||||
/P 4294967292
|
||||
/Filter /Standard
|
||||
/O <6a340a292629053da84a6d8b19a5d505953b8b3fdac3d2d389fde0e354528d44>
|
||||
/U <d6f0dc91c7b9de264a8d708515468e6528bf4e5e4e758a4164004e56fffa0108>
|
||||
>>
|
||||
endobj
|
||||
xref
|
||||
0 7
|
||||
0000000000 65535 f
|
||||
0000000015 00000 n
|
||||
0000000059 00000 n
|
||||
0000000118 00000 n
|
||||
0000000167 00000 n
|
||||
0000000348 00000 n
|
||||
0000000440 00000 n
|
||||
trailer
|
||||
<<
|
||||
/Size 7
|
||||
/Root 3 0 R
|
||||
/Info 1 0 R
|
||||
/ID [ <6364336635356135633239323638353039306635656133623165313637366430> <6364336635356135633239323638353039306635656133623165313637366430> ]
|
||||
/Encrypt 6 0 R
|
||||
>>
|
||||
startxref
|
||||
655
|
||||
%%EOF
|
||||
@@ -54,12 +54,6 @@ class TestReadPdfFile:
|
||||
text, _, _ = read_pdf_file(_load("encrypted.pdf"), pdf_pass="wrong")
|
||||
assert text == ""
|
||||
|
||||
def test_owner_password_only_pdf_extracts_text(self) -> None:
|
||||
"""A PDF encrypted with only an owner password (no user password)
|
||||
should still yield its text content. Regression for #9754."""
|
||||
text, _, _ = read_pdf_file(_load("owner_protected.pdf"))
|
||||
assert "Hello World" in text
|
||||
|
||||
def test_empty_pdf(self) -> None:
|
||||
text, _, _ = read_pdf_file(_load("empty.pdf"))
|
||||
assert text.strip() == ""
|
||||
@@ -123,12 +117,6 @@ class TestIsPdfProtected:
|
||||
def test_protected_pdf(self) -> None:
|
||||
assert is_pdf_protected(_load("encrypted.pdf")) is True
|
||||
|
||||
def test_owner_password_only_is_not_protected(self) -> None:
|
||||
"""A PDF with only an owner password (permission restrictions) but no
|
||||
user password should NOT be considered protected — any viewer can open
|
||||
it without prompting for a password."""
|
||||
assert is_pdf_protected(_load("owner_protected.pdf")) is False
|
||||
|
||||
def test_preserves_file_position(self) -> None:
|
||||
pdf = _load("simple.pdf")
|
||||
pdf.seek(42)
|
||||
|
||||
@@ -1,79 +0,0 @@
|
||||
import io
|
||||
|
||||
from pptx import Presentation # type: ignore[import-untyped]
|
||||
from pptx.chart.data import CategoryChartData # type: ignore[import-untyped]
|
||||
from pptx.enum.chart import XL_CHART_TYPE # type: ignore[import-untyped]
|
||||
from pptx.util import Inches # type: ignore[import-untyped]
|
||||
|
||||
from onyx.file_processing.extract_file_text import pptx_to_text
|
||||
|
||||
|
||||
def _make_pptx_with_chart() -> io.BytesIO:
|
||||
"""Create an in-memory pptx with one text slide and one chart slide."""
|
||||
prs = Presentation()
|
||||
|
||||
# Slide 1: text only
|
||||
slide1 = prs.slides.add_slide(prs.slide_layouts[1])
|
||||
slide1.shapes.title.text = "Introduction"
|
||||
slide1.placeholders[1].text = "This is the first slide."
|
||||
|
||||
# Slide 2: chart
|
||||
slide2 = prs.slides.add_slide(prs.slide_layouts[5]) # Blank layout
|
||||
chart_data = CategoryChartData()
|
||||
chart_data.categories = ["Q1", "Q2", "Q3"]
|
||||
chart_data.add_series("Revenue", (100, 200, 300))
|
||||
slide2.shapes.add_chart(
|
||||
XL_CHART_TYPE.COLUMN_CLUSTERED,
|
||||
Inches(1),
|
||||
Inches(1),
|
||||
Inches(6),
|
||||
Inches(4),
|
||||
chart_data,
|
||||
)
|
||||
|
||||
buf = io.BytesIO()
|
||||
prs.save(buf)
|
||||
buf.seek(0)
|
||||
return buf
|
||||
|
||||
|
||||
def _make_pptx_without_chart() -> io.BytesIO:
|
||||
"""Create an in-memory pptx with a single text-only slide."""
|
||||
prs = Presentation()
|
||||
slide = prs.slides.add_slide(prs.slide_layouts[1])
|
||||
slide.shapes.title.text = "Hello World"
|
||||
slide.placeholders[1].text = "Some content here."
|
||||
|
||||
buf = io.BytesIO()
|
||||
prs.save(buf)
|
||||
buf.seek(0)
|
||||
return buf
|
||||
|
||||
|
||||
class TestPptxToText:
|
||||
def test_chart_is_omitted(self) -> None:
|
||||
# Precondition
|
||||
pptx_file = _make_pptx_with_chart()
|
||||
|
||||
# Under test
|
||||
result = pptx_to_text(pptx_file)
|
||||
|
||||
# Postcondition
|
||||
assert "Introduction" in result
|
||||
assert "first slide" in result
|
||||
assert "[chart omitted]" in result
|
||||
# The actual chart data should NOT appear in the output.
|
||||
assert "Revenue" not in result
|
||||
assert "Q1" not in result
|
||||
|
||||
def test_text_only_pptx(self) -> None:
|
||||
# Precondition
|
||||
pptx_file = _make_pptx_without_chart()
|
||||
|
||||
# Under test
|
||||
result = pptx_to_text(pptx_file)
|
||||
|
||||
# Postcondition
|
||||
assert "Hello World" in result
|
||||
assert "Some content" in result
|
||||
assert "[chart omitted]" not in result
|
||||
@@ -1,91 +0,0 @@
|
||||
"""Tests for FileStore.delete_file error_on_missing behavior."""
|
||||
|
||||
from unittest.mock import MagicMock
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
|
||||
_S3_MODULE = "onyx.file_store.file_store"
|
||||
_PG_MODULE = "onyx.file_store.postgres_file_store"
|
||||
|
||||
|
||||
def _mock_db_session() -> MagicMock:
|
||||
session = MagicMock()
|
||||
session.__enter__ = MagicMock(return_value=session)
|
||||
session.__exit__ = MagicMock(return_value=False)
|
||||
return session
|
||||
|
||||
|
||||
# ── S3BackedFileStore ────────────────────────────────────────────────
|
||||
|
||||
|
||||
@patch(f"{_S3_MODULE}.get_session_with_current_tenant_if_none")
|
||||
@patch(f"{_S3_MODULE}.get_filerecord_by_file_id_optional", return_value=None)
|
||||
def test_s3_delete_missing_file_raises_by_default(
|
||||
_mock_get_record: MagicMock,
|
||||
mock_ctx: MagicMock,
|
||||
) -> None:
|
||||
from onyx.file_store.file_store import S3BackedFileStore
|
||||
|
||||
mock_ctx.return_value = _mock_db_session()
|
||||
store = S3BackedFileStore(bucket_name="b")
|
||||
|
||||
with pytest.raises(RuntimeError, match="does not exist"):
|
||||
store.delete_file("nonexistent")
|
||||
|
||||
|
||||
@patch(f"{_S3_MODULE}.get_session_with_current_tenant_if_none")
|
||||
@patch(f"{_S3_MODULE}.get_filerecord_by_file_id_optional", return_value=None)
|
||||
@patch(f"{_S3_MODULE}.delete_filerecord_by_file_id")
|
||||
def test_s3_delete_missing_file_silent_when_error_on_missing_false(
|
||||
mock_delete_record: MagicMock,
|
||||
_mock_get_record: MagicMock,
|
||||
mock_ctx: MagicMock,
|
||||
) -> None:
|
||||
from onyx.file_store.file_store import S3BackedFileStore
|
||||
|
||||
mock_ctx.return_value = _mock_db_session()
|
||||
store = S3BackedFileStore(bucket_name="b")
|
||||
|
||||
store.delete_file("nonexistent", error_on_missing=False)
|
||||
|
||||
mock_delete_record.assert_not_called()
|
||||
|
||||
|
||||
# ── PostgresBackedFileStore ──────────────────────────────────────────
|
||||
|
||||
|
||||
@patch(f"{_PG_MODULE}.get_session_with_current_tenant_if_none")
|
||||
@patch(f"{_PG_MODULE}.get_file_content_by_file_id_optional", return_value=None)
|
||||
def test_pg_delete_missing_file_raises_by_default(
|
||||
_mock_get_content: MagicMock,
|
||||
mock_ctx: MagicMock,
|
||||
) -> None:
|
||||
from onyx.file_store.postgres_file_store import PostgresBackedFileStore
|
||||
|
||||
mock_ctx.return_value = _mock_db_session()
|
||||
store = PostgresBackedFileStore()
|
||||
|
||||
with pytest.raises(RuntimeError, match="does not exist"):
|
||||
store.delete_file("nonexistent")
|
||||
|
||||
|
||||
@patch(f"{_PG_MODULE}.get_session_with_current_tenant_if_none")
|
||||
@patch(f"{_PG_MODULE}.get_file_content_by_file_id_optional", return_value=None)
|
||||
@patch(f"{_PG_MODULE}.delete_file_content_by_file_id")
|
||||
@patch(f"{_PG_MODULE}.delete_filerecord_by_file_id")
|
||||
def test_pg_delete_missing_file_silent_when_error_on_missing_false(
|
||||
mock_delete_record: MagicMock,
|
||||
mock_delete_content: MagicMock,
|
||||
_mock_get_content: MagicMock,
|
||||
mock_ctx: MagicMock,
|
||||
) -> None:
|
||||
from onyx.file_store.postgres_file_store import PostgresBackedFileStore
|
||||
|
||||
mock_ctx.return_value = _mock_db_session()
|
||||
store = PostgresBackedFileStore()
|
||||
|
||||
store.delete_file("nonexistent", error_on_missing=False)
|
||||
|
||||
mock_delete_record.assert_not_called()
|
||||
mock_delete_content.assert_not_called()
|
||||
@@ -98,7 +98,6 @@ Useful hardening flags:
|
||||
| `serve` | Serve the interactive chat TUI over SSH |
|
||||
| `configure` | Configure server URL and API key |
|
||||
| `validate-config` | Validate configuration and test connection |
|
||||
| `install-skill` | Install the agent skill file into a project |
|
||||
|
||||
## Slash Commands (in TUI)
|
||||
|
||||
|
||||
@@ -7,7 +7,6 @@ import (
|
||||
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/api"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/config"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
|
||||
"github.com/spf13/cobra"
|
||||
)
|
||||
|
||||
@@ -17,23 +16,16 @@ func newAgentsCmd() *cobra.Command {
|
||||
cmd := &cobra.Command{
|
||||
Use: "agents",
|
||||
Short: "List available agents",
|
||||
Long: `List all visible agents configured on the Onyx server.
|
||||
|
||||
By default, output is a human-readable table with ID, name, and description.
|
||||
Use --json for machine-readable output.`,
|
||||
Example: ` onyx-cli agents
|
||||
onyx-cli agents --json
|
||||
onyx-cli agents --json | jq '.[].name'`,
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
cfg := config.Load()
|
||||
if !cfg.IsConfigured() {
|
||||
return exitcodes.New(exitcodes.NotConfigured, "onyx CLI is not configured\n Run: onyx-cli configure")
|
||||
return fmt.Errorf("onyx CLI is not configured — run 'onyx-cli configure' first")
|
||||
}
|
||||
|
||||
client := api.NewClient(cfg)
|
||||
agents, err := client.ListAgents(cmd.Context())
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to list agents: %w\n Check your connection with: onyx-cli validate-config", err)
|
||||
return fmt.Errorf("failed to list agents: %w", err)
|
||||
}
|
||||
|
||||
if agentsJSON {
|
||||
|
||||
140
cli/cmd/ask.go
140
cli/cmd/ask.go
@@ -4,65 +4,33 @@ import (
|
||||
"context"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"os/signal"
|
||||
"strings"
|
||||
"syscall"
|
||||
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/api"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/config"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/models"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/overflow"
|
||||
"github.com/spf13/cobra"
|
||||
"golang.org/x/term"
|
||||
)
|
||||
|
||||
const defaultMaxOutputBytes = 4096
|
||||
|
||||
func newAskCmd() *cobra.Command {
|
||||
var (
|
||||
askAgentID int
|
||||
askJSON bool
|
||||
askQuiet bool
|
||||
askPrompt string
|
||||
maxOutput int
|
||||
)
|
||||
|
||||
cmd := &cobra.Command{
|
||||
Use: "ask [question]",
|
||||
Short: "Ask a one-shot question (non-interactive)",
|
||||
Long: `Send a one-shot question to an Onyx agent and print the response.
|
||||
|
||||
The question can be provided as a positional argument, via --prompt, or piped
|
||||
through stdin. When stdin contains piped data, it is sent as context along
|
||||
with the question from --prompt (or used as the question itself).
|
||||
|
||||
When stdout is not a TTY (e.g., called by a script or AI agent), output is
|
||||
automatically truncated to --max-output bytes and the full response is saved
|
||||
to a temp file. Set --max-output 0 to disable truncation.`,
|
||||
Args: cobra.MaximumNArgs(1),
|
||||
Example: ` onyx-cli ask "What connectors are available?"
|
||||
onyx-cli ask --agent-id 3 "Summarize our Q4 revenue"
|
||||
onyx-cli ask --json "List all users" | jq '.event.content'
|
||||
cat error.log | onyx-cli ask --prompt "Find the root cause"
|
||||
echo "what is onyx?" | onyx-cli ask`,
|
||||
Args: cobra.ExactArgs(1),
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
cfg := config.Load()
|
||||
if !cfg.IsConfigured() {
|
||||
return exitcodes.New(exitcodes.NotConfigured, "onyx CLI is not configured\n Run: onyx-cli configure")
|
||||
}
|
||||
|
||||
if askJSON && askQuiet {
|
||||
return exitcodes.New(exitcodes.BadRequest, "--json and --quiet cannot be used together")
|
||||
}
|
||||
|
||||
question, err := resolveQuestion(args, askPrompt)
|
||||
if err != nil {
|
||||
return err
|
||||
return fmt.Errorf("onyx CLI is not configured — run 'onyx-cli configure' first")
|
||||
}
|
||||
|
||||
question := args[0]
|
||||
agentID := cfg.DefaultAgentID
|
||||
if cmd.Flags().Changed("agent-id") {
|
||||
agentID = askAgentID
|
||||
@@ -82,23 +50,9 @@ to a temp file. Set --max-output 0 to disable truncation.`,
|
||||
nil,
|
||||
)
|
||||
|
||||
// Determine truncation threshold.
|
||||
isTTY := term.IsTerminal(int(os.Stdout.Fd()))
|
||||
truncateAt := 0 // 0 means no truncation
|
||||
if cmd.Flags().Changed("max-output") {
|
||||
truncateAt = maxOutput
|
||||
} else if !isTTY {
|
||||
truncateAt = defaultMaxOutputBytes
|
||||
}
|
||||
|
||||
var sessionID string
|
||||
var lastErr error
|
||||
gotStop := false
|
||||
|
||||
// Overflow writer: tees to stdout and optionally to a temp file.
|
||||
// In quiet mode, buffer everything and print once at the end.
|
||||
ow := &overflow.Writer{Limit: truncateAt, Quiet: askQuiet}
|
||||
|
||||
for event := range ch {
|
||||
if e, ok := event.(models.SessionCreatedEvent); ok {
|
||||
sessionID = e.ChatSessionID
|
||||
@@ -128,50 +82,22 @@ to a temp file. Set --max-output 0 to disable truncation.`,
|
||||
|
||||
switch e := event.(type) {
|
||||
case models.MessageDeltaEvent:
|
||||
ow.Write(e.Content)
|
||||
case models.SearchStartEvent:
|
||||
if isTTY && !askQuiet {
|
||||
if e.IsInternetSearch {
|
||||
fmt.Fprintf(os.Stderr, "\033[2mSearching the web...\033[0m\n")
|
||||
} else {
|
||||
fmt.Fprintf(os.Stderr, "\033[2mSearching documents...\033[0m\n")
|
||||
}
|
||||
}
|
||||
case models.SearchQueriesEvent:
|
||||
if isTTY && !askQuiet {
|
||||
for _, q := range e.Queries {
|
||||
fmt.Fprintf(os.Stderr, "\033[2m → %s\033[0m\n", q)
|
||||
}
|
||||
}
|
||||
case models.SearchDocumentsEvent:
|
||||
if isTTY && !askQuiet && len(e.Documents) > 0 {
|
||||
fmt.Fprintf(os.Stderr, "\033[2mFound %d documents\033[0m\n", len(e.Documents))
|
||||
}
|
||||
case models.ReasoningStartEvent:
|
||||
if isTTY && !askQuiet {
|
||||
fmt.Fprintf(os.Stderr, "\033[2mThinking...\033[0m\n")
|
||||
}
|
||||
case models.ToolStartEvent:
|
||||
if isTTY && !askQuiet && e.ToolName != "" {
|
||||
fmt.Fprintf(os.Stderr, "\033[2mUsing %s...\033[0m\n", e.ToolName)
|
||||
}
|
||||
fmt.Print(e.Content)
|
||||
case models.ErrorEvent:
|
||||
ow.Finish()
|
||||
return fmt.Errorf("%s", e.Error)
|
||||
case models.StopEvent:
|
||||
ow.Finish()
|
||||
fmt.Println()
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
if !askJSON {
|
||||
ow.Finish()
|
||||
}
|
||||
|
||||
if ctx.Err() != nil {
|
||||
if sessionID != "" {
|
||||
client.StopChatSession(context.Background(), sessionID)
|
||||
}
|
||||
if !askJSON {
|
||||
fmt.Println()
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
@@ -179,56 +105,20 @@ to a temp file. Set --max-output 0 to disable truncation.`,
|
||||
return lastErr
|
||||
}
|
||||
if !gotStop {
|
||||
if !askJSON {
|
||||
fmt.Println()
|
||||
}
|
||||
return fmt.Errorf("stream ended unexpectedly")
|
||||
}
|
||||
if !askJSON {
|
||||
fmt.Println()
|
||||
}
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
cmd.Flags().IntVar(&askAgentID, "agent-id", 0, "Agent ID to use")
|
||||
cmd.Flags().BoolVar(&askJSON, "json", false, "Output raw JSON events")
|
||||
cmd.Flags().BoolVarP(&askQuiet, "quiet", "q", false, "Buffer output and print once at end (no streaming)")
|
||||
cmd.Flags().StringVar(&askPrompt, "prompt", "", "Question text (use with piped stdin context)")
|
||||
cmd.Flags().IntVar(&maxOutput, "max-output", defaultMaxOutputBytes,
|
||||
"Max bytes to print before truncating (0 to disable, auto-enabled for non-TTY)")
|
||||
// Suppress cobra's default error/usage on RunE errors
|
||||
return cmd
|
||||
}
|
||||
|
||||
// resolveQuestion builds the final question string from args, --prompt, and stdin.
|
||||
func resolveQuestion(args []string, prompt string) (string, error) {
|
||||
hasArg := len(args) > 0
|
||||
hasPrompt := prompt != ""
|
||||
hasStdin := !term.IsTerminal(int(os.Stdin.Fd()))
|
||||
|
||||
if hasArg && hasPrompt {
|
||||
return "", exitcodes.New(exitcodes.BadRequest, "specify the question as an argument or --prompt, not both")
|
||||
}
|
||||
|
||||
var stdinContent string
|
||||
if hasStdin {
|
||||
const maxStdinBytes = 10 * 1024 * 1024 // 10MB
|
||||
data, err := io.ReadAll(io.LimitReader(os.Stdin, maxStdinBytes))
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("failed to read stdin: %w", err)
|
||||
}
|
||||
stdinContent = strings.TrimSpace(string(data))
|
||||
}
|
||||
|
||||
switch {
|
||||
case hasArg && stdinContent != "":
|
||||
// arg is the question, stdin is context
|
||||
return args[0] + "\n\n" + stdinContent, nil
|
||||
case hasArg:
|
||||
return args[0], nil
|
||||
case hasPrompt && stdinContent != "":
|
||||
// --prompt is the question, stdin is context
|
||||
return prompt + "\n\n" + stdinContent, nil
|
||||
case hasPrompt:
|
||||
return prompt, nil
|
||||
case stdinContent != "":
|
||||
return stdinContent, nil
|
||||
default:
|
||||
return "", exitcodes.New(exitcodes.BadRequest, "no question provided\n Usage: onyx-cli ask \"your question\"\n Or: echo \"context\" | onyx-cli ask --prompt \"your question\"")
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -10,16 +10,9 @@ import (
|
||||
)
|
||||
|
||||
func newChatCmd() *cobra.Command {
|
||||
var noStreamMarkdown bool
|
||||
|
||||
cmd := &cobra.Command{
|
||||
return &cobra.Command{
|
||||
Use: "chat",
|
||||
Short: "Launch the interactive chat TUI (default)",
|
||||
Long: `Launch the interactive terminal UI for chatting with your Onyx agent.
|
||||
This is the default command when no subcommand is specified. On first run,
|
||||
an interactive setup wizard will guide you through configuration.`,
|
||||
Example: ` onyx-cli chat
|
||||
onyx-cli`,
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
cfg := config.Load()
|
||||
|
||||
@@ -32,12 +25,6 @@ an interactive setup wizard will guide you through configuration.`,
|
||||
cfg = *result
|
||||
}
|
||||
|
||||
// CLI flag overrides config/env
|
||||
if cmd.Flags().Changed("no-stream-markdown") {
|
||||
v := !noStreamMarkdown
|
||||
cfg.Features.StreamMarkdown = &v
|
||||
}
|
||||
|
||||
starprompt.MaybePrompt()
|
||||
|
||||
m := tui.NewModel(cfg)
|
||||
@@ -46,8 +33,4 @@ an interactive setup wizard will guide you through configuration.`,
|
||||
return err
|
||||
},
|
||||
}
|
||||
|
||||
cmd.Flags().BoolVar(&noStreamMarkdown, "no-stream-markdown", false, "Disable progressive markdown rendering during streaming")
|
||||
|
||||
return cmd
|
||||
}
|
||||
|
||||
@@ -1,126 +1,19 @@
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/api"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/config"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/onboarding"
|
||||
"github.com/spf13/cobra"
|
||||
"golang.org/x/term"
|
||||
)
|
||||
|
||||
func newConfigureCmd() *cobra.Command {
|
||||
var (
|
||||
serverURL string
|
||||
apiKey string
|
||||
apiKeyStdin bool
|
||||
dryRun bool
|
||||
)
|
||||
|
||||
cmd := &cobra.Command{
|
||||
return &cobra.Command{
|
||||
Use: "configure",
|
||||
Short: "Configure server URL and API key",
|
||||
Long: `Set up the Onyx CLI with your server URL and API key.
|
||||
|
||||
When --server-url and --api-key are both provided, the configuration is saved
|
||||
non-interactively (useful for scripts and AI agents). Otherwise, an interactive
|
||||
setup wizard is launched.
|
||||
|
||||
If --api-key is omitted but stdin has piped data, the API key is read from
|
||||
stdin automatically. You can also use --api-key-stdin to make this explicit.
|
||||
This avoids leaking the key in shell history.
|
||||
|
||||
Use --dry-run to test the connection without saving the configuration.`,
|
||||
Example: ` onyx-cli configure
|
||||
onyx-cli configure --server-url https://my-onyx.com --api-key sk-...
|
||||
echo "$ONYX_API_KEY" | onyx-cli configure --server-url https://my-onyx.com
|
||||
echo "$ONYX_API_KEY" | onyx-cli configure --server-url https://my-onyx.com --api-key-stdin
|
||||
onyx-cli configure --server-url https://my-onyx.com --api-key sk-... --dry-run`,
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
// Read API key from stdin if piped (implicit) or --api-key-stdin (explicit)
|
||||
if apiKeyStdin && apiKey != "" {
|
||||
return exitcodes.New(exitcodes.BadRequest, "--api-key and --api-key-stdin cannot be used together")
|
||||
}
|
||||
if (apiKey == "" && !term.IsTerminal(int(os.Stdin.Fd()))) || apiKeyStdin {
|
||||
data, err := io.ReadAll(os.Stdin)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to read API key from stdin: %w", err)
|
||||
}
|
||||
apiKey = strings.TrimSpace(string(data))
|
||||
}
|
||||
|
||||
if serverURL != "" && apiKey != "" {
|
||||
return configureNonInteractive(serverURL, apiKey, dryRun)
|
||||
}
|
||||
|
||||
if dryRun {
|
||||
return exitcodes.New(exitcodes.BadRequest, "--dry-run requires --server-url and --api-key")
|
||||
}
|
||||
|
||||
if serverURL != "" || apiKey != "" {
|
||||
return exitcodes.New(exitcodes.BadRequest, "both --server-url and --api-key are required for non-interactive setup\n Run 'onyx-cli configure' without flags for interactive setup")
|
||||
}
|
||||
|
||||
cfg := config.Load()
|
||||
onboarding.Run(&cfg)
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
cmd.Flags().StringVar(&serverURL, "server-url", "", "Onyx server URL (e.g., https://cloud.onyx.app)")
|
||||
cmd.Flags().StringVar(&apiKey, "api-key", "", "API key for authentication (or pipe via stdin)")
|
||||
cmd.Flags().BoolVar(&apiKeyStdin, "api-key-stdin", false, "Read API key from stdin (explicit; also happens automatically when stdin is piped)")
|
||||
cmd.Flags().BoolVar(&dryRun, "dry-run", false, "Test connection without saving config (requires --server-url and --api-key)")
|
||||
|
||||
return cmd
|
||||
}
|
||||
|
||||
func configureNonInteractive(serverURL, apiKey string, dryRun bool) error {
|
||||
cfg := config.OnyxCliConfig{
|
||||
ServerURL: serverURL,
|
||||
APIKey: apiKey,
|
||||
DefaultAgentID: 0,
|
||||
}
|
||||
|
||||
// Preserve existing default agent ID from disk (not env overrides)
|
||||
if existing := config.LoadFromDisk(); existing.DefaultAgentID != 0 {
|
||||
cfg.DefaultAgentID = existing.DefaultAgentID
|
||||
}
|
||||
|
||||
// Test connection
|
||||
client := api.NewClient(cfg)
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
|
||||
defer cancel()
|
||||
|
||||
if err := client.TestConnection(ctx); err != nil {
|
||||
var authErr *api.AuthError
|
||||
if errors.As(err, &authErr) {
|
||||
return exitcodes.Newf(exitcodes.AuthFailure, "authentication failed: %v\n Check your API key", err)
|
||||
}
|
||||
return exitcodes.Newf(exitcodes.Unreachable, "connection failed: %v\n Check your server URL", err)
|
||||
}
|
||||
|
||||
if dryRun {
|
||||
fmt.Printf("Server: %s\n", serverURL)
|
||||
fmt.Println("Status: connected and authenticated")
|
||||
fmt.Println("Dry run: config was NOT saved")
|
||||
return nil
|
||||
}
|
||||
|
||||
if err := config.Save(cfg); err != nil {
|
||||
return fmt.Errorf("could not save config: %w", err)
|
||||
}
|
||||
|
||||
fmt.Printf("Config: %s\n", config.ConfigFilePath())
|
||||
fmt.Printf("Server: %s\n", serverURL)
|
||||
fmt.Println("Status: connected and authenticated")
|
||||
return nil
|
||||
}
|
||||
|
||||
@@ -1,20 +0,0 @@
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/config"
|
||||
"github.com/spf13/cobra"
|
||||
)
|
||||
|
||||
func newExperimentsCmd() *cobra.Command {
|
||||
return &cobra.Command{
|
||||
Use: "experiments",
|
||||
Short: "List experimental features and their status",
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
cfg := config.Load()
|
||||
_, _ = fmt.Fprintln(cmd.OutOrStdout(), config.ExperimentsText(cfg.Features))
|
||||
return nil
|
||||
},
|
||||
}
|
||||
}
|
||||
@@ -1,176 +0,0 @@
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/embedded"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/fsutil"
|
||||
"github.com/spf13/cobra"
|
||||
)
|
||||
|
||||
// agentSkillDirs maps agent names to their skill directory paths (relative to
|
||||
// the project or home root). "Universal" agents like Cursor and Codex read
|
||||
// from .agents/skills directly, so they don't need their own entry here.
|
||||
var agentSkillDirs = map[string]string{
|
||||
"claude-code": filepath.Join(".claude", "skills"),
|
||||
}
|
||||
|
||||
const (
|
||||
canonicalDir = ".agents/skills"
|
||||
skillName = "onyx-cli"
|
||||
)
|
||||
|
||||
func newInstallSkillCmd() *cobra.Command {
|
||||
var (
|
||||
global bool
|
||||
copyMode bool
|
||||
agents []string
|
||||
)
|
||||
|
||||
cmd := &cobra.Command{
|
||||
Use: "install-skill",
|
||||
Short: "Install the Onyx CLI agent skill file",
|
||||
Long: `Install the bundled SKILL.md so that AI coding agents can discover and use
|
||||
the Onyx CLI as a tool.
|
||||
|
||||
Files are written to the canonical .agents/skills/onyx-cli/ directory. For
|
||||
agents that use their own skill directory (e.g. Claude Code uses .claude/skills/),
|
||||
a symlink is created pointing back to the canonical copy.
|
||||
|
||||
By default the skill is installed at the project level (current directory).
|
||||
Use --global to install under your home directory instead.
|
||||
|
||||
Use --copy to write independent copies instead of symlinks.
|
||||
Use --agent to target specific agents (can be repeated).`,
|
||||
Example: ` onyx-cli install-skill
|
||||
onyx-cli install-skill --global
|
||||
onyx-cli install-skill --agent claude-code
|
||||
onyx-cli install-skill --copy`,
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
base, err := installBase(global)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Write the canonical copy.
|
||||
canonicalSkillDir := filepath.Join(base, canonicalDir, skillName)
|
||||
dest := filepath.Join(canonicalSkillDir, "SKILL.md")
|
||||
content := []byte(embedded.SkillMD)
|
||||
|
||||
status, err := fsutil.CompareFile(dest, content)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
switch status {
|
||||
case fsutil.StatusUpToDate:
|
||||
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Up to date %s\n", dest)
|
||||
case fsutil.StatusDiffers:
|
||||
_, _ = fmt.Fprintf(cmd.ErrOrStderr(), "Warning: overwriting modified %s\n", dest)
|
||||
if err := os.WriteFile(dest, content, 0o644); err != nil {
|
||||
return fmt.Errorf("could not write skill file: %w", err)
|
||||
}
|
||||
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Installed %s\n", dest)
|
||||
default: // statusMissing
|
||||
if err := os.MkdirAll(canonicalSkillDir, 0o755); err != nil {
|
||||
return fmt.Errorf("could not create directory: %w", err)
|
||||
}
|
||||
if err := os.WriteFile(dest, content, 0o644); err != nil {
|
||||
return fmt.Errorf("could not write skill file: %w", err)
|
||||
}
|
||||
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Installed %s\n", dest)
|
||||
}
|
||||
|
||||
// Determine which agents to link.
|
||||
targets := agentSkillDirs
|
||||
if len(agents) > 0 {
|
||||
targets = make(map[string]string)
|
||||
for _, a := range agents {
|
||||
dir, ok := agentSkillDirs[a]
|
||||
if !ok {
|
||||
_, _ = fmt.Fprintf(cmd.ErrOrStderr(), "Unknown agent %q (skipped) — known agents:", a)
|
||||
for name := range agentSkillDirs {
|
||||
_, _ = fmt.Fprintf(cmd.ErrOrStderr(), " %s", name)
|
||||
}
|
||||
_, _ = fmt.Fprintln(cmd.ErrOrStderr())
|
||||
continue
|
||||
}
|
||||
targets[a] = dir
|
||||
}
|
||||
}
|
||||
|
||||
// Create symlinks (or copies) from agent-specific dirs to canonical.
|
||||
for name, skillsDir := range targets {
|
||||
agentSkillDir := filepath.Join(base, skillsDir, skillName)
|
||||
|
||||
if copyMode {
|
||||
copyDest := filepath.Join(agentSkillDir, "SKILL.md")
|
||||
if err := fsutil.EnsureDirForCopy(agentSkillDir); err != nil {
|
||||
return fmt.Errorf("could not prepare %s directory: %w", name, err)
|
||||
}
|
||||
if err := os.MkdirAll(agentSkillDir, 0o755); err != nil {
|
||||
return fmt.Errorf("could not create %s directory: %w", name, err)
|
||||
}
|
||||
if err := os.WriteFile(copyDest, []byte(embedded.SkillMD), 0o644); err != nil {
|
||||
return fmt.Errorf("could not write %s skill file: %w", name, err)
|
||||
}
|
||||
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Copied %s\n", copyDest)
|
||||
continue
|
||||
}
|
||||
|
||||
// Compute relative symlink target. Symlinks resolve relative to
|
||||
// the parent directory of the link, not the link itself.
|
||||
rel, err := filepath.Rel(filepath.Dir(agentSkillDir), canonicalSkillDir)
|
||||
if err != nil {
|
||||
return fmt.Errorf("could not compute relative path for %s: %w", name, err)
|
||||
}
|
||||
|
||||
if err := os.MkdirAll(filepath.Dir(agentSkillDir), 0o755); err != nil {
|
||||
return fmt.Errorf("could not create %s directory: %w", name, err)
|
||||
}
|
||||
|
||||
// Remove existing symlink/dir before creating.
|
||||
_ = os.Remove(agentSkillDir)
|
||||
|
||||
if err := os.Symlink(rel, agentSkillDir); err != nil {
|
||||
// Fall back to copy if symlink fails (e.g. Windows without dev mode).
|
||||
copyDest := filepath.Join(agentSkillDir, "SKILL.md")
|
||||
if mkErr := os.MkdirAll(agentSkillDir, 0o755); mkErr != nil {
|
||||
return fmt.Errorf("could not create %s directory: %w", name, mkErr)
|
||||
}
|
||||
if wErr := os.WriteFile(copyDest, []byte(embedded.SkillMD), 0o644); wErr != nil {
|
||||
return fmt.Errorf("could not write %s skill file: %w", name, wErr)
|
||||
}
|
||||
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Copied %s (symlink failed)\n", copyDest)
|
||||
continue
|
||||
}
|
||||
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Linked %s -> %s\n", agentSkillDir, rel)
|
||||
}
|
||||
|
||||
return nil
|
||||
},
|
||||
}
|
||||
|
||||
cmd.Flags().BoolVarP(&global, "global", "g", false, "Install to home directory instead of project")
|
||||
cmd.Flags().BoolVar(©Mode, "copy", false, "Copy files instead of symlinking")
|
||||
cmd.Flags().StringSliceVarP(&agents, "agent", "a", nil, "Target specific agents (e.g. claude-code)")
|
||||
|
||||
return cmd
|
||||
}
|
||||
|
||||
func installBase(global bool) (string, error) {
|
||||
if global {
|
||||
home, err := os.UserHomeDir()
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("could not determine home directory: %w", err)
|
||||
}
|
||||
return home, nil
|
||||
}
|
||||
cwd, err := os.Getwd()
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("could not determine working directory: %w", err)
|
||||
}
|
||||
return cwd, nil
|
||||
}
|
||||
|
||||
@@ -97,8 +97,6 @@ func Execute() error {
|
||||
rootCmd.AddCommand(newConfigureCmd())
|
||||
rootCmd.AddCommand(newValidateConfigCmd())
|
||||
rootCmd.AddCommand(newServeCmd())
|
||||
rootCmd.AddCommand(newInstallSkillCmd())
|
||||
rootCmd.AddCommand(newExperimentsCmd())
|
||||
|
||||
// Default command is chat, but intercept --version first
|
||||
rootCmd.RunE = func(cmd *cobra.Command, args []string) error {
|
||||
|
||||
@@ -23,7 +23,6 @@ import (
|
||||
"github.com/charmbracelet/wish/ratelimiter"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/api"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/config"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/tui"
|
||||
"github.com/spf13/cobra"
|
||||
"golang.org/x/time/rate"
|
||||
@@ -296,15 +295,15 @@ provided via the ONYX_API_KEY environment variable to skip the prompt:
|
||||
The server URL is taken from the server operator's config. The server
|
||||
auto-generates an Ed25519 host key on first run if the key file does not
|
||||
already exist. The host key path can also be set via the ONYX_SSH_HOST_KEY
|
||||
environment variable (the --host-key flag takes precedence).`,
|
||||
Example: ` onyx-cli serve --port 2222
|
||||
ssh localhost -p 2222
|
||||
onyx-cli serve --host 0.0.0.0 --port 2222
|
||||
onyx-cli serve --idle-timeout 30m --max-session-timeout 2h`,
|
||||
environment variable (the --host-key flag takes precedence).
|
||||
|
||||
Example:
|
||||
onyx-cli serve --port 2222
|
||||
ssh localhost -p 2222`,
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
serverCfg := config.Load()
|
||||
if serverCfg.ServerURL == "" {
|
||||
return exitcodes.New(exitcodes.NotConfigured, "server URL is not configured\n Run: onyx-cli configure")
|
||||
return fmt.Errorf("server URL is not configured; run 'onyx-cli configure' first")
|
||||
}
|
||||
if !cmd.Flags().Changed("host-key") {
|
||||
if v := os.Getenv(config.EnvSSHHostKey); v != "" {
|
||||
|
||||
@@ -2,13 +2,11 @@ package cmd
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"time"
|
||||
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/api"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/config"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/version"
|
||||
log "github.com/sirupsen/logrus"
|
||||
"github.com/spf13/cobra"
|
||||
@@ -18,21 +16,17 @@ func newValidateConfigCmd() *cobra.Command {
|
||||
return &cobra.Command{
|
||||
Use: "validate-config",
|
||||
Short: "Validate configuration and test server connection",
|
||||
Long: `Check that the CLI is configured, the server is reachable, and the API key
|
||||
is valid. Also reports the server version and warns if it is below the
|
||||
minimum required.`,
|
||||
Example: ` onyx-cli validate-config`,
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
// Check config file
|
||||
if !config.ConfigExists() {
|
||||
return exitcodes.Newf(exitcodes.NotConfigured, "config file not found at %s\n Run: onyx-cli configure", config.ConfigFilePath())
|
||||
return fmt.Errorf("config file not found at %s\n Run 'onyx-cli configure' to set up", config.ConfigFilePath())
|
||||
}
|
||||
|
||||
cfg := config.Load()
|
||||
|
||||
// Check API key
|
||||
if !cfg.IsConfigured() {
|
||||
return exitcodes.New(exitcodes.NotConfigured, "API key is missing\n Run: onyx-cli configure")
|
||||
return fmt.Errorf("API key is missing\n Run 'onyx-cli configure' to set up")
|
||||
}
|
||||
|
||||
_, _ = fmt.Fprintf(cmd.OutOrStdout(), "Config: %s\n", config.ConfigFilePath())
|
||||
@@ -41,11 +35,7 @@ minimum required.`,
|
||||
// Test connection
|
||||
client := api.NewClient(cfg)
|
||||
if err := client.TestConnection(cmd.Context()); err != nil {
|
||||
var authErr *api.AuthError
|
||||
if errors.As(err, &authErr) {
|
||||
return exitcodes.Newf(exitcodes.AuthFailure, "authentication failed: %v\n Reconfigure with: onyx-cli configure", err)
|
||||
}
|
||||
return exitcodes.Newf(exitcodes.Unreachable, "connection failed: %v\n Reconfigure with: onyx-cli configure", err)
|
||||
return fmt.Errorf("connection failed: %w", err)
|
||||
}
|
||||
|
||||
_, _ = fmt.Fprintln(cmd.OutOrStdout(), "Status: connected and authenticated")
|
||||
|
||||
@@ -149,12 +149,12 @@ func (c *Client) TestConnection(ctx context.Context) error {
|
||||
|
||||
if resp2.StatusCode == 401 || resp2.StatusCode == 403 {
|
||||
if isHTML || strings.Contains(respServer, "awselb") {
|
||||
return &AuthError{Message: fmt.Sprintf("HTTP %d from a reverse proxy (not the Onyx backend).\n Check your deployment's ingress / proxy configuration", resp2.StatusCode)}
|
||||
return fmt.Errorf("HTTP %d from a reverse proxy (not the Onyx backend).\n Check your deployment's ingress / proxy configuration", resp2.StatusCode)
|
||||
}
|
||||
if resp2.StatusCode == 401 {
|
||||
return &AuthError{Message: fmt.Sprintf("invalid API key or token.\n %s", body)}
|
||||
return fmt.Errorf("invalid API key or token.\n %s", body)
|
||||
}
|
||||
return &AuthError{Message: fmt.Sprintf("access denied — check that the API key is valid.\n %s", body)}
|
||||
return fmt.Errorf("access denied — check that the API key is valid.\n %s", body)
|
||||
}
|
||||
|
||||
detail := fmt.Sprintf("HTTP %d", resp2.StatusCode)
|
||||
|
||||
@@ -11,12 +11,3 @@ type OnyxAPIError struct {
|
||||
func (e *OnyxAPIError) Error() string {
|
||||
return fmt.Sprintf("HTTP %d: %s", e.StatusCode, e.Detail)
|
||||
}
|
||||
|
||||
// AuthError is returned when authentication or authorization fails.
|
||||
type AuthError struct {
|
||||
Message string
|
||||
}
|
||||
|
||||
func (e *AuthError) Error() string {
|
||||
return e.Message
|
||||
}
|
||||
|
||||
@@ -9,47 +9,28 @@ import (
|
||||
)
|
||||
|
||||
const (
|
||||
EnvServerURL = "ONYX_SERVER_URL"
|
||||
EnvAPIKey = "ONYX_API_KEY"
|
||||
EnvAgentID = "ONYX_PERSONA_ID"
|
||||
EnvSSHHostKey = "ONYX_SSH_HOST_KEY"
|
||||
EnvStreamMarkdown = "ONYX_STREAM_MARKDOWN"
|
||||
EnvServerURL = "ONYX_SERVER_URL"
|
||||
EnvAPIKey = "ONYX_API_KEY"
|
||||
EnvAgentID = "ONYX_PERSONA_ID"
|
||||
EnvSSHHostKey = "ONYX_SSH_HOST_KEY"
|
||||
)
|
||||
|
||||
// Features holds experimental feature flags for the CLI.
|
||||
type Features struct {
|
||||
// StreamMarkdown enables progressive markdown rendering during streaming,
|
||||
// so output is formatted as it arrives rather than after completion.
|
||||
// nil means use the app default (true).
|
||||
StreamMarkdown *bool `json:"stream_markdown,omitempty"`
|
||||
}
|
||||
|
||||
// OnyxCliConfig holds the CLI configuration.
|
||||
type OnyxCliConfig struct {
|
||||
ServerURL string `json:"server_url"`
|
||||
APIKey string `json:"api_key"`
|
||||
DefaultAgentID int `json:"default_persona_id"`
|
||||
Features Features `json:"features,omitempty"`
|
||||
ServerURL string `json:"server_url"`
|
||||
APIKey string `json:"api_key"`
|
||||
DefaultAgentID int `json:"default_persona_id"`
|
||||
}
|
||||
|
||||
// DefaultConfig returns a config with default values.
|
||||
func DefaultConfig() OnyxCliConfig {
|
||||
return OnyxCliConfig{
|
||||
ServerURL: "https://cloud.onyx.app",
|
||||
APIKey: "",
|
||||
ServerURL: "https://cloud.onyx.app",
|
||||
APIKey: "",
|
||||
DefaultAgentID: 0,
|
||||
}
|
||||
}
|
||||
|
||||
// StreamMarkdownEnabled returns whether stream markdown is enabled,
|
||||
// defaulting to true when the user hasn't set an explicit preference.
|
||||
func (f Features) StreamMarkdownEnabled() bool {
|
||||
if f.StreamMarkdown != nil {
|
||||
return *f.StreamMarkdown
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// IsConfigured returns true if the config has an API key.
|
||||
func (c OnyxCliConfig) IsConfigured() bool {
|
||||
return c.APIKey != ""
|
||||
@@ -78,10 +59,8 @@ func ConfigExists() bool {
|
||||
return err == nil
|
||||
}
|
||||
|
||||
// LoadFromDisk reads config from the file only, without applying environment
|
||||
// variable overrides. Use this when you need the persisted config values
|
||||
// (e.g., to preserve them during a save operation).
|
||||
func LoadFromDisk() OnyxCliConfig {
|
||||
// Load reads config from file and applies environment variable overrides.
|
||||
func Load() OnyxCliConfig {
|
||||
cfg := DefaultConfig()
|
||||
|
||||
data, err := os.ReadFile(ConfigFilePath())
|
||||
@@ -91,13 +70,6 @@ func LoadFromDisk() OnyxCliConfig {
|
||||
}
|
||||
}
|
||||
|
||||
return cfg
|
||||
}
|
||||
|
||||
// Load reads config from file and applies environment variable overrides.
|
||||
func Load() OnyxCliConfig {
|
||||
cfg := LoadFromDisk()
|
||||
|
||||
// Environment overrides
|
||||
if v := os.Getenv(EnvServerURL); v != "" {
|
||||
cfg.ServerURL = v
|
||||
@@ -110,13 +82,6 @@ func Load() OnyxCliConfig {
|
||||
cfg.DefaultAgentID = id
|
||||
}
|
||||
}
|
||||
if v := os.Getenv(EnvStreamMarkdown); v != "" {
|
||||
if b, err := strconv.ParseBool(v); err == nil {
|
||||
cfg.Features.StreamMarkdown = &b
|
||||
} else {
|
||||
fmt.Fprintf(os.Stderr, "warning: invalid value %q for %s (expected true/false), ignoring\n", v, EnvStreamMarkdown)
|
||||
}
|
||||
}
|
||||
|
||||
return cfg
|
||||
}
|
||||
|
||||
@@ -9,7 +9,7 @@ import (
|
||||
|
||||
func clearEnvVars(t *testing.T) {
|
||||
t.Helper()
|
||||
for _, key := range []string{EnvServerURL, EnvAPIKey, EnvAgentID, EnvStreamMarkdown} {
|
||||
for _, key := range []string{EnvServerURL, EnvAPIKey, EnvAgentID} {
|
||||
t.Setenv(key, "")
|
||||
if err := os.Unsetenv(key); err != nil {
|
||||
t.Fatal(err)
|
||||
@@ -199,48 +199,6 @@ func TestSaveAndReload(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
func TestDefaultFeaturesStreamMarkdownNil(t *testing.T) {
|
||||
cfg := DefaultConfig()
|
||||
if cfg.Features.StreamMarkdown != nil {
|
||||
t.Error("expected StreamMarkdown to be nil by default")
|
||||
}
|
||||
if !cfg.Features.StreamMarkdownEnabled() {
|
||||
t.Error("expected StreamMarkdownEnabled() to return true when nil")
|
||||
}
|
||||
}
|
||||
|
||||
func TestEnvOverrideStreamMarkdownFalse(t *testing.T) {
|
||||
clearEnvVars(t)
|
||||
dir := t.TempDir()
|
||||
t.Setenv("XDG_CONFIG_HOME", dir)
|
||||
t.Setenv(EnvStreamMarkdown, "false")
|
||||
|
||||
cfg := Load()
|
||||
if cfg.Features.StreamMarkdown == nil || *cfg.Features.StreamMarkdown {
|
||||
t.Error("expected StreamMarkdown=false from env override")
|
||||
}
|
||||
}
|
||||
|
||||
func TestLoadFeaturesFromFile(t *testing.T) {
|
||||
clearEnvVars(t)
|
||||
dir := t.TempDir()
|
||||
t.Setenv("XDG_CONFIG_HOME", dir)
|
||||
|
||||
data, _ := json.Marshal(map[string]interface{}{
|
||||
"server_url": "https://example.com",
|
||||
"api_key": "key",
|
||||
"features": map[string]interface{}{
|
||||
"stream_markdown": true,
|
||||
},
|
||||
})
|
||||
writeConfig(t, dir, data)
|
||||
|
||||
cfg := Load()
|
||||
if cfg.Features.StreamMarkdown == nil || !*cfg.Features.StreamMarkdown {
|
||||
t.Error("expected StreamMarkdown=true from config file")
|
||||
}
|
||||
}
|
||||
|
||||
func TestSaveCreatesParentDirs(t *testing.T) {
|
||||
clearEnvVars(t)
|
||||
dir := t.TempDir()
|
||||
|
||||
@@ -1,46 +0,0 @@
|
||||
package config
|
||||
|
||||
import "fmt"
|
||||
|
||||
// Experiment describes an experimental feature flag.
|
||||
type Experiment struct {
|
||||
Name string
|
||||
Flag string // CLI flag name
|
||||
EnvVar string // environment variable name
|
||||
Config string // JSON path in config file
|
||||
Enabled bool
|
||||
Desc string
|
||||
}
|
||||
|
||||
// Experiments returns the list of available experimental features
|
||||
// with their current status based on the given feature flags.
|
||||
func Experiments(f Features) []Experiment {
|
||||
return []Experiment{
|
||||
{
|
||||
Name: "Stream Markdown",
|
||||
Flag: "--no-stream-markdown",
|
||||
EnvVar: EnvStreamMarkdown,
|
||||
Config: "features.stream_markdown",
|
||||
Enabled: f.StreamMarkdownEnabled(),
|
||||
Desc: "Render markdown progressively as the response streams in (enabled by default)",
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// ExperimentsText formats the experiments list for display.
|
||||
func ExperimentsText(f Features) string {
|
||||
exps := Experiments(f)
|
||||
text := "Experimental Features\n\n"
|
||||
for _, e := range exps {
|
||||
status := "off"
|
||||
if e.Enabled {
|
||||
status = "on"
|
||||
}
|
||||
text += fmt.Sprintf(" %-20s [%s]\n", e.Name, status)
|
||||
text += fmt.Sprintf(" %s\n", e.Desc)
|
||||
text += fmt.Sprintf(" flag: %s env: %s config: %s\n\n", e.Flag, e.EnvVar, e.Config)
|
||||
}
|
||||
text += "Toggle via CLI flag, environment variable, or config file.\n"
|
||||
text += "Example: onyx-cli chat --no-stream-markdown"
|
||||
return text
|
||||
}
|
||||
@@ -1,187 +0,0 @@
|
||||
---
|
||||
name: onyx-cli
|
||||
description: Query the Onyx knowledge base using the onyx-cli command. Use when the user wants to search company documents, ask questions about internal knowledge, query connected data sources, or look up information stored in Onyx.
|
||||
---
|
||||
|
||||
# Onyx CLI — Agent Tool
|
||||
|
||||
Onyx is an enterprise search and Gen-AI platform that connects to company documents, apps, and people. The `onyx-cli` CLI provides non-interactive commands to query the Onyx knowledge base and list available agents.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### 1. Check if installed
|
||||
|
||||
```bash
|
||||
which onyx-cli
|
||||
```
|
||||
|
||||
### 2. Install (if needed)
|
||||
|
||||
**Primary — pip:**
|
||||
|
||||
```bash
|
||||
pip install onyx-cli
|
||||
```
|
||||
|
||||
**From source (Go):**
|
||||
|
||||
```bash
|
||||
go build -o onyx-cli github.com/onyx-dot-app/onyx/cli && sudo mv onyx-cli /usr/local/bin/
|
||||
```
|
||||
|
||||
### 3. Check if configured
|
||||
|
||||
```bash
|
||||
onyx-cli validate-config
|
||||
```
|
||||
|
||||
This checks the config file exists, API key is present, and tests the server connection via `/api/me`. Exit code 0 on success, non-zero with a descriptive error on failure.
|
||||
|
||||
If unconfigured, you have two options:
|
||||
|
||||
**Option A — Interactive setup (requires user input):**
|
||||
|
||||
```bash
|
||||
onyx-cli configure
|
||||
```
|
||||
|
||||
This prompts for the Onyx server URL and API key, tests the connection, and saves config.
|
||||
|
||||
**Option B — Environment variables (non-interactive, preferred for agents):**
|
||||
|
||||
```bash
|
||||
export ONYX_SERVER_URL="https://your-onyx-server.com" # default: https://cloud.onyx.app
|
||||
export ONYX_API_KEY="your-api-key"
|
||||
```
|
||||
|
||||
Environment variables override the config file. If these are set, no config file is needed.
|
||||
|
||||
| Variable | Required | Description |
|
||||
| ----------------- | -------- | -------------------------------------------------------- |
|
||||
| `ONYX_SERVER_URL` | No | Onyx server base URL (default: `https://cloud.onyx.app`) |
|
||||
| `ONYX_API_KEY` | Yes | API key for authentication |
|
||||
| `ONYX_PERSONA_ID` | No | Default agent/persona ID |
|
||||
|
||||
If neither the config file nor environment variables are set, tell the user that `onyx-cli` needs to be configured and ask them to either:
|
||||
|
||||
- Run `onyx-cli configure` interactively, or
|
||||
- Set `ONYX_SERVER_URL` and `ONYX_API_KEY` environment variables
|
||||
|
||||
## Commands
|
||||
|
||||
### Validate configuration
|
||||
|
||||
```bash
|
||||
onyx-cli validate-config
|
||||
```
|
||||
|
||||
Checks config file exists, API key is present, and tests the server connection. Use this before `ask` or `agents` to confirm the CLI is properly set up.
|
||||
|
||||
### List available agents
|
||||
|
||||
```bash
|
||||
onyx-cli agents
|
||||
```
|
||||
|
||||
Prints a table of agent IDs, names, and descriptions. Use `--json` for structured output:
|
||||
|
||||
```bash
|
||||
onyx-cli agents --json
|
||||
```
|
||||
|
||||
Use agent IDs with `ask --agent-id` to query a specific agent.
|
||||
|
||||
### Basic query (plain text output)
|
||||
|
||||
```bash
|
||||
onyx-cli ask "What is our company's PTO policy?"
|
||||
```
|
||||
|
||||
Streams the answer as plain text to stdout. Exit code 0 on success, non-zero on error.
|
||||
|
||||
### JSON output (structured events)
|
||||
|
||||
```bash
|
||||
onyx-cli ask --json "What authentication methods do we support?"
|
||||
```
|
||||
|
||||
Outputs JSON-encoded parsed stream events (one object per line). Key event objects include message deltas, stop, errors, search-start, and citation payloads.
|
||||
|
||||
Each line is a JSON object with this envelope:
|
||||
|
||||
```json
|
||||
{"type": "<event_type>", "event": { ... }}
|
||||
```
|
||||
|
||||
| Event Type | Description |
|
||||
| ------------------- | -------------------------------------------------------------------- |
|
||||
| `message_delta` | Content token — concatenate all `content` fields for the full answer |
|
||||
| `stop` | Stream complete |
|
||||
| `error` | Error with `error` message field |
|
||||
| `search_tool_start` | Onyx started searching documents |
|
||||
| `citation_info` | Source citation — see shape below |
|
||||
|
||||
`citation_info` event shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "citation_info",
|
||||
"event": {
|
||||
"citation_number": 1,
|
||||
"document_id": "abc123def456",
|
||||
"placement": { "turn_index": 0, "tab_index": 0, "sub_turn_index": null }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`placement` is metadata about where in the conversation the citation appeared and can be ignored for most use cases.
|
||||
|
||||
### Specify an agent
|
||||
|
||||
```bash
|
||||
onyx-cli ask --agent-id 5 "Summarize our Q4 roadmap"
|
||||
```
|
||||
|
||||
Uses a specific Onyx agent/persona instead of the default.
|
||||
|
||||
### All flags
|
||||
|
||||
| Flag | Type | Description |
|
||||
| ------------ | ---- | ---------------------------------------------- |
|
||||
| `--agent-id` | int | Agent ID to use (overrides default) |
|
||||
| `--json` | bool | Output raw NDJSON events instead of plain text |
|
||||
|
||||
## Statelessness
|
||||
|
||||
Each `onyx-cli ask` call creates an independent chat session. There is no built-in way to chain context across multiple `ask` invocations — every call starts fresh. If you need multi-turn conversation with memory, use the interactive TUI (`onyx-cli` or `onyx-cli chat`) instead.
|
||||
|
||||
## When to Use
|
||||
|
||||
Use `onyx-cli ask` when:
|
||||
|
||||
- The user asks about company-specific information (policies, docs, processes)
|
||||
- You need to search internal knowledge bases or connected data sources
|
||||
- The user references Onyx, asks you to "search Onyx", or wants to query their documents
|
||||
- You need context from company wikis, Confluence, Google Drive, Slack, or other connected sources
|
||||
|
||||
Do NOT use when:
|
||||
|
||||
- The question is about general programming knowledge (use your own knowledge)
|
||||
- The user is asking about code in the current repository (use grep/read tools)
|
||||
- The user hasn't mentioned Onyx and the question doesn't require internal company data
|
||||
|
||||
## Examples
|
||||
|
||||
```bash
|
||||
# Simple question
|
||||
onyx-cli ask "What are the steps to deploy to production?"
|
||||
|
||||
# Get structured output for parsing
|
||||
onyx-cli ask --json "List all active API integrations"
|
||||
|
||||
# Use a specialized agent
|
||||
onyx-cli ask --agent-id 3 "What were the action items from last week's standup?"
|
||||
|
||||
# Pipe the answer into another command
|
||||
onyx-cli ask "What is the database schema for users?" | head -20
|
||||
```
|
||||
@@ -1,7 +0,0 @@
|
||||
// Package embedded holds files that are compiled into the onyx-cli binary.
|
||||
package embedded
|
||||
|
||||
import _ "embed"
|
||||
|
||||
//go:embed SKILL.md
|
||||
var SkillMD string
|
||||
@@ -1,33 +0,0 @@
|
||||
// Package exitcodes defines semantic exit codes for the Onyx CLI.
|
||||
package exitcodes
|
||||
|
||||
import "fmt"
|
||||
|
||||
const (
|
||||
Success = 0
|
||||
General = 1
|
||||
BadRequest = 2 // invalid args / command-line errors (convention)
|
||||
NotConfigured = 3
|
||||
AuthFailure = 4
|
||||
Unreachable = 5
|
||||
)
|
||||
|
||||
// ExitError wraps an error with a specific exit code.
|
||||
type ExitError struct {
|
||||
Code int
|
||||
Err error
|
||||
}
|
||||
|
||||
func (e *ExitError) Error() string {
|
||||
return e.Err.Error()
|
||||
}
|
||||
|
||||
// New creates an ExitError with the given code and message.
|
||||
func New(code int, msg string) *ExitError {
|
||||
return &ExitError{Code: code, Err: fmt.Errorf("%s", msg)}
|
||||
}
|
||||
|
||||
// Newf creates an ExitError with a formatted message.
|
||||
func Newf(code int, format string, args ...any) *ExitError {
|
||||
return &ExitError{Code: code, Err: fmt.Errorf(format, args...)}
|
||||
}
|
||||
@@ -1,40 +0,0 @@
|
||||
package exitcodes
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestExitError_Error(t *testing.T) {
|
||||
e := New(NotConfigured, "not configured")
|
||||
if e.Error() != "not configured" {
|
||||
t.Fatalf("expected 'not configured', got %q", e.Error())
|
||||
}
|
||||
if e.Code != NotConfigured {
|
||||
t.Fatalf("expected code %d, got %d", NotConfigured, e.Code)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExitError_Newf(t *testing.T) {
|
||||
e := Newf(Unreachable, "cannot reach %s", "server")
|
||||
if e.Error() != "cannot reach server" {
|
||||
t.Fatalf("expected 'cannot reach server', got %q", e.Error())
|
||||
}
|
||||
if e.Code != Unreachable {
|
||||
t.Fatalf("expected code %d, got %d", Unreachable, e.Code)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExitError_ErrorsAs(t *testing.T) {
|
||||
e := New(BadRequest, "bad input")
|
||||
wrapped := fmt.Errorf("wrapper: %w", e)
|
||||
|
||||
var exitErr *ExitError
|
||||
if !errors.As(wrapped, &exitErr) {
|
||||
t.Fatal("errors.As should find ExitError")
|
||||
}
|
||||
if exitErr.Code != BadRequest {
|
||||
t.Fatalf("expected code %d, got %d", BadRequest, exitErr.Code)
|
||||
}
|
||||
}
|
||||
@@ -1,50 +0,0 @@
|
||||
// Package fsutil provides filesystem helper functions.
|
||||
package fsutil
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"errors"
|
||||
"fmt"
|
||||
"os"
|
||||
)
|
||||
|
||||
// FileStatus describes how an on-disk file compares to expected content.
|
||||
type FileStatus int
|
||||
|
||||
const (
|
||||
StatusMissing FileStatus = iota
|
||||
StatusUpToDate // file exists with identical content
|
||||
StatusDiffers // file exists with different content
|
||||
)
|
||||
|
||||
// CompareFile checks whether the file at path matches the expected content.
|
||||
func CompareFile(path string, expected []byte) (FileStatus, error) {
|
||||
existing, err := os.ReadFile(path)
|
||||
if err != nil {
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
return StatusMissing, nil
|
||||
}
|
||||
return 0, fmt.Errorf("could not read %s: %w", path, err)
|
||||
}
|
||||
if bytes.Equal(existing, expected) {
|
||||
return StatusUpToDate, nil
|
||||
}
|
||||
return StatusDiffers, nil
|
||||
}
|
||||
|
||||
// EnsureDirForCopy makes sure path is a real directory, not a symlink or
|
||||
// regular file. If a symlink or file exists at path it is removed so the
|
||||
// caller can create a directory with independent content.
|
||||
func EnsureDirForCopy(path string) error {
|
||||
info, err := os.Lstat(path)
|
||||
if err == nil {
|
||||
if info.Mode()&os.ModeSymlink != 0 || !info.IsDir() {
|
||||
if err := os.Remove(path); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
} else if !errors.Is(err, os.ErrNotExist) {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
@@ -1,116 +0,0 @@
|
||||
package fsutil
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// TestCompareFile verifies that CompareFile correctly distinguishes between a
|
||||
// missing file, a file with matching content, and a file with different content.
|
||||
func TestCompareFile(t *testing.T) {
|
||||
tmpDir := t.TempDir()
|
||||
path := filepath.Join(tmpDir, "skill.md")
|
||||
expected := []byte("expected content")
|
||||
|
||||
status, err := CompareFile(path, expected)
|
||||
if err != nil {
|
||||
t.Fatalf("CompareFile on missing file failed: %v", err)
|
||||
}
|
||||
if status != StatusMissing {
|
||||
t.Fatalf("expected StatusMissing, got %v", status)
|
||||
}
|
||||
|
||||
if err := os.WriteFile(path, expected, 0o644); err != nil {
|
||||
t.Fatalf("write expected file failed: %v", err)
|
||||
}
|
||||
status, err = CompareFile(path, expected)
|
||||
if err != nil {
|
||||
t.Fatalf("CompareFile on matching file failed: %v", err)
|
||||
}
|
||||
if status != StatusUpToDate {
|
||||
t.Fatalf("expected StatusUpToDate, got %v", status)
|
||||
}
|
||||
|
||||
if err := os.WriteFile(path, []byte("different content"), 0o644); err != nil {
|
||||
t.Fatalf("write different file failed: %v", err)
|
||||
}
|
||||
status, err = CompareFile(path, expected)
|
||||
if err != nil {
|
||||
t.Fatalf("CompareFile on different file failed: %v", err)
|
||||
}
|
||||
if status != StatusDiffers {
|
||||
t.Fatalf("expected StatusDiffers, got %v", status)
|
||||
}
|
||||
}
|
||||
|
||||
// TestEnsureDirForCopy verifies that EnsureDirForCopy clears symlinks and
|
||||
// regular files so --copy can write a real directory, while leaving existing
|
||||
// directories and missing paths untouched.
|
||||
func TestEnsureDirForCopy(t *testing.T) {
|
||||
t.Run("removes symlink", func(t *testing.T) {
|
||||
tmpDir := t.TempDir()
|
||||
targetDir := filepath.Join(tmpDir, "target")
|
||||
linkPath := filepath.Join(tmpDir, "link")
|
||||
|
||||
if err := os.MkdirAll(targetDir, 0o755); err != nil {
|
||||
t.Fatalf("mkdir target failed: %v", err)
|
||||
}
|
||||
if err := os.Symlink(targetDir, linkPath); err != nil {
|
||||
t.Fatalf("create symlink failed: %v", err)
|
||||
}
|
||||
|
||||
if err := EnsureDirForCopy(linkPath); err != nil {
|
||||
t.Fatalf("EnsureDirForCopy failed: %v", err)
|
||||
}
|
||||
|
||||
if _, err := os.Lstat(linkPath); !os.IsNotExist(err) {
|
||||
t.Fatalf("expected symlink path to be removed, got err=%v", err)
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("removes regular file", func(t *testing.T) {
|
||||
tmpDir := t.TempDir()
|
||||
filePath := filepath.Join(tmpDir, "onyx-cli")
|
||||
if err := os.WriteFile(filePath, []byte("x"), 0o644); err != nil {
|
||||
t.Fatalf("write file failed: %v", err)
|
||||
}
|
||||
|
||||
if err := EnsureDirForCopy(filePath); err != nil {
|
||||
t.Fatalf("EnsureDirForCopy failed: %v", err)
|
||||
}
|
||||
|
||||
if _, err := os.Lstat(filePath); !os.IsNotExist(err) {
|
||||
t.Fatalf("expected file path to be removed, got err=%v", err)
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("keeps existing directory", func(t *testing.T) {
|
||||
tmpDir := t.TempDir()
|
||||
dirPath := filepath.Join(tmpDir, "onyx-cli")
|
||||
if err := os.MkdirAll(dirPath, 0o755); err != nil {
|
||||
t.Fatalf("mkdir failed: %v", err)
|
||||
}
|
||||
|
||||
if err := EnsureDirForCopy(dirPath); err != nil {
|
||||
t.Fatalf("EnsureDirForCopy failed: %v", err)
|
||||
}
|
||||
|
||||
info, err := os.Lstat(dirPath)
|
||||
if err != nil {
|
||||
t.Fatalf("lstat directory failed: %v", err)
|
||||
}
|
||||
if !info.IsDir() {
|
||||
t.Fatalf("expected directory to remain, got mode %v", info.Mode())
|
||||
}
|
||||
})
|
||||
|
||||
t.Run("missing path is no-op", func(t *testing.T) {
|
||||
tmpDir := t.TempDir()
|
||||
missingPath := filepath.Join(tmpDir, "does-not-exist")
|
||||
|
||||
if err := EnsureDirForCopy(missingPath); err != nil {
|
||||
t.Fatalf("EnsureDirForCopy failed: %v", err)
|
||||
}
|
||||
})
|
||||
}
|
||||
@@ -1,121 +0,0 @@
|
||||
// Package overflow provides a streaming writer that auto-truncates output
|
||||
// for non-TTY callers (e.g., AI agents, scripts). Full content is saved to
|
||||
// a temp file on disk; only the first N bytes are printed to stdout.
|
||||
package overflow
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"strings"
|
||||
|
||||
log "github.com/sirupsen/logrus"
|
||||
)
|
||||
|
||||
// Writer handles streaming output with optional truncation.
|
||||
// When Limit > 0, it streams to a temp file on disk (not memory) and stops
|
||||
// writing to stdout after Limit bytes. When Limit == 0, it writes directly
|
||||
// to stdout. In Quiet mode, it buffers in memory and prints once at the end.
|
||||
type Writer struct {
|
||||
Limit int
|
||||
Quiet bool
|
||||
written int
|
||||
totalBytes int
|
||||
truncated bool
|
||||
buf strings.Builder // used only in quiet mode
|
||||
tmpFile *os.File // used only in truncation mode (Limit > 0)
|
||||
}
|
||||
|
||||
// Write sends a chunk of content through the writer.
|
||||
func (w *Writer) Write(s string) {
|
||||
w.totalBytes += len(s)
|
||||
|
||||
// Quiet mode: buffer in memory, print nothing
|
||||
if w.Quiet {
|
||||
w.buf.WriteString(s)
|
||||
return
|
||||
}
|
||||
|
||||
if w.Limit <= 0 {
|
||||
fmt.Print(s)
|
||||
return
|
||||
}
|
||||
|
||||
// Truncation mode: stream all content to temp file on disk
|
||||
if w.tmpFile == nil {
|
||||
f, err := os.CreateTemp("", "onyx-ask-*.txt")
|
||||
if err != nil {
|
||||
// Fall back to no-truncation if we can't create the file
|
||||
fmt.Fprintf(os.Stderr, "warning: could not create temp file: %v\n", err)
|
||||
w.Limit = 0
|
||||
fmt.Print(s)
|
||||
return
|
||||
}
|
||||
w.tmpFile = f
|
||||
}
|
||||
if _, err := w.tmpFile.WriteString(s); err != nil {
|
||||
// Disk write failed — abandon truncation, stream directly to stdout
|
||||
fmt.Fprintf(os.Stderr, "warning: temp file write failed: %v\n", err)
|
||||
w.closeTmpFile(true)
|
||||
w.Limit = 0
|
||||
w.truncated = false
|
||||
fmt.Print(s)
|
||||
return
|
||||
}
|
||||
|
||||
if w.truncated {
|
||||
return
|
||||
}
|
||||
|
||||
remaining := w.Limit - w.written
|
||||
if len(s) <= remaining {
|
||||
fmt.Print(s)
|
||||
w.written += len(s)
|
||||
} else {
|
||||
if remaining > 0 {
|
||||
fmt.Print(s[:remaining])
|
||||
w.written += remaining
|
||||
}
|
||||
w.truncated = true
|
||||
}
|
||||
}
|
||||
|
||||
// Finish flushes remaining output. Call once after all Write calls are done.
|
||||
func (w *Writer) Finish() {
|
||||
// Quiet mode: print buffered content at once
|
||||
if w.Quiet {
|
||||
fmt.Println(w.buf.String())
|
||||
return
|
||||
}
|
||||
|
||||
if !w.truncated {
|
||||
w.closeTmpFile(true) // clean up unused temp file
|
||||
fmt.Println()
|
||||
return
|
||||
}
|
||||
|
||||
// Close the temp file so it's readable
|
||||
tmpPath := w.tmpFile.Name()
|
||||
w.closeTmpFile(false) // close but keep the file
|
||||
|
||||
fmt.Printf("\n\n--- response truncated (%d bytes total) ---\n", w.totalBytes)
|
||||
fmt.Printf("Full response: %s\n", tmpPath)
|
||||
fmt.Printf("Explore:\n")
|
||||
fmt.Printf(" cat %s | grep \"<pattern>\"\n", tmpPath)
|
||||
fmt.Printf(" cat %s | tail -50\n", tmpPath)
|
||||
}
|
||||
|
||||
// closeTmpFile closes and optionally removes the temp file.
|
||||
func (w *Writer) closeTmpFile(remove bool) {
|
||||
if w.tmpFile == nil {
|
||||
return
|
||||
}
|
||||
if err := w.tmpFile.Close(); err != nil {
|
||||
log.Debugf("warning: failed to close temp file: %v", err)
|
||||
}
|
||||
if remove {
|
||||
if err := os.Remove(w.tmpFile.Name()); err != nil {
|
||||
log.Debugf("warning: failed to remove temp file: %v", err)
|
||||
}
|
||||
}
|
||||
w.tmpFile = nil
|
||||
}
|
||||
@@ -1,95 +0,0 @@
|
||||
package overflow
|
||||
|
||||
import (
|
||||
"os"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestWriter_NoLimit(t *testing.T) {
|
||||
w := &Writer{Limit: 0}
|
||||
w.Write("hello world")
|
||||
if w.truncated {
|
||||
t.Fatal("should not be truncated with limit 0")
|
||||
}
|
||||
if w.totalBytes != 11 {
|
||||
t.Fatalf("expected 11 total bytes, got %d", w.totalBytes)
|
||||
}
|
||||
}
|
||||
|
||||
func TestWriter_UnderLimit(t *testing.T) {
|
||||
w := &Writer{Limit: 100}
|
||||
w.Write("hello")
|
||||
w.Write(" world")
|
||||
if w.truncated {
|
||||
t.Fatal("should not be truncated when under limit")
|
||||
}
|
||||
if w.written != 11 {
|
||||
t.Fatalf("expected 11 written bytes, got %d", w.written)
|
||||
}
|
||||
}
|
||||
|
||||
func TestWriter_OverLimit(t *testing.T) {
|
||||
w := &Writer{Limit: 5}
|
||||
w.Write("hello world") // 11 bytes, limit 5
|
||||
if !w.truncated {
|
||||
t.Fatal("should be truncated")
|
||||
}
|
||||
if w.written != 5 {
|
||||
t.Fatalf("expected 5 written bytes, got %d", w.written)
|
||||
}
|
||||
if w.totalBytes != 11 {
|
||||
t.Fatalf("expected 11 total bytes, got %d", w.totalBytes)
|
||||
}
|
||||
if w.tmpFile == nil {
|
||||
t.Fatal("temp file should have been created")
|
||||
}
|
||||
_ = w.tmpFile.Close()
|
||||
data, _ := os.ReadFile(w.tmpFile.Name())
|
||||
_ = os.Remove(w.tmpFile.Name())
|
||||
if string(data) != "hello world" {
|
||||
t.Fatalf("temp file should contain full content, got %q", string(data))
|
||||
}
|
||||
}
|
||||
|
||||
func TestWriter_MultipleChunks(t *testing.T) {
|
||||
w := &Writer{Limit: 10}
|
||||
w.Write("hello") // 5 bytes
|
||||
w.Write(" ") // 6 bytes
|
||||
w.Write("world") // 11 bytes, crosses limit
|
||||
w.Write("!") // 12 bytes, already truncated
|
||||
|
||||
if !w.truncated {
|
||||
t.Fatal("should be truncated")
|
||||
}
|
||||
if w.written != 10 {
|
||||
t.Fatalf("expected 10 written bytes, got %d", w.written)
|
||||
}
|
||||
if w.totalBytes != 12 {
|
||||
t.Fatalf("expected 12 total bytes, got %d", w.totalBytes)
|
||||
}
|
||||
if w.tmpFile == nil {
|
||||
t.Fatal("temp file should have been created")
|
||||
}
|
||||
_ = w.tmpFile.Close()
|
||||
data, _ := os.ReadFile(w.tmpFile.Name())
|
||||
_ = os.Remove(w.tmpFile.Name())
|
||||
if string(data) != "hello world!" {
|
||||
t.Fatalf("temp file should contain full content, got %q", string(data))
|
||||
}
|
||||
}
|
||||
|
||||
func TestWriter_QuietMode(t *testing.T) {
|
||||
w := &Writer{Limit: 0, Quiet: true}
|
||||
w.Write("hello")
|
||||
w.Write(" world")
|
||||
|
||||
if w.written != 0 {
|
||||
t.Fatalf("quiet mode should not write to stdout, got %d written", w.written)
|
||||
}
|
||||
if w.totalBytes != 11 {
|
||||
t.Fatalf("expected 11 total bytes, got %d", w.totalBytes)
|
||||
}
|
||||
if w.buf.String() != "hello world" {
|
||||
t.Fatalf("buffer should contain full content, got %q", w.buf.String())
|
||||
}
|
||||
}
|
||||
@@ -55,7 +55,7 @@ func NewModel(cfg config.OnyxCliConfig) Model {
|
||||
return Model{
|
||||
config: cfg,
|
||||
client: client,
|
||||
viewport: newViewport(80, cfg.Features.StreamMarkdownEnabled()),
|
||||
viewport: newViewport(80),
|
||||
input: newInputModel(),
|
||||
status: newStatusBar(),
|
||||
agentID: cfg.DefaultAgentID,
|
||||
|
||||
@@ -67,10 +67,6 @@ func handleSlashCommand(m Model, text string) (Model, tea.Cmd) {
|
||||
}
|
||||
return m, nil
|
||||
|
||||
case "/experiments":
|
||||
m.viewport.addInfo(m.experimentsText())
|
||||
return m, nil
|
||||
|
||||
case "/quit":
|
||||
return m, tea.Quit
|
||||
|
||||
|
||||
@@ -1,8 +0,0 @@
|
||||
package tui
|
||||
|
||||
import "github.com/onyx-dot-app/onyx/cli/internal/config"
|
||||
|
||||
// experimentsText returns the formatted experiments list for the current config.
|
||||
func (m Model) experimentsText() string {
|
||||
return config.ExperimentsText(m.config.Features)
|
||||
}
|
||||
@@ -10,7 +10,6 @@ const helpText = `Onyx CLI Commands
|
||||
/configure Re-run connection setup
|
||||
/connectors Open connectors page in browser
|
||||
/settings Open Onyx settings in browser
|
||||
/experiments List experimental features and their status
|
||||
/quit Exit Onyx CLI
|
||||
|
||||
Keyboard Shortcuts
|
||||
|
||||
@@ -24,7 +24,6 @@ var slashCommands = []slashCommand{
|
||||
{"/configure", "Re-run connection setup"},
|
||||
{"/connectors", "Open connectors in browser"},
|
||||
{"/settings", "Open settings in browser"},
|
||||
{"/experiments", "List experimental features"},
|
||||
{"/quit", "Exit Onyx CLI"},
|
||||
}
|
||||
|
||||
|
||||
@@ -4,7 +4,6 @@ import (
|
||||
"fmt"
|
||||
"sort"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/charmbracelet/glamour"
|
||||
"github.com/charmbracelet/glamour/styles"
|
||||
@@ -45,9 +44,6 @@ type pickerItem struct {
|
||||
label string
|
||||
}
|
||||
|
||||
// streamRenderInterval is the minimum time between markdown re-renders during streaming.
|
||||
const streamRenderInterval = 100 * time.Millisecond
|
||||
|
||||
// viewport manages the chat display.
|
||||
type viewport struct {
|
||||
entries []chatEntry
|
||||
@@ -61,12 +57,6 @@ type viewport struct {
|
||||
pickerIndex int
|
||||
pickerType pickerKind
|
||||
scrollOffset int // lines scrolled up from bottom (0 = pinned to bottom)
|
||||
|
||||
// Progressive markdown rendering during streaming
|
||||
streamMarkdown bool // feature flag: render markdown while streaming
|
||||
streamRendered string // cached rendered output during streaming
|
||||
lastRenderTime time.Time
|
||||
lastRenderLen int // length of streamBuf at last render (skip if unchanged)
|
||||
}
|
||||
|
||||
// newMarkdownRenderer creates a Glamour renderer with zero left margin.
|
||||
@@ -81,11 +71,10 @@ func newMarkdownRenderer(width int) *glamour.TermRenderer {
|
||||
return r
|
||||
}
|
||||
|
||||
func newViewport(width int, streamMarkdown bool) *viewport {
|
||||
func newViewport(width int) *viewport {
|
||||
return &viewport{
|
||||
width: width,
|
||||
renderer: newMarkdownRenderer(width),
|
||||
streamMarkdown: streamMarkdown,
|
||||
width: width,
|
||||
renderer: newMarkdownRenderer(width),
|
||||
}
|
||||
}
|
||||
|
||||
@@ -119,27 +108,12 @@ func (v *viewport) addUserMessage(msg string) {
|
||||
func (v *viewport) startAgent() {
|
||||
v.streaming = true
|
||||
v.streamBuf = ""
|
||||
v.streamRendered = ""
|
||||
v.lastRenderLen = 0
|
||||
v.lastRenderTime = time.Time{}
|
||||
// Add a blank-line spacer entry before the agent message
|
||||
v.entries = append(v.entries, chatEntry{kind: entryInfo, rendered: ""})
|
||||
}
|
||||
|
||||
func (v *viewport) appendToken(token string) {
|
||||
v.streamBuf += token
|
||||
|
||||
if !v.streamMarkdown {
|
||||
return
|
||||
}
|
||||
|
||||
now := time.Now()
|
||||
bufLen := len(v.streamBuf)
|
||||
if bufLen != v.lastRenderLen && now.Sub(v.lastRenderTime) >= streamRenderInterval {
|
||||
v.streamRendered = v.renderAgentContent(v.streamBuf)
|
||||
v.lastRenderTime = now
|
||||
v.lastRenderLen = bufLen
|
||||
}
|
||||
}
|
||||
|
||||
func (v *viewport) finishAgent() {
|
||||
@@ -161,8 +135,6 @@ func (v *viewport) finishAgent() {
|
||||
})
|
||||
v.streaming = false
|
||||
v.streamBuf = ""
|
||||
v.streamRendered = ""
|
||||
v.lastRenderLen = 0
|
||||
}
|
||||
|
||||
func (v *viewport) renderAgentContent(content string) string {
|
||||
@@ -386,22 +358,6 @@ func (v *viewport) renderPicker(width, height int) string {
|
||||
return lipgloss.Place(width, height, lipgloss.Center, lipgloss.Center, panel)
|
||||
}
|
||||
|
||||
// streamingContent returns the display content for the in-progress stream.
|
||||
func (v *viewport) streamingContent() string {
|
||||
if v.streamMarkdown && v.streamRendered != "" {
|
||||
return v.streamRendered
|
||||
}
|
||||
// Fall back to raw text with agent dot prefix
|
||||
bufLines := strings.Split(v.streamBuf, "\n")
|
||||
if len(bufLines) > 0 {
|
||||
bufLines[0] = agentDot + " " + bufLines[0]
|
||||
for i := 1; i < len(bufLines); i++ {
|
||||
bufLines[i] = " " + bufLines[i]
|
||||
}
|
||||
}
|
||||
return strings.Join(bufLines, "\n")
|
||||
}
|
||||
|
||||
// totalLines computes the total number of rendered content lines.
|
||||
func (v *viewport) totalLines() int {
|
||||
var lines []string
|
||||
@@ -412,7 +368,14 @@ func (v *viewport) totalLines() int {
|
||||
lines = append(lines, e.rendered)
|
||||
}
|
||||
if v.streaming && v.streamBuf != "" {
|
||||
lines = append(lines, v.streamingContent())
|
||||
bufLines := strings.Split(v.streamBuf, "\n")
|
||||
if len(bufLines) > 0 {
|
||||
bufLines[0] = agentDot + " " + bufLines[0]
|
||||
for i := 1; i < len(bufLines); i++ {
|
||||
bufLines[i] = " " + bufLines[i]
|
||||
}
|
||||
}
|
||||
lines = append(lines, strings.Join(bufLines, "\n"))
|
||||
} else if v.streaming {
|
||||
lines = append(lines, agentDot+" ")
|
||||
}
|
||||
@@ -436,9 +399,16 @@ func (v *viewport) view(height int) string {
|
||||
lines = append(lines, e.rendered)
|
||||
}
|
||||
|
||||
// Streaming buffer
|
||||
// Streaming buffer (plain text, not markdown)
|
||||
if v.streaming && v.streamBuf != "" {
|
||||
lines = append(lines, v.streamingContent())
|
||||
bufLines := strings.Split(v.streamBuf, "\n")
|
||||
if len(bufLines) > 0 {
|
||||
bufLines[0] = agentDot + " " + bufLines[0]
|
||||
for i := 1; i < len(bufLines); i++ {
|
||||
bufLines[i] = " " + bufLines[i]
|
||||
}
|
||||
}
|
||||
lines = append(lines, strings.Join(bufLines, "\n"))
|
||||
} else if v.streaming {
|
||||
lines = append(lines, agentDot+" ")
|
||||
}
|
||||
|
||||
@@ -4,7 +4,6 @@ import (
|
||||
"regexp"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
// stripANSI removes ANSI escape sequences for test comparisons.
|
||||
@@ -15,7 +14,7 @@ func stripANSI(s string) string {
|
||||
}
|
||||
|
||||
func TestAddUserMessage(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addUserMessage("hello world")
|
||||
|
||||
if len(v.entries) != 1 {
|
||||
@@ -38,7 +37,7 @@ func TestAddUserMessage(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestStartAndFinishAgent(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.startAgent()
|
||||
|
||||
if !v.streaming {
|
||||
@@ -84,7 +83,7 @@ func TestStartAndFinishAgent(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestFinishAgentNoPadding(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.startAgent()
|
||||
v.appendToken("Test message")
|
||||
v.finishAgent()
|
||||
@@ -99,7 +98,7 @@ func TestFinishAgentNoPadding(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestFinishAgentMultiline(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.startAgent()
|
||||
v.appendToken("Line one\n\nLine three")
|
||||
v.finishAgent()
|
||||
@@ -116,7 +115,7 @@ func TestFinishAgentMultiline(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestFinishAgentEmpty(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.startAgent()
|
||||
v.finishAgent()
|
||||
|
||||
@@ -129,7 +128,7 @@ func TestFinishAgentEmpty(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestAddInfo(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addInfo("test info")
|
||||
|
||||
if len(v.entries) != 1 {
|
||||
@@ -146,7 +145,7 @@ func TestAddInfo(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestAddError(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addError("something broke")
|
||||
|
||||
if len(v.entries) != 1 {
|
||||
@@ -163,7 +162,7 @@ func TestAddError(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestAddCitations(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addCitations(map[int]string{1: "doc-a", 2: "doc-b"})
|
||||
|
||||
if len(v.entries) != 1 {
|
||||
@@ -183,7 +182,7 @@ func TestAddCitations(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestAddCitationsEmpty(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addCitations(map[int]string{})
|
||||
|
||||
if len(v.entries) != 0 {
|
||||
@@ -192,7 +191,7 @@ func TestAddCitationsEmpty(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestCitationVisibility(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addInfo("hello")
|
||||
v.addCitations(map[int]string{1: "doc"})
|
||||
|
||||
@@ -212,7 +211,7 @@ func TestCitationVisibility(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestClearAll(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addUserMessage("test")
|
||||
v.startAgent()
|
||||
v.appendToken("response")
|
||||
@@ -231,7 +230,7 @@ func TestClearAll(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestClearDisplay(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addUserMessage("test")
|
||||
v.clearDisplay()
|
||||
|
||||
@@ -241,7 +240,7 @@ func TestClearDisplay(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestViewPadsShortContent(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
v.addInfo("hello")
|
||||
|
||||
view := v.view(10)
|
||||
@@ -252,7 +251,7 @@ func TestViewPadsShortContent(t *testing.T) {
|
||||
}
|
||||
|
||||
func TestViewTruncatesTallContent(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v := newViewport(80)
|
||||
for i := 0; i < 20; i++ {
|
||||
v.addInfo("line")
|
||||
}
|
||||
@@ -263,93 +262,3 @@ func TestViewTruncatesTallContent(t *testing.T) {
|
||||
t.Errorf("expected 5 lines (truncated), got %d", len(lines))
|
||||
}
|
||||
}
|
||||
|
||||
func TestStreamMarkdownRendersOnThrottle(t *testing.T) {
|
||||
v := newViewport(80, true)
|
||||
v.startAgent()
|
||||
|
||||
// First token: no prior render, so it should render immediately
|
||||
v.appendToken("**bold text**")
|
||||
|
||||
if v.streamRendered == "" {
|
||||
t.Error("expected streamRendered to be populated after first token")
|
||||
}
|
||||
plain := stripANSI(v.streamRendered)
|
||||
if !strings.Contains(plain, "bold text") {
|
||||
t.Errorf("expected rendered to contain 'bold text', got %q", plain)
|
||||
}
|
||||
// Should not contain raw markdown asterisks
|
||||
if strings.Contains(plain, "**") {
|
||||
t.Errorf("expected markdown to be rendered (no **), got %q", plain)
|
||||
}
|
||||
|
||||
// Second token within throttle window: should NOT re-render
|
||||
v.lastRenderTime = time.Now() // simulate recent render
|
||||
prevRendered := v.streamRendered
|
||||
v.appendToken(" more")
|
||||
if v.streamRendered != prevRendered {
|
||||
t.Error("expected streamRendered to be unchanged within throttle window")
|
||||
}
|
||||
|
||||
// After throttle interval: should re-render
|
||||
v.lastRenderTime = time.Now().Add(-streamRenderInterval - time.Millisecond)
|
||||
v.appendToken("!")
|
||||
if v.streamRendered == prevRendered {
|
||||
t.Error("expected streamRendered to update after throttle interval")
|
||||
}
|
||||
plain = stripANSI(v.streamRendered)
|
||||
if !strings.Contains(plain, "bold text more!") {
|
||||
t.Errorf("expected updated rendered content, got %q", plain)
|
||||
}
|
||||
}
|
||||
|
||||
func TestStreamMarkdownDisabledNoRender(t *testing.T) {
|
||||
v := newViewport(80, false)
|
||||
v.startAgent()
|
||||
v.appendToken("**bold**")
|
||||
|
||||
if v.streamRendered != "" {
|
||||
t.Error("expected no streamRendered when streamMarkdown is disabled")
|
||||
}
|
||||
|
||||
// View should show raw markdown
|
||||
view := v.view(10)
|
||||
plain := stripANSI(view)
|
||||
if !strings.Contains(plain, "**bold**") {
|
||||
t.Errorf("expected raw markdown in view, got %q", plain)
|
||||
}
|
||||
}
|
||||
|
||||
func TestStreamMarkdownViewUsesRendered(t *testing.T) {
|
||||
v := newViewport(80, true)
|
||||
v.startAgent()
|
||||
v.appendToken("**formatted**")
|
||||
|
||||
view := v.view(10)
|
||||
plain := stripANSI(view)
|
||||
// Should show rendered content, not raw **formatted**
|
||||
if strings.Contains(plain, "**") {
|
||||
t.Errorf("expected rendered markdown in view (no **), got %q", plain)
|
||||
}
|
||||
if !strings.Contains(plain, "formatted") {
|
||||
t.Errorf("expected 'formatted' in view, got %q", plain)
|
||||
}
|
||||
}
|
||||
|
||||
func TestStreamMarkdownResetOnStart(t *testing.T) {
|
||||
v := newViewport(80, true)
|
||||
|
||||
// First stream cycle
|
||||
v.startAgent()
|
||||
v.appendToken("first")
|
||||
v.finishAgent()
|
||||
|
||||
// Start second stream - state should be clean
|
||||
v.startAgent()
|
||||
if v.streamRendered != "" {
|
||||
t.Error("expected streamRendered cleared on startAgent")
|
||||
}
|
||||
if v.lastRenderLen != 0 {
|
||||
t.Error("expected lastRenderLen reset on startAgent")
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/onyx-dot-app/onyx/cli/cmd"
|
||||
"github.com/onyx-dot-app/onyx/cli/internal/exitcodes"
|
||||
)
|
||||
|
||||
var (
|
||||
@@ -20,10 +18,6 @@ func main() {
|
||||
|
||||
if err := cmd.Execute(); err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error: %v\n", err)
|
||||
var exitErr *exitcodes.ExitError
|
||||
if errors.As(err, &exitErr) {
|
||||
os.Exit(exitErr.Code)
|
||||
}
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -19,6 +19,6 @@ dependencies:
|
||||
version: 5.4.0
|
||||
- name: code-interpreter
|
||||
repository: https://onyx-dot-app.github.io/python-sandbox/
|
||||
version: 0.3.2
|
||||
digest: sha256:74908ea45ace2b4be913ff762772e6d87e40bab64e92c6662aa51730eaeb9d87
|
||||
generated: "2026-04-06T15:34:02.597166-07:00"
|
||||
version: 0.3.1
|
||||
digest: sha256:4965b6ea3674c37163832a2192cd3bc8004f2228729fca170af0b9f457e8f987
|
||||
generated: "2026-03-02T15:29:39.632344-08:00"
|
||||
|
||||
@@ -5,7 +5,7 @@ home: https://www.onyx.app/
|
||||
sources:
|
||||
- "https://github.com/onyx-dot-app/onyx"
|
||||
type: application
|
||||
version: 0.4.40
|
||||
version: 0.4.39
|
||||
appVersion: latest
|
||||
annotations:
|
||||
category: Productivity
|
||||
@@ -45,6 +45,6 @@ dependencies:
|
||||
repository: https://charts.min.io/
|
||||
condition: minio.enabled
|
||||
- name: code-interpreter
|
||||
version: 0.3.2
|
||||
version: 0.3.1
|
||||
repository: https://onyx-dot-app.github.io/python-sandbox/
|
||||
condition: codeInterpreter.enabled
|
||||
|
||||
@@ -67,9 +67,6 @@ spec:
|
||||
- "/bin/sh"
|
||||
- "-c"
|
||||
- |
|
||||
{{- if .Values.api.runUpdateCaCertificates }}
|
||||
update-ca-certificates &&
|
||||
{{- end }}
|
||||
alembic upgrade head &&
|
||||
echo "Starting Onyx Api Server" &&
|
||||
uvicorn onyx.main:app --host {{ .Values.global.host }} --port {{ .Values.api.containerPorts.server }}
|
||||
|
||||
@@ -504,18 +504,6 @@ api:
|
||||
tolerations: []
|
||||
affinity: {}
|
||||
|
||||
# Run update-ca-certificates before starting the server.
|
||||
# Useful when mounting custom CA certificates via volumes/volumeMounts.
|
||||
# NOTE: Requires the container to run as root (runAsUser: 0).
|
||||
# CA certificate files must be mounted under /usr/local/share/ca-certificates/
|
||||
# with a .crt extension (e.g. /usr/local/share/ca-certificates/my-ca.crt).
|
||||
# NOTE: Python HTTP clients (requests, httpx) use certifi's bundle by default
|
||||
# and will not pick up the system CA store automatically. Set the following
|
||||
# environment variables via configMap values (loaded through envFrom) to make them use the updated system bundle:
|
||||
# REQUESTS_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
|
||||
# SSL_CERT_FILE: /etc/ssl/certs/ca-certificates.crt
|
||||
runUpdateCaCertificates: false
|
||||
|
||||
|
||||
######################################################################
|
||||
#
|
||||
|
||||
@@ -30,10 +30,7 @@ target "backend" {
|
||||
context = "backend"
|
||||
dockerfile = "Dockerfile"
|
||||
|
||||
cache-from = [
|
||||
"type=registry,ref=${BACKEND_REPOSITORY}:latest",
|
||||
"type=registry,ref=${BACKEND_REPOSITORY}:edge",
|
||||
]
|
||||
cache-from = ["type=registry,ref=${BACKEND_REPOSITORY}:latest"]
|
||||
cache-to = ["type=inline"]
|
||||
|
||||
tags = ["${BACKEND_REPOSITORY}:${TAG}"]
|
||||
@@ -43,10 +40,7 @@ target "web" {
|
||||
context = "web"
|
||||
dockerfile = "Dockerfile"
|
||||
|
||||
cache-from = [
|
||||
"type=registry,ref=${WEB_SERVER_REPOSITORY}:latest",
|
||||
"type=registry,ref=${WEB_SERVER_REPOSITORY}:edge",
|
||||
]
|
||||
cache-from = ["type=registry,ref=${WEB_SERVER_REPOSITORY}:latest"]
|
||||
cache-to = ["type=inline"]
|
||||
|
||||
tags = ["${WEB_SERVER_REPOSITORY}:${TAG}"]
|
||||
@@ -57,10 +51,7 @@ target "model-server" {
|
||||
|
||||
dockerfile = "Dockerfile.model_server"
|
||||
|
||||
cache-from = [
|
||||
"type=registry,ref=${MODEL_SERVER_REPOSITORY}:latest",
|
||||
"type=registry,ref=${MODEL_SERVER_REPOSITORY}:edge",
|
||||
]
|
||||
cache-from = ["type=registry,ref=${MODEL_SERVER_REPOSITORY}:latest"]
|
||||
cache-to = ["type=inline"]
|
||||
|
||||
tags = ["${MODEL_SERVER_REPOSITORY}:${TAG}"]
|
||||
@@ -82,10 +73,7 @@ target "cli" {
|
||||
context = "cli"
|
||||
dockerfile = "Dockerfile"
|
||||
|
||||
cache-from = [
|
||||
"type=registry,ref=${CLI_REPOSITORY}:latest",
|
||||
"type=registry,ref=${CLI_REPOSITORY}:edge",
|
||||
]
|
||||
cache-from = ["type=registry,ref=${CLI_REPOSITORY}:latest"]
|
||||
cache-to = ["type=inline"]
|
||||
|
||||
tags = ["${CLI_REPOSITORY}:${TAG}"]
|
||||
|
||||
225
docs/METRICS.md
225
docs/METRICS.md
@@ -6,11 +6,11 @@ All Prometheus metrics live in the `backend/onyx/server/metrics/` package. Follo
|
||||
|
||||
### 1. Choose the right file (or create a new one)
|
||||
|
||||
| File | Purpose |
|
||||
| ------------------------------------- | -------------------------------------------- |
|
||||
| `metrics/slow_requests.py` | Slow request counter + callback |
|
||||
| `metrics/postgres_connection_pool.py` | SQLAlchemy connection pool metrics |
|
||||
| `metrics/prometheus_setup.py` | FastAPI instrumentator config (orchestrator) |
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `metrics/slow_requests.py` | Slow request counter + callback |
|
||||
| `metrics/postgres_connection_pool.py` | SQLAlchemy connection pool metrics |
|
||||
| `metrics/prometheus_setup.py` | FastAPI instrumentator config (orchestrator) |
|
||||
|
||||
If your metric is a standalone concern (e.g. cache hit rates, queue depths), create a new file under `metrics/` and keep one metric concept per file.
|
||||
|
||||
@@ -30,7 +30,6 @@ _my_counter = Counter(
|
||||
```
|
||||
|
||||
**Naming conventions:**
|
||||
|
||||
- Prefix all metric names with `onyx_`
|
||||
- Counters: `_total` suffix (e.g. `onyx_api_slow_requests_total`)
|
||||
- Histograms: `_seconds` or `_bytes` suffix for durations/sizes
|
||||
@@ -108,26 +107,26 @@ These metrics are exposed at `GET /metrics` on the API server.
|
||||
|
||||
### Built-in (via `prometheus-fastapi-instrumentator`)
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| ------------------------------------- | --------- | ----------------------------- | ------------------------------------------------- |
|
||||
| `http_requests_total` | Counter | `method`, `status`, `handler` | Total request count |
|
||||
| `http_request_duration_highr_seconds` | Histogram | _(none)_ | High-resolution latency (many buckets, no labels) |
|
||||
| `http_request_duration_seconds` | Histogram | `method`, `handler` | Latency by handler (custom buckets for P95/P99) |
|
||||
| `http_request_size_bytes` | Summary | `handler` | Incoming request content length |
|
||||
| `http_response_size_bytes` | Summary | `handler` | Outgoing response content length |
|
||||
| `http_requests_inprogress` | Gauge | `method`, `handler` | Currently in-flight requests |
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `http_requests_total` | Counter | `method`, `status`, `handler` | Total request count |
|
||||
| `http_request_duration_highr_seconds` | Histogram | _(none)_ | High-resolution latency (many buckets, no labels) |
|
||||
| `http_request_duration_seconds` | Histogram | `method`, `handler` | Latency by handler (custom buckets for P95/P99) |
|
||||
| `http_request_size_bytes` | Summary | `handler` | Incoming request content length |
|
||||
| `http_response_size_bytes` | Summary | `handler` | Outgoing response content length |
|
||||
| `http_requests_inprogress` | Gauge | `method`, `handler` | Currently in-flight requests |
|
||||
|
||||
### Custom (via `onyx.server.metrics`)
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| ------------------------------ | ------- | ----------------------------- | ---------------------------------------------------------------- |
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `onyx_api_slow_requests_total` | Counter | `method`, `handler`, `status` | Requests exceeding `SLOW_REQUEST_THRESHOLD_SECONDS` (default 1s) |
|
||||
|
||||
### Configuration
|
||||
|
||||
| Env Var | Default | Description |
|
||||
| -------------------------------- | ------- | -------------------------------------------- |
|
||||
| `SLOW_REQUEST_THRESHOLD_SECONDS` | `1.0` | Duration threshold for slow request counting |
|
||||
| Env Var | Default | Description |
|
||||
|---------|---------|-------------|
|
||||
| `SLOW_REQUEST_THRESHOLD_SECONDS` | `1.0` | Duration threshold for slow request counting |
|
||||
|
||||
### Instrumentator Settings
|
||||
|
||||
@@ -142,188 +141,44 @@ These metrics provide visibility into SQLAlchemy connection pool state across al
|
||||
|
||||
### Pool State (via custom Prometheus collector — snapshot on each scrape)
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| -------------------------- | ----- | -------- | ----------------------------------------------- |
|
||||
| `onyx_db_pool_checked_out` | Gauge | `engine` | Currently checked-out connections |
|
||||
| `onyx_db_pool_checked_in` | Gauge | `engine` | Idle connections available in the pool |
|
||||
| `onyx_db_pool_overflow` | Gauge | `engine` | Current overflow connections beyond `pool_size` |
|
||||
| `onyx_db_pool_size` | Gauge | `engine` | Configured pool size (constant) |
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `onyx_db_pool_checked_out` | Gauge | `engine` | Currently checked-out connections |
|
||||
| `onyx_db_pool_checked_in` | Gauge | `engine` | Idle connections available in the pool |
|
||||
| `onyx_db_pool_overflow` | Gauge | `engine` | Current overflow connections beyond `pool_size` |
|
||||
| `onyx_db_pool_size` | Gauge | `engine` | Configured pool size (constant) |
|
||||
|
||||
### Pool Lifecycle (via SQLAlchemy pool event listeners)
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| ---------------------------------------- | ------- | -------- | ---------------------------------------- |
|
||||
| `onyx_db_pool_checkout_total` | Counter | `engine` | Total connection checkouts from the pool |
|
||||
| `onyx_db_pool_checkin_total` | Counter | `engine` | Total connection checkins to the pool |
|
||||
| `onyx_db_pool_connections_created_total` | Counter | `engine` | Total new database connections created |
|
||||
| `onyx_db_pool_invalidations_total` | Counter | `engine` | Total connection invalidations |
|
||||
| `onyx_db_pool_checkout_timeout_total` | Counter | `engine` | Total connection checkout timeouts |
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `onyx_db_pool_checkout_total` | Counter | `engine` | Total connection checkouts from the pool |
|
||||
| `onyx_db_pool_checkin_total` | Counter | `engine` | Total connection checkins to the pool |
|
||||
| `onyx_db_pool_connections_created_total` | Counter | `engine` | Total new database connections created |
|
||||
| `onyx_db_pool_invalidations_total` | Counter | `engine` | Total connection invalidations |
|
||||
| `onyx_db_pool_checkout_timeout_total` | Counter | `engine` | Total connection checkout timeouts |
|
||||
|
||||
### Per-Endpoint Attribution (via pool events + endpoint context middleware)
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| -------------------------------------- | --------- | ------------------- | ----------------------------------------------- |
|
||||
| `onyx_db_connections_held_by_endpoint` | Gauge | `handler`, `engine` | DB connections currently held, by endpoint |
|
||||
| `onyx_db_connection_hold_seconds` | Histogram | `handler`, `engine` | Duration a DB connection is held by an endpoint |
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `onyx_db_connections_held_by_endpoint` | Gauge | `handler`, `engine` | DB connections currently held, by endpoint |
|
||||
| `onyx_db_connection_hold_seconds` | Histogram | `handler`, `engine` | Duration a DB connection is held by an endpoint |
|
||||
|
||||
Engine label values: `sync` (main read-write), `async` (async sessions), `readonly` (read-only user).
|
||||
|
||||
Connections from background tasks (Celery) or boot-time warmup appear as `handler="unknown"`.
|
||||
|
||||
## Celery Worker Metrics
|
||||
|
||||
Celery workers expose metrics via a standalone Prometheus HTTP server (separate from the API server's `/metrics` endpoint). Each worker type runs its own server on a dedicated port.
|
||||
|
||||
### Metrics Server (`onyx.server.metrics.metrics_server`)
|
||||
|
||||
| Env Var | Default | Description |
|
||||
| ---------------------------- | ------------------- | ----------------------------------------------------- |
|
||||
| `PROMETHEUS_METRICS_PORT` | _(per worker type)_ | Override the default port for this worker |
|
||||
| `PROMETHEUS_METRICS_ENABLED` | `true` | Set to `false` to disable the metrics server entirely |
|
||||
|
||||
Default ports:
|
||||
|
||||
| Worker | Port |
|
||||
| --------------- | ---- |
|
||||
| `docfetching` | 9092 |
|
||||
| `docprocessing` | 9093 |
|
||||
| `monitoring` | 9096 |
|
||||
|
||||
Workers without a default port and no `PROMETHEUS_METRICS_PORT` env var will skip starting the server.
|
||||
|
||||
### Generic Task Lifecycle Metrics (`onyx.server.metrics.celery_task_metrics`)
|
||||
|
||||
Push-based metrics that fire on Celery signals for all tasks on the worker.
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| ----------------------------------- | --------- | ------------------------------- | ----------------------------------------------------------------------------- |
|
||||
| `onyx_celery_task_started_total` | Counter | `task_name`, `queue` | Total tasks started |
|
||||
| `onyx_celery_task_completed_total` | Counter | `task_name`, `queue`, `outcome` | Total tasks completed (`outcome`: `success` or `failure`) |
|
||||
| `onyx_celery_task_duration_seconds` | Histogram | `task_name`, `queue` | Task execution duration. Buckets: 1, 5, 15, 30, 60, 120, 300, 600, 1800, 3600 |
|
||||
| `onyx_celery_tasks_active` | Gauge | `task_name`, `queue` | Currently executing tasks |
|
||||
| `onyx_celery_task_retried_total` | Counter | `task_name`, `queue` | Total task retries |
|
||||
| `onyx_celery_task_revoked_total` | Counter | `task_name` | Total tasks revoked (cancelled) |
|
||||
| `onyx_celery_task_rejected_total` | Counter | `task_name` | Total tasks rejected by worker |
|
||||
|
||||
Stale start-time entries (tasks killed via SIGTERM/OOM where `task_postrun` never fires) are evicted after 1 hour.
|
||||
|
||||
### Per-Connector Indexing Metrics (`onyx.server.metrics.indexing_task_metrics`)
|
||||
|
||||
Enriches docfetching and docprocessing tasks with connector-level labels. Silently no-ops for all other tasks.
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| ------------------------------------- | --------- | ----------------------------------------------------------- | ---------------------------------------- |
|
||||
| `onyx_indexing_task_started_total` | Counter | `task_name`, `source`, `tenant_id`, `cc_pair_id` | Indexing tasks started per connector |
|
||||
| `onyx_indexing_task_completed_total` | Counter | `task_name`, `source`, `tenant_id`, `cc_pair_id`, `outcome` | Indexing tasks completed per connector |
|
||||
| `onyx_indexing_task_duration_seconds` | Histogram | `task_name`, `source`, `tenant_id` | Indexing task duration by connector type |
|
||||
|
||||
`connector_name` is intentionally excluded from these push-based counters to avoid unbounded cardinality (it's a free-form user string). The pull-based collectors on the monitoring worker include it since they have bounded cardinality (one series per connector).
|
||||
|
||||
### Pull-Based Collectors (`onyx.server.metrics.indexing_pipeline`)
|
||||
|
||||
Registered only in the **Monitoring** worker. Collectors query Redis/Postgres at scrape time with a 30-second TTL cache.
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| ------------------------------------ | ----- | ------- | ----------------------------------- |
|
||||
| `onyx_queue_depth` | Gauge | `queue` | Celery queue length |
|
||||
| `onyx_queue_unacked` | Gauge | `queue` | Unacknowledged messages per queue |
|
||||
| `onyx_queue_oldest_task_age_seconds` | Gauge | `queue` | Age of the oldest task in the queue |
|
||||
|
||||
Plus additional connector health, index attempt, and worker heartbeat metrics — see `indexing_pipeline.py` for the full list.
|
||||
|
||||
### Adding Metrics to a Worker
|
||||
|
||||
Currently only the docfetching and docprocessing workers have push-based task metrics wired up. To add metrics to another worker (e.g. heavy, light, primary):
|
||||
|
||||
**1. Import and call the generic handlers from the worker's signal handlers:**
|
||||
|
||||
```python
|
||||
from onyx.server.metrics.celery_task_metrics import (
|
||||
on_celery_task_prerun,
|
||||
on_celery_task_postrun,
|
||||
on_celery_task_retry,
|
||||
on_celery_task_revoked,
|
||||
on_celery_task_rejected,
|
||||
)
|
||||
|
||||
@signals.task_prerun.connect
|
||||
def on_task_prerun(sender, task_id, task, args, kwargs, **kwds):
|
||||
app_base.on_task_prerun(sender, task_id, task, args, kwargs, **kwds)
|
||||
on_celery_task_prerun(task_id, task)
|
||||
```
|
||||
|
||||
Do the same for `task_postrun`, `task_retry`, `task_revoked`, and `task_rejected` — see `apps/docfetching.py` for the complete example.
|
||||
|
||||
**2. Start the metrics server on `worker_ready`:**
|
||||
|
||||
```python
|
||||
from onyx.server.metrics.metrics_server import start_metrics_server
|
||||
|
||||
@worker_ready.connect
|
||||
def on_worker_ready(sender, **kwargs):
|
||||
start_metrics_server("your_worker_type")
|
||||
app_base.on_worker_ready(sender, **kwargs)
|
||||
```
|
||||
|
||||
Add a default port for your worker type in `metrics_server.py`'s `_DEFAULT_PORTS` dict, or set `PROMETHEUS_METRICS_PORT` in the environment.
|
||||
|
||||
**3. (Optional) Add domain-specific enrichment:**
|
||||
|
||||
If your tasks need richer labels beyond `task_name`/`queue`, create a new module in `server/metrics/` following `indexing_task_metrics.py`:
|
||||
|
||||
- Define Counters/Histograms with your domain labels
|
||||
- Write `on_<domain>_task_prerun` / `on_<domain>_task_postrun` handlers that filter by task name and no-op for others
|
||||
- Call them from the worker's signal handlers alongside the generic ones
|
||||
|
||||
**Cardinality warning:** Never use user-defined free-form strings as metric labels — they create unbounded cardinality. Use IDs or enum values. If you need free-form labels, use pull-based collectors (monitoring worker) where cardinality is naturally bounded.
|
||||
|
||||
### Current Worker Integration Status
|
||||
|
||||
| Worker | Generic Task Metrics | Domain Metrics | Metrics Server |
|
||||
| -------------------- | -------------------- | -------------- | ------------------------------------ |
|
||||
| Docfetching | ✓ | ✓ (indexing) | ✓ (port 9092) |
|
||||
| Docprocessing | ✓ | ✓ (indexing) | ✓ (port 9093) |
|
||||
| Monitoring | — | — | ✓ (port 9096, pull-based collectors) |
|
||||
| Primary | — | — | — |
|
||||
| Light | — | — | — |
|
||||
| Heavy | — | — | — |
|
||||
| User File Processing | — | — | — |
|
||||
| KG Processing | — | — | — |
|
||||
|
||||
### Example PromQL Queries (Celery)
|
||||
|
||||
```promql
|
||||
# Task completion rate by worker queue
|
||||
sum by (queue) (rate(onyx_celery_task_completed_total[5m]))
|
||||
|
||||
# P95 task duration for pruning tasks
|
||||
histogram_quantile(0.95,
|
||||
sum by (le) (rate(onyx_celery_task_duration_seconds_bucket{task_name=~".*pruning.*"}[5m])))
|
||||
|
||||
# Task failure rate
|
||||
sum by (task_name) (rate(onyx_celery_task_completed_total{outcome="failure"}[5m]))
|
||||
/ sum by (task_name) (rate(onyx_celery_task_completed_total[5m]))
|
||||
|
||||
# Active tasks per queue
|
||||
sum by (queue) (onyx_celery_tasks_active)
|
||||
|
||||
# Indexing throughput by source type
|
||||
sum by (source) (rate(onyx_indexing_task_completed_total{outcome="success"}[5m]))
|
||||
|
||||
# Queue depth — are tasks backing up?
|
||||
onyx_queue_depth > 100
|
||||
```
|
||||
|
||||
## OpenSearch Search Metrics
|
||||
|
||||
These metrics track OpenSearch search latency and throughput. Collected via `onyx.server.metrics.opensearch_search`.
|
||||
|
||||
| Metric | Type | Labels | Description |
|
||||
| ------------------------------------------------ | --------- | ------------- | --------------------------------------------------------------------------- |
|
||||
| Metric | Type | Labels | Description |
|
||||
|--------|------|--------|-------------|
|
||||
| `onyx_opensearch_search_client_duration_seconds` | Histogram | `search_type` | Client-side end-to-end latency (network + serialization + server execution) |
|
||||
| `onyx_opensearch_search_server_duration_seconds` | Histogram | `search_type` | Server-side execution time from OpenSearch `took` field |
|
||||
| `onyx_opensearch_search_total` | Counter | `search_type` | Total search requests sent to OpenSearch |
|
||||
| `onyx_opensearch_searches_in_progress` | Gauge | `search_type` | Currently in-flight OpenSearch searches |
|
||||
| `onyx_opensearch_search_server_duration_seconds` | Histogram | `search_type` | Server-side execution time from OpenSearch `took` field |
|
||||
| `onyx_opensearch_search_total` | Counter | `search_type` | Total search requests sent to OpenSearch |
|
||||
| `onyx_opensearch_searches_in_progress` | Gauge | `search_type` | Currently in-flight OpenSearch searches |
|
||||
|
||||
Search type label values: See `OpenSearchSearchType`.
|
||||
|
||||
|
||||
@@ -70,10 +70,6 @@ backend = [
|
||||
"lazy_imports==1.0.1",
|
||||
"lxml==5.3.0",
|
||||
"Mako==1.2.4",
|
||||
# NOTE: Do not update without understanding the patching behavior in
|
||||
# get_markitdown_converter in
|
||||
# backend/onyx/file_processing/extract_file_text.py and what impacts
|
||||
# updating might have on this behavior.
|
||||
"markitdown[pdf, docx, pptx, xlsx, xls]==0.1.2",
|
||||
"mcp[cli]==1.26.0",
|
||||
"msal==1.34.0",
|
||||
|
||||
@@ -127,7 +127,7 @@ function SidebarTab({
|
||||
rightChildren={truncationSpacer}
|
||||
/>
|
||||
) : (
|
||||
<div className="flex flex-row items-center gap-2 w-full">
|
||||
<div className="flex flex-row items-center gap-2 flex-1">
|
||||
{Icon && (
|
||||
<div className="flex items-center justify-center p-0.5">
|
||||
<Icon className="h-[1rem] w-[1rem] text-text-03" />
|
||||
@@ -153,7 +153,7 @@ function SidebarTab({
|
||||
side="right"
|
||||
sideOffset={4}
|
||||
>
|
||||
{children}
|
||||
<Text>{children}</Text>
|
||||
</TooltipPrimitive.Content>
|
||||
</TooltipPrimitive.Portal>
|
||||
</TooltipPrimitive.Root>
|
||||
|
||||
@@ -1,22 +1,18 @@
|
||||
import { Card } from "@opal/components/cards/card/components";
|
||||
import { Content, SizePreset } from "@opal/layouts";
|
||||
import { Content } from "@opal/layouts";
|
||||
import { SvgEmpty } from "@opal/icons";
|
||||
import type {
|
||||
IconFunctionComponent,
|
||||
PaddingVariants,
|
||||
RichStr,
|
||||
} from "@opal/types";
|
||||
import type { IconFunctionComponent, PaddingVariants } from "@opal/types";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Types
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
type EmptyMessageCardBaseProps = {
|
||||
type EmptyMessageCardProps = {
|
||||
/** Icon displayed alongside the title. */
|
||||
icon?: IconFunctionComponent;
|
||||
|
||||
/** Primary message text. */
|
||||
title: string | RichStr;
|
||||
title: string;
|
||||
|
||||
/** Padding preset for the card. @default "md" */
|
||||
padding?: PaddingVariants;
|
||||
@@ -25,30 +21,16 @@ type EmptyMessageCardBaseProps = {
|
||||
ref?: React.Ref<HTMLDivElement>;
|
||||
};
|
||||
|
||||
type EmptyMessageCardProps =
|
||||
| (EmptyMessageCardBaseProps & {
|
||||
/** @default "secondary" */
|
||||
sizePreset?: "secondary";
|
||||
})
|
||||
| (EmptyMessageCardBaseProps & {
|
||||
sizePreset: "main-ui";
|
||||
/** Description text. Only supported when `sizePreset` is `"main-ui"`. */
|
||||
description?: string | RichStr;
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// EmptyMessageCard
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
function EmptyMessageCard(props: EmptyMessageCardProps) {
|
||||
const {
|
||||
sizePreset = "secondary",
|
||||
icon = SvgEmpty,
|
||||
title,
|
||||
padding = "md",
|
||||
ref,
|
||||
} = props;
|
||||
|
||||
function EmptyMessageCard({
|
||||
icon = SvgEmpty,
|
||||
title,
|
||||
padding = "md",
|
||||
ref,
|
||||
}: EmptyMessageCardProps) {
|
||||
return (
|
||||
<Card
|
||||
ref={ref}
|
||||
@@ -57,23 +39,13 @@ function EmptyMessageCard(props: EmptyMessageCardProps) {
|
||||
padding={padding}
|
||||
rounding="md"
|
||||
>
|
||||
{sizePreset === "secondary" ? (
|
||||
<Content
|
||||
icon={icon}
|
||||
title={title}
|
||||
sizePreset="secondary"
|
||||
variant="body"
|
||||
prominence="muted"
|
||||
/>
|
||||
) : (
|
||||
<Content
|
||||
icon={icon}
|
||||
title={title}
|
||||
description={"description" in props ? props.description : undefined}
|
||||
sizePreset={sizePreset}
|
||||
variant="section"
|
||||
/>
|
||||
)}
|
||||
<Content
|
||||
icon={icon}
|
||||
title={title}
|
||||
sizePreset="secondary"
|
||||
variant="body"
|
||||
prominence="muted"
|
||||
/>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
|
||||
@@ -1,16 +1,41 @@
|
||||
"use client";
|
||||
|
||||
import "@opal/core/animations/styles.css";
|
||||
import React from "react";
|
||||
import React, { createContext, useContext, useState, useCallback } from "react";
|
||||
import { cn } from "@opal/utils";
|
||||
import type { WithoutStyles, ExtremaSizeVariants } from "@opal/types";
|
||||
import { widthVariants } from "@opal/shared";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Types
|
||||
// Context-per-group registry
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
type HoverableInteraction = "rest" | "hover";
|
||||
/**
|
||||
* Lazily-created map of group names to React contexts.
|
||||
*
|
||||
* Each group gets its own `React.Context<boolean | null>` so that a
|
||||
* `Hoverable.Item` only re-renders when its *own* group's hover state
|
||||
* changes — not when any unrelated group changes.
|
||||
*
|
||||
* The default value is `null` (no provider found), which lets
|
||||
* `Hoverable.Item` distinguish "no Root ancestor" from "Root says
|
||||
* not hovered" and throw when `group` was explicitly specified.
|
||||
*/
|
||||
const contextMap = new Map<string, React.Context<boolean | null>>();
|
||||
|
||||
function getOrCreateContext(group: string): React.Context<boolean | null> {
|
||||
let ctx = contextMap.get(group);
|
||||
if (!ctx) {
|
||||
ctx = createContext<boolean | null>(null);
|
||||
ctx.displayName = `HoverableContext(${group})`;
|
||||
contextMap.set(group, ctx);
|
||||
}
|
||||
return ctx;
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Types
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
interface HoverableRootProps
|
||||
extends WithoutStyles<React.HTMLAttributes<HTMLDivElement>> {
|
||||
@@ -18,17 +43,6 @@ interface HoverableRootProps
|
||||
group: string;
|
||||
/** Width preset. @default "auto" */
|
||||
widthVariant?: ExtremaSizeVariants;
|
||||
/**
|
||||
* JS-controllable interaction state override.
|
||||
*
|
||||
* - `"rest"` (default): items are shown/hidden by CSS `:hover`.
|
||||
* - `"hover"`: forces items visible regardless of hover state. Useful when
|
||||
* a hoverable action opens a modal — set `interaction="hover"` while the
|
||||
* modal is open so the user can see which element they're interacting with.
|
||||
*
|
||||
* @default "rest"
|
||||
*/
|
||||
interaction?: HoverableInteraction;
|
||||
/** Ref forwarded to the root `<div>`. */
|
||||
ref?: React.Ref<HTMLDivElement>;
|
||||
}
|
||||
@@ -51,10 +65,12 @@ interface HoverableItemProps
|
||||
/**
|
||||
* Hover-tracking container for a named group.
|
||||
*
|
||||
* Uses a `data-hover-group` attribute and CSS `:hover` to control
|
||||
* descendant `Hoverable.Item` visibility. No React state or context —
|
||||
* the browser natively removes `:hover` when modals/portals steal
|
||||
* pointer events, preventing stale hover state.
|
||||
* Wraps children in a `<div>` that tracks mouse-enter / mouse-leave and
|
||||
* provides the hover state via a per-group React context.
|
||||
*
|
||||
* Nesting works because each `Hoverable.Root` creates a **new** context
|
||||
* provider that shadows the parent — so an inner `Hoverable.Item group="b"`
|
||||
* reads from the inner provider, not the outer `group="a"` provider.
|
||||
*
|
||||
* @example
|
||||
* ```tsx
|
||||
@@ -71,20 +87,70 @@ function HoverableRoot({
|
||||
group,
|
||||
children,
|
||||
widthVariant = "full",
|
||||
interaction = "rest",
|
||||
ref,
|
||||
onMouseEnter: consumerMouseEnter,
|
||||
onMouseLeave: consumerMouseLeave,
|
||||
onFocusCapture: consumerFocusCapture,
|
||||
onBlurCapture: consumerBlurCapture,
|
||||
...props
|
||||
}: HoverableRootProps) {
|
||||
const [hovered, setHovered] = useState(false);
|
||||
const [focused, setFocused] = useState(false);
|
||||
|
||||
const onMouseEnter = useCallback(
|
||||
(e: React.MouseEvent<HTMLDivElement>) => {
|
||||
setHovered(true);
|
||||
consumerMouseEnter?.(e);
|
||||
},
|
||||
[consumerMouseEnter]
|
||||
);
|
||||
|
||||
const onMouseLeave = useCallback(
|
||||
(e: React.MouseEvent<HTMLDivElement>) => {
|
||||
setHovered(false);
|
||||
consumerMouseLeave?.(e);
|
||||
},
|
||||
[consumerMouseLeave]
|
||||
);
|
||||
|
||||
const onFocusCapture = useCallback(
|
||||
(e: React.FocusEvent<HTMLDivElement>) => {
|
||||
setFocused(true);
|
||||
consumerFocusCapture?.(e);
|
||||
},
|
||||
[consumerFocusCapture]
|
||||
);
|
||||
|
||||
const onBlurCapture = useCallback(
|
||||
(e: React.FocusEvent<HTMLDivElement>) => {
|
||||
if (
|
||||
!(e.relatedTarget instanceof Node) ||
|
||||
!e.currentTarget.contains(e.relatedTarget)
|
||||
) {
|
||||
setFocused(false);
|
||||
}
|
||||
consumerBlurCapture?.(e);
|
||||
},
|
||||
[consumerBlurCapture]
|
||||
);
|
||||
|
||||
const active = hovered || focused;
|
||||
const GroupContext = getOrCreateContext(group);
|
||||
|
||||
return (
|
||||
<div
|
||||
{...props}
|
||||
ref={ref}
|
||||
className={cn(widthVariants[widthVariant])}
|
||||
data-hover-group={group}
|
||||
data-interaction={interaction !== "rest" ? interaction : undefined}
|
||||
>
|
||||
{children}
|
||||
</div>
|
||||
<GroupContext.Provider value={active}>
|
||||
<div
|
||||
{...props}
|
||||
ref={ref}
|
||||
className={cn(widthVariants[widthVariant])}
|
||||
onMouseEnter={onMouseEnter}
|
||||
onMouseLeave={onMouseLeave}
|
||||
onFocusCapture={onFocusCapture}
|
||||
onBlurCapture={onBlurCapture}
|
||||
>
|
||||
{children}
|
||||
</div>
|
||||
</GroupContext.Provider>
|
||||
);
|
||||
}
|
||||
|
||||
@@ -96,10 +162,13 @@ function HoverableRoot({
|
||||
* An element whose visibility is controlled by hover state.
|
||||
*
|
||||
* **Local mode** (`group` omitted): the item handles hover on its own
|
||||
* element via CSS `:hover`.
|
||||
* element via CSS `:hover`. This is the core abstraction.
|
||||
*
|
||||
* **Group mode** (`group` provided): visibility is driven by CSS `:hover`
|
||||
* on the nearest `Hoverable.Root` ancestor via `[data-hover-group]:hover`.
|
||||
* **Group mode** (`group` provided): visibility is driven by a matching
|
||||
* `Hoverable.Root` ancestor's hover state via React context. If no
|
||||
* matching Root is found, an error is thrown.
|
||||
*
|
||||
* Uses data-attributes for variant styling (see `styles.css`).
|
||||
*
|
||||
* @example
|
||||
* ```tsx
|
||||
@@ -115,6 +184,8 @@ function HoverableRoot({
|
||||
* </Hoverable.Item>
|
||||
* </Hoverable.Root>
|
||||
* ```
|
||||
*
|
||||
* @throws If `group` is specified but no matching `Hoverable.Root` ancestor exists.
|
||||
*/
|
||||
function HoverableItem({
|
||||
group,
|
||||
@@ -123,6 +194,17 @@ function HoverableItem({
|
||||
ref,
|
||||
...props
|
||||
}: HoverableItemProps) {
|
||||
const contextValue = useContext(
|
||||
group ? getOrCreateContext(group) : NOOP_CONTEXT
|
||||
);
|
||||
|
||||
if (group && contextValue === null) {
|
||||
throw new Error(
|
||||
`Hoverable.Item group="${group}" has no matching Hoverable.Root ancestor. ` +
|
||||
`Either wrap it in <Hoverable.Root group="${group}"> or remove the group prop for local hover.`
|
||||
);
|
||||
}
|
||||
|
||||
const isLocal = group === undefined;
|
||||
|
||||
return (
|
||||
@@ -131,6 +213,9 @@ function HoverableItem({
|
||||
ref={ref}
|
||||
className={cn("hoverable-item")}
|
||||
data-hoverable-variant={variant}
|
||||
data-hoverable-active={
|
||||
isLocal ? undefined : contextValue ? "true" : undefined
|
||||
}
|
||||
data-hoverable-local={isLocal ? "true" : undefined}
|
||||
>
|
||||
{children}
|
||||
@@ -138,6 +223,9 @@ function HoverableItem({
|
||||
);
|
||||
}
|
||||
|
||||
/** Stable context used when no group is specified (local mode). */
|
||||
const NOOP_CONTEXT = createContext<boolean | null>(null);
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Compound export
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -145,16 +233,18 @@ function HoverableItem({
|
||||
/**
|
||||
* Hoverable compound component for hover-to-reveal patterns.
|
||||
*
|
||||
* Entirely CSS-driven — no React state or context. The browser's native
|
||||
* `:hover` pseudo-class handles all state, which means hover is
|
||||
* automatically cleared when modals/portals steal pointer events.
|
||||
* Provides two sub-components:
|
||||
*
|
||||
* - `Hoverable.Root` — Container with `data-hover-group`. CSS `:hover`
|
||||
* on this element reveals descendant `Hoverable.Item` elements.
|
||||
* - `Hoverable.Root` — A container that tracks hover state for a named group
|
||||
* and provides it via React context.
|
||||
*
|
||||
* - `Hoverable.Item` — Hidden by default. In group mode, revealed when
|
||||
* the ancestor Root is hovered. In local mode (no `group`), revealed
|
||||
* when the item itself is hovered.
|
||||
* - `Hoverable.Item` — The core abstraction. On its own (no `group`), it
|
||||
* applies local CSS `:hover` for the variant effect. When `group` is
|
||||
* specified, it reads hover state from the nearest matching
|
||||
* `Hoverable.Root` — and throws if no matching Root is found.
|
||||
*
|
||||
* Supports nesting: a child `Hoverable.Root` shadows the parent's context,
|
||||
* so each group's items only respond to their own root's hover.
|
||||
*
|
||||
* @example
|
||||
* ```tsx
|
||||
@@ -186,5 +276,4 @@ export {
|
||||
type HoverableRootProps,
|
||||
type HoverableItemProps,
|
||||
type HoverableItemVariant,
|
||||
type HoverableInteraction,
|
||||
};
|
||||
|
||||
@@ -7,20 +7,8 @@
|
||||
opacity: 0;
|
||||
}
|
||||
|
||||
/* Group mode — Root :hover controls descendant item visibility via CSS.
|
||||
Exclude local-mode items so they aren't revealed by an ancestor root. */
|
||||
[data-hover-group]:hover
|
||||
.hoverable-item[data-hoverable-variant="opacity-on-hover"]:not(
|
||||
[data-hoverable-local]
|
||||
) {
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
/* Interaction override — force items visible via JS */
|
||||
[data-hover-group][data-interaction="hover"]
|
||||
.hoverable-item[data-hoverable-variant="opacity-on-hover"]:not(
|
||||
[data-hoverable-local]
|
||||
) {
|
||||
/* Group mode — Root controls visibility via React context */
|
||||
.hoverable-item[data-hoverable-variant="opacity-on-hover"][data-hoverable-active="true"] {
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
@@ -29,16 +17,7 @@
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
/* Group focus — any focusable descendant of the Root receives keyboard focus,
|
||||
revealing all group items (same behavior as hover). */
|
||||
[data-hover-group]:focus-within
|
||||
.hoverable-item[data-hoverable-variant="opacity-on-hover"]:not(
|
||||
[data-hoverable-local]
|
||||
) {
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
/* Local focus — item (or a focusable descendant) receives keyboard focus */
|
||||
/* Focus — item (or a focusable descendant) receives keyboard focus */
|
||||
.hoverable-item[data-hoverable-variant="opacity-on-hover"]:has(:focus-visible) {
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
@@ -8,7 +8,7 @@ const SvgBifrost = ({ size, className, ...props }: IconProps) => (
|
||||
viewBox="0 0 37 46"
|
||||
fill="none"
|
||||
xmlns="http://www.w3.org/2000/svg"
|
||||
className={cn(className, "!text-[#33C19E]")}
|
||||
className={cn(className, "text-[#33C19E] dark:text-white")}
|
||||
{...props}
|
||||
>
|
||||
<title>Bifrost</title>
|
||||
|
||||
@@ -1,116 +0,0 @@
|
||||
# Card
|
||||
|
||||
**Import:** `import { Card } from "@opal/layouts";`
|
||||
|
||||
A namespace of card layout primitives. Each sub-component handles a specific region of a card.
|
||||
|
||||
## Card.Header
|
||||
|
||||
A card header layout that pairs a [`Content`](../content/README.md) block with a right-side column and an optional full-width children slot.
|
||||
|
||||
### Why Card.Header?
|
||||
|
||||
[`ContentAction`](../content-action/README.md) provides a single `rightChildren` slot. Card headers typically need two distinct right-side regions — a primary action on top and secondary actions on the bottom. `Card.Header` provides this with `rightChildren` and `bottomRightChildren` slots, plus a `children` slot for full-width content below the header row (e.g., search bars, expandable tool lists).
|
||||
|
||||
### Props
|
||||
|
||||
Inherits **all** props from [`Content`](../content/README.md) (icon, title, description, sizePreset, variant, editable, onTitleChange, suffix, etc.) plus:
|
||||
|
||||
| Prop | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `rightChildren` | `ReactNode` | `undefined` | Content rendered to the right of the Content block (top of right column). |
|
||||
| `bottomRightChildren` | `ReactNode` | `undefined` | Content rendered below `rightChildren` in the same column. Laid out as `flex flex-row`. |
|
||||
| `children` | `ReactNode` | `undefined` | Content rendered below the full header row, spanning the entire width. |
|
||||
|
||||
### Layout Structure
|
||||
|
||||
```
|
||||
+---------------------------------------------------------+
|
||||
| [Content (p-2, self-start)] [rightChildren] |
|
||||
| icon + title + description [bottomRightChildren] |
|
||||
+---------------------------------------------------------+
|
||||
| [children — full width] |
|
||||
+---------------------------------------------------------+
|
||||
```
|
||||
|
||||
- Outer wrapper: `flex flex-col w-full`
|
||||
- Header row: `flex flex-row items-stretch w-full`
|
||||
- Content area: `flex-1 min-w-0 self-start p-2` — top-aligned with fixed padding
|
||||
- Right column: `flex flex-col items-end shrink-0` — no padding, no gap
|
||||
- `bottomRightChildren` wrapper: `flex flex-row` — lays children out horizontally
|
||||
- `children` wrapper: `w-full` — only rendered when children are provided
|
||||
|
||||
### Usage
|
||||
|
||||
#### Card with primary and secondary actions
|
||||
|
||||
```tsx
|
||||
import { Card } from "@opal/layouts";
|
||||
import { Button } from "@opal/components";
|
||||
import { SvgGlobe, SvgSettings, SvgUnplug, SvgCheckSquare } from "@opal/icons";
|
||||
|
||||
<Card.Header
|
||||
icon={SvgGlobe}
|
||||
title="Google Search"
|
||||
description="Web search provider"
|
||||
sizePreset="main-ui"
|
||||
variant="section"
|
||||
rightChildren={
|
||||
<Button icon={SvgCheckSquare} variant="action" prominence="tertiary">
|
||||
Current Default
|
||||
</Button>
|
||||
}
|
||||
bottomRightChildren={
|
||||
<>
|
||||
<Button icon={SvgUnplug} size="sm" prominence="tertiary" tooltip="Disconnect" />
|
||||
<Button icon={SvgSettings} size="sm" prominence="tertiary" tooltip="Edit" />
|
||||
</>
|
||||
}
|
||||
/>
|
||||
```
|
||||
|
||||
#### Card with only a connect action
|
||||
|
||||
```tsx
|
||||
<Card.Header
|
||||
icon={SvgCloud}
|
||||
title="OpenAI"
|
||||
description="Not configured"
|
||||
sizePreset="main-ui"
|
||||
variant="section"
|
||||
rightChildren={
|
||||
<Button rightIcon={SvgArrowExchange} prominence="tertiary">
|
||||
Connect
|
||||
</Button>
|
||||
}
|
||||
/>
|
||||
```
|
||||
|
||||
#### Card with expandable children
|
||||
|
||||
```tsx
|
||||
<Card.Header
|
||||
icon={SvgServer}
|
||||
title="MCP Server"
|
||||
description="12 tools available"
|
||||
sizePreset="main-ui"
|
||||
variant="section"
|
||||
rightChildren={<Button icon={SvgSettings} prominence="tertiary" />}
|
||||
>
|
||||
<SearchBar placeholder="Search tools..." />
|
||||
</Card.Header>
|
||||
```
|
||||
|
||||
#### No right children
|
||||
|
||||
```tsx
|
||||
<Card.Header
|
||||
icon={SvgInfo}
|
||||
title="Section Header"
|
||||
description="Description text"
|
||||
sizePreset="main-content"
|
||||
variant="section"
|
||||
/>
|
||||
```
|
||||
|
||||
When both `rightChildren` and `bottomRightChildren` are omitted and no `children` are provided, the component renders only the padded `Content`.
|
||||
@@ -1,5 +1,5 @@
|
||||
import type { Meta, StoryObj } from "@storybook/react";
|
||||
import { Card } from "@opal/layouts";
|
||||
import { CardHeaderLayout } from "@opal/layouts";
|
||||
import { Button } from "@opal/components";
|
||||
import {
|
||||
SvgArrowExchange,
|
||||
@@ -18,14 +18,14 @@ const withTooltipProvider: Decorator = (Story) => (
|
||||
);
|
||||
|
||||
const meta = {
|
||||
title: "Layouts/Card.Header",
|
||||
component: Card.Header,
|
||||
title: "Layouts/CardHeaderLayout",
|
||||
component: CardHeaderLayout,
|
||||
tags: ["autodocs"],
|
||||
decorators: [withTooltipProvider],
|
||||
parameters: {
|
||||
layout: "centered",
|
||||
},
|
||||
} satisfies Meta<typeof Card.Header>;
|
||||
} satisfies Meta<typeof CardHeaderLayout>;
|
||||
|
||||
export default meta;
|
||||
|
||||
@@ -38,7 +38,7 @@ type Story = StoryObj<typeof meta>;
|
||||
export const Default: Story = {
|
||||
render: () => (
|
||||
<div className="w-[28rem] border rounded-16">
|
||||
<Card.Header
|
||||
<CardHeaderLayout
|
||||
sizePreset="main-ui"
|
||||
variant="section"
|
||||
icon={SvgGlobe}
|
||||
@@ -57,7 +57,7 @@ export const Default: Story = {
|
||||
export const WithBothSlots: Story = {
|
||||
render: () => (
|
||||
<div className="w-[28rem] border rounded-16">
|
||||
<Card.Header
|
||||
<CardHeaderLayout
|
||||
sizePreset="main-ui"
|
||||
variant="section"
|
||||
icon={SvgGlobe}
|
||||
@@ -92,7 +92,7 @@ export const WithBothSlots: Story = {
|
||||
export const RightChildrenOnly: Story = {
|
||||
render: () => (
|
||||
<div className="w-[28rem] border rounded-16">
|
||||
<Card.Header
|
||||
<CardHeaderLayout
|
||||
sizePreset="main-ui"
|
||||
variant="section"
|
||||
icon={SvgGlobe}
|
||||
@@ -111,7 +111,7 @@ export const RightChildrenOnly: Story = {
|
||||
export const NoRightChildren: Story = {
|
||||
render: () => (
|
||||
<div className="w-[28rem] border rounded-16">
|
||||
<Card.Header
|
||||
<CardHeaderLayout
|
||||
sizePreset="main-ui"
|
||||
variant="section"
|
||||
icon={SvgGlobe}
|
||||
@@ -125,7 +125,7 @@ export const NoRightChildren: Story = {
|
||||
export const LongContent: Story = {
|
||||
render: () => (
|
||||
<div className="w-[28rem] border rounded-16">
|
||||
<Card.Header
|
||||
<CardHeaderLayout
|
||||
sizePreset="main-ui"
|
||||
variant="section"
|
||||
icon={SvgGlobe}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user