feat(permissions): update permission checks for creating personal access tokens and enhance UI with permission validation

feat(permissions): update permission identifiers for service accounts and bots management
feat(permissions): update permission checks for query history access
2026-04-21 17:36:44 +00:00 · 2026-04-11 18:15:42 +05:30 · 2026-04-11 17:51:50 +05:30 · 2026-04-11 17:45:41 +05:30 · 2026-04-10 10:35:29 +05:30 · 2026-04-09 17:56:13 +05:30
244 changed files with 7418 additions and 8481 deletions
--- a/.github/workflows/deployment.yml
+++ b/.github/workflows/deployment.yml
@@ -13,7 +13,7 @@ permissions:
  id-token: write # zizmor: ignore[excessive-permissions]

 env:
-  EDGE_TAG: ${{ startsWith(github.ref_name, 'nightly-latest') || github.ref_name == 'edge' }}
+  EDGE_TAG: ${{ startsWith(github.ref_name, 'nightly-latest') }}

 jobs:
  # Determine which components to build based on the tag
@@ -156,7 +156,7 @@ jobs:
  check-version-tag:
    runs-on: ubuntu-slim
    timeout-minutes: 10
-    if: ${{ !startsWith(github.ref_name, 'nightly-latest') && github.ref_name != 'edge' && github.event_name != 'workflow_dispatch' }}
+    if: ${{ !startsWith(github.ref_name, 'nightly-latest') && github.event_name != 'workflow_dispatch' }}
    steps:
      - name: Checkout
        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -9,6 +9,7 @@ repos:
    rev: d30b4298e4fb63ce8609e29acdbcf4c9018a483c
    hooks:
      - id: uv-sync
+        args: ["--locked", "--all-extras"]
      - id: uv-lock
      - id: uv-export
        name: uv-export default.txt
@@ -17,7 +18,7 @@ repos:
            "--no-emit-project",
            "--no-default-groups",
            "--no-hashes",
-            "--group",
+            "--extra",
            "backend",
            "-o",
            "backend/requirements/default.txt",
@@ -30,7 +31,7 @@ repos:
            "--no-emit-project",
            "--no-default-groups",
            "--no-hashes",
-            "--group",
+            "--extra",
            "dev",
            "-o",
            "backend/requirements/dev.txt",
@@ -43,7 +44,7 @@ repos:
            "--no-emit-project",
            "--no-default-groups",
            "--no-hashes",
-            "--group",
+            "--extra",
            "ee",
            "-o",
            "backend/requirements/ee.txt",
@@ -56,7 +57,7 @@ repos:
            "--no-emit-project",
            "--no-default-groups",
            "--no-hashes",
-            "--group",
+            "--extra",
            "model_server",
            "-o",
            "backend/requirements/model_server.txt",
--- a/.vscode/launch.json
+++ b/.vscode/launch.json
@@ -531,7 +531,8 @@
      "request": "launch",
      "runtimeExecutable": "uv",
      "runtimeArgs": [
-        "sync"
+        "sync",
+        "--all-extras"
      ],
      "cwd": "${workspaceFolder}",
      "console": "integratedTerminal",
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -117,7 +117,7 @@ If using PowerShell, the command slightly differs:
 Install the required Python dependencies:

 ```bash
-uv sync
+uv sync --all-extras
 ```

 Install Playwright for Python (headless browser required by the Web Connector):
--- a/backend/ee/onyx/db/license.py
+++ b/backend/ee/onyx/db/license.py
@@ -13,7 +13,6 @@ from ee.onyx.server.license.models import LicenseSource
 from onyx.auth.schemas import UserRole
 from onyx.cache.factory import get_cache_backend
 from onyx.configs.constants import ANONYMOUS_USER_EMAIL
-from onyx.db.enums import AccountType
 from onyx.db.models import License
 from onyx.db.models import User
 from onyx.utils.logger import setup_logger
@@ -108,13 +107,12 @@ def get_used_seats(tenant_id: str | None = None) -> int:
    Get current seat usage directly from database.

    For multi-tenant: counts users in UserTenantMapping for this tenant.
-    For self-hosted: counts all active users.
+    For self-hosted: counts all active users (excludes EXT_PERM_USER role
+    and the anonymous system user).

-    Only human accounts count toward seat limits.
-    SERVICE_ACCOUNT (API key dummy users), EXT_PERM_USER, and the
-    anonymous system user are excluded. BOT (Slack users) ARE counted
-    because they represent real humans and get upgraded to STANDARD
-    when they log in via web.
+    TODO: Exclude API key dummy users from seat counting. API keys create
+    users with emails like `__DANSWER_API_KEY_*` that should not count toward
+    seat limits. See: https://linear.app/onyx-app/issue/ENG-3518
    """
    if MULTI_TENANT:
        from ee.onyx.server.tenants.user_mapping import get_tenant_count
@@ -131,7 +129,6 @@ def get_used_seats(tenant_id: str | None = None) -> int:
                    User.is_active == True,  # type: ignore  # noqa: E712
                    User.role != UserRole.EXT_PERM_USER,
                    User.email != ANONYMOUS_USER_EMAIL,  # type: ignore
-                    User.account_type != AccountType.SERVICE_ACCOUNT,
                )
            )
            return result.scalar() or 0
--- a/backend/ee/onyx/db/user_group.py
+++ b/backend/ee/onyx/db/user_group.py
@@ -996,3 +996,72 @@ def set_group_permission__no_commit(

    db_session.flush()
    recompute_permissions_for_group__no_commit(group_id, db_session)
+
+
+def set_group_permissions_bulk__no_commit(
+    group_id: int,
+    desired_permissions: set[Permission],
+    granted_by: UUID,
+    db_session: Session,
+) -> list[Permission]:
+    """Set the full desired permission state for a group in one pass.
+
+    Enables permissions in `desired_permissions`, disables any toggleable
+    permission not in the set. Non-toggleable permissions are ignored.
+    Calls recompute once at the end. Does NOT commit.
+
+    Returns the resulting list of enabled permissions.
+    """
+
+    existing_grants = (
+        db_session.execute(
+            select(PermissionGrant)
+            .where(PermissionGrant.group_id == group_id)
+            .with_for_update()
+        )
+        .scalars()
+        .all()
+    )
+
+    grant_map: dict[Permission, PermissionGrant] = {
+        g.permission: g for g in existing_grants
+    }
+
+    # Enable desired permissions
+    for perm in desired_permissions:
+        existing = grant_map.get(perm)
+        if existing is not None:
+            if existing.is_deleted:
+                existing.is_deleted = False
+                existing.granted_by = granted_by
+                existing.granted_at = func.now()
+        else:
+            db_session.add(
+                PermissionGrant(
+                    group_id=group_id,
+                    permission=perm,
+                    grant_source=GrantSource.USER,
+                    granted_by=granted_by,
+                )
+            )
+
+    # Disable toggleable permissions not in the desired set
+    for perm, grant in grant_map.items():
+        if perm not in desired_permissions and not grant.is_deleted:
+            grant.is_deleted = True
+
+    db_session.flush()
+    recompute_permissions_for_group__no_commit(group_id, db_session)
+
+    # Return the resulting enabled set
+    return [
+        g.permission
+        for g in db_session.execute(
+            select(PermissionGrant).where(
+                PermissionGrant.group_id == group_id,
+                PermissionGrant.is_deleted.is_(False),
+            )
+        )
+        .scalars()
+        .all()
+    ]
--- a/backend/ee/onyx/server/query_history/api.py
+++ b/backend/ee/onyx/server/query_history/api.py
@@ -154,7 +154,7 @@ def snapshot_from_chat_session(
@router.get("/admin/chat-sessions")
 def admin_get_chat_sessions(
    user_id: UUID,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.READ_QUERY_HISTORY)),
    db_session: Session = Depends(get_session),
 ) -> ChatSessionsResponse:
    # we specifically don't allow this endpoint if "anonymized" since
@@ -197,7 +197,7 @@ def get_chat_session_history(
    feedback_type: QAFeedbackType | None = None,
    start_time: datetime | None = None,
    end_time: datetime | None = None,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.READ_QUERY_HISTORY)),
    db_session: Session = Depends(get_session),
 ) -> PaginatedReturn[ChatSessionMinimal]:
    ensure_query_history_is_enabled(disallowed=[QueryHistoryType.DISABLED])
@@ -235,7 +235,7 @@ def get_chat_session_history(
@router.get("/admin/chat-session-history/{chat_session_id}")
 def get_chat_session_admin(
    chat_session_id: UUID,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.READ_QUERY_HISTORY)),
    db_session: Session = Depends(get_session),
 ) -> ChatSessionSnapshot:
    ensure_query_history_is_enabled(disallowed=[QueryHistoryType.DISABLED])
@@ -270,7 +270,7 @@ def get_chat_session_admin(

@router.get("/admin/query-history/list")
 def list_all_query_history_exports(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.READ_QUERY_HISTORY)),
    db_session: Session = Depends(get_session),
 ) -> list[QueryHistoryExport]:
    ensure_query_history_is_enabled(disallowed=[QueryHistoryType.DISABLED])
@@ -298,7 +298,7 @@ def list_all_query_history_exports(

@router.post("/admin/query-history/start-export", tags=PUBLIC_API_TAGS)
 def start_query_history_export(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.READ_QUERY_HISTORY)),
    db_session: Session = Depends(get_session),
    start: datetime | None = None,
    end: datetime | None = None,
@@ -345,7 +345,7 @@ def start_query_history_export(
@router.get("/admin/query-history/export-status", tags=PUBLIC_API_TAGS)
 def get_query_history_export_status(
    request_id: str,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.READ_QUERY_HISTORY)),
    db_session: Session = Depends(get_session),
 ) -> dict[str, str]:
    ensure_query_history_is_enabled(disallowed=[QueryHistoryType.DISABLED])
@@ -379,7 +379,7 @@ def get_query_history_export_status(
@router.get("/admin/query-history/download", tags=PUBLIC_API_TAGS)
 def download_query_history_csv(
    request_id: str,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.READ_QUERY_HISTORY)),
    db_session: Session = Depends(get_session),
 ) -> StreamingResponse:
    ensure_query_history_is_enabled(disallowed=[QueryHistoryType.DISABLED])
--- a/backend/ee/onyx/server/scim/api.py
+++ b/backend/ee/onyx/server/scim/api.py
@@ -11,8 +11,6 @@ require a valid SCIM bearer token.

 from __future__ import annotations

-import hashlib
-import struct
 from uuid import UUID

 from fastapi import APIRouter
@@ -24,7 +22,6 @@ from fastapi import Response
 from fastapi.responses import JSONResponse
 from fastapi_users.password import PasswordHelper
 from sqlalchemy import func
-from sqlalchemy import text
 from sqlalchemy.exc import IntegrityError
 from sqlalchemy.orm import Session

@@ -68,25 +65,12 @@ from onyx.db.permissions import recompute_user_permissions__no_commit
 from onyx.db.users import assign_user_to_default_groups__no_commit
 from onyx.utils.logger import setup_logger
 from onyx.utils.variable_functionality import fetch_ee_implementation_or_noop
-from shared_configs.contextvars import get_current_tenant_id

 logger = setup_logger()

 # Group names reserved for system default groups (seeded by migration).
 _RESERVED_GROUP_NAMES = frozenset({"Admin", "Basic"})

-# Namespace prefix for the seat-allocation advisory lock. Hashed together
-# with the tenant ID so the lock is scoped per-tenant (unrelated tenants
-# never block each other) and cannot collide with unrelated advisory locks.
-_SEAT_LOCK_NAMESPACE = "onyx_scim_seat_lock"
-
-
-def _seat_lock_id_for_tenant(tenant_id: str) -> int:
-    """Derive a stable 64-bit signed int lock id for this tenant's seat lock."""
-    digest = hashlib.sha256(f"{_SEAT_LOCK_NAMESPACE}:{tenant_id}".encode()).digest()
-    # pg_advisory_xact_lock takes a signed 8-byte int; unpack as such.
-    return struct.unpack("q", digest[:8])[0]
-

 class ScimJSONResponse(JSONResponse):
    """JSONResponse with Content-Type: application/scim+json (RFC 7644 §3.1)."""
@@ -225,37 +209,12 @@ def _apply_exclusions(


 def _check_seat_availability(dal: ScimDAL) -> str | None:
-    """Return an error message if seat limit is reached, else None.
-
-    Acquires a transaction-scoped advisory lock so that concurrent
-    SCIM requests are serialized.  IdPs like Okta send provisioning
-    requests in parallel batches — without serialization the check is
-    vulnerable to a TOCTOU race where N concurrent requests each see
-    "seats available", all insert, and the tenant ends up over its
-    seat limit.
-
-    The lock is held until the caller's next COMMIT or ROLLBACK, which
-    means the seat count cannot change between the check here and the
-    subsequent INSERT/UPDATE.  Each call site in this module follows
-    the pattern: _check_seat_availability → write → dal.commit()
-    (which releases the lock for the next waiting request).
-    """
+    """Return an error message if seat limit is reached, else None."""
    check_fn = fetch_ee_implementation_or_noop(
        "onyx.db.license", "check_seat_availability", None
    )
    if check_fn is None:
        return None
-
-    # Transaction-scoped advisory lock — released on dal.commit() / dal.rollback().
-    # The lock id is derived from the tenant so unrelated tenants never block
-    # each other, and from a namespace string so it cannot collide with
-    # unrelated advisory locks elsewhere in the codebase.
-    lock_id = _seat_lock_id_for_tenant(get_current_tenant_id())
-    dal.session.execute(
-        text("SELECT pg_advisory_xact_lock(:lock_id)"),
-        {"lock_id": lock_id},
-    )
-
    result = check_fn(dal.session, seats_needed=1)
    if not result.available:
        return result.error_message or "Seat limit reached"
--- a/backend/ee/onyx/server/user_group/api.py
+++ b/backend/ee/onyx/server/user_group/api.py
@@ -13,20 +13,21 @@ from ee.onyx.db.user_group import fetch_user_groups_for_user
 from ee.onyx.db.user_group import insert_user_group
 from ee.onyx.db.user_group import prepare_user_group_for_deletion
 from ee.onyx.db.user_group import rename_user_group
-from ee.onyx.db.user_group import set_group_permission__no_commit
+from ee.onyx.db.user_group import set_group_permissions_bulk__no_commit
 from ee.onyx.db.user_group import update_user_curator_relationship
 from ee.onyx.db.user_group import update_user_group
 from ee.onyx.server.user_group.models import AddUsersToUserGroupRequest
+from ee.onyx.server.user_group.models import BulkSetPermissionsRequest
 from ee.onyx.server.user_group.models import MinimalUserGroupSnapshot
 from ee.onyx.server.user_group.models import SetCuratorRequest
-from ee.onyx.server.user_group.models import SetPermissionRequest
-from ee.onyx.server.user_group.models import SetPermissionResponse
 from ee.onyx.server.user_group.models import UpdateGroupAgentsRequest
 from ee.onyx.server.user_group.models import UserGroup
 from ee.onyx.server.user_group.models import UserGroupCreate
 from ee.onyx.server.user_group.models import UserGroupRename
 from ee.onyx.server.user_group.models import UserGroupUpdate
 from onyx.auth.permissions import NON_TOGGLEABLE_PERMISSIONS
+from onyx.auth.permissions import PERMISSION_REGISTRY
+from onyx.auth.permissions import PermissionRegistryEntry
 from onyx.auth.permissions import require_permission
 from onyx.auth.users import current_curator_or_admin_user
 from onyx.configs.app_configs import DISABLE_VECTOR_DB
@@ -48,24 +49,15 @@ router = APIRouter(prefix="/manage", tags=PUBLIC_API_TAGS)
@router.get("/admin/user-group")
 def list_user_groups(
    include_default: bool = False,
-    user: User = Depends(current_curator_or_admin_user),
+    _: User = Depends(require_permission(Permission.READ_USER_GROUPS)),
    db_session: Session = Depends(get_session),
 ) -> list[UserGroup]:
-    if user.role == UserRole.ADMIN:
-        user_groups = fetch_user_groups(
-            db_session,
-            only_up_to_date=False,
-            eager_load_for_snapshot=True,
-            include_default=include_default,
-        )
-    else:
-        user_groups = fetch_user_groups_for_user(
-            db_session=db_session,
-            user_id=user.id,
-            only_curator_groups=user.role == UserRole.CURATOR,
-            eager_load_for_snapshot=True,
-            include_default=include_default,
-        )
+    user_groups = fetch_user_groups(
+        db_session,
+        only_up_to_date=False,
+        eager_load_for_snapshot=True,
+        include_default=include_default,
+    )
    return [UserGroup.from_model(user_group) for user_group in user_groups]


@@ -92,6 +84,13 @@ def list_minimal_user_groups(
    ]


+@router.get("/admin/permissions/registry")
+def get_permission_registry(
+    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+) -> list[PermissionRegistryEntry]:
+    return PERMISSION_REGISTRY
+
+
@router.get("/admin/user-group/{user_group_id}/permissions")
 def get_user_group_permissions(
    user_group_id: int,
@@ -102,37 +101,39 @@ def get_user_group_permissions(
    if group is None:
        raise OnyxError(OnyxErrorCode.NOT_FOUND, "User group not found")
    return [
-        grant.permission for grant in group.permission_grants if not grant.is_deleted
+        grant.permission
+        for grant in group.permission_grants
+        if not grant.is_deleted and grant.permission not in NON_TOGGLEABLE_PERMISSIONS
    ]


@router.put("/admin/user-group/{user_group_id}/permissions")
-def set_user_group_permission(
+def set_user_group_permissions(
    user_group_id: int,
-    request: SetPermissionRequest,
+    request: BulkSetPermissionsRequest,
    user: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
    db_session: Session = Depends(get_session),
-) -> SetPermissionResponse:
+) -> list[Permission]:
    group = fetch_user_group(db_session, user_group_id)
    if group is None:
        raise OnyxError(OnyxErrorCode.NOT_FOUND, "User group not found")

-    if request.permission in NON_TOGGLEABLE_PERMISSIONS:
+    non_toggleable = [p for p in request.permissions if p in NON_TOGGLEABLE_PERMISSIONS]
+    if non_toggleable:
        raise OnyxError(
            OnyxErrorCode.INVALID_INPUT,
-            f"Permission '{request.permission}' cannot be toggled via this endpoint",
+            f"Permissions {non_toggleable} cannot be toggled via this endpoint",
        )

-    set_group_permission__no_commit(
+    result = set_group_permissions_bulk__no_commit(
        group_id=user_group_id,
-        permission=request.permission,
-        enabled=request.enabled,
+        desired_permissions=set(request.permissions),
        granted_by=user.id,
        db_session=db_session,
    )
    db_session.commit()

-    return SetPermissionResponse(permission=request.permission, enabled=request.enabled)
+    return result


@router.post("/admin/user-group")
--- a/backend/ee/onyx/server/user_group/models.py
+++ b/backend/ee/onyx/server/user_group/models.py
@@ -132,3 +132,7 @@ class SetPermissionRequest(BaseModel):
 class SetPermissionResponse(BaseModel):
    permission: Permission
    enabled: bool
+
+
+class BulkSetPermissionsRequest(BaseModel):
+    permissions: list[Permission]
--- a/backend/onyx/auth/permissions.py
+++ b/backend/onyx/auth/permissions.py
@@ -11,6 +11,8 @@ from collections.abc import Coroutine
 from typing import Any

 from fastapi import Depends
+from pydantic import BaseModel
+from pydantic import field_validator

 from onyx.auth.users import current_user
 from onyx.db.enums import Permission
@@ -29,14 +31,13 @@ IMPLIED_PERMISSIONS: dict[str, set[str]] = {
    Permission.MANAGE_AGENTS.value: {
        Permission.ADD_AGENTS.value,
        Permission.READ_AGENTS.value,
+        Permission.READ_DOCUMENT_SETS.value,
    },
    Permission.MANAGE_DOCUMENT_SETS.value: {
        Permission.READ_DOCUMENT_SETS.value,
        Permission.READ_CONNECTORS.value,
    },
-    Permission.ADD_CONNECTORS.value: {Permission.READ_CONNECTORS.value},
    Permission.MANAGE_CONNECTORS.value: {
-        Permission.ADD_CONNECTORS.value,
        Permission.READ_CONNECTORS.value,
    },
    Permission.MANAGE_USER_GROUPS.value: {
@@ -44,6 +45,11 @@ IMPLIED_PERMISSIONS: dict[str, set[str]] = {
        Permission.READ_DOCUMENT_SETS.value,
        Permission.READ_AGENTS.value,
        Permission.READ_USERS.value,
+        Permission.READ_USER_GROUPS.value,
+    },
+    Permission.MANAGE_LLMS.value: {
+        Permission.READ_USER_GROUPS.value,
+        Permission.READ_AGENTS.value,
    },
 }

@@ -58,10 +64,129 @@ NON_TOGGLEABLE_PERMISSIONS: frozenset[Permission] = frozenset(
        Permission.READ_DOCUMENT_SETS,
        Permission.READ_AGENTS,
        Permission.READ_USERS,
+        Permission.READ_USER_GROUPS,
    }
 )


+class PermissionRegistryEntry(BaseModel):
+    """A UI-facing permission row served by GET /admin/permissions/registry.
+
+    The field_validator ensures non-toggleable permissions (BASIC_ACCESS,
+    FULL_ADMIN_PANEL_ACCESS, READ_*) can never appear in the registry.
+    """
+
+    id: str
+    display_name: str
+    description: str
+    permissions: list[Permission]
+    group: int
+
+    @field_validator("permissions")
+    @classmethod
+    def must_be_toggleable(cls, v: list[Permission]) -> list[Permission]:
+        for p in v:
+            if p in NON_TOGGLEABLE_PERMISSIONS:
+                raise ValueError(
+                    f"Permission '{p.value}' is not toggleable and "
+                    "cannot be included in the permission registry"
+                )
+        return v
+
+
+# Registry of toggleable permissions exposed to the admin UI.
+# Single source of truth for display names, descriptions, grouping,
+# and which backend tokens each UI row controls.
+# The frontend fetches this via GET /admin/permissions/registry
+# and only adds icon mapping locally.
+PERMISSION_REGISTRY: list[PermissionRegistryEntry] = [
+    # Group 0 — System Configuration
+    PermissionRegistryEntry(
+        id="manage_llms",
+        display_name="Manage LLMs",
+        description="Add and update configurations for language models (LLMs).",
+        permissions=[Permission.MANAGE_LLMS],
+        group=0,
+    ),
+    PermissionRegistryEntry(
+        id="manage_connectors_and_document_sets",
+        display_name="Manage Connectors & Document Sets",
+        description="Add and update connectors and document sets.",
+        permissions=[
+            Permission.MANAGE_CONNECTORS,
+            Permission.MANAGE_DOCUMENT_SETS,
+        ],
+        group=0,
+    ),
+    PermissionRegistryEntry(
+        id="manage_actions",
+        display_name="Manage Actions",
+        description="Add and update custom tools and MCP/OpenAPI actions.",
+        permissions=[Permission.MANAGE_ACTIONS],
+        group=0,
+    ),
+    # Group 1 — User & Access Management
+    PermissionRegistryEntry(
+        id="manage_groups",
+        display_name="Manage Groups",
+        description="Add and update user groups.",
+        permissions=[Permission.MANAGE_USER_GROUPS],
+        group=1,
+    ),
+    PermissionRegistryEntry(
+        id="manage_service_accounts",
+        display_name="Manage Service Accounts",
+        description="Add and update service accounts and their API keys.",
+        permissions=[Permission.MANAGE_SERVICE_ACCOUNT_API_KEYS],
+        group=1,
+    ),
+    PermissionRegistryEntry(
+        id="manage_bots",
+        display_name="Manage Slack/Discord Bots",
+        description="Add and update Onyx integrations with Slack or Discord.",
+        permissions=[Permission.MANAGE_BOTS],
+        group=1,
+    ),
+    # Group 2 — Agents
+    PermissionRegistryEntry(
+        id="create_agents",
+        display_name="Create Agents",
+        description="Create and edit the user's own agents.",
+        permissions=[Permission.ADD_AGENTS],
+        group=2,
+    ),
+    PermissionRegistryEntry(
+        id="manage_agents",
+        display_name="Manage Agents",
+        description="View and update all public and shared agents in the organization.",
+        permissions=[Permission.MANAGE_AGENTS],
+        group=2,
+    ),
+    # Group 3 — Monitoring & Tokens
+    PermissionRegistryEntry(
+        id="view_agent_analytics",
+        display_name="View Agent Analytics",
+        description="View analytics for agents the group can manage.",
+        permissions=[Permission.READ_AGENT_ANALYTICS],
+        group=3,
+    ),
+    PermissionRegistryEntry(
+        id="view_query_history",
+        display_name="View Query History",
+        description="View query history of everyone in the organization.",
+        permissions=[Permission.READ_QUERY_HISTORY],
+        group=3,
+    ),
+    PermissionRegistryEntry(
+        id="create_user_access_token",
+        display_name="Create User Access Token",
+        description="Add and update the user's personal access tokens.",
+        permissions=[Permission.CREATE_USER_API_KEYS],
+        group=3,
+    ),
+]
+
+
 def resolve_effective_permissions(granted: set[str]) -> set[str]:
    """Expand granted permissions with their implied permissions.

@@ -83,7 +208,12 @@ def resolve_effective_permissions(granted: set[str]) -> set[str]:


 def get_effective_permissions(user: User) -> set[Permission]:
-    """Read granted permissions from the column and expand implied permissions."""
+    """Read granted permissions from the column and expand implied permissions.
+
+    Admin-role users always receive all permissions regardless of the JSONB
+    column, maintaining backward compatibility with role-based access control.
+    """
+
    granted: set[Permission] = set()
    for p in user.effective_permissions:
        try:
@@ -96,6 +226,11 @@ def get_effective_permissions(user: User) -> set[Permission]:
    return {Permission(p) for p in expanded}


+def has_permission(user: User, permission: Permission) -> bool:
+    """Check whether *user* holds *permission* (directly or via implication/admin override)."""
+    return permission in get_effective_permissions(user)
+
+
 def require_permission(
    required: Permission,
 ) -> Callable[..., Coroutine[Any, Any, User]]:
--- a/backend/onyx/background/celery/celery_utils.py
+++ b/backend/onyx/background/celery/celery_utils.py
@@ -1,4 +1,3 @@
-import time
 from collections.abc import Generator
 from collections.abc import Iterator
 from collections.abc import Sequence
@@ -31,8 +30,6 @@ from onyx.connectors.models import HierarchyNode
 from onyx.connectors.models import SlimDocument
 from onyx.httpx.httpx_pool import HttpxPool
 from onyx.indexing.indexing_heartbeat import IndexingHeartbeatInterface
-from onyx.server.metrics.pruning_metrics import inc_pruning_rate_limit_error
-from onyx.server.metrics.pruning_metrics import observe_pruning_enumeration_duration
 from onyx.utils.logger import setup_logger


@@ -133,7 +130,6 @@ def _extract_from_batch(
 def extract_ids_from_runnable_connector(
    runnable_connector: BaseConnector,
    callback: IndexingHeartbeatInterface | None = None,
-    connector_type: str = "unknown",
 ) -> SlimConnectorExtractionResult:
    """
    Extract document IDs and hierarchy nodes from a runnable connector.
@@ -183,38 +179,21 @@ def extract_ids_from_runnable_connector(
    )

    # process raw batches to extract both IDs and hierarchy nodes
-    enumeration_start = time.monotonic()
-    try:
-        for doc_list in raw_batch_generator:
-            if callback and callback.should_stop():
-                raise RuntimeError(
-                    "extract_ids_from_runnable_connector: Stop signal detected"
-                )
+    for doc_list in raw_batch_generator:
+        if callback and callback.should_stop():
+            raise RuntimeError(
+                "extract_ids_from_runnable_connector: Stop signal detected"
+            )

-            batch_result = _extract_from_batch(doc_list)
-            batch_ids = batch_result.raw_id_to_parent
-            batch_nodes = batch_result.hierarchy_nodes
-            doc_batch_processing_func(batch_ids)
-            all_raw_id_to_parent.update(batch_ids)
-            all_hierarchy_nodes.extend(batch_nodes)
+        batch_result = _extract_from_batch(doc_list)
+        batch_ids = batch_result.raw_id_to_parent
+        batch_nodes = batch_result.hierarchy_nodes
+        doc_batch_processing_func(batch_ids)
+        all_raw_id_to_parent.update(batch_ids)
+        all_hierarchy_nodes.extend(batch_nodes)

-            if callback:
-                callback.progress("extract_ids_from_runnable_connector", len(batch_ids))
-    except Exception as e:
-        # Best-effort rate limit detection via string matching.
-        # Connectors surface rate limits inconsistently — some raise HTTP 429,
-        # some use SDK-specific exceptions (e.g. google.api_core.exceptions.ResourceExhausted)
-        # that may or may not include "rate limit" or "429" in the message.
-        # TODO(Bo): replace with a standard ConnectorRateLimitError exception that all
-        # connectors raise when rate limited, making this check precise.
-        error_str = str(e)
-        if "rate limit" in error_str.lower() or "429" in error_str:
-            inc_pruning_rate_limit_error(connector_type)
-        raise
-    finally:
-        observe_pruning_enumeration_duration(
-            time.monotonic() - enumeration_start, connector_type
-        )
+        if callback:
+            callback.progress("extract_ids_from_runnable_connector", len(batch_ids))

    return SlimConnectorExtractionResult(
        raw_id_to_parent=all_raw_id_to_parent,
--- a/backend/onyx/background/celery/tasks/pruning/tasks.py
+++ b/backend/onyx/background/celery/tasks/pruning/tasks.py
@@ -72,7 +72,6 @@ from onyx.redis.redis_hierarchy import get_source_node_id_from_cache
 from onyx.redis.redis_hierarchy import HierarchyNodeCacheEntry
 from onyx.redis.redis_pool import get_redis_client
 from onyx.redis.redis_pool import get_redis_replica_client
-from onyx.server.metrics.pruning_metrics import observe_pruning_diff_duration
 from onyx.server.runtime.onyx_runtime import OnyxRuntime
 from onyx.server.utils import make_short_id
 from onyx.utils.logger import format_error_for_logging
@@ -571,9 +570,8 @@ def connector_pruning_generator_task(
            )

            # Extract docs and hierarchy nodes from the source
-            connector_type = cc_pair.connector.source.value
            extraction_result = extract_ids_from_runnable_connector(
-                runnable_connector, callback, connector_type=connector_type
+                runnable_connector, callback
            )
            all_connector_doc_ids = extraction_result.raw_id_to_parent

@@ -638,46 +636,40 @@ def connector_pruning_generator_task(
                commit=True,
            )

-            diff_start = time.monotonic()
-            try:
-                # a list of docs in our local index
-                all_indexed_document_ids = {
-                    doc.id
-                    for doc in get_documents_for_connector_credential_pair(
-                        db_session=db_session,
-                        connector_id=connector_id,
-                        credential_id=credential_id,
-                    )
-                }
+            # a list of docs in our local index
+            all_indexed_document_ids = {
+                doc.id
+                for doc in get_documents_for_connector_credential_pair(
+                    db_session=db_session,
+                    connector_id=connector_id,
+                    credential_id=credential_id,
+                )
+            }

-                # generate list of docs to remove (no longer in the source)
-                doc_ids_to_remove = list(
-                    all_indexed_document_ids - all_connector_doc_ids.keys()
-                )
+            # generate list of docs to remove (no longer in the source)
+            doc_ids_to_remove = list(
+                all_indexed_document_ids - all_connector_doc_ids.keys()
+            )

-                task_logger.info(
-                    "Pruning set collected: "
-                    f"cc_pair={cc_pair_id} "
-                    f"connector_source={cc_pair.connector.source} "
-                    f"docs_to_remove={len(doc_ids_to_remove)}"
-                )
+            task_logger.info(
+                "Pruning set collected: "
+                f"cc_pair={cc_pair_id} "
+                f"connector_source={cc_pair.connector.source} "
+                f"docs_to_remove={len(doc_ids_to_remove)}"
+            )

-                task_logger.info(
-                    f"RedisConnector.prune.generate_tasks starting. cc_pair={cc_pair_id}"
-                )
-                tasks_generated = redis_connector.prune.generate_tasks(
-                    set(doc_ids_to_remove), self.app, db_session, None
-                )
-                if tasks_generated is None:
-                    return None
+            task_logger.info(
+                f"RedisConnector.prune.generate_tasks starting. cc_pair={cc_pair_id}"
+            )
+            tasks_generated = redis_connector.prune.generate_tasks(
+                set(doc_ids_to_remove), self.app, db_session, None
+            )
+            if tasks_generated is None:
+                return None

-                task_logger.info(
-                    f"RedisConnector.prune.generate_tasks finished. cc_pair={cc_pair_id} tasks_generated={tasks_generated}"
-                )
-            finally:
-                observe_pruning_diff_duration(
-                    time.monotonic() - diff_start, connector_type
-                )
+            task_logger.info(
+                f"RedisConnector.prune.generate_tasks finished. cc_pair={cc_pair_id} tasks_generated={tasks_generated}"
+            )

            redis_connector.prune.generator_complete = tasks_generated

--- a/backend/onyx/chat/llm_loop.py
+++ b/backend/onyx/chat/llm_loop.py
@@ -4,6 +4,8 @@ from collections.abc import Callable
 from typing import Any
 from typing import Literal

+from sqlalchemy.orm import Session
+
 from onyx.chat.chat_state import ChatStateContainer
 from onyx.chat.chat_utils import create_tool_call_failure_messages
 from onyx.chat.citation_processor import CitationMapping
@@ -633,6 +635,7 @@ def run_llm_loop(
    user_memory_context: UserMemoryContext | None,
    llm: LLM,
    token_counter: Callable[[str], int],
+    db_session: Session,
    forced_tool_id: int | None = None,
    user_identity: LLMUserIdentity | None = None,
    chat_session_id: str | None = None,
@@ -1017,16 +1020,20 @@ def run_llm_loop(
                    persisted_memory_id: int | None = None
                    if user_memory_context and user_memory_context.user_id:
                        if tool_response.rich_response.index_to_replace is not None:
-                            persisted_memory_id = update_memory_at_index(
+                            memory = update_memory_at_index(
                                user_id=user_memory_context.user_id,
                                index=tool_response.rich_response.index_to_replace,
                                new_text=tool_response.rich_response.memory_text,
+                                db_session=db_session,
                            )
+                            persisted_memory_id = memory.id if memory else None
                        else:
-                            persisted_memory_id = add_memory(
+                            memory = add_memory(
                                user_id=user_memory_context.user_id,
                                memory_text=tool_response.rich_response.memory_text,
+                                db_session=db_session,
                            )
+                            persisted_memory_id = memory.id
                    operation: Literal["add", "update"] = (
                        "update"
                        if tool_response.rich_response.index_to_replace is not None
--- a/backend/onyx/chat/llm_step.py
+++ b/backend/onyx/chat/llm_step.py
@@ -826,12 +826,6 @@ def translate_history_to_llm_format(
                            base64_data = img_file.to_base64()
                            image_url = f"data:{image_type};base64,{base64_data}"

-                            content_parts.append(
-                                TextContentPart(
-                                    type="text",
-                                    text=f"[attached image — file_id: {img_file.file_id}]",
-                                )
-                            )
                            image_part = ImageContentPart(
                                type="image_url",
                                image_url=ImageUrlDetail(
--- a/backend/onyx/chat/process_message.py
+++ b/backend/onyx/chat/process_message.py
@@ -67,6 +67,7 @@ from onyx.db.chat import get_chat_session_by_id
 from onyx.db.chat import get_or_create_root_message
 from onyx.db.chat import reserve_message_id
 from onyx.db.chat import reserve_multi_model_message_ids
+from onyx.db.engine.sql_engine import get_session_with_current_tenant
 from onyx.db.enums import HookPoint
 from onyx.db.memory import get_memories
 from onyx.db.models import ChatMessage
@@ -1005,86 +1006,93 @@ def _run_models(
        model_llm = setup.llms[model_idx]

        try:
-            # Each function opens short-lived DB sessions on demand.
-            # Do NOT pass a long-lived session here — it would hold a
-            # connection for the entire LLM loop (minutes), and cloud
-            # infrastructure may drop idle connections.
-            thread_tool_dict = construct_tools(
-                persona=setup.persona,
-                emitter=model_emitter,
-                user=user,
-                llm=model_llm,
-                search_tool_config=SearchToolConfig(
-                    user_selected_filters=setup.new_msg_req.internal_search_filters,
-                    project_id_filter=setup.search_params.project_id_filter,
-                    persona_id_filter=setup.search_params.persona_id_filter,
-                    bypass_acl=setup.bypass_acl,
-                    slack_context=setup.slack_context,
-                    enable_slack_search=_should_enable_slack_search(
-                        setup.persona, setup.new_msg_req.internal_search_filters
-                    ),
-                ),
-                custom_tool_config=CustomToolConfig(
-                    chat_session_id=setup.chat_session.id,
-                    message_id=setup.user_message.id,
-                    additional_headers=setup.custom_tool_additional_headers,
-                    mcp_headers=setup.mcp_headers,
-                ),
-                file_reader_tool_config=FileReaderToolConfig(
-                    user_file_ids=setup.available_files.user_file_ids,
-                    chat_file_ids=setup.available_files.chat_file_ids,
-                ),
-                allowed_tool_ids=setup.new_msg_req.allowed_tool_ids,
-                search_usage_forcing_setting=setup.search_params.search_usage,
-            )
-            model_tools = [
-                tool for tool_list in thread_tool_dict.values() for tool in tool_list
-            ]
-
-            if setup.forced_tool_id and setup.forced_tool_id not in {
-                tool.id for tool in model_tools
-            }:
-                raise ValueError(
-                    f"Forced tool {setup.forced_tool_id} not found in tools"
-                )
-
-            # Per-thread copy: run_llm_loop mutates simple_chat_history in-place.
-            if n_models == 1 and setup.new_msg_req.deep_research:
-                if setup.chat_session.project_id:
-                    raise RuntimeError("Deep research is not supported for projects")
-                run_deep_research_llm_loop(
-                    emitter=model_emitter,
-                    state_container=sc,
-                    simple_chat_history=list(setup.simple_chat_history),
-                    tools=model_tools,
-                    custom_agent_prompt=setup.custom_agent_prompt,
-                    llm=model_llm,
-                    token_counter=get_llm_token_counter(model_llm),
-                    skip_clarification=setup.skip_clarification,
-                    user_identity=setup.user_identity,
-                    chat_session_id=str(setup.chat_session.id),
-                    all_injected_file_metadata=setup.all_injected_file_metadata,
-                )
-            else:
-                run_llm_loop(
-                    emitter=model_emitter,
-                    state_container=sc,
-                    simple_chat_history=list(setup.simple_chat_history),
-                    tools=model_tools,
-                    custom_agent_prompt=setup.custom_agent_prompt,
-                    context_files=setup.extracted_context_files,
+            # Each worker opens its own session — SQLAlchemy sessions are not thread-safe.
+            # Do NOT write to the outer db_session (or any shared DB state) from here;
+            # all DB writes in this thread must go through thread_db_session.
+            with get_session_with_current_tenant() as thread_db_session:
+                thread_tool_dict = construct_tools(
                    persona=setup.persona,
-                    user_memory_context=setup.user_memory_context,
+                    db_session=thread_db_session,
+                    emitter=model_emitter,
+                    user=user,
                    llm=model_llm,
-                    token_counter=get_llm_token_counter(model_llm),
-                    forced_tool_id=setup.forced_tool_id,
-                    user_identity=setup.user_identity,
-                    chat_session_id=str(setup.chat_session.id),
-                    chat_files=setup.chat_files_for_tools,
-                    include_citations=setup.new_msg_req.include_citations,
-                    all_injected_file_metadata=setup.all_injected_file_metadata,
-                    inject_memories_in_prompt=user.use_memories,
+                    search_tool_config=SearchToolConfig(
+                        user_selected_filters=setup.new_msg_req.internal_search_filters,
+                        project_id_filter=setup.search_params.project_id_filter,
+                        persona_id_filter=setup.search_params.persona_id_filter,
+                        bypass_acl=setup.bypass_acl,
+                        slack_context=setup.slack_context,
+                        enable_slack_search=_should_enable_slack_search(
+                            setup.persona, setup.new_msg_req.internal_search_filters
+                        ),
+                    ),
+                    custom_tool_config=CustomToolConfig(
+                        chat_session_id=setup.chat_session.id,
+                        message_id=setup.user_message.id,
+                        additional_headers=setup.custom_tool_additional_headers,
+                        mcp_headers=setup.mcp_headers,
+                    ),
+                    file_reader_tool_config=FileReaderToolConfig(
+                        user_file_ids=setup.available_files.user_file_ids,
+                        chat_file_ids=setup.available_files.chat_file_ids,
+                    ),
+                    allowed_tool_ids=setup.new_msg_req.allowed_tool_ids,
+                    search_usage_forcing_setting=setup.search_params.search_usage,
                )
+                model_tools = [
+                    tool
+                    for tool_list in thread_tool_dict.values()
+                    for tool in tool_list
+                ]
+
+                if setup.forced_tool_id and setup.forced_tool_id not in {
+                    tool.id for tool in model_tools
+                }:
+                    raise ValueError(
+                        f"Forced tool {setup.forced_tool_id} not found in tools"
+                    )
+
+                # Per-thread copy: run_llm_loop mutates simple_chat_history in-place.
+                if n_models == 1 and setup.new_msg_req.deep_research:
+                    if setup.chat_session.project_id:
+                        raise RuntimeError(
+                            "Deep research is not supported for projects"
+                        )
+                    run_deep_research_llm_loop(
+                        emitter=model_emitter,
+                        state_container=sc,
+                        simple_chat_history=list(setup.simple_chat_history),
+                        tools=model_tools,
+                        custom_agent_prompt=setup.custom_agent_prompt,
+                        llm=model_llm,
+                        token_counter=get_llm_token_counter(model_llm),
+                        db_session=thread_db_session,
+                        skip_clarification=setup.skip_clarification,
+                        user_identity=setup.user_identity,
+                        chat_session_id=str(setup.chat_session.id),
+                        all_injected_file_metadata=setup.all_injected_file_metadata,
+                    )
+                else:
+                    run_llm_loop(
+                        emitter=model_emitter,
+                        state_container=sc,
+                        simple_chat_history=list(setup.simple_chat_history),
+                        tools=model_tools,
+                        custom_agent_prompt=setup.custom_agent_prompt,
+                        context_files=setup.extracted_context_files,
+                        persona=setup.persona,
+                        user_memory_context=setup.user_memory_context,
+                        llm=model_llm,
+                        token_counter=get_llm_token_counter(model_llm),
+                        db_session=thread_db_session,
+                        forced_tool_id=setup.forced_tool_id,
+                        user_identity=setup.user_identity,
+                        chat_session_id=str(setup.chat_session.id),
+                        chat_files=setup.chat_files_for_tools,
+                        include_citations=setup.new_msg_req.include_citations,
+                        all_injected_file_metadata=setup.all_injected_file_metadata,
+                        inject_memories_in_prompt=user.use_memories,
+                    )

            model_succeeded[model_idx] = True

--- a/backend/onyx/connectors/google_utils/google_kv.py
+++ b/backend/onyx/connectors/google_utils/google_kv.py
@@ -1,5 +1,4 @@
 import json
-from typing import Any
 from typing import cast
 from urllib.parse import parse_qs
 from urllib.parse import ParseResult
@@ -54,21 +53,6 @@ from onyx.utils.logger import setup_logger
 logger = setup_logger()


-def _load_google_json(raw: object) -> dict[str, Any]:
-    """Accept both the current (dict) and legacy (JSON string) KV payload shapes.
-
-    Payloads written before the fix for serializing Google credentials into
-    ``EncryptedJson`` columns are stored as JSON strings; new writes store dicts.
-    Once every install has re-uploaded their Google credentials the legacy
-    ``str`` branch can be removed.
-    """
-    if isinstance(raw, dict):
-        return raw
-    if isinstance(raw, str):
-        return json.loads(raw)
-    raise ValueError(f"Unexpected Google credential payload type: {type(raw)!r}")
-
-
 def _build_frontend_google_drive_redirect(source: DocumentSource) -> str:
    if source == DocumentSource.GOOGLE_DRIVE:
        return f"{WEB_DOMAIN}/admin/connectors/google-drive/auth/callback"
@@ -178,13 +162,12 @@ def build_service_account_creds(

 def get_auth_url(credential_id: int, source: DocumentSource) -> str:
    if source == DocumentSource.GOOGLE_DRIVE:
-        credential_json = _load_google_json(
-            get_kv_store().load(KV_GOOGLE_DRIVE_CRED_KEY)
-        )
+        creds_str = str(get_kv_store().load(KV_GOOGLE_DRIVE_CRED_KEY))
    elif source == DocumentSource.GMAIL:
-        credential_json = _load_google_json(get_kv_store().load(KV_GMAIL_CRED_KEY))
+        creds_str = str(get_kv_store().load(KV_GMAIL_CRED_KEY))
    else:
        raise ValueError(f"Unsupported source: {source}")
+    credential_json = json.loads(creds_str)
    flow = InstalledAppFlow.from_client_config(
        credential_json,
        scopes=GOOGLE_SCOPES[source],
@@ -205,12 +188,12 @@ def get_auth_url(credential_id: int, source: DocumentSource) -> str:

 def get_google_app_cred(source: DocumentSource) -> GoogleAppCredentials:
    if source == DocumentSource.GOOGLE_DRIVE:
-        creds = _load_google_json(get_kv_store().load(KV_GOOGLE_DRIVE_CRED_KEY))
+        creds_str = str(get_kv_store().load(KV_GOOGLE_DRIVE_CRED_KEY))
    elif source == DocumentSource.GMAIL:
-        creds = _load_google_json(get_kv_store().load(KV_GMAIL_CRED_KEY))
+        creds_str = str(get_kv_store().load(KV_GMAIL_CRED_KEY))
    else:
        raise ValueError(f"Unsupported source: {source}")
-    return GoogleAppCredentials(**creds)
+    return GoogleAppCredentials(**json.loads(creds_str))


 def upsert_google_app_cred(
@@ -218,14 +201,10 @@ def upsert_google_app_cred(
 ) -> None:
    if source == DocumentSource.GOOGLE_DRIVE:
        get_kv_store().store(
-            KV_GOOGLE_DRIVE_CRED_KEY,
-            app_credentials.model_dump(mode="json"),
-            encrypt=True,
+            KV_GOOGLE_DRIVE_CRED_KEY, app_credentials.json(), encrypt=True
        )
    elif source == DocumentSource.GMAIL:
-        get_kv_store().store(
-            KV_GMAIL_CRED_KEY, app_credentials.model_dump(mode="json"), encrypt=True
-        )
+        get_kv_store().store(KV_GMAIL_CRED_KEY, app_credentials.json(), encrypt=True)
    else:
        raise ValueError(f"Unsupported source: {source}")

@@ -241,14 +220,12 @@ def delete_google_app_cred(source: DocumentSource) -> None:

 def get_service_account_key(source: DocumentSource) -> GoogleServiceAccountKey:
    if source == DocumentSource.GOOGLE_DRIVE:
-        creds = _load_google_json(
-            get_kv_store().load(KV_GOOGLE_DRIVE_SERVICE_ACCOUNT_KEY)
-        )
+        creds_str = str(get_kv_store().load(KV_GOOGLE_DRIVE_SERVICE_ACCOUNT_KEY))
    elif source == DocumentSource.GMAIL:
-        creds = _load_google_json(get_kv_store().load(KV_GMAIL_SERVICE_ACCOUNT_KEY))
+        creds_str = str(get_kv_store().load(KV_GMAIL_SERVICE_ACCOUNT_KEY))
    else:
        raise ValueError(f"Unsupported source: {source}")
-    return GoogleServiceAccountKey(**creds)
+    return GoogleServiceAccountKey(**json.loads(creds_str))


 def upsert_service_account_key(
@@ -257,14 +234,12 @@ def upsert_service_account_key(
    if source == DocumentSource.GOOGLE_DRIVE:
        get_kv_store().store(
            KV_GOOGLE_DRIVE_SERVICE_ACCOUNT_KEY,
-            service_account_key.model_dump(mode="json"),
+            service_account_key.json(),
            encrypt=True,
        )
    elif source == DocumentSource.GMAIL:
        get_kv_store().store(
-            KV_GMAIL_SERVICE_ACCOUNT_KEY,
-            service_account_key.model_dump(mode="json"),
-            encrypt=True,
+            KV_GMAIL_SERVICE_ACCOUNT_KEY, service_account_key.json(), encrypt=True
        )
    else:
        raise ValueError(f"Unsupported source: {source}")
--- a/backend/onyx/connectors/jira/connector.py
+++ b/backend/onyx/connectors/jira/connector.py
@@ -60,10 +60,8 @@ logger = setup_logger()

 ONE_HOUR = 3600

-_MAX_RESULTS_FETCH_IDS = 5000
+_MAX_RESULTS_FETCH_IDS = 5000  # 5000
 _JIRA_FULL_PAGE_SIZE = 50
-# https://developer.atlassian.com/cloud/jira/platform/rest/v3/api-group-issues/
-_JIRA_BULK_FETCH_LIMIT = 100

 # Constants for Jira field names
 _FIELD_REPORTER = "reporter"
@@ -257,13 +255,15 @@ def _bulk_fetch_request(
    return resp.json()["issues"]


-def _bulk_fetch_batch(
-    jira_client: JIRA, issue_ids: list[str], fields: str | None
-) -> list[dict[str, Any]]:
-    """Fetch a single batch (must be <= _JIRA_BULK_FETCH_LIMIT).
-    On JSONDecodeError, recursively bisects until it succeeds or reaches size 1."""
+def bulk_fetch_issues(
+    jira_client: JIRA, issue_ids: list[str], fields: str | None = None
+) -> list[Issue]:
+    # TODO(evan): move away from this jira library if they continue to not support
+    # the endpoints we need. Using private fields is not ideal, but
+    # is likely fine for now since we pin the library version
+
    try:
-        return _bulk_fetch_request(jira_client, issue_ids, fields)
+        raw_issues = _bulk_fetch_request(jira_client, issue_ids, fields)
    except requests.exceptions.JSONDecodeError:
        if len(issue_ids) <= 1:
            logger.exception(
@@ -277,25 +277,12 @@ def _bulk_fetch_batch(
            f"Jira bulk-fetch JSON decode failed for batch of {len(issue_ids)} issues. "
            f"Splitting into sub-batches of {mid} and {len(issue_ids) - mid}."
        )
-        left = _bulk_fetch_batch(jira_client, issue_ids[:mid], fields)
-        right = _bulk_fetch_batch(jira_client, issue_ids[mid:], fields)
+        left = bulk_fetch_issues(jira_client, issue_ids[:mid], fields)
+        right = bulk_fetch_issues(jira_client, issue_ids[mid:], fields)
        return left + right
-
-
-def bulk_fetch_issues(
-    jira_client: JIRA, issue_ids: list[str], fields: str | None = None
-) -> list[Issue]:
-    # TODO(evan): move away from this jira library if they continue to not support
-    # the endpoints we need. Using private fields is not ideal, but
-    # is likely fine for now since we pin the library version
-
-    raw_issues: list[dict[str, Any]] = []
-    for batch in chunked(issue_ids, _JIRA_BULK_FETCH_LIMIT):
-        try:
-            raw_issues.extend(_bulk_fetch_batch(jira_client, list(batch), fields))
-        except Exception as e:
-            logger.error(f"Error fetching issues: {e}")
-            raise
+    except Exception as e:
+        logger.error(f"Error fetching issues: {e}")
+        raise

    return [
        Issue(jira_client._options, jira_client._session, raw=issue)
--- a/backend/onyx/context/search/federated/models.py
+++ b/backend/onyx/context/search/federated/models.py
@@ -1,4 +1,3 @@
-from dataclasses import dataclass
 from datetime import datetime
 from typing import TypedDict

@@ -7,14 +6,6 @@ from pydantic import BaseModel
 from onyx.onyxbot.slack.models import ChannelType


-@dataclass(frozen=True)
-class DirectThreadFetch:
-    """Request to fetch a Slack thread directly by channel and timestamp."""
-
-    channel_id: str
-    thread_ts: str
-
-
 class ChannelMetadata(TypedDict):
    """Type definition for cached channel metadata."""

--- a/backend/onyx/context/search/federated/slack_search.py
+++ b/backend/onyx/context/search/federated/slack_search.py
@@ -19,7 +19,6 @@ from onyx.configs.chat_configs import DOC_TIME_DECAY
 from onyx.connectors.models import IndexingDocument
 from onyx.connectors.models import TextSection
 from onyx.context.search.federated.models import ChannelMetadata
-from onyx.context.search.federated.models import DirectThreadFetch
 from onyx.context.search.federated.models import SlackMessage
 from onyx.context.search.federated.slack_search_utils import ALL_CHANNEL_TYPES
 from onyx.context.search.federated.slack_search_utils import build_channel_query_filter
@@ -50,6 +49,7 @@ from onyx.server.federated.models import FederatedConnectorDetail
 from onyx.utils.logger import setup_logger
 from onyx.utils.threadpool_concurrency import run_functions_tuples_in_parallel
 from onyx.utils.timing import log_function_time
+from shared_configs.configs import DOC_EMBEDDING_CONTEXT_SIZE

 logger = setup_logger()

@@ -58,6 +58,7 @@ HIGHLIGHT_END_CHAR = "\ue001"

 CHANNEL_METADATA_CACHE_TTL = 60 * 60 * 24  # 24 hours
 USER_PROFILE_CACHE_TTL = 60 * 60 * 24  # 24 hours
+SLACK_THREAD_CONTEXT_WINDOW = 3  # Number of messages before matched message to include
 CHANNEL_METADATA_MAX_RETRIES = 3  # Maximum retry attempts for channel metadata fetching
 CHANNEL_METADATA_RETRY_DELAY = 1  # Initial retry delay in seconds (exponential backoff)

@@ -420,94 +421,6 @@ class SlackQueryResult(BaseModel):
    filtered_channels: list[str]  # Channels filtered out during this query


-def _fetch_thread_from_url(
-    thread_fetch: DirectThreadFetch,
-    access_token: str,
-    channel_metadata_dict: dict[str, ChannelMetadata] | None = None,
-) -> SlackQueryResult:
-    """Fetch a thread directly from a Slack URL via conversations.replies."""
-    channel_id = thread_fetch.channel_id
-    thread_ts = thread_fetch.thread_ts
-
-    slack_client = WebClient(token=access_token)
-    try:
-        response = slack_client.conversations_replies(
-            channel=channel_id,
-            ts=thread_ts,
-        )
-        response.validate()
-        messages: list[dict[str, Any]] = response.get("messages", [])
-    except SlackApiError as e:
-        logger.warning(
-            f"Failed to fetch thread from URL (channel={channel_id}, ts={thread_ts}): {e}"
-        )
-        return SlackQueryResult(messages=[], filtered_channels=[])
-
-    if not messages:
-        logger.warning(
-            f"No messages found for URL override (channel={channel_id}, ts={thread_ts})"
-        )
-        return SlackQueryResult(messages=[], filtered_channels=[])
-
-    # Build thread text from all messages
-    thread_text = _build_thread_text(messages, access_token, None, slack_client)
-
-    # Get channel name from metadata cache or API
-    channel_name = "unknown"
-    if channel_metadata_dict and channel_id in channel_metadata_dict:
-        channel_name = channel_metadata_dict[channel_id].get("name", "unknown")
-    else:
-        try:
-            ch_response = slack_client.conversations_info(channel=channel_id)
-            ch_response.validate()
-            channel_info: dict[str, Any] = ch_response.get("channel", {})
-            channel_name = channel_info.get("name", "unknown")
-        except SlackApiError:
-            pass
-
-    # Build the SlackMessage
-    parent_msg = messages[0]
-    message_ts = parent_msg.get("ts", thread_ts)
-    username = parent_msg.get("user", "unknown_user")
-    parent_text = parent_msg.get("text", "")
-    snippet = (
-        parent_text[:50].rstrip() + "..." if len(parent_text) > 50 else parent_text
-    ).replace("\n", " ")
-
-    doc_time = datetime.fromtimestamp(float(message_ts))
-    decay_factor = DOC_TIME_DECAY
-    doc_age_years = (datetime.now() - doc_time).total_seconds() / (365 * 24 * 60 * 60)
-    recency_bias = max(1 / (1 + decay_factor * doc_age_years), 0.75)
-
-    permalink = (
-        f"https://slack.com/archives/{channel_id}/p{message_ts.replace('.', '')}"
-    )
-
-    slack_message = SlackMessage(
-        document_id=f"{channel_id}_{message_ts}",
-        channel_id=channel_id,
-        message_id=message_ts,
-        thread_id=None,  # Prevent double-enrichment in thread context fetch
-        link=permalink,
-        metadata={
-            "channel": channel_name,
-            "time": doc_time.isoformat(),
-        },
-        timestamp=doc_time,
-        recency_bias=recency_bias,
-        semantic_identifier=f"{username} in #{channel_name}: {snippet}",
-        text=thread_text,
-        highlighted_texts=set(),
-        slack_score=100000.0,  # High priority — user explicitly asked for this thread
-    )
-
-    logger.info(
-        f"URL override: fetched thread from channel={channel_id}, ts={thread_ts}, {len(messages)} messages"
-    )
-
-    return SlackQueryResult(messages=[slack_message], filtered_channels=[])
-
-
 def query_slack(
    query_string: str,
    access_token: str,
@@ -519,6 +432,7 @@ def query_slack(
    available_channels: list[str] | None = None,
    channel_metadata_dict: dict[str, ChannelMetadata] | None = None,
 ) -> SlackQueryResult:
+
    # Check if query has channel override (user specified channels in query)
    has_channel_override = query_string.startswith("__CHANNEL_OVERRIDE__")

@@ -748,6 +662,7 @@ def _fetch_thread_context(
    """
    channel_id = message.channel_id
    thread_id = message.thread_id
+    message_id = message.message_id

    # If not a thread, return original text as success
    if thread_id is None:
@@ -780,37 +695,62 @@ def _fetch_thread_context(
    if len(messages) <= 1:
        return ThreadContextResult.success(message.text)

-    # Build thread text from thread starter + all replies
-    thread_text = _build_thread_text(messages, access_token, team_id, slack_client)
+    # Build thread text from thread starter + context window around matched message
+    thread_text = _build_thread_text(
+        messages, message_id, thread_id, access_token, team_id, slack_client
+    )
    return ThreadContextResult.success(thread_text)


 def _build_thread_text(
    messages: list[dict[str, Any]],
+    message_id: str,
+    thread_id: str,
    access_token: str,
    team_id: str | None,
    slack_client: WebClient,
 ) -> str:
-    """Build thread text including all replies.
-
-    Includes the thread parent message followed by all replies in order.
-    """
+    """Build the thread text from messages."""
    msg_text = messages[0].get("text", "")
    msg_sender = messages[0].get("user", "")
    thread_text = f"<@{msg_sender}>: {msg_text}"

-    # All messages after index 0 are replies
-    replies = messages[1:]
-    if not replies:
-        return thread_text
-
-    logger.debug(f"Thread {messages[0].get('ts')}: {len(replies)} replies included")
    thread_text += "\n\nReplies:"
+    if thread_id == message_id:
+        message_id_idx = 0
+    else:
+        message_id_idx = next(
+            (i for i, msg in enumerate(messages) if msg.get("ts") == message_id), 0
+        )
+        if not message_id_idx:
+            return thread_text

-    for msg in replies:
+        start_idx = max(1, message_id_idx - SLACK_THREAD_CONTEXT_WINDOW)
+
+        if start_idx > 1:
+            thread_text += "\n..."
+
+        for i in range(start_idx, message_id_idx):
+            msg_text = messages[i].get("text", "")
+            msg_sender = messages[i].get("user", "")
+            thread_text += f"\n\n<@{msg_sender}>: {msg_text}"
+
+        msg_text = messages[message_id_idx].get("text", "")
+        msg_sender = messages[message_id_idx].get("user", "")
+        thread_text += f"\n\n<@{msg_sender}>: {msg_text}"
+
+    # Add following replies
+    len_replies = 0
+    for msg in messages[message_id_idx + 1 :]:
        msg_text = msg.get("text", "")
        msg_sender = msg.get("user", "")
-        thread_text += f"\n\n<@{msg_sender}>: {msg_text}"
+        reply = f"\n\n<@{msg_sender}>: {msg_text}"
+        thread_text += reply
+
+        len_replies += len(reply)
+        if len_replies >= DOC_EMBEDDING_CONTEXT_SIZE * 4:
+            thread_text += "\n..."
+            break

    # Replace user IDs with names using cached lookups
    userids: set[str] = set(re.findall(r"<@([A-Z0-9]+)>", thread_text))
@@ -1036,16 +976,7 @@ def slack_retrieval(

    # Query slack with entity filtering
    llm = get_default_llm()
-    query_items = build_slack_queries(query, llm, entities, available_channels)
-
-    # Partition into direct thread fetches and search query strings
-    direct_fetches: list[DirectThreadFetch] = []
-    query_strings: list[str] = []
-    for item in query_items:
-        if isinstance(item, DirectThreadFetch):
-            direct_fetches.append(item)
-        else:
-            query_strings.append(item)
+    query_strings = build_slack_queries(query, llm, entities, available_channels)

    # Determine filtering based on entities OR context (bot)
    include_dm = False
@@ -1062,16 +993,8 @@ def slack_retrieval(
                f"Private channel context: will only allow messages from {allowed_private_channel} + public channels"
            )

-    # Build search tasks — direct thread fetches + keyword searches
-    search_tasks: list[tuple] = [
-        (
-            _fetch_thread_from_url,
-            (fetch, access_token, channel_metadata_dict),
-        )
-        for fetch in direct_fetches
-    ]
-
-    search_tasks.extend(
+    # Build search tasks
+    search_tasks = [
        (
            query_slack,
            (
@@ -1087,7 +1010,7 @@ def slack_retrieval(
            ),
        )
        for query_string in query_strings
-    )
+    ]

    # If include_dm is True AND we're not already searching all channels,
    # add additional searches without channel filters.
--- a/backend/onyx/context/search/federated/slack_search_utils.py
+++ b/backend/onyx/context/search/federated/slack_search_utils.py
@@ -10,7 +10,6 @@ from pydantic import ValidationError

 from onyx.configs.app_configs import MAX_SLACK_QUERY_EXPANSIONS
 from onyx.context.search.federated.models import ChannelMetadata
-from onyx.context.search.federated.models import DirectThreadFetch
 from onyx.context.search.models import ChunkIndexRequest
 from onyx.federated_connectors.slack.models import SlackEntities
 from onyx.llm.interfaces import LLM
@@ -639,38 +638,12 @@ def expand_query_with_llm(query_text: str, llm: LLM) -> list[str]:
        return [query_text]


-SLACK_URL_PATTERN = re.compile(
-    r"https?://[a-z0-9-]+\.slack\.com/archives/([A-Z0-9]+)/p(\d{16})"
-)
-
-
-def extract_slack_message_urls(
-    query_text: str,
-) -> list[tuple[str, str]]:
-    """Extract Slack message URLs from query text.
-
-    Parses URLs like:
-      https://onyx-company.slack.com/archives/C097NBWMY8Y/p1775491616524769
-
-    Returns list of (channel_id, thread_ts) tuples.
-    The 16-digit timestamp is converted to Slack ts format (with dot).
-    """
-    results = []
-    for match in SLACK_URL_PATTERN.finditer(query_text):
-        channel_id = match.group(1)
-        raw_ts = match.group(2)
-        # Convert p1775491616524769 -> 1775491616.524769
-        thread_ts = f"{raw_ts[:10]}.{raw_ts[10:]}"
-        results.append((channel_id, thread_ts))
-    return results
-
-
 def build_slack_queries(
    query: ChunkIndexRequest,
    llm: LLM,
    entities: dict[str, Any] | None = None,
    available_channels: list[str] | None = None,
-) -> list[str | DirectThreadFetch]:
+) -> list[str]:
    """Build Slack query strings with date filtering and query expansion."""
    default_search_days = 30
    if entities:
@@ -695,15 +668,6 @@ def build_slack_queries(
            cutoff_date = datetime.now(timezone.utc) - timedelta(days=days_back)
            time_filter = f" after:{cutoff_date.strftime('%Y-%m-%d')}"

-    # Check for Slack message URLs — if found, add direct fetch requests
-    url_fetches: list[DirectThreadFetch] = []
-    slack_urls = extract_slack_message_urls(query.query)
-    for channel_id, thread_ts in slack_urls:
-        url_fetches.append(
-            DirectThreadFetch(channel_id=channel_id, thread_ts=thread_ts)
-        )
-        logger.info(f"Detected Slack URL: channel={channel_id}, ts={thread_ts}")
-
    # ALWAYS extract channel references from the query (not just for recency queries)
    channel_references = extract_channel_references_from_query(query.query)

@@ -720,9 +684,7 @@ def build_slack_queries(

            # If valid channels detected, use ONLY those channels with NO keywords
            # Return query with ONLY time filter + channel filter (no keywords)
-            return url_fetches + [
-                build_channel_override_query(channel_references, time_filter)
-            ]
+            return [build_channel_override_query(channel_references, time_filter)]
        except ValueError as e:
            # If validation fails, log the error and continue with normal flow
            logger.warning(f"Channel reference validation failed: {e}")
@@ -740,8 +702,7 @@ def build_slack_queries(
        rephrased_queries = expand_query_with_llm(query.query, llm)

    # Build final query strings with time filters
-    search_queries = [
+    return [
        rephrased_query.strip() + time_filter
        for rephrased_query in rephrased_queries[:MAX_SLACK_QUERY_EXPANSIONS]
    ]
-    return url_fetches + search_queries
--- a/backend/onyx/db/document_set.py
+++ b/backend/onyx/db/document_set.py
@@ -4,7 +4,6 @@ from uuid import UUID

 from sqlalchemy import and_
 from sqlalchemy import delete
-from sqlalchemy import exists
 from sqlalchemy import func
 from sqlalchemy import or_
 from sqlalchemy import Select
@@ -13,22 +12,21 @@ from sqlalchemy.orm import aliased
 from sqlalchemy.orm import selectinload
 from sqlalchemy.orm import Session

+from onyx.auth.permissions import has_permission
 from onyx.configs.app_configs import DISABLE_VECTOR_DB
 from onyx.db.connector_credential_pair import get_cc_pair_groups_for_ids
 from onyx.db.connector_credential_pair import get_connector_credential_pairs
 from onyx.db.enums import AccessType
 from onyx.db.enums import ConnectorCredentialPairStatus
+from onyx.db.enums import Permission
 from onyx.db.federated import create_federated_connector_document_set_mapping
 from onyx.db.models import ConnectorCredentialPair
 from onyx.db.models import Document
 from onyx.db.models import DocumentByConnectorCredentialPair
 from onyx.db.models import DocumentSet as DocumentSetDBModel
 from onyx.db.models import DocumentSet__ConnectorCredentialPair
-from onyx.db.models import DocumentSet__UserGroup
 from onyx.db.models import FederatedConnector__DocumentSet
 from onyx.db.models import User
-from onyx.db.models import User__UserGroup
-from onyx.db.models import UserRole
 from onyx.server.features.document_set.models import DocumentSetCreationRequest
 from onyx.server.features.document_set.models import DocumentSetUpdateRequest
 from onyx.utils.logger import setup_logger
@@ -38,54 +36,16 @@ logger = setup_logger()


 def _add_user_filters(stmt: Select, user: User, get_editable: bool = True) -> Select:
-    if user.role == UserRole.ADMIN:
+    # MANAGE → always return all
+    if has_permission(user, Permission.MANAGE_DOCUMENT_SETS):
        return stmt
-
-    stmt = stmt.distinct()
-    DocumentSet__UG = aliased(DocumentSet__UserGroup)
-    User__UG = aliased(User__UserGroup)
-    """
-    Here we select cc_pairs by relation:
-    User -> User__UserGroup -> DocumentSet__UserGroup -> DocumentSet
-    """
-    stmt = stmt.outerjoin(DocumentSet__UG).outerjoin(
-        User__UserGroup,
-        User__UserGroup.user_group_id == DocumentSet__UG.user_group_id,
-    )
-    """
-    Filter DocumentSets by:
-    - if the user is in the user_group that owns the DocumentSet
-    - if the user is not a global_curator, they must also have a curator relationship
-    to the user_group
-    - if editing is being done, we also filter out DocumentSets that are owned by groups
-    that the user isn't a curator for
-    - if we are not editing, we show all DocumentSets in the groups the user is a curator
-    for (as well as public DocumentSets)
-    """
-
-    # Anonymous users only see public DocumentSets
-    if user.is_anonymous:
-        where_clause = DocumentSetDBModel.is_public == True  # noqa: E712
-        return stmt.where(where_clause)
-
-    where_clause = User__UserGroup.user_id == user.id
-    if user.role == UserRole.CURATOR and get_editable:
-        where_clause &= User__UserGroup.is_curator == True  # noqa: E712
-    if get_editable:
-        user_groups = select(User__UG.user_group_id).where(User__UG.user_id == user.id)
-        if user.role == UserRole.CURATOR:
-            user_groups = user_groups.where(User__UG.is_curator == True)  # noqa: E712
-        where_clause &= (
-            ~exists()
-            .where(DocumentSet__UG.document_set_id == DocumentSetDBModel.id)
-            .where(~DocumentSet__UG.user_group_id.in_(user_groups))
-            .correlate(DocumentSetDBModel)
-        )
-        where_clause |= DocumentSetDBModel.user_id == user.id
-    else:
-        where_clause |= DocumentSetDBModel.is_public == True  # noqa: E712
-
-    return stmt.where(where_clause)
+    # READ → return all when reading, nothing when editing
+    if has_permission(user, Permission.READ_DOCUMENT_SETS):
+        if get_editable:
+            return stmt.where(False)
+        return stmt
+    # No permission → return nothing
+    return stmt.where(False)


 def _delete_document_set_cc_pairs__no_commit(
--- a/backend/onyx/db/engine/sql_engine.py
+++ b/backend/onyx/db/engine/sql_engine.py
@@ -11,7 +11,6 @@ from sqlalchemy import event
 from sqlalchemy import pool
 from sqlalchemy.engine import create_engine
 from sqlalchemy.engine import Engine
-from sqlalchemy.exc import DBAPIError
 from sqlalchemy.orm import Session

 from onyx.configs.app_configs import DB_READONLY_PASSWORD
@@ -347,25 +346,6 @@ def get_session_with_shared_schema() -> Generator[Session, None, None]:
    CURRENT_TENANT_ID_CONTEXTVAR.reset(token)


-def _safe_close_session(session: Session) -> None:
-    """Close a session, catching connection-closed errors during cleanup.
-
-    Long-running operations (e.g. multi-model LLM loops) can hold a session
-    open for minutes.  If the underlying connection is dropped by cloud
-    infrastructure (load-balancer timeouts, PgBouncer, idle-in-transaction
-    timeouts, etc.), the implicit rollback in Session.close() raises
-    OperationalError or InterfaceError.  Since the work is already complete,
-    we log and move on — SQLAlchemy internally invalidates the connection
-    for pool recycling.
-    """
-    try:
-        session.close()
-    except DBAPIError:
-        logger.warning(
-            "DB connection lost during session cleanup — the connection will be invalidated and recycled by the pool."
-        )
-
-
@contextmanager
 def get_session_with_tenant(*, tenant_id: str) -> Generator[Session, None, None]:
    """
@@ -378,11 +358,8 @@ def get_session_with_tenant(*, tenant_id: str) -> Generator[Session, None, None]

    # no need to use the schema translation map for self-hosted + default schema
    if not MULTI_TENANT and tenant_id == POSTGRES_DEFAULT_SCHEMA_STANDARD_VALUE:
-        session = Session(bind=engine, expire_on_commit=False)
-        try:
+        with Session(bind=engine, expire_on_commit=False) as session:
            yield session
-        finally:
-            _safe_close_session(session)
        return

    # Create connection with schema translation to handle querying the right schema
@@ -390,11 +367,8 @@ def get_session_with_tenant(*, tenant_id: str) -> Generator[Session, None, None]
    with engine.connect().execution_options(
        schema_translate_map=schema_translate_map
    ) as connection:
-        session = Session(bind=connection, expire_on_commit=False)
-        try:
+        with Session(bind=connection, expire_on_commit=False) as session:
            yield session
-        finally:
-            _safe_close_session(session)


 def get_session() -> Generator[Session, None, None]:
--- a/backend/onyx/db/enums.py
+++ b/backend/onyx/db/enums.py
@@ -366,12 +366,12 @@ class Permission(str, PyEnum):
    READ_DOCUMENT_SETS = "read:document_sets"
    READ_AGENTS = "read:agents"
    READ_USERS = "read:users"
+    READ_USER_GROUPS = "read:user_groups"

    # Add / Manage pairs
    ADD_AGENTS = "add:agents"
    MANAGE_AGENTS = "manage:agents"
    MANAGE_DOCUMENT_SETS = "manage:document_sets"
-    ADD_CONNECTORS = "add:connectors"
    MANAGE_CONNECTORS = "manage:connectors"
    MANAGE_LLMS = "manage:llms"

@@ -381,8 +381,8 @@ class Permission(str, PyEnum):
    READ_QUERY_HISTORY = "read:query_history"
    MANAGE_USER_GROUPS = "manage:user_groups"
    CREATE_USER_API_KEYS = "create:user_api_keys"
-    CREATE_SERVICE_ACCOUNT_API_KEYS = "create:service_account_api_keys"
-    CREATE_SLACK_DISCORD_BOTS = "create:slack_discord_bots"
+    MANAGE_SERVICE_ACCOUNT_API_KEYS = "manage:service_account_api_keys"
+    MANAGE_BOTS = "manage:bots"

    # Override — any permission check passes
    FULL_ADMIN_PANEL_ACCESS = "admin"
--- a/backend/onyx/db/memory.py
+++ b/backend/onyx/db/memory.py
@@ -5,7 +5,6 @@ from pydantic import ConfigDict
 from sqlalchemy import select
 from sqlalchemy.orm import Session

-from onyx.db.engine.sql_engine import get_session_with_current_tenant_if_none
 from onyx.db.models import Memory
 from onyx.db.models import User

@@ -84,51 +83,47 @@ def get_memories(user: User, db_session: Session) -> UserMemoryContext:
 def add_memory(
    user_id: UUID,
    memory_text: str,
-    db_session: Session | None = None,
-) -> int:
+    db_session: Session,
+) -> Memory:
    """Insert a new Memory row for the given user.

    If the user already has MAX_MEMORIES_PER_USER memories, the oldest
    one (lowest id) is deleted before inserting the new one.
-
-    Returns the id of the newly created Memory row.
    """
-    with get_session_with_current_tenant_if_none(db_session) as db_session:
-        existing = db_session.scalars(
-            select(Memory).where(Memory.user_id == user_id).order_by(Memory.id.asc())
-        ).all()
+    existing = db_session.scalars(
+        select(Memory).where(Memory.user_id == user_id).order_by(Memory.id.asc())
+    ).all()

-        if len(existing) >= MAX_MEMORIES_PER_USER:
-            db_session.delete(existing[0])
+    if len(existing) >= MAX_MEMORIES_PER_USER:
+        db_session.delete(existing[0])

-        memory = Memory(
-            user_id=user_id,
-            memory_text=memory_text,
-        )
-        db_session.add(memory)
-        db_session.commit()
-        return memory.id
+    memory = Memory(
+        user_id=user_id,
+        memory_text=memory_text,
+    )
+    db_session.add(memory)
+    db_session.commit()
+    return memory


 def update_memory_at_index(
    user_id: UUID,
    index: int,
    new_text: str,
-    db_session: Session | None = None,
-) -> int | None:
+    db_session: Session,
+) -> Memory | None:
    """Update the memory at the given 0-based index (ordered by id ASC, matching get_memories()).

-    Returns the id of the updated Memory row, or None if the index is out of range.
+    Returns the updated Memory row, or None if the index is out of range.
    """
-    with get_session_with_current_tenant_if_none(db_session) as db_session:
-        memory_rows = db_session.scalars(
-            select(Memory).where(Memory.user_id == user_id).order_by(Memory.id.asc())
-        ).all()
+    memory_rows = db_session.scalars(
+        select(Memory).where(Memory.user_id == user_id).order_by(Memory.id.asc())
+    ).all()

-        if index < 0 or index >= len(memory_rows):
-            return None
+    if index < 0 or index >= len(memory_rows):
+        return None

-        memory = memory_rows[index]
-        memory.memory_text = new_text
-        db_session.commit()
-        return memory.id
+    memory = memory_rows[index]
+    memory.memory_text = new_text
+    db_session.commit()
+    return memory
--- a/backend/onyx/db/persona.py
+++ b/backend/onyx/db/persona.py
@@ -16,12 +16,12 @@ from sqlalchemy.orm import selectinload
 from sqlalchemy.orm import Session

 from onyx.access.hierarchy_access import get_user_external_group_ids
-from onyx.auth.schemas import UserRole
-from onyx.configs.app_configs import CURATORS_CANNOT_VIEW_OR_EDIT_NON_OWNED_ASSISTANTS
+from onyx.auth.permissions import has_permission
 from onyx.configs.constants import DEFAULT_PERSONA_ID
 from onyx.configs.constants import NotificationType
 from onyx.db.constants import SLACK_BOT_PERSONA_PREFIX
 from onyx.db.document_access import get_accessible_documents_by_ids
+from onyx.db.enums import Permission
 from onyx.db.models import ConnectorCredentialPair
 from onyx.db.models import Document
 from onyx.db.models import DocumentSet
@@ -74,7 +74,9 @@ class PersonaLoadType(Enum):
 def _add_user_filters(
    stmt: Select[tuple[Persona]], user: User, get_editable: bool = True
 ) -> Select[tuple[Persona]]:
-    if user.role == UserRole.ADMIN:
+    if has_permission(user, Permission.MANAGE_AGENTS):
+        return stmt
+    if not get_editable and has_permission(user, Permission.READ_AGENTS):
        return stmt

    stmt = stmt.distinct()
@@ -98,12 +100,7 @@ def _add_user_filters(
    """
    Filter Personas by:
    - if the user is in the user_group that owns the Persona
-    - if the user is not a global_curator, they must also have a curator relationship
-    to the user_group
-    - if editing is being done, we also filter out Personas that are owned by groups
-    that the user isn't a curator for
-    - if we are not editing, we show all Personas in the groups the user is a curator
-    for (as well as public Personas)
+    - if we are not editing, we show all public and listed Personas
    - if we are not editing, we return all Personas directly connected to the user
    """

@@ -112,21 +109,9 @@ def _add_user_filters(
        where_clause = Persona.is_public == True  # noqa: E712
        return stmt.where(where_clause)

-    # If curator ownership restriction is enabled, curators can only access their own assistants
-    if CURATORS_CANNOT_VIEW_OR_EDIT_NON_OWNED_ASSISTANTS and user.role in [
-        UserRole.CURATOR,
-        UserRole.GLOBAL_CURATOR,
-    ]:
-        where_clause = (Persona.user_id == user.id) | (Persona.user_id.is_(None))
-        return stmt.where(where_clause)
-
    where_clause = User__UserGroup.user_id == user.id
-    if user.role == UserRole.CURATOR and get_editable:
-        where_clause &= User__UserGroup.is_curator == True  # noqa: E712
    if get_editable:
        user_groups = select(User__UG.user_group_id).where(User__UG.user_id == user.id)
-        if user.role == UserRole.CURATOR:
-            user_groups = user_groups.where(User__UG.is_curator == True)  # noqa: E712
        where_clause &= (
            ~exists()
            .where(Persona__UG.persona_id == Persona.id)
@@ -197,7 +182,7 @@ def _get_persona_by_name(
    - Non-admin users: can only see their own personas
    """
    stmt = select(Persona).where(Persona.name == persona_name)
-    if user and user.role != UserRole.ADMIN:
+    if user and not has_permission(user, Permission.MANAGE_AGENTS):
        stmt = stmt.where(Persona.user_id == user.id)
    result = db_session.execute(stmt).scalar_one_or_none()
    return result
@@ -271,12 +256,10 @@ def create_update_persona(
    try:
        # Featured persona validation
        if create_persona_request.is_featured:
-            # Curators can edit featured personas, but not make them
-            # TODO this will be reworked soon with RBAC permissions feature
-            if user.role == UserRole.CURATOR or user.role == UserRole.GLOBAL_CURATOR:
-                pass
-            elif user.role != UserRole.ADMIN:
-                raise ValueError("Only admins can make a featured persona")
+            if not has_permission(user, Permission.MANAGE_AGENTS):
+                raise ValueError(
+                    "Only users with agent management permissions can make a featured persona"
+                )

        # Convert incoming string UUIDs to UUID objects for DB operations
        converted_user_file_ids = None
@@ -353,7 +336,11 @@ def update_persona_shared(
        db_session=db_session, persona_id=persona_id, user=user, get_editable=True
    )

-    if user and user.role != UserRole.ADMIN and persona.user_id != user.id:
+    if (
+        user
+        and not has_permission(user, Permission.MANAGE_AGENTS)
+        and persona.user_id != user.id
+    ):
        raise PermissionError("You don't have permission to modify this persona")

    versioned_update_persona_access = fetch_versioned_implementation(
@@ -389,7 +376,10 @@ def update_persona_public_status(
    persona = fetch_persona_by_id_for_user(
        db_session=db_session, persona_id=persona_id, user=user, get_editable=True
    )
-    if user.role != UserRole.ADMIN and persona.user_id != user.id:
+    if (
+        not has_permission(user, Permission.MANAGE_AGENTS)
+        and persona.user_id != user.id
+    ):
        raise ValueError("You don't have permission to modify this persona")

    persona.is_public = is_public
@@ -1226,7 +1216,11 @@ def get_persona_by_id(
    if not include_deleted:
        persona_stmt = persona_stmt.where(Persona.deleted.is_(False))

-    if not user or user.role == UserRole.ADMIN:
+    if (
+        not user
+        or has_permission(user, Permission.MANAGE_AGENTS)
+        or (not is_for_edit and has_permission(user, Permission.READ_AGENTS))
+    ):
        result = db_session.execute(persona_stmt)
        persona = result.scalar_one_or_none()
        if persona is None:
@@ -1243,14 +1237,6 @@ def get_persona_by_id(
        # if the user is in the .users of the persona
        or_conditions |= User.id == user.id
        or_conditions |= Persona.is_public == True  # noqa: E712
-    elif user.role == UserRole.GLOBAL_CURATOR:
-        # global curators can edit personas for the groups they are in
-        or_conditions |= User__UserGroup.user_id == user.id
-    elif user.role == UserRole.CURATOR:
-        # curators can edit personas for the groups they are curators of
-        or_conditions |= (User__UserGroup.user_id == user.id) & (
-            User__UserGroup.is_curator == True  # noqa: E712
-        )

    persona_stmt = persona_stmt.where(or_conditions)
    result = db_session.execute(persona_stmt)
--- a/backend/onyx/deep_research/dr_loop.py
+++ b/backend/onyx/deep_research/dr_loop.py
@@ -7,6 +7,8 @@ import time
 from collections.abc import Callable
 from typing import cast

+from sqlalchemy.orm import Session
+
 from onyx.chat.chat_state import ChatStateContainer
 from onyx.chat.citation_processor import CitationMapping
 from onyx.chat.citation_processor import DynamicCitationProcessor
@@ -20,7 +22,6 @@ from onyx.chat.models import LlmStepResult
 from onyx.chat.models import ToolCallSimple
 from onyx.configs.chat_configs import SKIP_DEEP_RESEARCH_CLARIFICATION
 from onyx.configs.constants import MessageType
-from onyx.db.engine.sql_engine import get_session_with_current_tenant
 from onyx.db.tools import get_tool_by_name
 from onyx.deep_research.dr_mock_tools import get_clarification_tool_definitions
 from onyx.deep_research.dr_mock_tools import get_orchestrator_tools
@@ -183,14 +184,6 @@ def generate_final_report(
        return has_reasoned


-def _get_research_agent_tool_id() -> int:
-    with get_session_with_current_tenant() as db_session:
-        return get_tool_by_name(
-            tool_name=RESEARCH_AGENT_TOOL_NAME,
-            db_session=db_session,
-        ).id
-
-
@log_function_time(print_only=True)
 def run_deep_research_llm_loop(
    emitter: Emitter,
@@ -200,6 +193,7 @@ def run_deep_research_llm_loop(
    custom_agent_prompt: str | None,  # noqa: ARG001
    llm: LLM,
    token_counter: Callable[[str], int],
+    db_session: Session,
    skip_clarification: bool = False,
    user_identity: LLMUserIdentity | None = None,
    chat_session_id: str | None = None,
@@ -723,7 +717,6 @@ def run_deep_research_llm_loop(
                    simple_chat_history.append(assistant_with_tools)

                    # Now add TOOL_CALL_RESPONSE messages and tool call info for each result
-                    research_agent_tool_id = _get_research_agent_tool_id()
                    for tab_index, report in enumerate(
                        research_results.intermediate_reports
                    ):
@@ -744,7 +737,10 @@ def run_deep_research_llm_loop(
                            tab_index=tab_index,
                            tool_name=current_tool_call.tool_name,
                            tool_call_id=current_tool_call.tool_call_id,
-                            tool_id=research_agent_tool_id,
+                            tool_id=get_tool_by_name(
+                                tool_name=RESEARCH_AGENT_TOOL_NAME,
+                                db_session=db_session,
+                            ).id,
                            reasoning_tokens=llm_step_result.reasoning
                            or most_recent_reasoning,
                            tool_call_arguments=current_tool_call.tool_args,
--- a/backend/onyx/error_handling/error_codes.py
+++ b/backend/onyx/error_handling/error_codes.py
@@ -56,6 +56,7 @@ class OnyxErrorCode(Enum):
    DOCUMENT_NOT_FOUND = ("DOCUMENT_NOT_FOUND", 404)
    SESSION_NOT_FOUND = ("SESSION_NOT_FOUND", 404)
    USER_NOT_FOUND = ("USER_NOT_FOUND", 404)
+    DOCUMENT_SET_NOT_FOUND = ("DOCUMENT_SET_NOT_FOUND", 404)

    # ------------------------------------------------------------------
    # Conflict (409)
--- a/backend/onyx/llm/constants.py
+++ b/backend/onyx/llm/constants.py
@@ -66,7 +66,7 @@ PROVIDER_DISPLAY_NAMES: dict[str, str] = {
    LlmProviderNames.LM_STUDIO: "LM Studio",
    LlmProviderNames.LITELLM_PROXY: "LiteLLM Proxy",
    LlmProviderNames.BIFROST: "Bifrost",
-    LlmProviderNames.OPENAI_COMPATIBLE: "OpenAI-Compatible",
+    LlmProviderNames.OPENAI_COMPATIBLE: "OpenAI Compatible",
    "groq": "Groq",
    "anyscale": "Anyscale",
    "deepseek": "DeepSeek",
@@ -87,44 +87,6 @@ PROVIDER_DISPLAY_NAMES: dict[str, str] = {
    "gemini": "Gemini",
    "stability": "Stability",
    "writer": "Writer",
-    # Custom provider display names (used in the custom provider picker)
-    "aiml": "AI/ML",
-    "assemblyai": "AssemblyAI",
-    "aws_polly": "AWS Polly",
-    "azure_ai": "Azure AI",
-    "chatgpt": "ChatGPT",
-    "cohere_chat": "Cohere Chat",
-    "datarobot": "DataRobot",
-    "deepgram": "Deepgram",
-    "deepinfra": "DeepInfra",
-    "elevenlabs": "ElevenLabs",
-    "fal_ai": "fal.ai",
-    "featherless_ai": "Featherless AI",
-    "fireworks_ai": "Fireworks AI",
-    "friendliai": "FriendliAI",
-    "gigachat": "GigaChat",
-    "github_copilot": "GitHub Copilot",
-    "gradient_ai": "Gradient AI",
-    "huggingface": "HuggingFace",
-    "jina_ai": "Jina AI",
-    "lambda_ai": "Lambda AI",
-    "llamagate": "LlamaGate",
-    "meta_llama": "Meta Llama",
-    "minimax": "MiniMax",
-    "nlp_cloud": "NLP Cloud",
-    "nvidia_nim": "NVIDIA NIM",
-    "oci": "OCI",
-    "ovhcloud": "OVHcloud",
-    "palm": "PaLM",
-    "publicai": "PublicAI",
-    "runwayml": "RunwayML",
-    "sambanova": "SambaNova",
-    "together_ai": "Together AI",
-    "vercel_ai_gateway": "Vercel AI Gateway",
-    "volcengine": "Volcengine",
-    "wandb": "W&B",
-    "watsonx": "IBM watsonx",
-    "zai": "ZAI",
 }

 # Map vendors to their brand names (used for provider_display_name generation)
--- a/backend/onyx/llm/model_metadata_enrichments.json
+++ b/backend/onyx/llm/model_metadata_enrichments.json
@@ -1516,10 +1516,6 @@
    "display_name": "Claude Opus 4.6",
    "model_vendor": "anthropic"
  },
-  "claude-opus-4-7": {
-    "display_name": "Claude Opus 4.7",
-    "model_vendor": "anthropic"
-  },
  "claude-opus-4-5-20251101": {
    "display_name": "Claude Opus 4.5",
    "model_vendor": "anthropic",
--- a/backend/onyx/llm/models.py
+++ b/backend/onyx/llm/models.py
@@ -46,15 +46,6 @@ ANTHROPIC_REASONING_EFFORT_BUDGET: dict[ReasoningEffort, int] = {
    ReasoningEffort.HIGH: 4096,
 }

-# Newer Anthropic models (Claude Opus 4.7+) use adaptive thinking with
-# output_config.effort instead of thinking.type.enabled + budget_tokens.
-ANTHROPIC_ADAPTIVE_REASONING_EFFORT: dict[ReasoningEffort, str] = {
-    ReasoningEffort.AUTO: "medium",
-    ReasoningEffort.LOW: "low",
-    ReasoningEffort.MEDIUM: "medium",
-    ReasoningEffort.HIGH: "high",
-}
-

 # Content part structures for multimodal messages
 # The classes in this mirror the OpenAI Chat Completions message types and work well with routers like LiteLLM
--- a/backend/onyx/llm/multi_llm.py
+++ b/backend/onyx/llm/multi_llm.py
@@ -23,7 +23,6 @@ from onyx.llm.interfaces import ToolChoiceOptions
 from onyx.llm.model_response import ModelResponse
 from onyx.llm.model_response import ModelResponseStream
 from onyx.llm.model_response import Usage
-from onyx.llm.models import ANTHROPIC_ADAPTIVE_REASONING_EFFORT
 from onyx.llm.models import ANTHROPIC_REASONING_EFFORT_BUDGET
 from onyx.llm.models import OPENAI_REASONING_EFFORT
 from onyx.llm.request_context import get_llm_mock_response
@@ -68,13 +67,8 @@ STANDARD_MAX_TOKENS_KWARG = "max_completion_tokens"
 _VERTEX_ANTHROPIC_MODELS_REJECTING_OUTPUT_CONFIG = (
    "claude-opus-4-5",
    "claude-opus-4-6",
-    "claude-opus-4-7",
 )

-# Anthropic models that require the adaptive thinking API (thinking.type.adaptive
-# + output_config.effort) instead of the legacy thinking.type.enabled + budget_tokens.
-_ANTHROPIC_ADAPTIVE_THINKING_MODELS = ("claude-opus-4-7",)
-

 class LLMTimeoutError(Exception):
    """
@@ -236,14 +230,6 @@ def _is_vertex_model_rejecting_output_config(model_name: str) -> bool:
    )


-def _anthropic_uses_adaptive_thinking(model_name: str) -> bool:
-    normalized_model_name = model_name.lower()
-    return any(
-        adaptive_model in normalized_model_name
-        for adaptive_model in _ANTHROPIC_ADAPTIVE_THINKING_MODELS
-    )
-
-
 class LitellmLLM(LLM):
    """Uses Litellm library to allow easy configuration to use a multitude of LLMs
    See https://python.langchain.com/docs/integrations/chat/litellm"""
@@ -523,6 +509,10 @@ class LitellmLLM(LLM):
                    }

            elif is_claude_model:
+                budget_tokens: int | None = ANTHROPIC_REASONING_EFFORT_BUDGET.get(
+                    reasoning_effort
+                )
+
                # Anthropic requires every assistant message with tool_use
                # blocks to start with a thinking block that carries a
                # cryptographic signature.  We don't preserve those blocks
@@ -530,35 +520,24 @@ class LitellmLLM(LLM):
                # contains tool-calling assistant messages.  LiteLLM's
                # modify_params workaround doesn't cover all providers
                # (notably Bedrock).
-                has_tool_call_history = _prompt_contains_tool_call_history(prompt)
+                can_enable_thinking = (
+                    budget_tokens is not None
+                    and not _prompt_contains_tool_call_history(prompt)
+                )

-                if _anthropic_uses_adaptive_thinking(self.config.model_name):
-                    # Newer Anthropic models (Claude Opus 4.7+) reject
-                    # thinking.type.enabled — they require the adaptive
-                    # thinking config with output_config.effort.
-                    if not has_tool_call_history:
-                        optional_kwargs["thinking"] = {"type": "adaptive"}
-                        optional_kwargs["output_config"] = {
-                            "effort": ANTHROPIC_ADAPTIVE_REASONING_EFFORT[
-                                reasoning_effort
-                            ],
-                        }
-                else:
-                    budget_tokens: int | None = ANTHROPIC_REASONING_EFFORT_BUDGET.get(
-                        reasoning_effort
-                    )
-                    if budget_tokens is not None and not has_tool_call_history:
-                        if max_tokens is not None:
-                            # Anthropic has a weird rule where max token has to be at least as much as budget tokens if set
-                            # and the minimum budget tokens is 1024
-                            # Will note that overwriting a developer set max tokens is not ideal but is the best we can do for now
-                            # It is better to allow the LLM to output more reasoning tokens even if it results in a fairly small tool
-                            # call as compared to reducing the budget for reasoning.
-                            max_tokens = max(budget_tokens + 1, max_tokens)
-                        optional_kwargs["thinking"] = {
-                            "type": "enabled",
-                            "budget_tokens": budget_tokens,
-                        }
+                if can_enable_thinking:
+                    assert budget_tokens is not None  # mypy
+                    if max_tokens is not None:
+                        # Anthropic has a weird rule where max token has to be at least as much as budget tokens if set
+                        # and the minimum budget tokens is 1024
+                        # Will note that overwriting a developer set max tokens is not ideal but is the best we can do for now
+                        # It is better to allow the LLM to output more reasoning tokens even if it results in a fairly small tool
+                        # call as compared to reducing the budget for reasoning.
+                        max_tokens = max(budget_tokens + 1, max_tokens)
+                    optional_kwargs["thinking"] = {
+                        "type": "enabled",
+                        "budget_tokens": budget_tokens,
+                    }

                # LiteLLM just does some mapping like this anyway but is incomplete for Anthropic
                optional_kwargs.pop("reasoning_effort", None)
--- a/backend/onyx/llm/well_known_providers/llm_provider_options.py
+++ b/backend/onyx/llm/well_known_providers/llm_provider_options.py
@@ -338,7 +338,7 @@ def get_provider_display_name(provider_name: str) -> str:
        VERTEXAI_PROVIDER_NAME: "Google Vertex AI",
        OPENROUTER_PROVIDER_NAME: "OpenRouter",
        LITELLM_PROXY_PROVIDER_NAME: "LiteLLM Proxy",
-        OPENAI_COMPATIBLE_PROVIDER_NAME: "OpenAI-Compatible",
+        OPENAI_COMPATIBLE_PROVIDER_NAME: "OpenAI Compatible",
    }

    if provider_name in _ONYX_PROVIDER_DISPLAY_NAMES:
--- a/backend/onyx/llm/well_known_providers/recommended-models.json
+++ b/backend/onyx/llm/well_known_providers/recommended-models.json
@@ -1,6 +1,6 @@
 {
-  "version": "1.2",
-  "updated_at": "2026-04-16T00:00:00Z",
+  "version": "1.1",
+  "updated_at": "2026-03-05T00:00:00Z",
  "providers": {
    "openai": {
      "default_model": { "name": "gpt-5.4" },
@@ -10,12 +10,8 @@
      ]
    },
    "anthropic": {
-      "default_model": "claude-opus-4-7",
+      "default_model": "claude-opus-4-6",
      "additional_visible_models": [
-        {
-          "name": "claude-opus-4-7",
-          "display_name": "Claude Opus 4.7"
-        },
        {
          "name": "claude-opus-4-6",
          "display_name": "Claude Opus 4.6"
--- a/backend/onyx/onyxbot/slack/listener.py
+++ b/backend/onyx/onyxbot/slack/listener.py
@@ -90,7 +90,6 @@ from onyx.onyxbot.slack.utils import respond_in_thread_or_channel
 from onyx.onyxbot.slack.utils import TenantSocketModeClient
 from onyx.redis.redis_pool import get_redis_client
 from onyx.server.manage.models import SlackBotTokens
-from onyx.tracing.setup import setup_tracing
 from onyx.utils.logger import setup_logger
 from onyx.utils.variable_functionality import fetch_ee_implementation_or_noop
 from onyx.utils.variable_functionality import set_is_ee_based_on_env_variable
@@ -1207,7 +1206,6 @@ if __name__ == "__main__":
    tenant_handler = SlackbotHandler()

    set_is_ee_based_on_env_variable()
-    setup_tracing()

    try:
        # Keep the main thread alive
--- a/backend/onyx/prompts/tool_prompts.py
+++ b/backend/onyx/prompts/tool_prompts.py
@@ -65,9 +65,8 @@ IMPORTANT: each call to this tool is independent. Variables from previous calls
 GENERATE_IMAGE_GUIDANCE = """
 ## generate_image
 NEVER use generate_image unless the user specifically requests an image.
-To edit, restyle, or vary an existing image, pass its file_id in `reference_image_file_ids`. \
-File IDs come from `[attached image — file_id: <id>]` tags on user-attached images or from prior `generate_image` tool results — never invent one. \
-Leave `reference_image_file_ids` unset for a fresh generation.
+For edits/variations of a previously generated image, pass `reference_image_file_ids` with
+the `file_id` values returned by earlier `generate_image` tool results.
 """.lstrip()

 MEMORY_GUIDANCE = """
--- a/backend/onyx/server/api_key/api.py
+++ b/backend/onyx/server/api_key/api.py
@@ -20,7 +20,7 @@ router = APIRouter(prefix="/admin/api-key")

@router.get("")
 def list_api_keys(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_SERVICE_ACCOUNT_API_KEYS)),
    db_session: Session = Depends(get_session),
 ) -> list[ApiKeyDescriptor]:
    return fetch_api_keys(db_session)
@@ -29,7 +29,9 @@ def list_api_keys(
@router.post("")
 def create_api_key(
    api_key_args: APIKeyArgs,
-    user: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    user: User = Depends(
+        require_permission(Permission.MANAGE_SERVICE_ACCOUNT_API_KEYS)
+    ),
    db_session: Session = Depends(get_session),
 ) -> ApiKeyDescriptor:
    return insert_api_key(db_session, api_key_args, user.id)
@@ -38,7 +40,7 @@ def create_api_key(
@router.post("/{api_key_id}/regenerate")
 def regenerate_existing_api_key(
    api_key_id: int,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_SERVICE_ACCOUNT_API_KEYS)),
    db_session: Session = Depends(get_session),
 ) -> ApiKeyDescriptor:
    return regenerate_api_key(db_session, api_key_id)
@@ -48,7 +50,7 @@ def regenerate_existing_api_key(
 def update_existing_api_key(
    api_key_id: int,
    api_key_args: APIKeyArgs,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_SERVICE_ACCOUNT_API_KEYS)),
    db_session: Session = Depends(get_session),
 ) -> ApiKeyDescriptor:
    return update_api_key(db_session, api_key_id, api_key_args)
@@ -57,7 +59,7 @@ def update_existing_api_key(
@router.delete("/{api_key_id}")
 def delete_api_key(
    api_key_id: int,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_SERVICE_ACCOUNT_API_KEYS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    remove_api_key(db_session, api_key_id)
--- a/backend/onyx/server/features/build/sandbox/kubernetes/docker/README.md
+++ b/backend/onyx/server/features/build/sandbox/kubernetes/docker/README.md
@@ -58,7 +58,7 @@ docker buildx build --platform linux/amd64,linux/arm64 \

 1. **Build and push** the new image (see above)

-2. **Update the ConfigMap** in in the internal repo
+2. **Update the ConfigMap** in `cloud-deployment-yamls/danswer/configmap/env-configmap.yaml`:
   ```yaml
   SANDBOX_CONTAINER_IMAGE: "onyxdotapp/sandbox:v0.1.x"
   ```
--- a/backend/onyx/server/features/build/sandbox/kubernetes/kubernetes_sandbox_manager.py
+++ b/backend/onyx/server/features/build/sandbox/kubernetes/kubernetes_sandbox_manager.py
@@ -618,7 +618,6 @@ done
                    "app.kubernetes.io/managed-by": "onyx",
                    "onyx.app/sandbox-id": sandbox_id,
                    "onyx.app/tenant-id": tenant_id,
-                    "admission.datadoghq.com/enabled": "false",
                },
            ),
            spec=pod_spec,
--- a/backend/onyx/server/features/document_set/api.py
+++ b/backend/onyx/server/features/document_set/api.py
@@ -1,11 +1,9 @@
 from fastapi import APIRouter
 from fastapi import Depends
-from fastapi import HTTPException
 from fastapi import Query
 from sqlalchemy.orm import Session

 from onyx.auth.permissions import require_permission
-from onyx.auth.users import current_curator_or_admin_user
 from onyx.background.celery.versioned_apps.client import app as client_app
 from onyx.configs.app_configs import DISABLE_VECTOR_DB
 from onyx.configs.constants import OnyxCeleryPriority
@@ -20,6 +18,8 @@ from onyx.db.document_set import update_document_set
 from onyx.db.engine.sql_engine import get_session
 from onyx.db.enums import Permission
 from onyx.db.models import User
+from onyx.error_handling.error_codes import OnyxErrorCode
+from onyx.error_handling.exceptions import OnyxError
 from onyx.server.features.document_set.models import CheckDocSetPublicRequest
 from onyx.server.features.document_set.models import CheckDocSetPublicResponse
 from onyx.server.features.document_set.models import DocumentSetCreationRequest
@@ -35,7 +35,7 @@ router = APIRouter(prefix="/manage")
@router.post("/admin/document-set")
 def create_document_set(
    document_set_creation_request: DocumentSetCreationRequest,
-    user: User = Depends(current_curator_or_admin_user),
+    user: User = Depends(require_permission(Permission.MANAGE_DOCUMENT_SETS)),
    db_session: Session = Depends(get_session),
    tenant_id: str = Depends(get_current_tenant_id),
 ) -> int:
@@ -55,7 +55,7 @@ def create_document_set(
            db_session=db_session,
        )
    except Exception as e:
-        raise HTTPException(status_code=400, detail=str(e))
+        raise OnyxError(OnyxErrorCode.VALIDATION_ERROR, str(e))

    if not DISABLE_VECTOR_DB:
        client_app.send_task(
@@ -70,15 +70,15 @@ def create_document_set(
@router.patch("/admin/document-set")
 def patch_document_set(
    document_set_update_request: DocumentSetUpdateRequest,
-    user: User = Depends(current_curator_or_admin_user),
+    user: User = Depends(require_permission(Permission.MANAGE_DOCUMENT_SETS)),
    db_session: Session = Depends(get_session),
    tenant_id: str = Depends(get_current_tenant_id),
 ) -> None:
    document_set = get_document_set_by_id(db_session, document_set_update_request.id)
    if document_set is None:
-        raise HTTPException(
-            status_code=404,
-            detail=f"Document set {document_set_update_request.id} does not exist",
+        raise OnyxError(
+            OnyxErrorCode.DOCUMENT_SET_NOT_FOUND,
+            f"Document set {document_set_update_request.id} does not exist",
        )

    fetch_ee_implementation_or_noop(
@@ -98,7 +98,7 @@ def patch_document_set(
            user=user,
        )
    except Exception as e:
-        raise HTTPException(status_code=400, detail=str(e))
+        raise OnyxError(OnyxErrorCode.VALIDATION_ERROR, str(e))

    if not DISABLE_VECTOR_DB:
        client_app.send_task(
@@ -111,15 +111,15 @@ def patch_document_set(
@router.delete("/admin/document-set/{document_set_id}")
 def delete_document_set(
    document_set_id: int,
-    user: User = Depends(current_curator_or_admin_user),
+    user: User = Depends(require_permission(Permission.MANAGE_DOCUMENT_SETS)),
    db_session: Session = Depends(get_session),
    tenant_id: str = Depends(get_current_tenant_id),
 ) -> None:
    document_set = get_document_set_by_id(db_session, document_set_id)
    if document_set is None:
-        raise HTTPException(
-            status_code=404,
-            detail=f"Document set {document_set_id} does not exist",
+        raise OnyxError(
+            OnyxErrorCode.DOCUMENT_SET_NOT_FOUND,
+            f"Document set {document_set_id} does not exist",
        )

    # check if the user has "edit" access to the document set.
@@ -142,7 +142,7 @@ def delete_document_set(
            user=user,
        )
    except Exception as e:
-        raise HTTPException(status_code=400, detail=str(e))
+        raise OnyxError(OnyxErrorCode.VALIDATION_ERROR, str(e))

    if DISABLE_VECTOR_DB:
        db_session.refresh(document_set)
--- a/backend/onyx/server/features/mcp/api.py
+++ b/backend/onyx/server/features/mcp/api.py
@@ -96,32 +96,6 @@ def _truncate_description(description: str | None, max_length: int = 500) -> str
    return description[: max_length - 3] + "..."


-# TODO: Replace mask-comparison approach with an explicit Unset sentinel from the
-# frontend indicating whether each credential field was actually modified. The current
-# approach is brittle (e.g. short credentials produce a fixed-length mask that could
-# collide) and mutates request values, which is surprising. The frontend should signal
-# "unchanged" vs "new value" directly rather than relying on masked-string equality.
-def _restore_masked_oauth_credentials(
-    request_client_id: str | None,
-    request_client_secret: str | None,
-    existing_client: OAuthClientInformationFull,
-) -> tuple[str | None, str | None]:
-    """If the frontend sent back masked credentials, restore the real stored values."""
-    if (
-        request_client_id
-        and existing_client.client_id
-        and request_client_id == mask_string(existing_client.client_id)
-    ):
-        request_client_id = existing_client.client_id
-    if (
-        request_client_secret
-        and existing_client.client_secret
-        and request_client_secret == mask_string(existing_client.client_secret)
-    ):
-        request_client_secret = existing_client.client_secret
-    return request_client_id, request_client_secret
-
-
 router = APIRouter(prefix="/mcp")
 admin_router = APIRouter(prefix="/admin/mcp")
 STATE_TTL_SECONDS = 60 * 5  # 5 minutes
@@ -418,26 +392,6 @@ async def _connect_oauth(
            detail=f"Server was configured with authentication type {auth_type_str}",
        )

-    # If the frontend sent back masked credentials (unchanged by the user),
-    # restore the real stored values so we don't overwrite them with masks.
-    if mcp_server.admin_connection_config:
-        existing_data = extract_connection_data(
-            mcp_server.admin_connection_config, apply_mask=False
-        )
-        existing_client_raw = existing_data.get(MCPOAuthKeys.CLIENT_INFO.value)
-        if existing_client_raw:
-            existing_client = OAuthClientInformationFull.model_validate(
-                existing_client_raw
-            )
-            (
-                request.oauth_client_id,
-                request.oauth_client_secret,
-            ) = _restore_masked_oauth_credentials(
-                request.oauth_client_id,
-                request.oauth_client_secret,
-                existing_client,
-            )
-
    # Create admin config with client info if provided
    config_data = MCPConnectionData(headers={})
    if request.oauth_client_id and request.oauth_client_secret:
@@ -1402,19 +1356,6 @@ def _upsert_mcp_server(
            if client_info_raw:
                client_info = OAuthClientInformationFull.model_validate(client_info_raw)

-        # If the frontend sent back masked credentials (unchanged by the user),
-        # restore the real stored values so the comparison below sees no change
-        # and the credentials aren't overwritten with masked strings.
-        if client_info and request.auth_type == MCPAuthenticationType.OAUTH:
-            (
-                request.oauth_client_id,
-                request.oauth_client_secret,
-            ) = _restore_masked_oauth_credentials(
-                request.oauth_client_id,
-                request.oauth_client_secret,
-                client_info,
-            )
-
        changing_connection_config = (
            not mcp_server.admin_connection_config
            or (
--- a/backend/onyx/server/features/notifications/api.py
+++ b/backend/onyx/server/features/notifications/api.py
@@ -11,9 +11,6 @@ from onyx.db.notification import dismiss_notification
 from onyx.db.notification import get_notification_by_id
 from onyx.db.notification import get_notifications
 from onyx.server.features.build.utils import ensure_build_mode_intro_notification
-from onyx.server.features.notifications.utils import (
-    ensure_permissions_migration_notification,
-)
 from onyx.server.features.release_notes.utils import (
    ensure_release_notes_fresh_and_notify,
 )
@@ -52,13 +49,6 @@ def get_notifications_api(
    except Exception:
        logger.exception("Failed to check for release notes in notifications endpoint")

-    try:
-        ensure_permissions_migration_notification(user, db_session)
-    except Exception:
-        logger.exception(
-            "Failed to create permissions_migration_v1 announcement in notifications endpoint"
-        )
-
    notifications = [
        NotificationModel.from_model(notif)
        for notif in get_notifications(user, db_session, include_dismissed=True)
--- a/backend/onyx/server/features/notifications/utils.py
+++ b/backend/onyx/server/features/notifications/utils.py
@@ -1,21 +0,0 @@
-from sqlalchemy.orm import Session
-
-from onyx.configs.constants import NotificationType
-from onyx.db.models import User
-from onyx.db.notification import create_notification
-
-
-def ensure_permissions_migration_notification(user: User, db_session: Session) -> None:
-    # Feature id "permissions_migration_v1" must not change after shipping —
-    # it is the dedup key on (user_id, notif_type, additional_data).
-    create_notification(
-        user_id=user.id,
-        notif_type=NotificationType.FEATURE_ANNOUNCEMENT,
-        db_session=db_session,
-        title="Permissions are changing in Onyx",
-        description="Roles are moving to group-based permissions. Click for details.",
-        additional_data={
-            "feature": "permissions_migration_v1",
-            "link": "https://docs.onyx.app/admins/permissions/whats_changing",
-        },
-    )
--- a/backend/onyx/server/features/persona/api.py
+++ b/backend/onyx/server/features/persona/api.py
@@ -11,7 +11,6 @@ from sqlalchemy.orm import Session

 from onyx.auth.permissions import require_permission
 from onyx.auth.users import current_chat_accessible_user
-from onyx.auth.users import current_curator_or_admin_user
 from onyx.auth.users import current_limited_user
 from onyx.configs.app_configs import DISABLE_VECTOR_DB
 from onyx.configs.constants import FileOrigin
@@ -135,7 +134,7 @@ class IsFeaturedRequest(BaseModel):
 def patch_persona_visibility(
    persona_id: int,
    is_listed_request: IsListedRequest,
-    user: User = Depends(current_curator_or_admin_user),
+    user: User = Depends(require_permission(Permission.MANAGE_AGENTS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    update_persona_visibility(
@@ -150,7 +149,7 @@ def patch_persona_visibility(
 def patch_user_persona_public_status(
    persona_id: int,
    is_public_request: IsPublicRequest,
-    user: User = Depends(require_permission(Permission.BASIC_ACCESS)),
+    user: User = Depends(require_permission(Permission.ADD_AGENTS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    try:
@@ -169,7 +168,7 @@ def patch_user_persona_public_status(
 def patch_persona_featured_status(
    persona_id: int,
    is_featured_request: IsFeaturedRequest,
-    user: User = Depends(current_curator_or_admin_user),
+    user: User = Depends(require_permission(Permission.MANAGE_AGENTS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    try:
@@ -204,7 +203,7 @@ def patch_agents_display_priorities(

@admin_router.get("", tags=PUBLIC_API_TAGS)
 def list_personas_admin(
-    user: User = Depends(current_curator_or_admin_user),
+    user: User = Depends(require_permission(Permission.READ_AGENTS)),
    db_session: Session = Depends(get_session),
    include_deleted: bool = False,
    get_editable: bool = Query(False, description="If true, return editable personas"),
@@ -221,7 +220,7 @@ def list_personas_admin(
 def get_agents_admin_paginated(
    page_num: int = Query(0, ge=0, description="Page number (0-indexed)."),
    page_size: int = Query(10, ge=1, le=1000, description="Items per page."),
-    user: User = Depends(current_curator_or_admin_user),
+    user: User = Depends(require_permission(Permission.READ_AGENTS)),
    db_session: Session = Depends(get_session),
    include_deleted: bool = Query(
        False, description="If true, includes deleted personas."
@@ -298,7 +297,7 @@ def upload_file(
@basic_router.post("", tags=PUBLIC_API_TAGS)
 def create_persona(
    persona_upsert_request: PersonaUpsertRequest,
-    user: User = Depends(require_permission(Permission.BASIC_ACCESS)),
+    user: User = Depends(require_permission(Permission.ADD_AGENTS)),
    db_session: Session = Depends(get_session),
 ) -> PersonaSnapshot:
    tenant_id = get_current_tenant_id()
@@ -328,7 +327,7 @@ def create_persona(
 def update_persona(
    persona_id: int,
    persona_upsert_request: PersonaUpsertRequest,
-    user: User = Depends(require_permission(Permission.BASIC_ACCESS)),
+    user: User = Depends(require_permission(Permission.ADD_AGENTS)),
    db_session: Session = Depends(get_session),
 ) -> PersonaSnapshot:
    _validate_user_knowledge_enabled(persona_upsert_request, "update")
@@ -410,7 +409,7 @@ class PersonaShareRequest(BaseModel):
 def share_persona(
    persona_id: int,
    persona_share_request: PersonaShareRequest,
-    user: User = Depends(require_permission(Permission.BASIC_ACCESS)),
+    user: User = Depends(require_permission(Permission.ADD_AGENTS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    try:
@@ -434,7 +433,7 @@ def share_persona(
@basic_router.delete("/{persona_id}", tags=PUBLIC_API_TAGS)
 def delete_persona(
    persona_id: int,
-    user: User = Depends(require_permission(Permission.BASIC_ACCESS)),
+    user: User = Depends(require_permission(Permission.ADD_AGENTS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    mark_persona_as_deleted(
--- a/backend/onyx/server/manage/discord_bot/api.py
+++ b/backend/onyx/server/manage/discord_bot/api.py
@@ -2,8 +2,6 @@

 from fastapi import APIRouter
 from fastapi import Depends
-from fastapi import HTTPException
-from fastapi import status
 from sqlalchemy.orm import Session

 from onyx.auth.permissions import require_permission
@@ -25,6 +23,8 @@ from onyx.db.discord_bot import update_guild_config
 from onyx.db.engine.sql_engine import get_session
 from onyx.db.enums import Permission
 from onyx.db.models import User
+from onyx.error_handling.error_codes import OnyxErrorCode
+from onyx.error_handling.exceptions import OnyxError
 from onyx.server.manage.discord_bot.models import DiscordBotConfigCreateRequest
 from onyx.server.manage.discord_bot.models import DiscordBotConfigResponse
 from onyx.server.manage.discord_bot.models import DiscordChannelConfigResponse
@@ -48,14 +48,14 @@ def _check_bot_config_api_access() -> None:
    - When DISCORD_BOT_TOKEN env var is set (managed via env)
    """
    if AUTH_TYPE == AuthType.CLOUD:
-        raise HTTPException(
-            status_code=status.HTTP_403_FORBIDDEN,
-            detail="Discord bot configuration is managed by Onyx on Cloud.",
+        raise OnyxError(
+            OnyxErrorCode.INSUFFICIENT_PERMISSIONS,
+            "Discord bot configuration is managed by Onyx on Cloud.",
        )
    if DISCORD_BOT_TOKEN:
-        raise HTTPException(
-            status_code=status.HTTP_403_FORBIDDEN,
-            detail="Discord bot is configured via environment variables. API access disabled.",
+        raise OnyxError(
+            OnyxErrorCode.INSUFFICIENT_PERMISSIONS,
+            "Discord bot is configured via environment variables. API access disabled.",
        )


@@ -65,7 +65,7 @@ def _check_bot_config_api_access() -> None:
@router.get("/config", response_model=DiscordBotConfigResponse)
 def get_bot_config(
    _: None = Depends(_check_bot_config_api_access),
-    __: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    __: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> DiscordBotConfigResponse:
    """Get Discord bot config. Returns 403 on Cloud or if env vars set."""
@@ -83,7 +83,7 @@ def get_bot_config(
 def create_bot_request(
    request: DiscordBotConfigCreateRequest,
    _: None = Depends(_check_bot_config_api_access),
-    __: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    __: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> DiscordBotConfigResponse:
    """Create Discord bot config. Returns 403 on Cloud or if env vars set."""
@@ -93,9 +93,9 @@ def create_bot_request(
            bot_token=request.bot_token,
        )
    except ValueError:
-        raise HTTPException(
-            status_code=status.HTTP_409_CONFLICT,
-            detail="Discord bot config already exists. Delete it first to create a new one.",
+        raise OnyxError(
+            OnyxErrorCode.CONFLICT,
+            "Discord bot config already exists. Delete it first to create a new one.",
        )

    db_session.commit()
@@ -109,7 +109,7 @@ def create_bot_request(
@router.delete("/config")
 def delete_bot_config_endpoint(
    _: None = Depends(_check_bot_config_api_access),
-    __: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    __: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> dict:
    """Delete Discord bot config.
@@ -118,7 +118,7 @@ def delete_bot_config_endpoint(
    """
    deleted = delete_discord_bot_config(db_session)
    if not deleted:
-        raise HTTPException(status_code=404, detail="Bot config not found")
+        raise OnyxError(OnyxErrorCode.NOT_FOUND, "Bot config not found")

    # Also delete the service API key used by the Discord bot
    delete_discord_service_api_key(db_session)
@@ -132,7 +132,7 @@ def delete_bot_config_endpoint(

@router.delete("/service-api-key")
 def delete_service_api_key_endpoint(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> dict:
    """Delete the Discord service API key.
@@ -145,7 +145,7 @@ def delete_service_api_key_endpoint(
    """
    deleted = delete_discord_service_api_key(db_session)
    if not deleted:
-        raise HTTPException(status_code=404, detail="Service API key not found")
+        raise OnyxError(OnyxErrorCode.NOT_FOUND, "Service API key not found")
    db_session.commit()
    return {"deleted": True}

@@ -155,7 +155,7 @@ def delete_service_api_key_endpoint(

@router.get("/guilds", response_model=list[DiscordGuildConfigResponse])
 def list_guild_configs(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> list[DiscordGuildConfigResponse]:
    """List all guild configs (pending and registered)."""
@@ -165,7 +165,7 @@ def list_guild_configs(

@router.post("/guilds", response_model=DiscordGuildConfigCreateResponse)
 def create_guild_request(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> DiscordGuildConfigCreateResponse:
    """Create new guild config with registration key. Key shown once."""
@@ -184,13 +184,13 @@ def create_guild_request(
@router.get("/guilds/{config_id}", response_model=DiscordGuildConfigResponse)
 def get_guild_config(
    config_id: int,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> DiscordGuildConfigResponse:
    """Get specific guild config."""
    config = get_guild_config_by_internal_id(db_session, internal_id=config_id)
    if not config:
-        raise HTTPException(status_code=404, detail="Guild config not found")
+        raise OnyxError(OnyxErrorCode.NOT_FOUND, "Guild config not found")
    return DiscordGuildConfigResponse.model_validate(config)


@@ -198,13 +198,13 @@ def get_guild_config(
 def update_guild_request(
    config_id: int,
    request: DiscordGuildConfigUpdateRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> DiscordGuildConfigResponse:
    """Update guild config."""
    config = get_guild_config_by_internal_id(db_session, internal_id=config_id)
    if not config:
-        raise HTTPException(status_code=404, detail="Guild config not found")
+        raise OnyxError(OnyxErrorCode.NOT_FOUND, "Guild config not found")

    config = update_guild_config(
        db_session,
@@ -220,7 +220,7 @@ def update_guild_request(
@router.delete("/guilds/{config_id}")
 def delete_guild_request(
    config_id: int,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> dict:
    """Delete guild config (invalidates registration key).
@@ -229,7 +229,7 @@ def delete_guild_request(
    """
    deleted = delete_guild_config(db_session, config_id)
    if not deleted:
-        raise HTTPException(status_code=404, detail="Guild config not found")
+        raise OnyxError(OnyxErrorCode.NOT_FOUND, "Guild config not found")

    # On Cloud, delete service API key when all guilds are removed
    if AUTH_TYPE == AuthType.CLOUD:
@@ -249,15 +249,15 @@ def delete_guild_request(
 )
 def list_channel_configs(
    config_id: int,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> list[DiscordChannelConfigResponse]:
    """List whitelisted channels for a guild."""
    guild_config = get_guild_config_by_internal_id(db_session, internal_id=config_id)
    if not guild_config:
-        raise HTTPException(status_code=404, detail="Guild config not found")
+        raise OnyxError(OnyxErrorCode.NOT_FOUND, "Guild config not found")
    if not guild_config.guild_id:
-        raise HTTPException(status_code=400, detail="Guild not yet registered")
+        raise OnyxError(OnyxErrorCode.INVALID_INPUT, "Guild not yet registered")

    configs = get_channel_configs(db_session, config_id)
    return [DiscordChannelConfigResponse.model_validate(c) for c in configs]
@@ -271,7 +271,7 @@ def update_channel_request(
    guild_config_id: int,
    channel_config_id: int,
    request: DiscordChannelConfigUpdateRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
    db_session: Session = Depends(get_session),
 ) -> DiscordChannelConfigResponse:
    """Update channel config."""
@@ -279,7 +279,7 @@ def update_channel_request(
        db_session, guild_config_id, channel_config_id
    )
    if not config:
-        raise HTTPException(status_code=404, detail="Channel config not found")
+        raise OnyxError(OnyxErrorCode.NOT_FOUND, "Channel config not found")

    config = update_discord_channel_config(
        db_session,
--- a/backend/onyx/server/manage/llm/api.py
+++ b/backend/onyx/server/manage/llm/api.py
@@ -15,8 +15,8 @@ from fastapi import Query
 from pydantic import ValidationError
 from sqlalchemy.orm import Session

+from onyx.auth.permissions import has_permission
 from onyx.auth.permissions import require_permission
-from onyx.auth.schemas import UserRole
 from onyx.auth.users import current_chat_accessible_user
 from onyx.db.engine.sql_engine import get_session
 from onyx.db.enums import LLMModelFlowType
@@ -40,8 +40,6 @@ from onyx.db.models import User
 from onyx.db.persona import user_can_access_persona
 from onyx.error_handling.error_codes import OnyxErrorCode
 from onyx.error_handling.exceptions import OnyxError
-from onyx.llm.constants import PROVIDER_DISPLAY_NAMES
-from onyx.llm.constants import WELL_KNOWN_PROVIDER_NAMES
 from onyx.llm.factory import get_default_llm
 from onyx.llm.factory import get_llm
 from onyx.llm.factory import get_max_input_tokens_from_llm_provider
@@ -62,7 +60,6 @@ from onyx.server.manage.llm.models import BedrockFinalModelResponse
 from onyx.server.manage.llm.models import BedrockModelsRequest
 from onyx.server.manage.llm.models import BifrostFinalModelResponse
 from onyx.server.manage.llm.models import BifrostModelsRequest
-from onyx.server.manage.llm.models import CustomProviderOption
 from onyx.server.manage.llm.models import DefaultModel
 from onyx.server.manage.llm.models import LitellmFinalModelResponse
 from onyx.server.manage.llm.models import LitellmModelDetails
@@ -111,43 +108,6 @@ def _mask_string(value: str) -> str:
    return value[:4] + "****" + value[-4:]


-def _resolve_api_key(
-    api_key: str | None,
-    provider_name: str | None,
-    api_base: str | None,
-    db_session: Session,
-) -> str | None:
-    """Return the real API key for model-fetch endpoints.
-
-    When editing an existing provider the form value is masked (e.g.
-    ``sk-a****b1c2``).  If *provider_name* is supplied we can look up
-    the unmasked key from the database so the external request succeeds.
-
-    The stored key is only returned when the request's *api_base*
-    matches the value stored in the database.
-    """
-    if not provider_name:
-        return api_key
-
-    existing_provider = fetch_existing_llm_provider(
-        name=provider_name, db_session=db_session
-    )
-    if existing_provider and existing_provider.api_key:
-        # Normalise both URLs before comparing so trailing-slash
-        # differences don't cause a false mismatch.
-        stored_base = (existing_provider.api_base or "").strip().rstrip("/")
-        request_base = (api_base or "").strip().rstrip("/")
-        if stored_base != request_base:
-            return api_key
-
-        stored_key = existing_provider.api_key.get_value(apply_mask=False)
-        # Only resolve when the incoming value is the masked form of the
-        # stored key — i.e. the user hasn't typed a new key.
-        if api_key and api_key == _mask_string(stored_key):
-            return stored_key
-    return api_key
-
-
 def _sync_fetched_models(
    db_session: Session,
    provider_name: str,
@@ -290,32 +250,9 @@ def _validate_llm_provider_change(
        )


-@admin_router.get("/custom-provider-names")
-def fetch_custom_provider_names(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
-) -> list[CustomProviderOption]:
-    """Returns the sorted list of LiteLLM provider names that can be used
-    with the custom provider modal (i.e. everything that is not already
-    covered by a well-known provider modal)."""
-    import litellm
-
-    well_known = {p.value for p in WELL_KNOWN_PROVIDER_NAMES}
-    return sorted(
-        (
-            CustomProviderOption(
-                value=name,
-                label=PROVIDER_DISPLAY_NAMES.get(name, name.replace("_", " ").title()),
-            )
-            for name in litellm.models_by_provider.keys()
-            if name not in well_known
-        ),
-        key=lambda o: o.label.lower(),
-    )
-
-
@admin_router.get("/built-in/options")
 def fetch_llm_options(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
 ) -> list[WellKnownLLMProviderDescriptor]:
    return fetch_available_well_known_llms()

@@ -323,7 +260,7 @@ def fetch_llm_options(
@admin_router.get("/built-in/options/{provider_name}")
 def fetch_llm_provider_options(
    provider_name: str,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
 ) -> WellKnownLLMProviderDescriptor:
    well_known_llms = fetch_available_well_known_llms()
    for well_known_llm in well_known_llms:
@@ -335,7 +272,7 @@ def fetch_llm_provider_options(
@admin_router.post("/test")
 def test_llm_configuration(
    test_llm_request: TestLLMRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    """Test LLM configuration settings"""
@@ -393,7 +330,7 @@ def test_llm_configuration(

@admin_router.post("/test/default")
 def test_default_provider(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
 ) -> None:
    try:
        llm = get_default_llm()
@@ -409,7 +346,7 @@ def test_default_provider(
@admin_router.get("/provider")
 def list_llm_providers(
    include_image_gen: bool = Query(False),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> LLMProviderResponse[LLMProviderView]:
    start_time = datetime.now(timezone.utc)
@@ -454,7 +391,7 @@ def put_llm_provider(
        False,
        description="True if creating a new one, False if updating an existing provider",
    ),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> LLMProviderView:
    # validate request (e.g. if we're intending to create but the name already exists we should throw an error)
@@ -592,7 +529,7 @@ def put_llm_provider(
 def delete_llm_provider(
    provider_id: int,
    force: bool = Query(False),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    if not force:
@@ -613,7 +550,7 @@ def delete_llm_provider(
@admin_router.post("/default")
 def set_provider_as_default(
    default_model_request: DefaultModel,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    update_default_provider(
@@ -626,7 +563,7 @@ def set_provider_as_default(
@admin_router.post("/default-vision")
 def set_provider_as_default_vision(
    default_model: DefaultModel,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> None:
    update_default_vision_provider(
@@ -638,7 +575,7 @@ def set_provider_as_default_vision(

@admin_router.get("/auto-config")
 def get_auto_config(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
 ) -> dict:
    """Get the current Auto mode configuration from GitHub.

@@ -656,7 +593,7 @@ def get_auto_config(

@admin_router.get("/vision-providers")
 def get_vision_capable_providers(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> LLMProviderResponse[VisionProviderResponse]:
    """Return a list of LLM providers and their models that support image input"""
@@ -718,7 +655,7 @@ def list_llm_provider_basics(

    all_providers = fetch_existing_llm_providers(db_session, [])
    user_group_ids = fetch_user_group_ids(db_session, user)
-    is_admin = user.role == UserRole.ADMIN
+    can_manage_llms = has_permission(user, Permission.MANAGE_LLMS)

    accessible_providers = []

@@ -730,7 +667,7 @@ def list_llm_provider_basics(
        # - Excludes providers with persona restrictions (requires specific persona)
        # - Excludes non-public providers with no restrictions (admin-only)
        if can_user_access_llm_provider(
-            provider, user_group_ids, persona=None, is_admin=is_admin
+            provider, user_group_ids, persona=None, is_admin=can_manage_llms
        ):
            accessible_providers.append(LLMProviderDescriptor.from_model(provider))

@@ -766,17 +703,19 @@ def get_valid_model_names_for_persona(
    if not persona:
        return []

-    is_admin = user.role == UserRole.ADMIN
+    can_manage_llms = has_permission(user, Permission.MANAGE_LLMS)
    all_providers = fetch_existing_llm_providers(
        db_session, [LLMModelFlowType.CHAT, LLMModelFlowType.VISION]
    )
-    user_group_ids = set() if is_admin else fetch_user_group_ids(db_session, user)
+    user_group_ids = (
+        set() if can_manage_llms else fetch_user_group_ids(db_session, user)
+    )

    valid_models = []
    for llm_provider_model in all_providers:
        # Check access with persona context — respects all RBAC restrictions
        if can_user_access_llm_provider(
-            llm_provider_model, user_group_ids, persona, is_admin=is_admin
+            llm_provider_model, user_group_ids, persona, is_admin=can_manage_llms
        ):
            # Collect all model names from this provider
            for model_config in llm_provider_model.model_configurations:
@@ -815,18 +754,20 @@ def list_llm_providers_for_persona(
            "You don't have access to this assistant",
        )

-    is_admin = user.role == UserRole.ADMIN
+    can_manage_llms = has_permission(user, Permission.MANAGE_LLMS)
    all_providers = fetch_existing_llm_providers(
        db_session, [LLMModelFlowType.CHAT, LLMModelFlowType.VISION]
    )
-    user_group_ids = set() if is_admin else fetch_user_group_ids(db_session, user)
+    user_group_ids = (
+        set() if can_manage_llms else fetch_user_group_ids(db_session, user)
+    )

    llm_provider_list: list[LLMProviderDescriptor] = []

    for llm_provider_model in all_providers:
        # Check access with persona context — respects persona restrictions
        if can_user_access_llm_provider(
-            llm_provider_model, user_group_ids, persona, is_admin=is_admin
+            llm_provider_model, user_group_ids, persona, is_admin=can_manage_llms
        ):
            llm_provider_list.append(
                LLMProviderDescriptor.from_model(llm_provider_model)
@@ -854,7 +795,7 @@ def list_llm_providers_for_persona(
    if persona_default_provider:
        provider = fetch_existing_llm_provider(persona_default_provider, db_session)
        if provider and can_user_access_llm_provider(
-            provider, user_group_ids, persona, is_admin=is_admin
+            provider, user_group_ids, persona, is_admin=can_manage_llms
        ):
            if persona_default_model:
                # Persona specifies both provider and model — use them directly
@@ -887,7 +828,7 @@ def list_llm_providers_for_persona(

@admin_router.get("/provider-contextual-cost")
 def get_provider_contextual_cost(
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> list[LLMCost]:
    """
@@ -936,7 +877,7 @@ def get_provider_contextual_cost(
@admin_router.post("/bedrock/available-models")
 def get_bedrock_available_models(
    request: BedrockModelsRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> list[BedrockFinalModelResponse]:
    """Fetch available Bedrock models for a specific region and credentials.
@@ -1111,7 +1052,7 @@ def _get_ollama_available_model_names(api_base: str) -> set[str]:
@admin_router.post("/ollama/available-models")
 def get_ollama_available_models(
    request: OllamaModelsRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> list[OllamaFinalModelResponse]:
    """Fetch the list of available models from an Ollama server."""
@@ -1211,17 +1152,16 @@ def get_ollama_available_models(
    return sorted_results


-def _get_openrouter_models_response(api_base: str, api_key: str | None) -> dict:
+def _get_openrouter_models_response(api_base: str, api_key: str) -> dict:
    """Perform GET to OpenRouter /models and return parsed JSON."""
    cleaned_api_base = api_base.strip().rstrip("/")
    url = f"{cleaned_api_base}/models"
-    headers: dict[str, str] = {
+    headers = {
+        "Authorization": f"Bearer {api_key}",
        # Optional headers recommended by OpenRouter for attribution
        "HTTP-Referer": "https://onyx.app",
        "X-Title": "Onyx",
    }
-    if api_key:
-        headers["Authorization"] = f"Bearer {api_key}"
    try:
        response = httpx.get(url, headers=headers, timeout=10.0)
        response.raise_for_status()
@@ -1236,7 +1176,7 @@ def _get_openrouter_models_response(api_base: str, api_key: str | None) -> dict:
@admin_router.post("/openrouter/available-models")
 def get_openrouter_available_models(
    request: OpenRouterModelsRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> list[OpenRouterFinalModelResponse]:
    """Fetch available models from OpenRouter `/models` endpoint.
@@ -1244,12 +1184,8 @@ def get_openrouter_available_models(
    Parses id, name (display), context_length, and architecture.input_modalities.
    """

-    api_key = _resolve_api_key(
-        request.api_key, request.provider_name, request.api_base, db_session
-    )
-
    response_json = _get_openrouter_models_response(
-        api_base=request.api_base, api_key=api_key
+        api_base=request.api_base, api_key=request.api_key
    )

    data = response_json.get("data", [])
@@ -1321,7 +1257,7 @@ def get_openrouter_available_models(
@admin_router.post("/lm-studio/available-models")
 def get_lm_studio_available_models(
    request: LMStudioModelsRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> list[LMStudioFinalModelResponse]:
    """Fetch available models from an LM Studio server.
@@ -1342,18 +1278,13 @@ def get_lm_studio_available_models(

    # If provider_name is given and the api_key hasn't been changed by the user,
    # fall back to the stored API key from the database (the form value is masked).
-    # Only do so when the api_base matches what is stored.
    api_key = request.api_key
    if request.provider_name and not request.api_key_changed:
        existing_provider = fetch_existing_llm_provider(
            name=request.provider_name, db_session=db_session
        )
        if existing_provider and existing_provider.custom_config:
-            stored_base = (existing_provider.api_base or "").strip().rstrip("/")
-            if stored_base == cleaned_api_base:
-                api_key = existing_provider.custom_config.get(
-                    LM_STUDIO_API_KEY_CONFIG_KEY
-                )
+            api_key = existing_provider.custom_config.get(LM_STUDIO_API_KEY_CONFIG_KEY)

    url = f"{cleaned_api_base}/api/v1/models"
    headers: dict[str, str] = {}
@@ -1433,16 +1364,12 @@ def get_lm_studio_available_models(
@admin_router.post("/litellm/available-models")
 def get_litellm_available_models(
    request: LitellmModelsRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> list[LitellmFinalModelResponse]:
    """Fetch available models from Litellm proxy /v1/models endpoint."""
-    api_key = _resolve_api_key(
-        request.api_key, request.provider_name, request.api_base, db_session
-    )
-
    response_json = _get_litellm_models_response(
-        api_key=api_key, api_base=request.api_base
+        api_key=request.api_key, api_base=request.api_base
    )

    models = response_json.get("data", [])
@@ -1499,7 +1426,7 @@ def get_litellm_available_models(
    return sorted_results


-def _get_litellm_models_response(api_key: str | None, api_base: str) -> dict:
+def _get_litellm_models_response(api_key: str, api_base: str) -> dict:
    """Perform GET to Litellm proxy /api/v1/models and return parsed JSON."""
    cleaned_api_base = api_base.strip().rstrip("/")
    url = f"{cleaned_api_base}/v1/models"
@@ -1570,16 +1497,12 @@ def _get_openai_compatible_models_response(
@admin_router.post("/bifrost/available-models")
 def get_bifrost_available_models(
    request: BifrostModelsRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> list[BifrostFinalModelResponse]:
    """Fetch available models from Bifrost gateway /v1/models endpoint."""
-    api_key = _resolve_api_key(
-        request.api_key, request.provider_name, request.api_base, db_session
-    )
-
    response_json = _get_bifrost_models_response(
-        api_base=request.api_base, api_key=api_key
+        api_base=request.api_base, api_key=request.api_key
    )

    models = response_json.get("data", [])
@@ -1664,16 +1587,12 @@ def _get_bifrost_models_response(api_base: str, api_key: str | None = None) -> d
@admin_router.post("/openai-compatible/available-models")
 def get_openai_compatible_server_available_models(
    request: OpenAICompatibleModelsRequest,
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_LLMS)),
    db_session: Session = Depends(get_session),
 ) -> list[OpenAICompatibleFinalModelResponse]:
    """Fetch available models from a generic OpenAI-compatible /v1/models endpoint."""
-    api_key = _resolve_api_key(
-        request.api_key, request.provider_name, request.api_base, db_session
-    )
-
    response_json = _get_openai_compatible_server_response(
-        api_base=request.api_base, api_key=api_key
+        api_base=request.api_base, api_key=request.api_key
    )

    models = response_json.get("data", [])
@@ -1733,7 +1652,7 @@ def get_openai_compatible_server_available_models(
                )
                for r in sorted_results
            ],
-            source_label="OpenAI-Compatible",
+            source_label="OpenAI Compatible",
        )

    return sorted_results
@@ -1752,6 +1671,6 @@ def _get_openai_compatible_server_response(

    return _get_openai_compatible_models_response(
        url=url,
-        source_name="OpenAI-Compatible",
+        source_name="OpenAI Compatible",
        api_key=api_key,
    )
--- a/backend/onyx/server/manage/llm/models.py
+++ b/backend/onyx/server/manage/llm/models.py
@@ -28,13 +28,6 @@ if TYPE_CHECKING:
 T = TypeVar("T", "LLMProviderDescriptor", "LLMProviderView", "VisionProviderResponse")


-class CustomProviderOption(BaseModel):
-    """A provider slug + human-friendly label for the custom-provider picker."""
-
-    value: str
-    label: str
-
-
 class TestLLMRequest(BaseModel):
    # provider level
    id: int | None = None
--- a/backend/onyx/server/manage/llm/utils.py
+++ b/backend/onyx/server/manage/llm/utils.py
@@ -183,9 +183,6 @@ def generate_ollama_display_name(model_name: str) -> str:
        "qwen2.5:7b" → "Qwen 2.5 7B"
        "mistral:latest" → "Mistral"
        "deepseek-r1:14b" → "DeepSeek R1 14B"
-        "gemma4:e4b" → "Gemma 4 E4B"
-        "deepseek-v3.1:671b-cloud" → "DeepSeek V3.1 671B Cloud"
-        "qwen3-vl:235b-instruct-cloud" → "Qwen 3-vl 235B Instruct Cloud"
    """
    # Split into base name and tag
    if ":" in model_name:
@@ -212,24 +209,13 @@ def generate_ollama_display_name(model_name: str) -> str:
        # Default: Title case with dashes converted to spaces
        display_name = base.replace("-", " ").title()

-    # Process tag (skip "latest")
+    # Process tag to extract size info (skip "latest")
    if tag and tag.lower() != "latest":
-        # Check for size prefix like "7b", "70b", optionally followed by modifiers
-        size_match = re.match(r"^(\d+(?:\.\d+)?[bBmM])(-.+)?$", tag)
+        # Extract size like "7b", "70b", "14b"
+        size_match = re.match(r"^(\d+(?:\.\d+)?[bBmM])", tag)
        if size_match:
            size = size_match.group(1).upper()
-            remainder = size_match.group(2)
-            if remainder:
-                # Format modifiers like "-cloud", "-instruct-cloud"
-                modifiers = " ".join(
-                    p.title() for p in remainder.strip("-").split("-") if p
-                )
-                display_name = f"{display_name} {size} {modifiers}"
-            else:
-                display_name = f"{display_name} {size}"
-        else:
-            # Non-size tags like "e4b", "q4_0", "fp16", "cloud"
-            display_name = f"{display_name} {tag.upper()}"
+            display_name = f"{display_name} {size}"

    return display_name

--- a/backend/onyx/server/manage/models.py
+++ b/backend/onyx/server/manage/models.py
@@ -135,6 +135,7 @@ class UserInfo(BaseModel):
    is_anonymous_user: bool | None = None
    password_configured: bool | None = None
    tenant_info: TenantInfo | None = None
+    effective_permissions: list[str] = Field(default_factory=list)

    @classmethod
    def from_model(
@@ -148,6 +149,7 @@ class UserInfo(BaseModel):
        tenant_info: TenantInfo | None = None,
        assistant_specific_configs: UserSpecificAssistantPreferences | None = None,
        memories: list[MemoryItem] | None = None,
+        effective_permissions: list[str] | None = None,
    ) -> "UserInfo":
        return cls(
            id=str(user.id),
@@ -187,6 +189,7 @@ class UserInfo(BaseModel):
            is_cloud_superuser=is_cloud_superuser,
            is_anonymous_user=is_anonymous_user,
            tenant_info=tenant_info,
+            effective_permissions=effective_permissions or [],
            personalization=UserPersonalization(
                name=user.personal_name or "",
                role=user.personal_role or "",
--- a/backend/onyx/server/manage/slack_bot.py
+++ b/backend/onyx/server/manage/slack_bot.py
@@ -114,7 +114,7 @@ def _form_channel_config(
 def create_slack_channel_config(
    slack_channel_config_creation_request: SlackChannelConfigCreationRequest,
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> SlackChannelConfig:
    channel_config = _form_channel_config(
        db_session=db_session,
@@ -155,7 +155,7 @@ def patch_slack_channel_config(
    slack_channel_config_id: int,
    slack_channel_config_creation_request: SlackChannelConfigCreationRequest,
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> SlackChannelConfig:
    channel_config = _form_channel_config(
        db_session=db_session,
@@ -216,7 +216,7 @@ def patch_slack_channel_config(
 def delete_slack_channel_config(
    slack_channel_config_id: int,
    db_session: Session = Depends(get_session),
-    user: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    user: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> None:
    remove_slack_channel_config(
        db_session=db_session,
@@ -228,7 +228,7 @@ def delete_slack_channel_config(
@router.get("/admin/slack-app/channel")
 def list_slack_channel_configs(
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> list[SlackChannelConfig]:
    slack_channel_config_models = fetch_slack_channel_configs(db_session=db_session)
    return [
@@ -241,7 +241,7 @@ def list_slack_channel_configs(
 def create_bot(
    slack_bot_creation_request: SlackBotCreationRequest,
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> SlackBot:
    tenant_id = get_current_tenant_id()

@@ -287,7 +287,7 @@ def patch_bot(
    slack_bot_id: int,
    slack_bot_creation_request: SlackBotCreationRequest,
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> SlackBot:
    validate_bot_token(slack_bot_creation_request.bot_token)
    validate_app_token(slack_bot_creation_request.app_token)
@@ -308,7 +308,7 @@ def patch_bot(
 def delete_bot(
    slack_bot_id: int,
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> None:
    remove_slack_bot(
        db_session=db_session,
@@ -320,7 +320,7 @@ def delete_bot(
 def get_bot_by_id(
    slack_bot_id: int,
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> SlackBot:
    slack_bot_model = fetch_slack_bot(
        db_session=db_session,
@@ -332,7 +332,7 @@ def get_bot_by_id(
@router.get("/admin/slack-app/bots")
 def list_bots(
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> list[SlackBot]:
    slack_bot_models = fetch_slack_bots(db_session=db_session)
    return [
@@ -344,7 +344,7 @@ def list_bots(
 def list_bot_configs(
    bot_id: int,
    db_session: Session = Depends(get_session),
-    _: User = Depends(require_permission(Permission.FULL_ADMIN_PANEL_ACCESS)),
+    _: User = Depends(require_permission(Permission.MANAGE_BOTS)),
 ) -> list[SlackChannelConfig]:
    slack_bot_config_models = fetch_slack_channel_configs(
        db_session=db_session, slack_bot_id=bot_id
--- a/backend/onyx/server/manage/users.py
+++ b/backend/onyx/server/manage/users.py
@@ -857,6 +857,7 @@ def verify_user_logged_in(
            invitation=tenant_invitation,
        ),
        memories=memories,
+        effective_permissions=sorted(p.value for p in get_effective_permissions(user)),
    )

    return user_info
--- a/backend/onyx/server/manage/voice/user_api.py
+++ b/backend/onyx/server/manage/voice/user_api.py
@@ -1,14 +1,13 @@
-import json
 import secrets
 from collections.abc import AsyncIterator

 from fastapi import APIRouter
 from fastapi import Depends
 from fastapi import File
+from fastapi import Query
 from fastapi import UploadFile
 from fastapi.responses import StreamingResponse
 from pydantic import BaseModel
-from pydantic import Field
 from sqlalchemy.orm import Session

 from onyx.auth.permissions import require_permission
@@ -114,47 +113,28 @@ async def transcribe_audio(
        ) from exc


-def _extract_provider_error(exc: Exception) -> str:
-    """Extract a human-readable message from a provider exception.
-
-    Provider errors often embed JSON from upstream APIs (e.g. ElevenLabs).
-    This tries to parse a readable ``message`` field out of common JSON
-    error shapes; falls back to ``str(exc)`` if nothing better is found.
-    """
-    raw = str(exc)
-    try:
-        # Many providers embed JSON after a prefix like "ElevenLabs TTS failed: {...}"
-        json_start = raw.find("{")
-        if json_start == -1:
-            return raw
-        parsed = json.loads(raw[json_start:])
-        # Shape: {"detail": {"message": "..."}} (ElevenLabs)
-        detail = parsed.get("detail", parsed)
-        if isinstance(detail, dict):
-            return detail.get("message") or detail.get("error") or raw
-        if isinstance(detail, str):
-            return detail
-    except (json.JSONDecodeError, AttributeError, TypeError):
-        pass
-    return raw
-
-
-class SynthesizeRequest(BaseModel):
-    text: str = Field(..., min_length=1)
-    voice: str | None = None
-    speed: float | None = Field(default=None, ge=0.5, le=2.0)
-
-
@router.post("/synthesize")
 async def synthesize_speech(
-    body: SynthesizeRequest,
+    text: str | None = Query(
+        default=None, description="Text to synthesize", max_length=4096
+    ),
+    voice: str | None = Query(default=None, description="Voice ID to use"),
+    speed: float | None = Query(
+        default=None, description="Playback speed (0.5-2.0)", ge=0.5, le=2.0
+    ),
    user: User = Depends(require_permission(Permission.BASIC_ACCESS)),
 ) -> StreamingResponse:
-    """Synthesize text to speech using the default TTS provider."""
-    text = body.text
-    voice = body.voice
-    speed = body.speed
-    logger.info(f"TTS request: text length={len(text)}, voice={voice}, speed={speed}")
+    """
+    Synthesize text to speech using the default TTS provider.
+
+    Accepts parameters via query string for streaming compatibility.
+    """
+    logger.info(
+        f"TTS request: text length={len(text) if text else 0}, voice={voice}, speed={speed}"
+    )
+
+    if not text:
+        raise OnyxError(OnyxErrorCode.VALIDATION_ERROR, "Text is required")

    # Use short-lived session to fetch provider config, then release connection
    # before starting the long-running streaming response
@@ -197,36 +177,31 @@ async def synthesize_speech(
            logger.error(f"Failed to get voice provider: {exc}")
            raise OnyxError(OnyxErrorCode.INTERNAL_ERROR, str(exc)) from exc

-    # Pull the first chunk before returning the StreamingResponse. If the
-    # provider rejects the request (e.g. text too long), the error surfaces
-    # as a proper HTTP error instead of a broken audio stream.
-    stream_iter = provider.synthesize_stream(
-        text=text, voice=final_voice, speed=final_speed
-    )
-    try:
-        first_chunk = await stream_iter.__anext__()
-    except StopAsyncIteration:
-        raise OnyxError(OnyxErrorCode.INTERNAL_ERROR, "TTS provider returned no audio")
-    except Exception as exc:
-        raise OnyxError(
-            OnyxErrorCode.BAD_GATEWAY, _extract_provider_error(exc)
-        ) from exc
-
+    # Session is now closed - streaming response won't hold DB connection
    async def audio_stream() -> AsyncIterator[bytes]:
-        yield first_chunk
-        chunk_count = 1
-        async for chunk in stream_iter:
-            chunk_count += 1
-            yield chunk
-        logger.info(f"TTS streaming complete: {chunk_count} chunks sent")
+        try:
+            chunk_count = 0
+            async for chunk in provider.synthesize_stream(
+                text=text, voice=final_voice, speed=final_speed
+            ):
+                chunk_count += 1
+                yield chunk
+            logger.info(f"TTS streaming complete: {chunk_count} chunks sent")
+        except NotImplementedError as exc:
+            logger.error(f"TTS not implemented: {exc}")
+            raise
+        except Exception as exc:
+            logger.error(f"Synthesis failed: {exc}")
+            raise

    return StreamingResponse(
        audio_stream(),
        media_type="audio/mpeg",
        headers={
            "Content-Disposition": "inline; filename=speech.mp3",
+            # Allow streaming by not setting content-length
            "Cache-Control": "no-cache",
-            "X-Accel-Buffering": "no",
+            "X-Accel-Buffering": "no",  # Disable nginx buffering
        },
    )

--- a/backend/onyx/server/metrics/pruning_metrics.py
+++ b/backend/onyx/server/metrics/pruning_metrics.py
@@ -1,72 +0,0 @@
-"""Pruning-specific Prometheus metrics.
-
-Tracks three pruning pipeline phases for connector_pruning_generator_task:
-  1. Document ID enumeration duration (extract_ids_from_runnable_connector)
-  2. Diff + dispatch duration (DB lookup, set diff, generate_tasks)
-  3. Rate limit errors during enumeration
-
-All metrics are labeled by connector_type to identify which connector sources
-are the most expensive to prune. cc_pair_id is intentionally excluded to avoid
-unbounded cardinality.
-
-Usage:
-    from onyx.server.metrics.pruning_metrics import (
-        observe_pruning_enumeration_duration,
-        observe_pruning_diff_duration,
-        inc_pruning_rate_limit_error,
-    )
-"""
-
-from prometheus_client import Counter
-from prometheus_client import Histogram
-
-from onyx.utils.logger import setup_logger
-
-logger = setup_logger()
-
-PRUNING_ENUMERATION_DURATION = Histogram(
-    "onyx_pruning_enumeration_duration_seconds",
-    "Duration of document ID enumeration from the source connector during pruning",
-    ["connector_type"],
-    buckets=[1, 5, 15, 30, 60, 120, 300, 600, 1800, 3600],
-)
-
-PRUNING_DIFF_DURATION = Histogram(
-    "onyx_pruning_diff_duration_seconds",
-    "Duration of diff computation and subtask dispatch during pruning",
-    ["connector_type"],
-    buckets=[1, 5, 15, 30, 60, 120, 300, 600, 1800, 3600],
-)
-
-PRUNING_RATE_LIMIT_ERRORS = Counter(
-    "onyx_pruning_rate_limit_errors_total",
-    "Total rate limit errors encountered during pruning document ID enumeration",
-    ["connector_type"],
-)
-
-
-def observe_pruning_enumeration_duration(
-    duration_seconds: float, connector_type: str
-) -> None:
-    try:
-        PRUNING_ENUMERATION_DURATION.labels(connector_type=connector_type).observe(
-            duration_seconds
-        )
-    except Exception:
-        logger.debug("Failed to record pruning enumeration duration", exc_info=True)
-
-
-def observe_pruning_diff_duration(duration_seconds: float, connector_type: str) -> None:
-    try:
-        PRUNING_DIFF_DURATION.labels(connector_type=connector_type).observe(
-            duration_seconds
-        )
-    except Exception:
-        logger.debug("Failed to record pruning diff duration", exc_info=True)
-
-
-def inc_pruning_rate_limit_error(connector_type: str) -> None:
-    try:
-        PRUNING_RATE_LIMIT_ERRORS.labels(connector_type=connector_type).inc()
-    except Exception:
-        logger.debug("Failed to record pruning rate limit error", exc_info=True)
--- a/backend/onyx/server/pat/api.py
+++ b/backend/onyx/server/pat/api.py
@@ -2,7 +2,6 @@

 from fastapi import APIRouter
 from fastapi import Depends
-from fastapi import HTTPException
 from sqlalchemy.orm import Session

 from onyx.auth.permissions import require_permission
@@ -12,6 +11,8 @@ from onyx.db.models import User
 from onyx.db.pat import create_pat
 from onyx.db.pat import list_user_pats
 from onyx.db.pat import revoke_pat
+from onyx.error_handling.error_codes import OnyxErrorCode
+from onyx.error_handling.exceptions import OnyxError
 from onyx.server.pat.models import CreatedTokenResponse
 from onyx.server.pat.models import CreateTokenRequest
 from onyx.server.pat.models import TokenResponse
@@ -46,7 +47,7 @@ def list_tokens(
@router.post("")
 def create_token(
    request: CreateTokenRequest,
-    user: User = Depends(require_permission(Permission.BASIC_ACCESS)),
+    user: User = Depends(require_permission(Permission.CREATE_USER_API_KEYS)),
    db_session: Session = Depends(get_session),
 ) -> CreatedTokenResponse:
    """Create new personal access token for current user."""
@@ -58,7 +59,7 @@ def create_token(
            expiration_days=request.expiration_days,
        )
    except ValueError as e:
-        raise HTTPException(status_code=400, detail=str(e))
+        raise OnyxError(OnyxErrorCode.BAD_REQUEST, str(e))

    logger.info(f"User {user.email} created PAT '{request.name}'")

@@ -82,9 +83,7 @@ def delete_token(
    """Delete (revoke) personal access token. Only owner can revoke their own tokens."""
    success = revoke_pat(db_session, token_id, user.id)
    if not success:
-        raise HTTPException(
-            status_code=404, detail="Token not found or not owned by user"
-        )
+        raise OnyxError(OnyxErrorCode.NOT_FOUND, "Token not found or not owned by user")

    logger.info(f"User {user.email} revoked token {token_id}")
    return {"message": "Token deleted successfully"}
--- a/backend/onyx/server/settings/models.py
+++ b/backend/onyx/server/settings/models.py
@@ -65,7 +65,6 @@ class Settings(BaseModel):
    anonymous_user_enabled: bool | None = None
    invite_only_enabled: bool = False
    deep_research_enabled: bool | None = None
-    multi_model_chat_enabled: bool | None = None
    search_ui_enabled: bool | None = None

    # Whether EE features are unlocked for use.
@@ -90,8 +89,7 @@ class Settings(BaseModel):
        default=DEFAULT_USER_FILE_MAX_UPLOAD_SIZE_MB, ge=0
    )
    file_token_count_threshold_k: int | None = Field(
-        default=None,
-        ge=0,  # thousands of tokens; None = context-aware default
+        default=None, ge=0  # thousands of tokens; None = context-aware default
    )

    # Connector settings
--- a/backend/onyx/tools/models.py
+++ b/backend/onyx/tools/models.py
@@ -208,6 +208,12 @@ class PythonToolOverrideKwargs(BaseModel):
    chat_files: list[ChatFile] = []


+class ImageGenerationToolOverrideKwargs(BaseModel):
+    """Override kwargs for image generation tool calls."""
+
+    recent_generated_image_file_ids: list[str] = []
+
+
 class SearchToolRunContext(BaseModel):
    emitter: Emitter

--- a/backend/onyx/tools/tool_constructor.py
+++ b/backend/onyx/tools/tool_constructor.py
@@ -10,7 +10,6 @@ from onyx.configs.app_configs import DISABLE_VECTOR_DB
 from onyx.configs.model_configs import GEN_AI_TEMPERATURE
 from onyx.context.search.models import BaseFilters
 from onyx.context.search.models import PersonaSearchInfo
-from onyx.db.engine.sql_engine import get_session_with_current_tenant_if_none
 from onyx.db.enums import MCPAuthenticationPerformer
 from onyx.db.enums import MCPAuthenticationType
 from onyx.db.mcp import get_all_mcp_tools_for_server
@@ -114,10 +113,10 @@ def _get_image_generation_config(llm: LLM, db_session: Session) -> LLMConfig:

 def construct_tools(
    persona: Persona,
+    db_session: Session,
    emitter: Emitter,
    user: User,
    llm: LLM,
-    db_session: Session | None = None,
    search_tool_config: SearchToolConfig | None = None,
    custom_tool_config: CustomToolConfig | None = None,
    file_reader_tool_config: FileReaderToolConfig | None = None,
@@ -132,33 +131,6 @@ def construct_tools(
    ``attached_documents``, and ``hierarchy_nodes`` already eager-loaded
    (e.g. via ``eager_load_persona=True`` or ``eager_load_for_tools=True``)
    to avoid lazy SQL queries after the session may have been flushed."""
-    with get_session_with_current_tenant_if_none(db_session) as db_session:
-        return _construct_tools_impl(
-            persona=persona,
-            db_session=db_session,
-            emitter=emitter,
-            user=user,
-            llm=llm,
-            search_tool_config=search_tool_config,
-            custom_tool_config=custom_tool_config,
-            file_reader_tool_config=file_reader_tool_config,
-            allowed_tool_ids=allowed_tool_ids,
-            search_usage_forcing_setting=search_usage_forcing_setting,
-        )
-
-
-def _construct_tools_impl(
-    persona: Persona,
-    db_session: Session,
-    emitter: Emitter,
-    user: User,
-    llm: LLM,
-    search_tool_config: SearchToolConfig | None = None,
-    custom_tool_config: CustomToolConfig | None = None,
-    file_reader_tool_config: FileReaderToolConfig | None = None,
-    allowed_tool_ids: list[int] | None = None,
-    search_usage_forcing_setting: SearchToolUsage = SearchToolUsage.AUTO,
-) -> dict[int, list[Tool]]:
    tool_dict: dict[int, list[Tool]] = {}

    # Log which tools are attached to the persona for debugging
--- a/backend/onyx/tools/tool_implementations/images/image_generation_tool.py
+++ b/backend/onyx/tools/tool_implementations/images/image_generation_tool.py
@@ -26,6 +26,7 @@ from onyx.server.query_and_chat.streaming_models import ImageGenerationToolHeart
 from onyx.server.query_and_chat.streaming_models import ImageGenerationToolStart
 from onyx.server.query_and_chat.streaming_models import Packet
 from onyx.tools.interface import Tool
+from onyx.tools.models import ImageGenerationToolOverrideKwargs
 from onyx.tools.models import ToolCallException
 from onyx.tools.models import ToolExecutionException
 from onyx.tools.models import ToolResponse
@@ -47,7 +48,7 @@ PROMPT_FIELD = "prompt"
 REFERENCE_IMAGE_FILE_IDS_FIELD = "reference_image_file_ids"


-class ImageGenerationTool(Tool[None]):
+class ImageGenerationTool(Tool[ImageGenerationToolOverrideKwargs | None]):
    NAME = "generate_image"
    DESCRIPTION = "Generate an image based on a prompt. Do not use unless the user specifically requests an image."
    DISPLAY_NAME = "Image Generation"
@@ -141,11 +142,8 @@ class ImageGenerationTool(Tool[None]):
                        REFERENCE_IMAGE_FILE_IDS_FIELD: {
                            "type": "array",
                            "description": (
-                                "Optional file_ids of existing images to edit or use as reference;"
-                                " the first is the primary edit source."
-                                " Get file_ids from `[attached image — file_id: <id>]` tags on"
-                                " user-attached images or from prior generate_image tool responses."
-                                " Omit for a fresh, unrelated generation."
+                                "Optional image file IDs to use as reference context for edits/variations. "
+                                "Use the file_id values returned by previous generate_image calls."
                            ),
                            "items": {
                                "type": "string",
@@ -256,31 +254,41 @@ class ImageGenerationTool(Tool[None]):
    def _resolve_reference_image_file_ids(
        self,
        llm_kwargs: dict[str, Any],
+        override_kwargs: ImageGenerationToolOverrideKwargs | None,
    ) -> list[str]:
        raw_reference_ids = llm_kwargs.get(REFERENCE_IMAGE_FILE_IDS_FIELD)
-        if raw_reference_ids is None:
-            # No references requested — plain generation.
-            return []
-
-        if not isinstance(raw_reference_ids, list) or not all(
-            isinstance(file_id, str) for file_id in raw_reference_ids
+        if raw_reference_ids is not None:
+            if not isinstance(raw_reference_ids, list) or not all(
+                isinstance(file_id, str) for file_id in raw_reference_ids
+            ):
+                raise ToolCallException(
+                    message=(
+                        f"Invalid {REFERENCE_IMAGE_FILE_IDS_FIELD}: expected array of strings, got {type(raw_reference_ids)}"
+                    ),
+                    llm_facing_message=(
+                        f"The '{REFERENCE_IMAGE_FILE_IDS_FIELD}' field must be an array of file_id strings."
+                    ),
+                )
+            reference_image_file_ids = [
+                file_id.strip() for file_id in raw_reference_ids if file_id.strip()
+            ]
+        elif (
+            override_kwargs
+            and override_kwargs.recent_generated_image_file_ids
+            and self.img_provider.supports_reference_images
        ):
-            raise ToolCallException(
-                message=(
-                    f"Invalid {REFERENCE_IMAGE_FILE_IDS_FIELD}: expected array of strings, got {type(raw_reference_ids)}"
-                ),
-                llm_facing_message=(
-                    f"The '{REFERENCE_IMAGE_FILE_IDS_FIELD}' field must be an array of file_id strings."
-                ),
-            )
+            # If no explicit reference was provided, default to the most recently generated image.
+            reference_image_file_ids = [
+                override_kwargs.recent_generated_image_file_ids[-1]
+            ]
+        else:
+            reference_image_file_ids = []

-        # Deduplicate while preserving order (first occurrence wins, so the
-        # LLM's intended "primary edit source" stays at index 0).
+        # Deduplicate while preserving order.
        deduped_reference_image_ids: list[str] = []
        seen_ids: set[str] = set()
-        for file_id in raw_reference_ids:
-            file_id = file_id.strip()
-            if not file_id or file_id in seen_ids:
+        for file_id in reference_image_file_ids:
+            if file_id in seen_ids:
                continue
            seen_ids.add(file_id)
            deduped_reference_image_ids.append(file_id)
@@ -294,14 +302,14 @@ class ImageGenerationTool(Tool[None]):
                    f"Reference images requested but provider '{self.provider}' does not support image-editing context."
                ),
                llm_facing_message=(
-                    "This image provider does not support editing from existing images. "
+                    "This image provider does not support editing from previous image context. "
                    "Try text-only generation, or switch to a provider/model that supports image edits."
                ),
            )

        max_reference_images = self.img_provider.max_reference_images
        if max_reference_images > 0:
-            return deduped_reference_image_ids[:max_reference_images]
+            return deduped_reference_image_ids[-max_reference_images:]
        return deduped_reference_image_ids

    def _load_reference_images(
@@ -350,7 +358,7 @@ class ImageGenerationTool(Tool[None]):
    def run(
        self,
        placement: Placement,
-        override_kwargs: None = None,  # noqa: ARG002
+        override_kwargs: ImageGenerationToolOverrideKwargs | None = None,
        **llm_kwargs: Any,
    ) -> ToolResponse:
        if PROMPT_FIELD not in llm_kwargs:
@@ -365,6 +373,7 @@ class ImageGenerationTool(Tool[None]):
        shape = ImageShape(llm_kwargs.get("shape", ImageShape.SQUARE.value))
        reference_image_file_ids = self._resolve_reference_image_file_ids(
            llm_kwargs=llm_kwargs,
+            override_kwargs=override_kwargs,
        )
        reference_images = self._load_reference_images(reference_image_file_ids)

--- a/backend/onyx/tools/tool_runner.py
+++ b/backend/onyx/tools/tool_runner.py
@@ -1,3 +1,4 @@
+import json
 import traceback
 from collections import defaultdict
 from typing import Any
@@ -13,6 +14,7 @@ from onyx.server.query_and_chat.streaming_models import SectionEnd
 from onyx.tools.interface import Tool
 from onyx.tools.models import ChatFile
 from onyx.tools.models import ChatMinimalTextMessage
+from onyx.tools.models import ImageGenerationToolOverrideKwargs
 from onyx.tools.models import OpenURLToolOverrideKwargs
 from onyx.tools.models import ParallelToolCallResponse
 from onyx.tools.models import PythonToolOverrideKwargs
@@ -22,6 +24,9 @@ from onyx.tools.models import ToolCallKickoff
 from onyx.tools.models import ToolExecutionException
 from onyx.tools.models import ToolResponse
 from onyx.tools.models import WebSearchToolOverrideKwargs
+from onyx.tools.tool_implementations.images.image_generation_tool import (
+    ImageGenerationTool,
+)
 from onyx.tools.tool_implementations.memory.memory_tool import MemoryTool
 from onyx.tools.tool_implementations.memory.memory_tool import MemoryToolOverrideKwargs
 from onyx.tools.tool_implementations.open_url.open_url_tool import OpenURLTool
@@ -105,6 +110,63 @@ def _merge_tool_calls(tool_calls: list[ToolCallKickoff]) -> list[ToolCallKickoff
    return merged_calls


+def _extract_image_file_ids_from_tool_response_message(
+    message: str,
+) -> list[str]:
+    try:
+        parsed_message = json.loads(message)
+    except json.JSONDecodeError:
+        return []
+
+    parsed_items: list[Any] = (
+        parsed_message if isinstance(parsed_message, list) else [parsed_message]
+    )
+    file_ids: list[str] = []
+    for item in parsed_items:
+        if not isinstance(item, dict):
+            continue
+
+        file_id = item.get("file_id")
+        if isinstance(file_id, str):
+            file_ids.append(file_id)
+
+    return file_ids
+
+
+def _extract_recent_generated_image_file_ids(
+    message_history: list[ChatMessageSimple],
+) -> list[str]:
+    tool_name_by_tool_call_id: dict[str, str] = {}
+    recent_image_file_ids: list[str] = []
+    seen_file_ids: set[str] = set()
+
+    for message in message_history:
+        if message.message_type == MessageType.ASSISTANT and message.tool_calls:
+            for tool_call in message.tool_calls:
+                tool_name_by_tool_call_id[tool_call.tool_call_id] = tool_call.tool_name
+            continue
+
+        if (
+            message.message_type != MessageType.TOOL_CALL_RESPONSE
+            or not message.tool_call_id
+        ):
+            continue
+
+        tool_name = tool_name_by_tool_call_id.get(message.tool_call_id)
+        if tool_name != ImageGenerationTool.NAME:
+            continue
+
+        for file_id in _extract_image_file_ids_from_tool_response_message(
+            message.message
+        ):
+            if file_id in seen_file_ids:
+                continue
+            seen_file_ids.add(file_id)
+            recent_image_file_ids.append(file_id)
+
+    return recent_image_file_ids
+
+
 def _safe_run_single_tool(
    tool: Tool,
    tool_call: ToolCallKickoff,
@@ -324,6 +386,9 @@ def run_tool_calls(
    url_to_citation: dict[str, int] = {
        url: citation_num for citation_num, url in citation_mapping.items()
    }
+    recent_generated_image_file_ids = _extract_recent_generated_image_file_ids(
+        message_history
+    )

    # Prepare all tool calls with their override_kwargs
    # Each tool gets a unique starting citation number to avoid conflicts when running in parallel
@@ -340,6 +405,7 @@ def run_tool_calls(
            | WebSearchToolOverrideKwargs
            | OpenURLToolOverrideKwargs
            | PythonToolOverrideKwargs
+            | ImageGenerationToolOverrideKwargs
            | MemoryToolOverrideKwargs
            | None
        ) = None
@@ -388,6 +454,10 @@ def run_tool_calls(
            override_kwargs = PythonToolOverrideKwargs(
                chat_files=chat_files or [],
            )
+        elif isinstance(tool, ImageGenerationTool):
+            override_kwargs = ImageGenerationToolOverrideKwargs(
+                recent_generated_image_file_ids=recent_generated_image_file_ids
+            )
        elif isinstance(tool, MemoryTool):
            override_kwargs = MemoryToolOverrideKwargs(
                user_name=(
--- a/backend/pyproject.toml
+++ b/backend/pyproject.toml
@@ -0,0 +1,10 @@
+[project]
+name = "onyx-backend"
+version = "0.0.0"
+requires-python = ">=3.11"
+dependencies = [
+    "onyx[backend,dev,ee]",
+]
+
+[tool.uv.sources]
+onyx = { workspace = true }
--- a/backend/requirements/README.md
+++ b/backend/requirements/README.md
@@ -46,11 +46,11 @@ curl -LsSf https://astral.py/uv/install.sh | sh

 1. Edit `pyproject.toml`
 2. Add/update/remove dependencies in the appropriate section:
+   - `[dependency-groups]` for dev tools
   - `[project.dependencies]` for **shared** dependencies (used by both backend and model_server)
-   - `[dependency-groups.backend]` for backend-only dependencies
-   - `[dependency-groups.dev]` for dev tools
-   - `[dependency-groups.ee]` for EE features
-   - `[dependency-groups.model_server]` for model_server-only dependencies (ML packages)
+   - `[project.optional-dependencies.backend]` for backend-only dependencies
+   - `[project.optional-dependencies.model_server]` for model_server-only dependencies (ML packages)
+   - `[project.optional-dependencies.ee]` for EE features
 3. Commit your changes - pre-commit hooks will automatically regenerate the lock file and requirements

 ### 3. Generating Lock File and Requirements
@@ -64,10 +64,10 @@ To manually regenerate:

 ```bash
 uv lock
-uv export --no-emit-project --no-default-groups --no-hashes --group backend -o backend/requirements/default.txt
+uv export --no-emit-project --no-default-groups --no-hashes --extra backend -o backend/requirements/default.txt
 uv export --no-emit-project --no-default-groups --no-hashes --group dev -o backend/requirements/dev.txt
-uv export --no-emit-project --no-default-groups --no-hashes --group ee -o backend/requirements/ee.txt
-uv export --no-emit-project --no-default-groups --no-hashes --group model_server -o backend/requirements/model_server.txt
+uv export --no-emit-project --no-default-groups --no-hashes --extra ee -o backend/requirements/ee.txt
+uv export --no-emit-project --no-default-groups --no-hashes --extra model_server -o backend/requirements/model_server.txt
 ```

 ### 4. Installing Dependencies
@@ -76,14 +76,30 @@ If enabled, all packages are installed automatically by the `uv-sync` pre-commit
 branches or pulling new changes.

 ```bash
-# For development (most common) — installs shared + backend + dev + ee
-uv sync
+# For everything (most common)
+uv sync --all-extras

-# For backend production only (shared + backend dependencies)
-uv sync --no-default-groups --group backend
+# For backend production (shared + backend dependencies)
+uv sync --extra backend
+
+# For backend development (shared + backend + dev tools)
+uv sync --extra backend --extra dev
+
+# For backend with EE (shared + backend + ee)
+uv sync --extra backend --extra ee

 # For model server (shared + model_server, NO backend deps!)
-uv sync --no-default-groups --group model_server
+uv sync --extra model_server
+```
+
+`uv` aggressively [ignores active virtual environments](https://docs.astral.sh/uv/concepts/projects/config/#project-environment-path) and prefers the root virtual environment.
+When working in workspace packages, be sure to pass `--active` when syncing the virtual environment:
+
+```bash
+cd backend/
+source .venv/bin/activate
+uv sync --active
+uv run --active ...
 ```

 ### 5. Upgrading Dependencies
--- a/backend/requirements/default.txt
+++ b/backend/requirements/default.txt
@@ -1,5 +1,5 @@
 # This file was autogenerated by uv via the following command:
-#    uv export --no-emit-project --no-default-groups --no-hashes --group backend -o backend/requirements/default.txt
+#    uv export --no-emit-project --no-default-groups --no-hashes --extra backend -o backend/requirements/default.txt
 agent-client-protocol==0.7.1
    # via onyx
 aioboto3==15.1.0
@@ -19,6 +19,7 @@ aiohttp==3.13.4
    #   aiobotocore
    #   discord-py
    #   litellm
+    #   onyx
    #   voyageai
 aioitertools==0.13.0
    # via aiobotocore
@@ -27,6 +28,7 @@ aiolimiter==1.2.1
 aiosignal==1.4.0
    # via aiohttp
 alembic==1.10.4
+    # via onyx
 amqp==5.3.1
    # via kombu
 annotated-doc==0.0.4
@@ -49,10 +51,13 @@ argon2-cffi==23.1.0
 argon2-cffi-bindings==25.1.0
    # via argon2-cffi
 asana==5.0.8
+    # via onyx
 async-timeout==5.0.1 ; python_full_version < '3.11.3'
    # via redis
 asyncpg==0.30.0
+    # via onyx
 atlassian-python-api==3.41.16
+    # via onyx
 attrs==25.4.0
    # via
    #   aiohttp
@@ -63,6 +68,7 @@ attrs==25.4.0
 authlib==1.6.9
    # via fastmcp
 azure-cognitiveservices-speech==1.38.0
+    # via onyx
 babel==2.17.0
    # via courlan
 backoff==2.2.1
@@ -80,6 +86,7 @@ beautifulsoup4==4.12.3
    #   atlassian-python-api
    #   markdownify
    #   markitdown
+    #   onyx
    #   unstructured
 billiard==4.2.3
    # via celery
@@ -87,7 +94,9 @@ boto3==1.39.11
    # via
    #   aiobotocore
    #   cohere
+    #   onyx
 boto3-stubs==1.39.11
+    # via onyx
 botocore==1.39.11
    # via
    #   aiobotocore
@@ -96,6 +105,7 @@ botocore==1.39.11
 botocore-stubs==1.40.74
    # via boto3-stubs
 braintrust==0.3.9
+    # via onyx
 brotli==1.2.0
    # via onyx
 bytecode==0.17.0
@@ -105,6 +115,7 @@ cachetools==6.2.2
 caio==0.9.25
    # via aiofile
 celery==5.5.1
+    # via onyx
 certifi==2025.11.12
    # via
    #   asana
@@ -123,6 +134,7 @@ cffi==2.0.0
    #   pynacl
    #   zstandard
 chardet==5.2.0
+    # via onyx
 charset-normalizer==3.4.4
    # via
    #   htmldate
@@ -134,6 +146,7 @@ charset-normalizer==3.4.4
 chevron==0.14.0
    # via braintrust
 chonkie==1.0.10
+    # via onyx
 claude-agent-sdk==0.1.19
    # via onyx
 click==8.3.1
@@ -188,12 +201,15 @@ cryptography==46.0.6
 cyclopts==4.2.4
    # via fastmcp
 dask==2026.1.1
-    # via distributed
+    # via
+    #   distributed
+    #   onyx
 dataclasses-json==0.6.7
    # via unstructured
 dateparser==1.2.2
    # via htmldate
 ddtrace==3.10.0
+    # via onyx
 decorator==5.2.1
    # via retry
 defusedxml==0.7.1
@@ -207,6 +223,7 @@ deprecated==1.3.1
 discord-py==2.4.0
    # via onyx
 distributed==2026.1.1
+    # via onyx
 distro==1.9.0
    # via
    #   openai
@@ -218,6 +235,7 @@ docstring-parser==0.17.0
 docutils==0.22.3
    # via rich-rst
 dropbox==12.0.2
+    # via onyx
 durationpy==0.10
    # via kubernetes
 email-validator==2.2.0
@@ -233,6 +251,7 @@ et-xmlfile==2.0.0
 events==0.5
    # via opensearch-py
 exa-py==1.15.4
+    # via onyx
 exceptiongroup==1.3.0
    # via
    #   braintrust
@@ -243,16 +262,23 @@ fastapi==0.133.1
    #   fastapi-users
    #   onyx
 fastapi-limiter==0.1.6
+    # via onyx
 fastapi-users==15.0.4
-    # via fastapi-users-db-sqlalchemy
+    # via
+    #   fastapi-users-db-sqlalchemy
+    #   onyx
 fastapi-users-db-sqlalchemy==7.0.0
+    # via onyx
 fastavro==1.12.1
    # via cohere
 fastmcp==3.2.0
+    # via onyx
 fastuuid==0.14.0
    # via litellm
 filelock==3.20.3
-    # via huggingface-hub
+    # via
+    #   huggingface-hub
+    #   onyx
 filetype==1.2.0
    # via unstructured
 flatbuffers==25.9.23
@@ -272,6 +298,7 @@ gitpython==3.1.45
 google-api-core==2.28.1
    # via google-api-python-client
 google-api-python-client==2.86.0
+    # via onyx
 google-auth==2.48.0
    # via
    #   google-api-core
@@ -281,8 +308,11 @@ google-auth==2.48.0
    #   google-genai
    #   kubernetes
 google-auth-httplib2==0.1.0
-    # via google-api-python-client
+    # via
+    #   google-api-python-client
+    #   onyx
 google-auth-oauthlib==1.0.0
+    # via onyx
 google-genai==1.52.0
    # via onyx
 googleapis-common-protos==1.72.0
@@ -310,6 +340,7 @@ htmldate==1.9.1
 httpcore==1.0.9
    # via
    #   httpx
+    #   onyx
    #   unstructured-client
 httplib2==0.31.0
    # via
@@ -326,16 +357,21 @@ httpx==0.28.1
    #   langsmith
    #   litellm
    #   mcp
+    #   onyx
    #   openai
    #   unstructured-client
 httpx-oauth==0.15.1
+    # via onyx
 httpx-sse==0.4.3
    # via
    #   cohere
    #   mcp
 hubspot-api-client==11.1.0
+    # via onyx
 huggingface-hub==0.35.3
-    # via tokenizers
+    # via
+    #   onyx
+    #   tokenizers
 humanfriendly==10.0
    # via coloredlogs
 hyperframe==6.1.0
@@ -354,7 +390,9 @@ importlib-metadata==8.7.0
    #   litellm
    #   opentelemetry-api
 inflection==0.5.1
-    # via pyairtable
+    # via
+    #   onyx
+    #   pyairtable
 iniconfig==2.3.0
    # via pytest
 isodate==0.7.2
@@ -376,6 +414,7 @@ jinja2==3.1.6
    #   distributed
    #   litellm
 jira==3.10.5
+    # via onyx
 jiter==0.12.0
    # via openai
 jmespath==1.0.1
@@ -391,7 +430,9 @@ jsonpatch==1.33
 jsonpointer==3.0.0
    # via jsonpatch
 jsonref==1.1.0
-    # via fastmcp
+    # via
+    #   fastmcp
+    #   onyx
 jsonschema==4.25.1
    # via
    #   litellm
@@ -409,12 +450,15 @@ kombu==5.5.4
 kubernetes==31.0.0
    # via onyx
 langchain-core==1.2.22
+    # via onyx
 langdetect==1.0.9
    # via unstructured
 langfuse==3.10.0
+    # via onyx
 langsmith==0.3.45
    # via langchain-core
 lazy-imports==1.0.1
+    # via onyx
 legacy-cgi==2.6.4 ; python_full_version >= '3.13'
    # via ddtrace
 litellm==1.81.6
@@ -429,6 +473,7 @@ lxml==5.3.0
    #   justext
    #   lxml-html-clean
    #   markitdown
+    #   onyx
    #   python-docx
    #   python-pptx
    #   python3-saml
@@ -443,7 +488,9 @@ magika==0.6.3
 makefun==1.16.0
    # via fastapi-users
 mako==1.2.4
-    # via alembic
+    # via
+    #   alembic
+    #   onyx
 mammoth==1.11.0
    # via markitdown
 markdown-it-py==4.0.0
@@ -451,6 +498,7 @@ markdown-it-py==4.0.0
 markdownify==1.2.2
    # via markitdown
 markitdown==0.1.2
+    # via onyx
 markupsafe==3.0.3
    # via
    #   jinja2
@@ -464,9 +512,11 @@ mcp==1.26.0
    # via
    #   claude-agent-sdk
    #   fastmcp
+    #   onyx
 mdurl==0.1.2
    # via markdown-it-py
 mistune==3.2.0
+    # via onyx
 more-itertools==10.8.0
    # via
    #   jaraco-classes
@@ -475,10 +525,13 @@ more-itertools==10.8.0
 mpmath==1.3.0
    # via sympy
 msal==1.34.0
-    # via office365-rest-python-client
+    # via
+    #   office365-rest-python-client
+    #   onyx
 msgpack==1.1.2
    # via distributed
 msoffcrypto-tool==5.4.2
+    # via onyx
 multidict==6.7.0
    # via
    #   aiobotocore
@@ -495,6 +548,7 @@ mypy-extensions==1.0.0
    #   mypy
    #   typing-inspect
 nest-asyncio==1.6.0
+    # via onyx
 nltk==3.9.4
    # via unstructured
 numpy==2.4.1
@@ -509,8 +563,10 @@ oauthlib==3.2.2
    # via
    #   atlassian-python-api
    #   kubernetes
+    #   onyx
    #   requests-oauthlib
 office365-rest-python-client==2.6.2
+    # via onyx
 olefile==0.47
    # via
    #   msoffcrypto-tool
@@ -526,11 +582,15 @@ openai==2.14.0
 openapi-pydantic==0.5.1
    # via fastmcp
 openinference-instrumentation==0.1.42
+    # via onyx
 openinference-semantic-conventions==0.1.25
    # via openinference-instrumentation
 openpyxl==3.0.10
-    # via markitdown
+    # via
+    #   markitdown
+    #   onyx
 opensearch-py==3.0.0
+    # via onyx
 opentelemetry-api==1.39.1
    # via
    #   ddtrace
@@ -546,6 +606,7 @@ opentelemetry-exporter-otlp-proto-http==1.39.1
    # via langfuse
 opentelemetry-proto==1.39.1
    # via
+    #   onyx
    #   opentelemetry-exporter-otlp-proto-common
    #   opentelemetry-exporter-otlp-proto-http
 opentelemetry-sdk==1.39.1
@@ -579,6 +640,7 @@ parameterized==0.9.0
 partd==1.4.2
    # via dask
 passlib==1.7.4
+    # via onyx
 pathable==0.4.4
    # via jsonschema-path
 pdfminer-six==20251107
@@ -590,7 +652,9 @@ platformdirs==4.5.0
    #   fastmcp
    #   zeep
 playwright==1.55.0
-    # via pytest-playwright
+    # via
+    #   onyx
+    #   pytest-playwright
 pluggy==1.6.0
    # via pytest
 ply==3.11
@@ -620,9 +684,12 @@ protobuf==6.33.5
 psutil==7.1.3
    # via
    #   distributed
+    #   onyx
    #   unstructured
 psycopg2-binary==2.9.9
+    # via onyx
 puremagic==1.28
+    # via onyx
 pwdlib==0.3.0
    # via fastapi-users
 py==1.11.0
@@ -630,6 +697,7 @@ py==1.11.0
 py-key-value-aio==0.4.4
    # via fastmcp
 pyairtable==3.0.1
+    # via onyx
 pyasn1==0.6.3
    # via
    #   pyasn1-modules
@@ -639,6 +707,7 @@ pyasn1-modules==0.4.2
 pycparser==2.23 ; implementation_name != 'PyPy'
    # via cffi
 pycryptodome==3.19.1
+    # via onyx
 pydantic==2.11.7
    # via
    #   agent-client-protocol
@@ -665,6 +734,7 @@ pydantic-settings==2.12.0
 pyee==13.0.0
    # via playwright
 pygithub==2.5.0
+    # via onyx
 pygments==2.20.0
    # via rich
 pyjwt==2.12.0
@@ -675,13 +745,17 @@ pyjwt==2.12.0
    #   pygithub
    #   simple-salesforce
 pympler==1.1
+    # via onyx
 pynacl==1.6.2
    # via pygithub
 pypandoc-binary==1.16.2
+    # via onyx
 pyparsing==3.2.5
    # via httplib2
 pypdf==6.9.2
-    # via unstructured-client
+    # via
+    #   onyx
+    #   unstructured-client
 pyperclip==1.11.0
    # via fastmcp
 pyreadline3==3.5.4 ; sys_platform == 'win32'
@@ -694,7 +768,9 @@ pytest==8.3.5
 pytest-base-url==2.1.0
    # via pytest-playwright
 pytest-mock==3.12.0
+    # via onyx
 pytest-playwright==0.7.0
+    # via onyx
 python-dateutil==2.8.2
    # via
    #   aiobotocore
@@ -705,9 +781,11 @@ python-dateutil==2.8.2
    #   htmldate
    #   hubspot-api-client
    #   kubernetes
+    #   onyx
    #   opensearch-py
    #   pandas
 python-docx==1.1.2
+    # via onyx
 python-dotenv==1.1.1
    # via
    #   braintrust
@@ -715,8 +793,10 @@ python-dotenv==1.1.1
    #   litellm
    #   magika
    #   mcp
+    #   onyx
    #   pydantic-settings
 python-gitlab==5.6.0
+    # via onyx
 python-http-client==3.3.7
    # via sendgrid
 python-iso639==2025.11.16
@@ -727,15 +807,19 @@ python-multipart==0.0.22
    # via
    #   fastapi-users
    #   mcp
+    #   onyx
 python-oxmsg==0.0.2
    # via unstructured
 python-pptx==0.6.23
-    # via markitdown
+    # via
+    #   markitdown
+    #   onyx
 python-slugify==8.0.4
    # via
    #   braintrust
    #   pytest-playwright
 python3-saml==1.15.0
+    # via onyx
 pytz==2025.2
    # via
    #   dateparser
@@ -743,6 +827,7 @@ pytz==2025.2
    #   pandas
    #   zeep
 pywikibot==9.0.0
+    # via onyx
 pywin32==311 ; sys_platform == 'win32'
    # via
    #   mcp
@@ -759,9 +844,13 @@ pyyaml==6.0.3
    #   kubernetes
    #   langchain-core
 rapidfuzz==3.13.0
-    # via unstructured
+    # via
+    #   onyx
+    #   unstructured
 redis==5.0.8
-    # via fastapi-limiter
+    # via
+    #   fastapi-limiter
+    #   onyx
 referencing==0.36.2
    # via
    #   jsonschema
@@ -792,6 +881,7 @@ requests==2.33.0
    #   matrix-client
    #   msal
    #   office365-rest-python-client
+    #   onyx
    #   opensearch-py
    #   opentelemetry-exporter-otlp-proto-http
    #   pyairtable
@@ -817,6 +907,7 @@ requests-oauthlib==1.3.1
    #   google-auth-oauthlib
    #   jira
    #   kubernetes
+    #   onyx
 requests-toolbelt==1.0.0
    # via
    #   jira
@@ -827,6 +918,7 @@ requests-toolbelt==1.0.0
 retry==0.9.2
    # via onyx
 rfc3986==1.5.0
+    # via onyx
 rich==14.2.0
    # via
    #   cyclopts
@@ -846,12 +938,15 @@ s3transfer==0.13.1
 secretstorage==3.5.0 ; sys_platform == 'linux'
    # via keyring
 sendgrid==6.12.5
+    # via onyx
 sentry-sdk==2.14.0
    # via onyx
 shapely==2.0.6
+    # via onyx
 shellingham==1.5.4
    # via typer
 simple-salesforce==1.12.6
+    # via onyx
 six==1.17.0
    # via
    #   asana
@@ -866,6 +961,7 @@ six==1.17.0
    #   python-dateutil
    #   stone
 slack-sdk==3.20.2
+    # via onyx
 smmap==5.0.2
    # via gitdb
 sniffio==1.3.1
@@ -880,6 +976,7 @@ sqlalchemy==2.0.15
    # via
    #   alembic
    #   fastapi-users-db-sqlalchemy
+    #   onyx
 sse-starlette==3.0.3
    # via mcp
 sseclient-py==1.8.0
@@ -888,11 +985,14 @@ starlette==0.49.3
    # via
    #   fastapi
    #   mcp
+    #   onyx
    #   prometheus-fastapi-instrumentator
 stone==3.3.1
    # via dropbox
 stripe==10.12.0
+    # via onyx
 supervisor==4.3.0
+    # via onyx
 sympy==1.14.0
    # via onnxruntime
 tblib==3.2.2
@@ -905,8 +1005,11 @@ tenacity==9.1.2
 text-unidecode==1.3
    # via python-slugify
 tiktoken==0.7.0
-    # via litellm
+    # via
+    #   litellm
+    #   onyx
 timeago==1.0.16
+    # via onyx
 tld==0.13.1
    # via courlan
 tokenizers==0.21.4
@@ -930,11 +1033,13 @@ tqdm==4.67.1
    #   openai
    #   unstructured
 trafilatura==1.12.2
+    # via onyx
 typer==0.20.0
    # via mcp
 types-awscrt==0.28.4
    # via botocore-stubs
 types-openpyxl==3.0.4.7
+    # via onyx
 types-requests==2.32.0.20250328
    # via cohere
 types-s3transfer==0.14.0
@@ -1000,8 +1105,11 @@ tzlocal==5.3.1
 uncalled-for==0.2.0
    # via fastmcp
 unstructured==0.18.27
+    # via onyx
 unstructured-client==0.42.6
-    # via unstructured
+    # via
+    #   onyx
+    #   unstructured
 uritemplate==4.2.0
    # via google-api-python-client
 urllib3==2.6.3
@@ -1013,6 +1121,7 @@ urllib3==2.6.3
    #   htmldate
    #   hubspot-api-client
    #   kubernetes
+    #   onyx
    #   opensearch-py
    #   pyairtable
    #   pygithub
@@ -1062,7 +1171,9 @@ xlrd==2.0.2
 xlsxwriter==3.2.9
    # via python-pptx
 xmlsec==1.3.14
-    # via python3-saml
+    # via
+    #   onyx
+    #   python3-saml
 xmltodict==1.0.2
    # via ddtrace
 yarl==1.22.0
@@ -1076,3 +1187,4 @@ zipp==3.23.0
 zstandard==0.23.0
    # via langsmith
 zulip==0.8.2
+    # via onyx
--- a/backend/requirements/dev.txt
+++ b/backend/requirements/dev.txt
@@ -1,5 +1,5 @@
 # This file was autogenerated by uv via the following command:
-#    uv export --no-emit-project --no-default-groups --no-hashes --group dev -o backend/requirements/dev.txt
+#    uv export --no-emit-project --no-default-groups --no-hashes --extra dev -o backend/requirements/dev.txt
 agent-client-protocol==0.7.1
    # via onyx
 aioboto3==15.1.0
@@ -47,6 +47,7 @@ attrs==25.4.0
    #   jsonschema
    #   referencing
 black==25.1.0
+    # via onyx
 boto3==1.39.11
    # via
    #   aiobotocore
@@ -59,6 +60,7 @@ botocore==1.39.11
 brotli==1.2.0
    # via onyx
 celery-types==0.19.0
+    # via onyx
 certifi==2025.11.12
    # via
    #   httpcore
@@ -120,6 +122,7 @@ execnet==2.1.2
 executing==2.2.1
    # via stack-data
 faker==40.1.2
+    # via onyx
 fastapi==0.133.1
    # via
    #   onyx
@@ -153,6 +156,7 @@ h11==0.16.0
    #   httpcore
    #   uvicorn
 hatchling==1.28.0
+    # via onyx
 hf-xet==1.2.0 ; platform_machine == 'aarch64' or platform_machine == 'amd64' or platform_machine == 'arm64' or platform_machine == 'x86_64'
    # via huggingface-hub
 httpcore==1.0.9
@@ -183,6 +187,7 @@ importlib-metadata==8.7.0
 iniconfig==2.3.0
    # via pytest
 ipykernel==6.29.5
+    # via onyx
 ipython==9.7.0
    # via ipykernel
 ipython-pygments-lexers==1.1.1
@@ -219,11 +224,13 @@ litellm==1.81.6
 mako==1.2.4
    # via alembic
 manygo==0.2.0
+    # via onyx
 markupsafe==3.0.3
    # via
    #   jinja2
    #   mako
 matplotlib==3.10.8
+    # via onyx
 matplotlib-inline==0.2.1
    # via
    #   ipykernel
@@ -236,10 +243,12 @@ multidict==6.7.0
    #   aiohttp
    #   yarl
 mypy==1.13.0
+    # via onyx
 mypy-extensions==1.0.0
    # via
    #   black
    #   mypy
+    #   onyx
 nest-asyncio==1.6.0
    # via ipykernel
 nodeenv==1.9.1
@@ -254,13 +263,16 @@ oauthlib==3.2.2
    # via
    #   kubernetes
    #   requests-oauthlib
-onyx-devtools==0.7.3
+onyx-devtools==0.7.2
+    # via onyx
 openai==2.14.0
    # via
    #   litellm
    #   onyx
 openapi-generator-cli==7.17.0
-    # via onyx-devtools
+    # via
+    #   onyx
+    #   onyx-devtools
 packaging==24.2
    # via
    #   black
@@ -270,6 +282,7 @@ packaging==24.2
    #   matplotlib
    #   pytest
 pandas-stubs==2.3.3.251201
+    # via onyx
 parameterized==0.9.0
    # via cohere
 parso==0.8.5
@@ -292,6 +305,7 @@ pluggy==1.6.0
    #   hatchling
    #   pytest
 pre-commit==3.2.2
+    # via onyx
 prometheus-client==0.23.1
    # via
    #   onyx
@@ -345,16 +359,22 @@ pyparsing==3.2.5
    # via matplotlib
 pytest==8.3.5
    # via
+    #   onyx
    #   pytest-alembic
    #   pytest-asyncio
    #   pytest-dotenv
    #   pytest-repeat
    #   pytest-xdist
 pytest-alembic==0.12.1
+    # via onyx
 pytest-asyncio==1.3.0
+    # via onyx
 pytest-dotenv==0.5.2
+    # via onyx
 pytest-repeat==0.9.4
+    # via onyx
 pytest-xdist==3.8.0
+    # via onyx
 python-dateutil==2.8.2
    # via
    #   aiobotocore
@@ -387,7 +407,9 @@ referencing==0.36.2
 regex==2025.11.3
    # via tiktoken
 release-tag==0.5.2
+    # via onyx
 reorder-python-imports-black==3.14.0
+    # via onyx
 requests==2.33.0
    # via
    #   cohere
@@ -408,6 +430,7 @@ rpds-py==0.29.0
 rsa==4.9.1
    # via google-auth
 ruff==0.12.0
+    # via onyx
 s3transfer==0.13.1
    # via boto3
 sentry-sdk==2.14.0
@@ -461,22 +484,39 @@ traitlets==5.14.3
 trove-classifiers==2025.12.1.14
    # via hatchling
 types-beautifulsoup4==4.12.0.3
+    # via onyx
 types-html5lib==1.1.11.13
-    # via types-beautifulsoup4
+    # via
+    #   onyx
+    #   types-beautifulsoup4
 types-oauthlib==3.2.0.9
+    # via onyx
 types-passlib==1.7.7.20240106
+    # via onyx
 types-pillow==10.2.0.20240822
+    # via onyx
 types-psutil==7.1.3.20251125
+    # via onyx
 types-psycopg2==2.9.21.10
+    # via onyx
 types-python-dateutil==2.8.19.13
+    # via onyx
 types-pytz==2023.3.1.1
-    # via pandas-stubs
+    # via
+    #   onyx
+    #   pandas-stubs
 types-pyyaml==6.0.12.11
+    # via onyx
 types-regex==2023.3.23.1
+    # via onyx
 types-requests==2.32.0.20250328
-    # via cohere
+    # via
+    #   cohere
+    #   onyx
 types-retry==0.9.9.3
+    # via onyx
 types-setuptools==68.0.0.3
+    # via onyx
 typing-extensions==4.15.0
    # via
    #   aiosignal
@@ -534,3 +574,4 @@ yarl==1.22.0
 zipp==3.23.0
    # via importlib-metadata
 zizmor==1.18.0
+    # via onyx
--- a/backend/requirements/ee.txt
+++ b/backend/requirements/ee.txt
@@ -1,5 +1,5 @@
 # This file was autogenerated by uv via the following command:
-#    uv export --no-emit-project --no-default-groups --no-hashes --group ee -o backend/requirements/ee.txt
+#    uv export --no-emit-project --no-default-groups --no-hashes --extra ee -o backend/requirements/ee.txt
 agent-client-protocol==0.7.1
    # via onyx
 aioboto3==15.1.0
@@ -182,6 +182,7 @@ packaging==24.2
 parameterized==0.9.0
    # via cohere
 posthog==3.7.4
+    # via onyx
 prometheus-client==0.23.1
    # via
    #   onyx
--- a/backend/requirements/model_server.txt
+++ b/backend/requirements/model_server.txt
@@ -1,6 +1,7 @@
 # This file was autogenerated by uv via the following command:
-#    uv export --no-emit-project --no-default-groups --no-hashes --group model_server -o backend/requirements/model_server.txt
+#    uv export --no-emit-project --no-default-groups --no-hashes --extra model_server -o backend/requirements/model_server.txt
 accelerate==1.6.0
+    # via onyx
 agent-client-protocol==0.7.1
    # via onyx
 aioboto3==15.1.0
@@ -104,6 +105,7 @@ distro==1.9.0
 durationpy==0.10
    # via kubernetes
 einops==0.8.1
+    # via onyx
 fastapi==0.133.1
    # via
    #   onyx
@@ -205,6 +207,7 @@ networkx==3.5
 numpy==2.4.1
    # via
    #   accelerate
+    #   onyx
    #   scikit-learn
    #   scipy
    #   transformers
@@ -360,6 +363,7 @@ s3transfer==0.13.1
 safetensors==0.5.3
    # via
    #   accelerate
+    #   onyx
    #   transformers
 scikit-learn==1.7.2
    # via sentence-transformers
@@ -368,6 +372,7 @@ scipy==1.16.3
    #   scikit-learn
    #   sentence-transformers
 sentence-transformers==4.0.2
+    # via onyx
 sentry-sdk==2.14.0
    # via onyx
 setuptools==80.9.0 ; python_full_version >= '3.12'
@@ -406,6 +411,7 @@ tokenizers==0.21.4
 torch==2.9.1
    # via
    #   accelerate
+    #   onyx
    #   sentence-transformers
 tqdm==4.67.1
    # via
@@ -414,7 +420,9 @@ tqdm==4.67.1
    #   sentence-transformers
    #   transformers
 transformers==4.53.0
-    # via sentence-transformers
+    # via
+    #   onyx
+    #   sentence-transformers
 triton==3.5.1 ; platform_machine == 'x86_64' and sys_platform == 'linux'
    # via torch
 types-requests==2.32.0.20250328
--- a/backend/tests/external_dependency_unit/tools/test_memory_tool_integration.py
+++ b/backend/tests/external_dependency_unit/tools/test_memory_tool_integration.py
@@ -38,41 +38,38 @@ class TestAddMemory:
    def test_add_memory_creates_row(self, db_session: Session, test_user: User) -> None:
        """Verify that add_memory inserts a new Memory row."""
        user_id = test_user.id
-        memory_id = add_memory(
+        memory = add_memory(
            user_id=user_id,
            memory_text="User prefers dark mode",
            db_session=db_session,
        )

-        assert memory_id is not None
+        assert memory.id is not None
+        assert memory.user_id == user_id
+        assert memory.memory_text == "User prefers dark mode"

        # Verify it persists
-        fetched = db_session.get(Memory, memory_id)
+        fetched = db_session.get(Memory, memory.id)
        assert fetched is not None
-        assert fetched.user_id == user_id
        assert fetched.memory_text == "User prefers dark mode"

    def test_add_multiple_memories(self, db_session: Session, test_user: User) -> None:
        """Verify that multiple memories can be added for the same user."""
        user_id = test_user.id
-        m1_id = add_memory(
+        m1 = add_memory(
            user_id=user_id,
            memory_text="Favorite color is blue",
            db_session=db_session,
        )
-        m2_id = add_memory(
+        m2 = add_memory(
            user_id=user_id,
            memory_text="Works in engineering",
            db_session=db_session,
        )

-        assert m1_id != m2_id
-        fetched_m1 = db_session.get(Memory, m1_id)
-        fetched_m2 = db_session.get(Memory, m2_id)
-        assert fetched_m1 is not None
-        assert fetched_m2 is not None
-        assert fetched_m1.memory_text == "Favorite color is blue"
-        assert fetched_m2.memory_text == "Works in engineering"
+        assert m1.id != m2.id
+        assert m1.memory_text == "Favorite color is blue"
+        assert m2.memory_text == "Works in engineering"


 class TestUpdateMemoryAtIndex:
@@ -85,17 +82,15 @@ class TestUpdateMemoryAtIndex:
        add_memory(user_id=user_id, memory_text="Memory 1", db_session=db_session)
        add_memory(user_id=user_id, memory_text="Memory 2", db_session=db_session)

-        updated_id = update_memory_at_index(
+        updated = update_memory_at_index(
            user_id=user_id,
            index=1,
            new_text="Updated Memory 1",
            db_session=db_session,
        )

-        assert updated_id is not None
-        fetched = db_session.get(Memory, updated_id)
-        assert fetched is not None
-        assert fetched.memory_text == "Updated Memory 1"
+        assert updated is not None
+        assert updated.memory_text == "Updated Memory 1"

    def test_update_memory_at_out_of_range_index(
        self, db_session: Session, test_user: User
@@ -172,7 +167,7 @@ class TestMemoryCap:
        assert len(rows_before) == MAX_MEMORIES_PER_USER

        # Add one more — should evict the oldest
-        new_memory_id = add_memory(
+        new_memory = add_memory(
            user_id=user_id,
            memory_text="New memory after cap",
            db_session=db_session,
@@ -186,7 +181,7 @@ class TestMemoryCap:
        # Oldest ("Memory 0") should be gone; "Memory 1" is now the oldest
        assert rows_after[0].memory_text == "Memory 1"
        # Newest should be the one we just added
-        assert rows_after[-1].id == new_memory_id
+        assert rows_after[-1].id == new_memory.id
        assert rows_after[-1].memory_text == "New memory after cap"


@@ -226,26 +221,22 @@ class TestGetMemoriesWithUserId:
        user_id = test_user_no_memories.id

        # Add a memory
-        memory_id = add_memory(
+        memory = add_memory(
            user_id=user_id,
            memory_text="Memory with use_memories off",
            db_session=db_session,
        )
-        fetched = db_session.get(Memory, memory_id)
-        assert fetched is not None
-        assert fetched.memory_text == "Memory with use_memories off"
+        assert memory.memory_text == "Memory with use_memories off"

        # Update that memory
-        updated_id = update_memory_at_index(
+        updated = update_memory_at_index(
            user_id=user_id,
            index=0,
            new_text="Updated memory with use_memories off",
            db_session=db_session,
        )
-        assert updated_id is not None
-        fetched_updated = db_session.get(Memory, updated_id)
-        assert fetched_updated is not None
-        assert fetched_updated.memory_text == "Updated memory with use_memories off"
+        assert updated is not None
+        assert updated.memory_text == "Updated memory with use_memories off"

        # Verify get_memories returns the updated memory
        context = get_memories(test_user_no_memories, db_session)
--- a/backend/tests/integration/common_utils/managers/user_group.py
+++ b/backend/tests/integration/common_utils/managers/user_group.py
@@ -117,15 +117,14 @@ class UserGroupManager:
        return response.json()

    @staticmethod
-    def set_permission(
+    def set_permissions(
        user_group: DATestUserGroup,
-        permission: str,
-        enabled: bool,
+        permissions: list[str],
        user_performing_action: DATestUser,
    ) -> requests.Response:
        response = requests.put(
            f"{API_SERVER_URL}/manage/admin/user-group/{user_group.id}/permissions",
-            json={"permission": permission, "enabled": enabled},
+            json={"permissions": permissions},
            headers=user_performing_action.headers,
        )
        return response
--- a/backend/tests/integration/tests/usergroup/test_group_permission_toggle.py
+++ b/backend/tests/integration/tests/usergroup/test_group_permission_toggle.py
@@ -13,7 +13,7 @@ ENTERPRISE_SKIP = pytest.mark.skipif(


@ENTERPRISE_SKIP
-def test_grant_permission_via_toggle(reset: None) -> None:  # noqa: ARG001
+def test_grant_permission_via_bulk(reset: None) -> None:  # noqa: ARG001
    admin_user: DATestUser = UserManager.create(name="admin_grant")
    basic_user: DATestUser = UserManager.create(name="basic_grant")

@@ -23,10 +23,11 @@ def test_grant_permission_via_toggle(reset: None) -> None:  # noqa: ARG001
        user_performing_action=admin_user,
    )

-    # Grant manage:llms
-    resp = UserGroupManager.set_permission(group, "manage:llms", True, admin_user)
+    # Set desired permissions to [manage:llms]
+    resp = UserGroupManager.set_permissions(group, ["manage:llms"], admin_user)
    resp.raise_for_status()
-    assert resp.json() == {"permission": "manage:llms", "enabled": True}
+    result = resp.json()
+    assert "manage:llms" in result, f"Expected manage:llms in {result}"

    # Verify group permissions
    group_perms = UserGroupManager.get_permissions(group, admin_user)
@@ -38,7 +39,7 @@ def test_grant_permission_via_toggle(reset: None) -> None:  # noqa: ARG001


@ENTERPRISE_SKIP
-def test_revoke_permission_via_toggle(reset: None) -> None:  # noqa: ARG001
+def test_revoke_permission_via_bulk(reset: None) -> None:  # noqa: ARG001
    admin_user: DATestUser = UserManager.create(name="admin_revoke")
    basic_user: DATestUser = UserManager.create(name="basic_revoke")

@@ -48,13 +49,11 @@ def test_revoke_permission_via_toggle(reset: None) -> None:  # noqa: ARG001
        user_performing_action=admin_user,
    )

-    # Grant then revoke
-    UserGroupManager.set_permission(
-        group, "manage:llms", True, admin_user
-    ).raise_for_status()
-    UserGroupManager.set_permission(
-        group, "manage:llms", False, admin_user
+    # Grant then revoke by sending empty list
+    UserGroupManager.set_permissions(
+        group, ["manage:llms"], admin_user
    ).raise_for_status()
+    UserGroupManager.set_permissions(group, [], admin_user).raise_for_status()

    # Verify removed from group
    group_perms = UserGroupManager.get_permissions(group, admin_user)
@@ -68,7 +67,7 @@ def test_revoke_permission_via_toggle(reset: None) -> None:  # noqa: ARG001


@ENTERPRISE_SKIP
-def test_idempotent_grant(reset: None) -> None:  # noqa: ARG001
+def test_idempotent_bulk_set(reset: None) -> None:  # noqa: ARG001
    admin_user: DATestUser = UserManager.create(name="admin_idempotent_grant")

    group = UserGroupManager.create(
@@ -77,12 +76,12 @@ def test_idempotent_grant(reset: None) -> None:  # noqa: ARG001
        user_performing_action=admin_user,
    )

-    # Toggle ON twice
-    UserGroupManager.set_permission(
-        group, "manage:llms", True, admin_user
+    # Set same permissions twice
+    UserGroupManager.set_permissions(
+        group, ["manage:llms"], admin_user
    ).raise_for_status()
-    UserGroupManager.set_permission(
-        group, "manage:llms", True, admin_user
+    UserGroupManager.set_permissions(
+        group, ["manage:llms"], admin_user
    ).raise_for_status()

    group_perms = UserGroupManager.get_permissions(group, admin_user)
@@ -92,22 +91,22 @@ def test_idempotent_grant(reset: None) -> None:  # noqa: ARG001


@ENTERPRISE_SKIP
-def test_idempotent_revoke(reset: None) -> None:  # noqa: ARG001
-    admin_user: DATestUser = UserManager.create(name="admin_idempotent_revoke")
+def test_empty_permissions_is_valid(reset: None) -> None:  # noqa: ARG001
+    admin_user: DATestUser = UserManager.create(name="admin_empty")

    group = UserGroupManager.create(
-        name="idempotent-revoke-group",
+        name="empty-perms-group",
        user_ids=[admin_user.id],
        user_performing_action=admin_user,
    )

-    # Toggle OFF when never granted — should not error
-    resp = UserGroupManager.set_permission(group, "manage:llms", False, admin_user)
+    # Setting empty list should not error
+    resp = UserGroupManager.set_permissions(group, [], admin_user)
    resp.raise_for_status()


@ENTERPRISE_SKIP
-def test_cannot_toggle_basic_access(reset: None) -> None:  # noqa: ARG001
+def test_cannot_set_basic_access(reset: None) -> None:  # noqa: ARG001
    admin_user: DATestUser = UserManager.create(name="admin_basic_block")

    group = UserGroupManager.create(
@@ -116,12 +115,12 @@ def test_cannot_toggle_basic_access(reset: None) -> None:  # noqa: ARG001
        user_performing_action=admin_user,
    )

-    resp = UserGroupManager.set_permission(group, "basic", True, admin_user)
+    resp = UserGroupManager.set_permissions(group, ["basic"], admin_user)
    assert resp.status_code == 400, f"Expected 400, got {resp.status_code}"


@ENTERPRISE_SKIP
-def test_cannot_toggle_admin(reset: None) -> None:  # noqa: ARG001
+def test_cannot_set_admin(reset: None) -> None:  # noqa: ARG001
    admin_user: DATestUser = UserManager.create(name="admin_admin_block")

    group = UserGroupManager.create(
@@ -130,7 +129,7 @@ def test_cannot_toggle_admin(reset: None) -> None:  # noqa: ARG001
        user_performing_action=admin_user,
    )

-    resp = UserGroupManager.set_permission(group, "admin", True, admin_user)
+    resp = UserGroupManager.set_permissions(group, ["admin"], admin_user)
    assert resp.status_code == 400, f"Expected 400, got {resp.status_code}"


@@ -146,11 +145,44 @@ def test_implied_permissions_expand(reset: None) -> None:  # noqa: ARG001
    )

    # Grant manage:agents — should imply add:agents and read:agents
-    UserGroupManager.set_permission(
-        group, "manage:agents", True, admin_user
+    UserGroupManager.set_permissions(
+        group, ["manage:agents"], admin_user
    ).raise_for_status()

    user_perms = UserManager.get_permissions(basic_user)
    assert "manage:agents" in user_perms, f"Missing manage:agents: {user_perms}"
    assert "add:agents" in user_perms, f"Missing implied add:agents: {user_perms}"
    assert "read:agents" in user_perms, f"Missing implied read:agents: {user_perms}"
+
+
+@ENTERPRISE_SKIP
+def test_bulk_replaces_previous_state(reset: None) -> None:  # noqa: ARG001
+    """Setting a new permission list should disable ones no longer included."""
+    admin_user: DATestUser = UserManager.create(name="admin_replace")
+
+    group = UserGroupManager.create(
+        name="replace-state-group",
+        user_ids=[admin_user.id],
+        user_performing_action=admin_user,
+    )
+
+    # Set initial permissions
+    UserGroupManager.set_permissions(
+        group, ["manage:llms", "manage:actions"], admin_user
+    ).raise_for_status()
+
+    # Replace with a different set
+    UserGroupManager.set_permissions(
+        group, ["manage:actions", "manage:user_groups"], admin_user
+    ).raise_for_status()
+
+    group_perms = UserGroupManager.get_permissions(group, admin_user)
+    assert (
+        "manage:llms" not in group_perms
+    ), f"manage:llms should be removed: {group_perms}"
+    assert (
+        "manage:actions" in group_perms
+    ), f"manage:actions should remain: {group_perms}"
+    assert (
+        "manage:user_groups" in group_perms
+    ), f"manage:user_groups should be added: {group_perms}"
--- a/backend/tests/unit/background/init.py
+++ b/backend/tests/unit/background/init.py
--- a/backend/tests/unit/background/celery/init.py
+++ b/backend/tests/unit/background/celery/init.py
--- a/backend/tests/unit/background/celery/test_celery_utils.py
+++ b/backend/tests/unit/background/celery/test_celery_utils.py
@@ -1,149 +0,0 @@
-"""Unit tests for extract_ids_from_runnable_connector metrics instrumentation."""
-
-from collections.abc import Iterator
-from unittest.mock import MagicMock
-
-import pytest
-
-from onyx.background.celery.celery_utils import extract_ids_from_runnable_connector
-from onyx.connectors.interfaces import SlimConnector
-from onyx.connectors.models import SlimDocument
-from onyx.server.metrics.pruning_metrics import PRUNING_ENUMERATION_DURATION
-from onyx.server.metrics.pruning_metrics import PRUNING_RATE_LIMIT_ERRORS
-
-
-def _make_slim_connector(doc_ids: list[str]) -> SlimConnector:
-    """Mock SlimConnector that yields the given doc IDs in one batch."""
-    connector = MagicMock(spec=SlimConnector)
-    docs = [
-        MagicMock(spec=SlimDocument, id=doc_id, parent_hierarchy_raw_node_id=None)
-        for doc_id in doc_ids
-    ]
-    connector.retrieve_all_slim_docs.return_value = iter([docs])
-    return connector
-
-
-def _raising_connector(message: str) -> SlimConnector:
-    """Mock SlimConnector whose generator raises with the given message."""
-    connector = MagicMock(spec=SlimConnector)
-
-    def raising_iter() -> Iterator:
-        raise Exception(message)
-        yield
-
-    connector.retrieve_all_slim_docs.return_value = raising_iter()
-    return connector
-
-
-class TestEnumerationDuration:
-    def test_recorded_on_success(self) -> None:
-        connector = _make_slim_connector(["doc1"])
-        before = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="google_drive"
-        )._sum.get()
-
-        extract_ids_from_runnable_connector(connector, connector_type="google_drive")
-
-        after = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="google_drive"
-        )._sum.get()
-        assert after >= before  # duration observed (non-negative)
-
-    def test_recorded_on_exception(self) -> None:
-        connector = _raising_connector("unexpected error")
-        before = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="confluence"
-        )._sum.get()
-
-        with pytest.raises(Exception):
-            extract_ids_from_runnable_connector(connector, connector_type="confluence")
-
-        after = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="confluence"
-        )._sum.get()
-        assert after >= before  # duration observed even on exception
-
-
-class TestRateLimitDetection:
-    def test_increments_on_rate_limit_message(self) -> None:
-        connector = _raising_connector("rate limit exceeded")
-        before = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="google_drive"
-        )._value.get()
-
-        with pytest.raises(Exception, match="rate limit exceeded"):
-            extract_ids_from_runnable_connector(
-                connector, connector_type="google_drive"
-            )
-
-        after = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="google_drive"
-        )._value.get()
-        assert after == before + 1
-
-    def test_increments_on_429_in_message(self) -> None:
-        connector = _raising_connector("HTTP 429 Too Many Requests")
-        before = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="confluence"
-        )._value.get()
-
-        with pytest.raises(Exception, match="429"):
-            extract_ids_from_runnable_connector(connector, connector_type="confluence")
-
-        after = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="confluence"
-        )._value.get()
-        assert after == before + 1
-
-    def test_does_not_increment_on_non_rate_limit_exception(self) -> None:
-        connector = _raising_connector("connection timeout")
-        before = PRUNING_RATE_LIMIT_ERRORS.labels(connector_type="slack")._value.get()
-
-        with pytest.raises(Exception, match="connection timeout"):
-            extract_ids_from_runnable_connector(connector, connector_type="slack")
-
-        after = PRUNING_RATE_LIMIT_ERRORS.labels(connector_type="slack")._value.get()
-        assert after == before
-
-    def test_rate_limit_detection_is_case_insensitive(self) -> None:
-        connector = _raising_connector("RATE LIMIT exceeded")
-        before = PRUNING_RATE_LIMIT_ERRORS.labels(connector_type="jira")._value.get()
-
-        with pytest.raises(Exception):
-            extract_ids_from_runnable_connector(connector, connector_type="jira")
-
-        after = PRUNING_RATE_LIMIT_ERRORS.labels(connector_type="jira")._value.get()
-        assert after == before + 1
-
-    def test_connector_type_label_matches_input(self) -> None:
-        connector = _raising_connector("rate limit exceeded")
-        before_gd = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="google_drive"
-        )._value.get()
-        before_jira = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="jira"
-        )._value.get()
-
-        with pytest.raises(Exception):
-            extract_ids_from_runnable_connector(
-                connector, connector_type="google_drive"
-            )
-
-        assert (
-            PRUNING_RATE_LIMIT_ERRORS.labels(connector_type="google_drive")._value.get()
-            == before_gd + 1
-        )
-        assert (
-            PRUNING_RATE_LIMIT_ERRORS.labels(connector_type="jira")._value.get()
-            == before_jira
-        )
-
-    def test_defaults_to_unknown_connector_type(self) -> None:
-        connector = _raising_connector("rate limit exceeded")
-        before = PRUNING_RATE_LIMIT_ERRORS.labels(connector_type="unknown")._value.get()
-
-        with pytest.raises(Exception):
-            extract_ids_from_runnable_connector(connector)
-
-        after = PRUNING_RATE_LIMIT_ERRORS.labels(connector_type="unknown")._value.get()
-        assert after == before + 1
--- a/backend/tests/unit/ee/onyx/db/test_license.py
+++ b/backend/tests/unit/ee/onyx/db/test_license.py
@@ -9,7 +9,6 @@ from unittest.mock import patch
 from ee.onyx.db.license import check_seat_availability
 from ee.onyx.db.license import delete_license
 from ee.onyx.db.license import get_license
-from ee.onyx.db.license import get_used_seats
 from ee.onyx.db.license import upsert_license
 from ee.onyx.server.license.models import LicenseMetadata
 from ee.onyx.server.license.models import LicenseSource
@@ -215,43 +214,3 @@ class TestCheckSeatAvailabilityMultiTenant:
        assert result.available is False
        assert result.error_message is not None
        mock_tenant_count.assert_called_once_with("tenant-abc")
-
-
-class TestGetUsedSeatsAccountTypeFiltering:
-    """Verify get_used_seats query excludes SERVICE_ACCOUNT but includes BOT."""
-
-    @patch("ee.onyx.db.license.MULTI_TENANT", False)
-    @patch("onyx.db.engine.sql_engine.get_session_with_current_tenant")
-    def test_excludes_service_accounts(self, mock_get_session: MagicMock) -> None:
-        """SERVICE_ACCOUNT users should not count toward seats."""
-        mock_session = MagicMock()
-        mock_get_session.return_value.__enter__ = MagicMock(return_value=mock_session)
-        mock_get_session.return_value.__exit__ = MagicMock(return_value=False)
-        mock_session.execute.return_value.scalar.return_value = 5
-
-        result = get_used_seats()
-
-        assert result == 5
-        # Inspect the compiled query to verify account_type filter
-        call_args = mock_session.execute.call_args
-        query = call_args[0][0]
-        compiled = str(query.compile(compile_kwargs={"literal_binds": True}))
-        assert "SERVICE_ACCOUNT" in compiled
-        # BOT should NOT be excluded
-        assert "BOT" not in compiled
-
-    @patch("ee.onyx.db.license.MULTI_TENANT", False)
-    @patch("onyx.db.engine.sql_engine.get_session_with_current_tenant")
-    def test_still_excludes_ext_perm_user(self, mock_get_session: MagicMock) -> None:
-        """EXT_PERM_USER exclusion should still be present."""
-        mock_session = MagicMock()
-        mock_get_session.return_value.__enter__ = MagicMock(return_value=mock_session)
-        mock_get_session.return_value.__exit__ = MagicMock(return_value=False)
-        mock_session.execute.return_value.scalar.return_value = 3
-
-        get_used_seats()
-
-        call_args = mock_session.execute.call_args
-        query = call_args[0][0]
-        compiled = str(query.compile(compile_kwargs={"literal_binds": True}))
-        assert "EXT_PERM_USER" in compiled
--- a/backend/tests/unit/onyx/chat/test_multi_model_streaming.py
+++ b/backend/tests/unit/onyx/chat/test_multi_model_streaming.py
@@ -301,6 +301,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop", side_effect=emit_stop),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch("onyx.chat.process_message.llm_loop_completion_handle"),
            patch(
                "onyx.chat.process_message.get_llm_token_counter",
@@ -331,6 +332,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop", side_effect=emit_one),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch("onyx.chat.process_message.llm_loop_completion_handle"),
            patch(
                "onyx.chat.process_message.get_llm_token_counter",
@@ -361,6 +363,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop", side_effect=emit_one),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch("onyx.chat.process_message.llm_loop_completion_handle"),
            patch(
                "onyx.chat.process_message.get_llm_token_counter",
@@ -388,6 +391,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop", side_effect=always_fail),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch("onyx.chat.process_message.llm_loop_completion_handle"),
            patch(
                "onyx.chat.process_message.get_llm_token_counter",
@@ -419,6 +423,7 @@ class TestRunModels:
            ),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch("onyx.chat.process_message.llm_loop_completion_handle"),
            patch(
                "onyx.chat.process_message.get_llm_token_counter",
@@ -451,6 +456,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop", side_effect=slow_llm),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch("onyx.chat.process_message.llm_loop_completion_handle"),
            patch(
                "onyx.chat.process_message.get_llm_token_counter",
@@ -491,6 +497,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop", side_effect=slow_llm),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch(
                "onyx.chat.process_message.llm_loop_completion_handle"
            ) as mock_handle,
@@ -512,6 +519,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop"),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch(
                "onyx.chat.process_message.llm_loop_completion_handle"
            ) as mock_handle,
@@ -534,6 +542,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop", side_effect=always_fail),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch(
                "onyx.chat.process_message.llm_loop_completion_handle"
            ) as mock_handle,
@@ -587,6 +596,7 @@ class TestRunModels:
            ),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch(
                "onyx.chat.process_message.llm_loop_completion_handle",
                side_effect=lambda *_, **__: completion_called.set(),
@@ -643,6 +653,7 @@ class TestRunModels:
            ),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch(
                "onyx.chat.process_message.llm_loop_completion_handle",
                side_effect=lambda *_, **__: completion_called.set(),
@@ -695,6 +706,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop", side_effect=fail_model_0),
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch(
                "onyx.chat.process_message.llm_loop_completion_handle"
            ) as mock_handle,
@@ -724,6 +736,7 @@ class TestRunModels:
            patch("onyx.chat.process_message.run_llm_loop") as mock_llm,
            patch("onyx.chat.process_message.run_deep_research_llm_loop"),
            patch("onyx.chat.process_message.construct_tools", return_value={}),
+            patch("onyx.chat.process_message.get_session_with_current_tenant"),
            patch("onyx.chat.process_message.llm_loop_completion_handle"),
            patch(
                "onyx.chat.process_message.get_llm_token_counter",
--- a/backend/tests/unit/onyx/connectors/google_utils/test_google_credential_storage.py
+++ b/backend/tests/unit/onyx/connectors/google_utils/test_google_credential_storage.py
@@ -1,182 +0,0 @@
-from typing import Any
-
-import pytest
-
-from onyx.configs.constants import DocumentSource
-from onyx.configs.constants import KV_GOOGLE_DRIVE_CRED_KEY
-from onyx.configs.constants import KV_GOOGLE_DRIVE_SERVICE_ACCOUNT_KEY
-from onyx.connectors.google_utils.google_kv import get_auth_url
-from onyx.connectors.google_utils.google_kv import get_google_app_cred
-from onyx.connectors.google_utils.google_kv import get_service_account_key
-from onyx.connectors.google_utils.google_kv import upsert_google_app_cred
-from onyx.connectors.google_utils.google_kv import upsert_service_account_key
-from onyx.server.documents.models import GoogleAppCredentials
-from onyx.server.documents.models import GoogleAppWebCredentials
-from onyx.server.documents.models import GoogleServiceAccountKey
-
-
-def _make_app_creds() -> GoogleAppCredentials:
-    return GoogleAppCredentials(
-        web=GoogleAppWebCredentials(
-            client_id="client-id.apps.googleusercontent.com",
-            project_id="test-project",
-            auth_uri="https://accounts.google.com/o/oauth2/auth",
-            token_uri="https://oauth2.googleapis.com/token",
-            auth_provider_x509_cert_url="https://www.googleapis.com/oauth2/v1/certs",
-            client_secret="secret",
-            redirect_uris=["https://example.com/callback"],
-            javascript_origins=["https://example.com"],
-        )
-    )
-
-
-def _make_service_account_key() -> GoogleServiceAccountKey:
-    return GoogleServiceAccountKey(
-        type="service_account",
-        project_id="test-project",
-        private_key_id="private-key-id",
-        private_key="-----BEGIN PRIVATE KEY-----\nabc\n-----END PRIVATE KEY-----\n",
-        client_email="test@test-project.iam.gserviceaccount.com",
-        client_id="123",
-        auth_uri="https://accounts.google.com/o/oauth2/auth",
-        token_uri="https://oauth2.googleapis.com/token",
-        auth_provider_x509_cert_url="https://www.googleapis.com/oauth2/v1/certs",
-        client_x509_cert_url="https://www.googleapis.com/robot/v1/metadata/x509/test",
-        universe_domain="googleapis.com",
-    )
-
-
-def test_upsert_google_app_cred_stores_dict(monkeypatch: Any) -> None:
-    stored: dict[str, Any] = {}
-
-    class _StubKvStore:
-        def store(self, key: str, value: object, encrypt: bool) -> None:
-            stored["key"] = key
-            stored["value"] = value
-            stored["encrypt"] = encrypt
-
-    monkeypatch.setattr(
-        "onyx.connectors.google_utils.google_kv.get_kv_store", lambda: _StubKvStore()
-    )
-
-    upsert_google_app_cred(_make_app_creds(), DocumentSource.GOOGLE_DRIVE)
-
-    assert stored["key"] == KV_GOOGLE_DRIVE_CRED_KEY
-    assert stored["encrypt"] is True
-    assert isinstance(stored["value"], dict)
-    assert stored["value"]["web"]["client_id"] == "client-id.apps.googleusercontent.com"
-
-
-def test_upsert_service_account_key_stores_dict(monkeypatch: Any) -> None:
-    stored: dict[str, Any] = {}
-
-    class _StubKvStore:
-        def store(self, key: str, value: object, encrypt: bool) -> None:
-            stored["key"] = key
-            stored["value"] = value
-            stored["encrypt"] = encrypt
-
-    monkeypatch.setattr(
-        "onyx.connectors.google_utils.google_kv.get_kv_store", lambda: _StubKvStore()
-    )
-
-    upsert_service_account_key(_make_service_account_key(), DocumentSource.GOOGLE_DRIVE)
-
-    assert stored["key"] == KV_GOOGLE_DRIVE_SERVICE_ACCOUNT_KEY
-    assert stored["encrypt"] is True
-    assert isinstance(stored["value"], dict)
-    assert stored["value"]["project_id"] == "test-project"
-
-
-@pytest.mark.parametrize("legacy_string", [False, True])
-def test_get_google_app_cred_accepts_dict_and_legacy_string(
-    monkeypatch: Any, legacy_string: bool
-) -> None:
-    payload: dict[str, Any] = _make_app_creds().model_dump(mode="json")
-    stored_value: object = (
-        payload if not legacy_string else _make_app_creds().model_dump_json()
-    )
-
-    class _StubKvStore:
-        def load(self, key: str) -> object:
-            assert key == KV_GOOGLE_DRIVE_CRED_KEY
-            return stored_value
-
-    monkeypatch.setattr(
-        "onyx.connectors.google_utils.google_kv.get_kv_store", lambda: _StubKvStore()
-    )
-
-    creds = get_google_app_cred(DocumentSource.GOOGLE_DRIVE)
-
-    assert creds.web.client_id == "client-id.apps.googleusercontent.com"
-
-
-@pytest.mark.parametrize("legacy_string", [False, True])
-def test_get_service_account_key_accepts_dict_and_legacy_string(
-    monkeypatch: Any, legacy_string: bool
-) -> None:
-    stored_value: object = (
-        _make_service_account_key().model_dump(mode="json")
-        if not legacy_string
-        else _make_service_account_key().model_dump_json()
-    )
-
-    class _StubKvStore:
-        def load(self, key: str) -> object:
-            assert key == KV_GOOGLE_DRIVE_SERVICE_ACCOUNT_KEY
-            return stored_value
-
-    monkeypatch.setattr(
-        "onyx.connectors.google_utils.google_kv.get_kv_store", lambda: _StubKvStore()
-    )
-
-    key = get_service_account_key(DocumentSource.GOOGLE_DRIVE)
-
-    assert key.client_email == "test@test-project.iam.gserviceaccount.com"
-
-
-@pytest.mark.parametrize("legacy_string", [False, True])
-def test_get_auth_url_accepts_dict_and_legacy_string(
-    monkeypatch: Any, legacy_string: bool
-) -> None:
-    payload = _make_app_creds().model_dump(mode="json")
-    stored_value: object = (
-        payload if not legacy_string else _make_app_creds().model_dump_json()
-    )
-    stored_state: dict[str, object] = {}
-
-    class _StubKvStore:
-        def load(self, key: str) -> object:
-            assert key == KV_GOOGLE_DRIVE_CRED_KEY
-            return stored_value
-
-        def store(self, key: str, value: object, encrypt: bool) -> None:
-            stored_state["key"] = key
-            stored_state["value"] = value
-            stored_state["encrypt"] = encrypt
-
-    class _StubFlow:
-        def authorization_url(self, prompt: str) -> tuple[str, None]:
-            assert prompt == "consent"
-            return "https://accounts.google.com/o/oauth2/auth?state=test-state", None
-
-    monkeypatch.setattr(
-        "onyx.connectors.google_utils.google_kv.get_kv_store", lambda: _StubKvStore()
-    )
-
-    def _from_client_config(
-        _app_config: object, *, scopes: object, redirect_uri: object
-    ) -> _StubFlow:
-        del scopes, redirect_uri
-        return _StubFlow()
-
-    monkeypatch.setattr(
-        "onyx.connectors.google_utils.google_kv.InstalledAppFlow.from_client_config",
-        _from_client_config,
-    )
-
-    auth_url = get_auth_url(42, DocumentSource.GOOGLE_DRIVE)
-
-    assert auth_url.startswith("https://accounts.google.com")
-    assert stored_state["value"] == {"value": "test-state"}
-    assert stored_state["encrypt"] is True
--- a/backend/tests/unit/onyx/connectors/jira/test_jira_bulk_fetch.py
+++ b/backend/tests/unit/onyx/connectors/jira/test_jira_bulk_fetch.py
@@ -6,7 +6,6 @@ import requests
 from jira import JIRA
 from jira.resources import Issue

-from onyx.connectors.jira.connector import _JIRA_BULK_FETCH_LIMIT
 from onyx.connectors.jira.connector import bulk_fetch_issues


@@ -146,29 +145,3 @@ def test_bulk_fetch_recursive_splitting_raises_on_bad_issue() -> None:

    with pytest.raises(requests.exceptions.JSONDecodeError):
        bulk_fetch_issues(client, ["1", "2", bad_id, "3", "4", "5"])
-
-
-def test_bulk_fetch_respects_api_batch_limit() -> None:
-    """Requests to the bulkfetch endpoint never exceed _JIRA_BULK_FETCH_LIMIT IDs."""
-    client = _mock_jira_client()
-    total_issues = _JIRA_BULK_FETCH_LIMIT * 3 + 7
-    all_ids = [str(i) for i in range(total_issues)]
-
-    batch_sizes: list[int] = []
-
-    def _post_side_effect(url: str, json: dict[str, Any]) -> MagicMock:  # noqa: ARG001
-        ids = json["issueIdsOrKeys"]
-        batch_sizes.append(len(ids))
-        resp = MagicMock()
-        resp.json.return_value = {"issues": [_make_raw_issue(i) for i in ids]}
-        return resp
-
-    client._session.post.side_effect = _post_side_effect
-
-    result = bulk_fetch_issues(client, all_ids)
-
-    assert len(result) == total_issues
-    # keeping this hardcoded because it's the documented limit
-    # https://developer.atlassian.com/cloud/jira/platform/rest/v3/api-group-issues/
-    assert all(size <= 100 for size in batch_sizes)
-    assert len(batch_sizes) == 4
--- a/backend/tests/unit/onyx/context/search/federated/test_build_thread_text.py
+++ b/backend/tests/unit/onyx/context/search/federated/test_build_thread_text.py
@@ -1,67 +0,0 @@
-"""Tests for _build_thread_text function."""
-
-from unittest.mock import MagicMock
-from unittest.mock import patch
-
-from onyx.context.search.federated.slack_search import _build_thread_text
-
-
-def _make_msg(user: str, text: str, ts: str) -> dict[str, str]:
-    return {"user": user, "text": text, "ts": ts}
-
-
-class TestBuildThreadText:
-    """Verify _build_thread_text includes full thread replies up to cap."""
-
-    @patch("onyx.context.search.federated.slack_search.batch_get_user_profiles")
-    def test_includes_all_replies(self, mock_profiles: MagicMock) -> None:
-        """All replies within cap are included in output."""
-        mock_profiles.return_value = {}
-        messages = [
-            _make_msg("U1", "parent msg", "1000.0"),
-            _make_msg("U2", "reply 1", "1001.0"),
-            _make_msg("U3", "reply 2", "1002.0"),
-            _make_msg("U4", "reply 3", "1003.0"),
-        ]
-        result = _build_thread_text(messages, "token", "T123", MagicMock())
-        assert "parent msg" in result
-        assert "reply 1" in result
-        assert "reply 2" in result
-        assert "reply 3" in result
-        assert "..." not in result
-
-    @patch("onyx.context.search.federated.slack_search.batch_get_user_profiles")
-    def test_non_thread_returns_parent_only(self, mock_profiles: MagicMock) -> None:
-        """Single message (no replies) returns just the parent text."""
-        mock_profiles.return_value = {}
-        messages = [_make_msg("U1", "just a message", "1000.0")]
-        result = _build_thread_text(messages, "token", "T123", MagicMock())
-        assert "just a message" in result
-        assert "Replies:" not in result
-
-    @patch("onyx.context.search.federated.slack_search.batch_get_user_profiles")
-    def test_parent_always_first(self, mock_profiles: MagicMock) -> None:
-        """Thread parent message is always the first line of output."""
-        mock_profiles.return_value = {}
-        messages = [
-            _make_msg("U1", "I am the parent", "1000.0"),
-            _make_msg("U2", "I am a reply", "1001.0"),
-        ]
-        result = _build_thread_text(messages, "token", "T123", MagicMock())
-        parent_pos = result.index("I am the parent")
-        reply_pos = result.index("I am a reply")
-        assert parent_pos < reply_pos
-
-    @patch("onyx.context.search.federated.slack_search.batch_get_user_profiles")
-    def test_user_profiles_resolved(self, mock_profiles: MagicMock) -> None:
-        """User IDs in thread text are replaced with display names."""
-        mock_profiles.return_value = {"U1": "Alice", "U2": "Bob"}
-        messages = [
-            _make_msg("U1", "hello", "1000.0"),
-            _make_msg("U2", "world", "1001.0"),
-        ]
-        result = _build_thread_text(messages, "token", "T123", MagicMock())
-        assert "Alice" in result
-        assert "Bob" in result
-        assert "<@U1>" not in result
-        assert "<@U2>" not in result
--- a/backend/tests/unit/onyx/context/search/federated/test_url_override.py
+++ b/backend/tests/unit/onyx/context/search/federated/test_url_override.py
@@ -1,108 +0,0 @@
-"""Tests for Slack URL parsing and direct thread fetch via URL override."""
-
-from unittest.mock import MagicMock
-from unittest.mock import patch
-
-from onyx.context.search.federated.models import DirectThreadFetch
-from onyx.context.search.federated.slack_search import _fetch_thread_from_url
-from onyx.context.search.federated.slack_search_utils import extract_slack_message_urls
-
-
-class TestExtractSlackMessageUrls:
-    """Verify URL parsing extracts channel_id and timestamp correctly."""
-
-    def test_standard_url(self) -> None:
-        query = "summarize https://mycompany.slack.com/archives/C097NBWMY8Y/p1775491616524769"
-        results = extract_slack_message_urls(query)
-        assert len(results) == 1
-        assert results[0] == ("C097NBWMY8Y", "1775491616.524769")
-
-    def test_multiple_urls(self) -> None:
-        query = (
-            "compare https://co.slack.com/archives/C111/p1234567890123456 "
-            "and https://co.slack.com/archives/C222/p9876543210987654"
-        )
-        results = extract_slack_message_urls(query)
-        assert len(results) == 2
-        assert results[0] == ("C111", "1234567890.123456")
-        assert results[1] == ("C222", "9876543210.987654")
-
-    def test_no_urls(self) -> None:
-        query = "what happened in #general last week?"
-        results = extract_slack_message_urls(query)
-        assert len(results) == 0
-
-    def test_non_slack_url_ignored(self) -> None:
-        query = "check https://google.com/archives/C111/p1234567890123456"
-        results = extract_slack_message_urls(query)
-        assert len(results) == 0
-
-    def test_timestamp_conversion(self) -> None:
-        """p prefix removed, dot inserted after 10th digit."""
-        query = "https://x.slack.com/archives/CABC123/p1775491616524769"
-        results = extract_slack_message_urls(query)
-        channel_id, ts = results[0]
-        assert channel_id == "CABC123"
-        assert ts == "1775491616.524769"
-        assert not ts.startswith("p")
-        assert "." in ts
-
-
-class TestFetchThreadFromUrl:
-    """Verify _fetch_thread_from_url calls conversations.replies and returns SlackMessage."""
-
-    @patch("onyx.context.search.federated.slack_search._build_thread_text")
-    @patch("onyx.context.search.federated.slack_search.WebClient")
-    def test_successful_fetch(
-        self, mock_webclient_cls: MagicMock, mock_build_thread: MagicMock
-    ) -> None:
-        mock_client = MagicMock()
-        mock_webclient_cls.return_value = mock_client
-
-        # Mock conversations_replies
-        mock_response = MagicMock()
-        mock_response.get.return_value = [
-            {"user": "U1", "text": "parent", "ts": "1775491616.524769"},
-            {"user": "U2", "text": "reply 1", "ts": "1775491617.000000"},
-            {"user": "U3", "text": "reply 2", "ts": "1775491618.000000"},
-        ]
-        mock_client.conversations_replies.return_value = mock_response
-
-        # Mock channel info
-        mock_ch_response = MagicMock()
-        mock_ch_response.get.return_value = {"name": "general"}
-        mock_client.conversations_info.return_value = mock_ch_response
-
-        mock_build_thread.return_value = (
-            "U1: parent\n\nReplies:\n\nU2: reply 1\n\nU3: reply 2"
-        )
-
-        fetch = DirectThreadFetch(
-            channel_id="C097NBWMY8Y", thread_ts="1775491616.524769"
-        )
-        result = _fetch_thread_from_url(fetch, "xoxp-token")
-
-        assert len(result.messages) == 1
-        msg = result.messages[0]
-        assert msg.channel_id == "C097NBWMY8Y"
-        assert msg.thread_id is None  # Prevents double-enrichment
-        assert msg.slack_score == 100000.0
-        assert "parent" in msg.text
-        mock_client.conversations_replies.assert_called_once_with(
-            channel="C097NBWMY8Y", ts="1775491616.524769"
-        )
-
-    @patch("onyx.context.search.federated.slack_search.WebClient")
-    def test_api_error_returns_empty(self, mock_webclient_cls: MagicMock) -> None:
-        from slack_sdk.errors import SlackApiError
-
-        mock_client = MagicMock()
-        mock_webclient_cls.return_value = mock_client
-        mock_client.conversations_replies.side_effect = SlackApiError(
-            message="channel_not_found",
-            response=MagicMock(status_code=404),
-        )
-
-        fetch = DirectThreadFetch(channel_id="CBAD", thread_ts="1234567890.123456")
-        result = _fetch_thread_from_url(fetch, "xoxp-token")
-        assert len(result.messages) == 0
--- a/backend/tests/unit/onyx/llm/test_multi_llm.py
+++ b/backend/tests/unit/onyx/llm/test_multi_llm.py
@@ -29,7 +29,6 @@ from onyx.llm.utils import get_max_input_tokens
 VERTEX_OPUS_MODELS_REJECTING_OUTPUT_CONFIG = [
    "claude-opus-4-5@20251101",
    "claude-opus-4-6",
-    "claude-opus-4-7",
 ]


--- a/backend/tests/unit/onyx/server/manage/llm/test_fetch_models_api.py
+++ b/backend/tests/unit/onyx/server/manage/llm/test_fetch_models_api.py
@@ -505,7 +505,6 @@ class TestGetLMStudioAvailableModels:

        mock_session = MagicMock()
        mock_provider = MagicMock()
-        mock_provider.api_base = "http://localhost:1234"
        mock_provider.custom_config = {"LM_STUDIO_API_KEY": "stored-secret"}

        response = {
--- a/backend/tests/unit/onyx/server/manage/llm/test_llm_provider_utils.py
+++ b/backend/tests/unit/onyx/server/manage/llm/test_llm_provider_utils.py
@@ -100,39 +100,6 @@ class TestGenerateOllamaDisplayName:
        result = generate_ollama_display_name("llama3.3:70b")
        assert "3.3" in result or "3 3" in result  # Either format is acceptable

-    def test_non_size_tag_shown(self) -> None:
-        """Test that non-size tags like 'e4b' are included in the display name."""
-        result = generate_ollama_display_name("gemma4:e4b")
-        assert "Gemma" in result
-        assert "4" in result
-        assert "E4B" in result
-
-    def test_size_with_cloud_modifier(self) -> None:
-        """Test size tag with cloud modifier."""
-        result = generate_ollama_display_name("deepseek-v3.1:671b-cloud")
-        assert "DeepSeek" in result
-        assert "671B" in result
-        assert "Cloud" in result
-
-    def test_size_with_multiple_modifiers(self) -> None:
-        """Test size tag with multiple modifiers."""
-        result = generate_ollama_display_name("qwen3-vl:235b-instruct-cloud")
-        assert "Qwen" in result
-        assert "235B" in result
-        assert "Instruct" in result
-        assert "Cloud" in result
-
-    def test_quantization_tag_shown(self) -> None:
-        """Test that quantization tags are included in the display name."""
-        result = generate_ollama_display_name("llama3:q4_0")
-        assert "Llama" in result
-        assert "Q4_0" in result
-
-    def test_cloud_only_tag(self) -> None:
-        """Test standalone cloud tag."""
-        result = generate_ollama_display_name("glm-4.6:cloud")
-        assert "CLOUD" in result
-

 class TestStripOpenrouterVendorPrefix:
    """Tests for OpenRouter vendor prefix stripping."""
--- a/backend/tests/unit/onyx/server/scim/test_user_endpoints.py
+++ b/backend/tests/unit/onyx/server/scim/test_user_endpoints.py
@@ -2,7 +2,6 @@

 from __future__ import annotations

-from typing import Any
 from unittest.mock import MagicMock
 from unittest.mock import patch
 from uuid import uuid4
@@ -10,9 +9,7 @@ from uuid import uuid4
 from fastapi import Response
 from sqlalchemy.exc import IntegrityError

-from ee.onyx.server.scim.api import _check_seat_availability
 from ee.onyx.server.scim.api import _scim_name_to_str
-from ee.onyx.server.scim.api import _seat_lock_id_for_tenant
 from ee.onyx.server.scim.api import create_user
 from ee.onyx.server.scim.api import delete_user
 from ee.onyx.server.scim.api import get_user
@@ -744,80 +741,3 @@ class TestEmailCasePreservation:
        resource = parse_scim_user(result)
        assert resource.userName == "Alice@Example.COM"
        assert resource.emails[0].value == "Alice@Example.COM"
-
-
-class TestSeatLock:
-    """Tests for the advisory lock in _check_seat_availability."""
-
-    @patch("ee.onyx.server.scim.api.get_current_tenant_id", return_value="tenant_abc")
-    def test_acquires_advisory_lock_before_checking(
-        self,
-        _mock_tenant: MagicMock,
-        mock_dal: MagicMock,
-    ) -> None:
-        """The advisory lock must be acquired before the seat check runs."""
-        call_order: list[str] = []
-
-        def track_execute(stmt: Any, _params: Any = None) -> None:
-            if "pg_advisory_xact_lock" in str(stmt):
-                call_order.append("lock")
-
-        mock_dal.session.execute.side_effect = track_execute
-
-        with patch(
-            "ee.onyx.server.scim.api.fetch_ee_implementation_or_noop"
-        ) as mock_fetch:
-            mock_result = MagicMock()
-            mock_result.available = True
-            mock_fn = MagicMock(return_value=mock_result)
-            mock_fetch.return_value = mock_fn
-
-            def track_check(*_args: Any, **_kwargs: Any) -> Any:
-                call_order.append("check")
-                return mock_result
-
-            mock_fn.side_effect = track_check
-
-            _check_seat_availability(mock_dal)
-
-        assert call_order == ["lock", "check"]
-
-    @patch("ee.onyx.server.scim.api.get_current_tenant_id", return_value="tenant_xyz")
-    def test_lock_uses_tenant_scoped_key(
-        self,
-        _mock_tenant: MagicMock,
-        mock_dal: MagicMock,
-    ) -> None:
-        """The lock id must be derived from the tenant via _seat_lock_id_for_tenant."""
-        mock_result = MagicMock()
-        mock_result.available = True
-        mock_check = MagicMock(return_value=mock_result)
-
-        with patch(
-            "ee.onyx.server.scim.api.fetch_ee_implementation_or_noop",
-            return_value=mock_check,
-        ):
-            _check_seat_availability(mock_dal)
-
-        mock_dal.session.execute.assert_called_once()
-        params = mock_dal.session.execute.call_args[0][1]
-        assert params["lock_id"] == _seat_lock_id_for_tenant("tenant_xyz")
-
-    def test_seat_lock_id_is_stable_and_tenant_scoped(self) -> None:
-        """Lock id must be deterministic and differ across tenants."""
-        assert _seat_lock_id_for_tenant("t1") == _seat_lock_id_for_tenant("t1")
-        assert _seat_lock_id_for_tenant("t1") != _seat_lock_id_for_tenant("t2")
-
-    def test_no_lock_when_ee_absent(
-        self,
-        mock_dal: MagicMock,
-    ) -> None:
-        """No advisory lock should be acquired when the EE check is absent."""
-        with patch(
-            "ee.onyx.server.scim.api.fetch_ee_implementation_or_noop",
-            return_value=None,
-        ):
-            result = _check_seat_availability(mock_dal)
-
-        assert result is None
-        mock_dal.session.execute.assert_not_called()
--- a/backend/tests/unit/onyx/tools/test_construct_tools_no_vectordb.py
+++ b/backend/tests/unit/onyx/tools/test_construct_tools_no_vectordb.py
@@ -95,9 +95,9 @@ class TestForceAddSearchToolGuard:
        without a vector DB."""
        import inspect

-        from onyx.tools.tool_constructor import _construct_tools_impl
+        from onyx.tools.tool_constructor import construct_tools

-        source = inspect.getsource(_construct_tools_impl)
+        source = inspect.getsource(construct_tools)
        assert (
            "DISABLE_VECTOR_DB" in source
        ), "construct_tools should reference DISABLE_VECTOR_DB to suppress force-adding SearchTool"
--- a/backend/tests/unit/onyx/tools/test_image_generation_reference_resolution.py
+++ b/backend/tests/unit/onyx/tools/test_image_generation_reference_resolution.py
@@ -1,110 +0,0 @@
-"""Tests for ``ImageGenerationTool._resolve_reference_image_file_ids``.
-
-The resolver turns the LLM's ``reference_image_file_ids`` argument into a
-cleaned list of file IDs to hand to ``_load_reference_images``. It trusts
-the LLM's picks — the LLM can only see file IDs that actually appear in
-the conversation (via ``[attached image — file_id: <id>]`` tags on user
-messages and the JSON returned by prior generate_image calls), so we
-don't re-validate against an allow-list in the tool itself.
-"""
-
-from unittest.mock import MagicMock
-from unittest.mock import patch
-
-import pytest
-
-from onyx.tools.models import ToolCallException
-from onyx.tools.tool_implementations.images.image_generation_tool import (
-    ImageGenerationTool,
-)
-from onyx.tools.tool_implementations.images.image_generation_tool import (
-    REFERENCE_IMAGE_FILE_IDS_FIELD,
-)
-
-
-def _make_tool(
-    supports_reference_images: bool = True,
-    max_reference_images: int = 16,
-) -> ImageGenerationTool:
-    """Construct a tool with a mock provider so no credentials/network are needed."""
-    with patch(
-        "onyx.tools.tool_implementations.images.image_generation_tool.get_image_generation_provider"
-    ) as mock_get_provider:
-        mock_provider = MagicMock()
-        mock_provider.supports_reference_images = supports_reference_images
-        mock_provider.max_reference_images = max_reference_images
-        mock_get_provider.return_value = mock_provider
-
-        return ImageGenerationTool(
-            image_generation_credentials=MagicMock(),
-            tool_id=1,
-            emitter=MagicMock(),
-            model="gpt-image-1",
-            provider="openai",
-        )
-
-
-class TestResolveReferenceImageFileIds:
-    def test_unset_returns_empty_plain_generation(self) -> None:
-        tool = _make_tool()
-        assert tool._resolve_reference_image_file_ids(llm_kwargs={}) == []
-
-    def test_empty_list_is_treated_like_unset(self) -> None:
-        tool = _make_tool()
-        result = tool._resolve_reference_image_file_ids(
-            llm_kwargs={REFERENCE_IMAGE_FILE_IDS_FIELD: []},
-        )
-        assert result == []
-
-    def test_passes_llm_supplied_ids_through(self) -> None:
-        tool = _make_tool()
-        result = tool._resolve_reference_image_file_ids(
-            llm_kwargs={REFERENCE_IMAGE_FILE_IDS_FIELD: ["upload-1", "gen-1"]},
-        )
-        # Order preserved — first entry is the primary edit source.
-        assert result == ["upload-1", "gen-1"]
-
-    def test_invalid_shape_raises(self) -> None:
-        tool = _make_tool()
-        with pytest.raises(ToolCallException):
-            tool._resolve_reference_image_file_ids(
-                llm_kwargs={REFERENCE_IMAGE_FILE_IDS_FIELD: "not-a-list"},
-            )
-
-    def test_non_string_element_raises(self) -> None:
-        tool = _make_tool()
-        with pytest.raises(ToolCallException):
-            tool._resolve_reference_image_file_ids(
-                llm_kwargs={REFERENCE_IMAGE_FILE_IDS_FIELD: ["ok", 123]},
-            )
-
-    def test_deduplicates_preserving_first_occurrence(self) -> None:
-        tool = _make_tool()
-        result = tool._resolve_reference_image_file_ids(
-            llm_kwargs={REFERENCE_IMAGE_FILE_IDS_FIELD: ["gen-1", "gen-2", "gen-1"]},
-        )
-        assert result == ["gen-1", "gen-2"]
-
-    def test_strips_whitespace_and_skips_empty_strings(self) -> None:
-        tool = _make_tool()
-        result = tool._resolve_reference_image_file_ids(
-            llm_kwargs={REFERENCE_IMAGE_FILE_IDS_FIELD: ["  gen-1  ", "", "   "]},
-        )
-        assert result == ["gen-1"]
-
-    def test_provider_without_reference_support_raises(self) -> None:
-        tool = _make_tool(supports_reference_images=False)
-        with pytest.raises(ToolCallException):
-            tool._resolve_reference_image_file_ids(
-                llm_kwargs={REFERENCE_IMAGE_FILE_IDS_FIELD: ["gen-1"]},
-            )
-
-    def test_truncates_to_provider_max_preserving_head(self) -> None:
-        """When the LLM lists more images than the provider allows, keep the
-        HEAD of the list (the primary edit source + earliest extras) rather
-        than the tail, since the LLM put the most important one first."""
-        tool = _make_tool(max_reference_images=2)
-        result = tool._resolve_reference_image_file_ids(
-            llm_kwargs={REFERENCE_IMAGE_FILE_IDS_FIELD: ["a", "b", "c", "d"]},
-        )
-        assert result == ["a", "b"]
--- a/backend/tests/unit/onyx/tools/test_tool_runner.py
+++ b/backend/tests/unit/onyx/tools/test_tool_runner.py
@@ -1,5 +1,10 @@
+from onyx.chat.models import ChatMessageSimple
+from onyx.chat.models import ToolCallSimple
+from onyx.configs.constants import MessageType
 from onyx.server.query_and_chat.placement import Placement
 from onyx.tools.models import ToolCallKickoff
+from onyx.tools.tool_runner import _extract_image_file_ids_from_tool_response_message
+from onyx.tools.tool_runner import _extract_recent_generated_image_file_ids
 from onyx.tools.tool_runner import _merge_tool_calls


@@ -307,3 +312,62 @@ class TestMergeToolCalls:
        assert len(result) == 1
        # String should be converted to list item
        assert result[0].tool_args["queries"] == ["single_query", "q2"]
+
+
+class TestImageHistoryExtraction:
+    def test_extracts_image_file_ids_from_json_response(self) -> None:
+        msg = '[{"file_id":"img-1","revised_prompt":"v1"},{"file_id":"img-2","revised_prompt":"v2"}]'
+        assert _extract_image_file_ids_from_tool_response_message(msg) == [
+            "img-1",
+            "img-2",
+        ]
+
+    def test_extracts_recent_generated_image_ids_from_history(self) -> None:
+        history = [
+            ChatMessageSimple(
+                message="",
+                token_count=1,
+                message_type=MessageType.ASSISTANT,
+                tool_calls=[
+                    ToolCallSimple(
+                        tool_call_id="call_1",
+                        tool_name="generate_image",
+                        tool_arguments={"prompt": "test"},
+                        token_count=1,
+                    )
+                ],
+            ),
+            ChatMessageSimple(
+                message='[{"file_id":"img-1","revised_prompt":"r1"}]',
+                token_count=1,
+                message_type=MessageType.TOOL_CALL_RESPONSE,
+                tool_call_id="call_1",
+            ),
+        ]
+
+        assert _extract_recent_generated_image_file_ids(history) == ["img-1"]
+
+    def test_ignores_non_image_tool_responses(self) -> None:
+        history = [
+            ChatMessageSimple(
+                message="",
+                token_count=1,
+                message_type=MessageType.ASSISTANT,
+                tool_calls=[
+                    ToolCallSimple(
+                        tool_call_id="call_1",
+                        tool_name="web_search",
+                        tool_arguments={"queries": ["q"]},
+                        token_count=1,
+                    )
+                ],
+            ),
+            ChatMessageSimple(
+                message='[{"file_id":"img-1","revised_prompt":"r1"}]',
+                token_count=1,
+                message_type=MessageType.TOOL_CALL_RESPONSE,
+                tool_call_id="call_1",
+            ),
+        ]
+
+        assert _extract_recent_generated_image_file_ids(history) == []
--- a/backend/tests/unit/server/metrics/test_pruning_metrics.py
+++ b/backend/tests/unit/server/metrics/test_pruning_metrics.py
@@ -1,128 +0,0 @@
-"""Tests for pruning-specific Prometheus metrics."""
-
-import pytest
-
-from onyx.server.metrics.pruning_metrics import inc_pruning_rate_limit_error
-from onyx.server.metrics.pruning_metrics import observe_pruning_diff_duration
-from onyx.server.metrics.pruning_metrics import observe_pruning_enumeration_duration
-from onyx.server.metrics.pruning_metrics import PRUNING_DIFF_DURATION
-from onyx.server.metrics.pruning_metrics import PRUNING_ENUMERATION_DURATION
-from onyx.server.metrics.pruning_metrics import PRUNING_RATE_LIMIT_ERRORS
-
-
-class TestObservePruningEnumerationDuration:
-    def test_observes_duration(self) -> None:
-        before = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="google_drive"
-        )._sum.get()
-
-        observe_pruning_enumeration_duration(10.0, "google_drive")
-
-        after = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="google_drive"
-        )._sum.get()
-        assert after == pytest.approx(before + 10.0)
-
-    def test_labels_by_connector_type(self) -> None:
-        before_gd = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="google_drive"
-        )._sum.get()
-        before_conf = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="confluence"
-        )._sum.get()
-
-        observe_pruning_enumeration_duration(5.0, "google_drive")
-
-        after_gd = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="google_drive"
-        )._sum.get()
-        after_conf = PRUNING_ENUMERATION_DURATION.labels(
-            connector_type="confluence"
-        )._sum.get()
-
-        assert after_gd == pytest.approx(before_gd + 5.0)
-        assert after_conf == pytest.approx(before_conf)
-
-    def test_does_not_raise_on_exception(self, monkeypatch: pytest.MonkeyPatch) -> None:
-        monkeypatch.setattr(
-            PRUNING_ENUMERATION_DURATION,
-            "labels",
-            lambda **_: (_ for _ in ()).throw(RuntimeError("boom")),
-        )
-        observe_pruning_enumeration_duration(1.0, "google_drive")
-
-
-class TestObservePruningDiffDuration:
-    def test_observes_duration(self) -> None:
-        before = PRUNING_DIFF_DURATION.labels(connector_type="confluence")._sum.get()
-
-        observe_pruning_diff_duration(3.0, "confluence")
-
-        after = PRUNING_DIFF_DURATION.labels(connector_type="confluence")._sum.get()
-        assert after == pytest.approx(before + 3.0)
-
-    def test_labels_by_connector_type(self) -> None:
-        before_conf = PRUNING_DIFF_DURATION.labels(
-            connector_type="confluence"
-        )._sum.get()
-        before_slack = PRUNING_DIFF_DURATION.labels(connector_type="slack")._sum.get()
-
-        observe_pruning_diff_duration(2.0, "confluence")
-
-        after_conf = PRUNING_DIFF_DURATION.labels(
-            connector_type="confluence"
-        )._sum.get()
-        after_slack = PRUNING_DIFF_DURATION.labels(connector_type="slack")._sum.get()
-
-        assert after_conf == pytest.approx(before_conf + 2.0)
-        assert after_slack == pytest.approx(before_slack)
-
-    def test_does_not_raise_on_exception(self, monkeypatch: pytest.MonkeyPatch) -> None:
-        monkeypatch.setattr(
-            PRUNING_DIFF_DURATION,
-            "labels",
-            lambda **_: (_ for _ in ()).throw(RuntimeError("boom")),
-        )
-        observe_pruning_diff_duration(1.0, "confluence")
-
-
-class TestIncPruningRateLimitError:
-    def test_increments_counter(self) -> None:
-        before = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="google_drive"
-        )._value.get()
-
-        inc_pruning_rate_limit_error("google_drive")
-
-        after = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="google_drive"
-        )._value.get()
-        assert after == before + 1
-
-    def test_labels_by_connector_type(self) -> None:
-        before_gd = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="google_drive"
-        )._value.get()
-        before_jira = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="jira"
-        )._value.get()
-
-        inc_pruning_rate_limit_error("google_drive")
-
-        after_gd = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="google_drive"
-        )._value.get()
-        after_jira = PRUNING_RATE_LIMIT_ERRORS.labels(
-            connector_type="jira"
-        )._value.get()
-
-        assert after_gd == before_gd + 1
-        assert after_jira == before_jira
-
-    def test_does_not_raise_on_exception(self, monkeypatch: pytest.MonkeyPatch) -> None:
-        monkeypatch.setattr(
-            PRUNING_RATE_LIMIT_ERRORS,
-            "labels",
-            lambda **_: (_ for _ in ()).throw(RuntimeError("boom")),
-        )
-        inc_pruning_rate_limit_error("google_drive")
--- a/deployment/helm/charts/onyx/Chart.lock
+++ b/deployment/helm/charts/onyx/Chart.lock
@@ -19,6 +19,6 @@ dependencies:
  version: 5.4.0
 - name: code-interpreter
  repository: https://onyx-dot-app.github.io/python-sandbox/
-  version: 0.3.3
-digest: sha256:a57f29088b1624a72f6c70e4c3ccc2f2aad675e4624278c4e9be92083d6d5dad
-generated: "2026-04-08T16:47:29.33368-07:00"
+  version: 0.3.2
+digest: sha256:74908ea45ace2b4be913ff762772e6d87e40bab64e92c6662aa51730eaeb9d87
+generated: "2026-04-06T15:34:02.597166-07:00"
--- a/deployment/helm/charts/onyx/Chart.yaml
+++ b/deployment/helm/charts/onyx/Chart.yaml
@@ -45,6 +45,6 @@ dependencies:
    repository: https://charts.min.io/
    condition: minio.enabled
  - name: code-interpreter
-    version: 0.3.3
+    version: 0.3.2
    repository: https://onyx-dot-app.github.io/python-sandbox/
    condition: codeInterpreter.enabled
--- a/deployment/helm/charts/onyx/templates/celery-worker-heavy-metrics-service.yaml
+++ b/deployment/helm/charts/onyx/templates/celery-worker-heavy-metrics-service.yaml
@@ -1,26 +0,0 @@
-{{- /* Metrics port must match the default in metrics_server.py (_DEFAULT_PORTS).
-       Do NOT use PROMETHEUS_METRICS_PORT env var in Helm — each worker needs its own port. */ -}}
-{{- if gt (int .Values.celery_worker_heavy.replicaCount) 0 }}
-apiVersion: v1
-kind: Service
-metadata:
-  name: {{ include "onyx.fullname" . }}-celery-worker-heavy-metrics
-  labels:
-    {{- include "onyx.labels" . | nindent 4 }}
-    {{- if .Values.celery_worker_heavy.deploymentLabels }}
-    {{- toYaml .Values.celery_worker_heavy.deploymentLabels | nindent 4 }}
-    {{- end }}
-    metrics: "true"
-spec:
-  type: ClusterIP
-  ports:
-    - port: 9094
-      targetPort: metrics
-      protocol: TCP
-      name: metrics
-  selector:
-    {{- include "onyx.selectorLabels" . | nindent 4 }}
-    {{- if .Values.celery_worker_heavy.deploymentLabels }}
-    {{- toYaml .Values.celery_worker_heavy.deploymentLabels | nindent 4 }}
-    {{- end }}
-{{- end }}
--- a/deployment/helm/charts/onyx/templates/celery-worker-heavy.yaml
+++ b/deployment/helm/charts/onyx/templates/celery-worker-heavy.yaml
@@ -70,10 +70,6 @@ spec:
              "-Q",
              "connector_pruning,connector_doc_permissions_sync,connector_external_group_sync,csv_generation,sandbox",
            ]
-          ports:
-            - name: metrics
-              containerPort: 9094
-              protocol: TCP
          resources:
            {{- toYaml .Values.celery_worker_heavy.resources | nindent 12 }}
          envFrom:
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -28,7 +28,7 @@ dependencies = [
    "kubernetes>=31.0.0",
 ]

-[dependency-groups]
+[project.optional-dependencies]
 # Main backend application dependencies
 backend = [
    "aiohttp==3.13.4",
@@ -148,7 +148,7 @@ dev = [
    "matplotlib==3.10.8",
    "mypy-extensions==1.0.0",
    "mypy==1.13.0",
-    "onyx-devtools==0.7.3",
+    "onyx-devtools==0.7.2",
    "openapi-generator-cli==7.17.0",
    "pandas-stubs~=2.3.3",
    "pre-commit==3.2.2",
@@ -195,9 +195,6 @@ model_server = [
    "sentry-sdk[fastapi,celery,starlette]==2.14.0",
 ]

-[tool.uv]
-default-groups = ["backend", "dev", "ee", "model_server"]
-
 [tool.mypy]
 plugins = "sqlalchemy.ext.mypy.plugin"
 mypy_path = "backend"
@@ -233,7 +230,7 @@ follow_imports = "skip"
 ignore_errors = true

 [tool.uv.workspace]
-members = ["tools/ods"]
+members = ["backend", "tools/ods"]

 [tool.basedpyright]
 include = ["backend"]
--- a/tools/ods/cmd/deploy.go
+++ b/tools/ods/cmd/deploy.go
@@ -1,19 +0,0 @@
-package cmd
-
-import (
-	"github.com/spf13/cobra"
-)
-
-// NewDeployCommand creates the parent `ods deploy` command. Subcommands hang
-// off it (e.g. `ods deploy edge`) and represent ad-hoc deployment workflows.
-func NewDeployCommand() *cobra.Command {
-	cmd := &cobra.Command{
-		Use:   "deploy",
-		Short: "Trigger ad-hoc deployments",
-		Long:  "Trigger ad-hoc deployments to Onyx-managed environments.",
-	}
-
-	cmd.AddCommand(NewDeployEdgeCommand())
-
-	return cmd
-}
--- a/tools/ods/cmd/deploy_edge.go
+++ b/tools/ods/cmd/deploy_edge.go
@@ -1,353 +0,0 @@
-package cmd
-
-import (
-	"encoding/json"
-	"fmt"
-	"os/exec"
-	"sort"
-	"time"
-
-	log "github.com/sirupsen/logrus"
-	"github.com/spf13/cobra"
-
-	"github.com/onyx-dot-app/onyx/tools/ods/internal/config"
-	"github.com/onyx-dot-app/onyx/tools/ods/internal/git"
-	"github.com/onyx-dot-app/onyx/tools/ods/internal/paths"
-	"github.com/onyx-dot-app/onyx/tools/ods/internal/prompt"
-)
-
-const (
-	onyxRepo               = "onyx-dot-app/onyx"
-	deploymentWorkflowFile = "deployment.yml"
-	edgeTagName            = "edge"
-
-	// Polling configuration. Build runs typically take 20-30 minutes; deploys
-	// are much shorter. The "discover" phase polls fast for a short window
-	// because the run usually appears within seconds of pushing the tag /
-	// dispatching the workflow.
-	runDiscoveryInterval = 5 * time.Second
-	runDiscoveryTimeout  = 2 * time.Minute
-	runProgressInterval  = 30 * time.Second
-	buildPollTimeout     = 60 * time.Minute
-	deployPollTimeout    = 30 * time.Minute
-)
-
-// DeployEdgeOptions holds options for the deploy edge command.
-type DeployEdgeOptions struct {
-	TargetRepo     string
-	TargetWorkflow string
-	DryRun         bool
-	Yes            bool
-	NoWaitDeploy   bool
-}
-
-// NewDeployEdgeCommand creates the `ods deploy edge` command.
-func NewDeployEdgeCommand() *cobra.Command {
-	opts := &DeployEdgeOptions{}
-
-	cmd := &cobra.Command{
-		Use:   "edge",
-		Short: "Build edge images off main and deploy to the configured target",
-		Long: `Build edge images off origin/main and dispatch the configured deploy workflow.
-
-This command will:
-  1. Force-push the 'edge' tag to origin/main, triggering the build
-  2. Wait for the build workflow to finish
-  3. Dispatch the configured deploy workflow with version_tag=edge
-  4. Wait for the deploy workflow to finish
-
-All GitHub operations run through the gh CLI, so authorization is enforced
-by your gh credentials and GitHub's repo/workflow permissions.
-
-On first run, you'll be prompted for the deploy target repo and workflow
-filename. These are saved to the ods config file (~/.config/onyx-dev/config.json
-on Linux/macOS) and reused on subsequent runs. Pass --target-repo or
--target-workflow to override the saved values.
-
-Example usage:
-
-    $ ods deploy edge`,
-		Args: cobra.NoArgs,
-		Run: func(cmd *cobra.Command, args []string) {
-			deployEdge(opts)
-		},
-	}
-
-	cmd.Flags().StringVar(&opts.TargetRepo, "target-repo", "", "GitHub repo (owner/name) hosting the deploy workflow; overrides saved config")
-	cmd.Flags().StringVar(&opts.TargetWorkflow, "target-workflow", "", "Filename of the deploy workflow within the target repo; overrides saved config")
-	cmd.Flags().BoolVar(&opts.DryRun, "dry-run", false, "Perform local operations only; skip pushing the tag and dispatching workflows")
-	cmd.Flags().BoolVar(&opts.Yes, "yes", false, "Skip the confirmation prompt")
-	cmd.Flags().BoolVar(&opts.NoWaitDeploy, "no-wait-deploy", false, "Do not wait for the deploy workflow to finish after dispatching it")
-
-	return cmd
-}
-
-func deployEdge(opts *DeployEdgeOptions) {
-	git.CheckGitHubCLI()
-
-	deployRepo, deployWorkflow := resolveDeployTarget(opts)
-
-	if opts.DryRun {
-		log.Warning("=== DRY RUN MODE: tag push and workflow dispatch will be skipped (read-only gh and git fetch still run) ===")
-	}
-
-	if !opts.Yes {
-		msg := "About to force-push tag 'edge' to origin/main and trigger an ad-hoc deploy. Continue? (Y/n): "
-		if !prompt.Confirm(msg) {
-			log.Info("Exiting...")
-			return
-		}
-	}
-
-	// Capture the most recent existing edge build run id BEFORE pushing, so we
-	// can reliably identify the new run we trigger and not pick up a stale one.
-	priorBuildRunID, err := latestWorkflowRunID(onyxRepo, deploymentWorkflowFile, "push", edgeTagName)
-	if err != nil {
-		log.Fatalf("Failed to query existing deployment runs: %v", err)
-	}
-	log.Debugf("Most recent prior edge build run id: %d", priorBuildRunID)
-
-	log.Info("Fetching origin/main...")
-	if err := git.RunCommand("fetch", "origin", "main"); err != nil {
-		log.Fatalf("Failed to fetch origin/main: %v", err)
-	}
-
-	if opts.DryRun {
-		log.Warnf("[DRY RUN] Would move local '%s' tag to origin/main", edgeTagName)
-		log.Warnf("[DRY RUN] Would force-push tag '%s' to origin", edgeTagName)
-		log.Warn("[DRY RUN] Would wait for build then dispatch the configured deploy workflow")
-		return
-	}
-
-	log.Infof("Moving local '%s' tag to origin/main...", edgeTagName)
-	if err := git.RunCommand("tag", "-f", edgeTagName, "origin/main"); err != nil {
-		log.Fatalf("Failed to move local tag: %v", err)
-	}
-
-	log.Infof("Force-pushing tag '%s' to origin...", edgeTagName)
-	if err := git.RunCommand("push", "-f", "origin", edgeTagName); err != nil {
-		log.Fatalf("Failed to push edge tag: %v", err)
-	}
-
-	// Find the new build run, then poll it to completion.
-	log.Info("Waiting for build workflow to start...")
-	buildRun, err := waitForNewRun(onyxRepo, deploymentWorkflowFile, "push", edgeTagName, priorBuildRunID)
-	if err != nil {
-		log.Fatalf("Failed to find triggered build run: %v", err)
-	}
-	log.Infof("Build run started: %s", buildRun.URL)
-
-	if err := waitForRunCompletion(onyxRepo, buildRun.DatabaseID, buildPollTimeout, "build"); err != nil {
-		log.Fatalf("Build did not complete successfully: %v", err)
-	}
-	log.Info("Build completed successfully.")
-
-	// Dispatch the deploy workflow.
-	priorDeployRunID, err := latestWorkflowRunID(deployRepo, deployWorkflow, "workflow_dispatch", "")
-	if err != nil {
-		log.Fatalf("Failed to query existing deploy runs: %v", err)
-	}
-	log.Debugf("Most recent prior deploy run id: %d", priorDeployRunID)
-
-	log.Info("Dispatching deploy workflow with version_tag=edge...")
-	if err := dispatchWorkflow(deployRepo, deployWorkflow, map[string]string{"version_tag": edgeTagName}); err != nil {
-		log.Fatalf("Failed to dispatch deploy workflow: %v", err)
-	}
-
-	deployRun, err := waitForNewRun(deployRepo, deployWorkflow, "workflow_dispatch", "", priorDeployRunID)
-	if err != nil {
-		log.Fatalf("Failed to find dispatched deploy run: %v", err)
-	}
-	log.Infof("Deploy run started: %s", deployRun.URL)
-	log.Info("A kickoff Slack message will appear in #monitor-deployments.")
-
-	if opts.NoWaitDeploy {
-		log.Info("--no-wait-deploy set; not waiting for deploy completion.")
-		return
-	}
-
-	if err := waitForRunCompletion(deployRepo, deployRun.DatabaseID, deployPollTimeout, "deploy"); err != nil {
-		log.Fatalf("Deploy did not complete successfully: %v", err)
-	}
-	log.Info("Deploy completed successfully.")
-}
-
-// resolveDeployTarget returns the deploy target repo and workflow to use,
-// preferring explicit flags, then saved config, then prompting the user on
-// first-time setup. Any newly-prompted values are persisted back to the
-// config file so subsequent runs are non-interactive.
-func resolveDeployTarget(opts *DeployEdgeOptions) (string, string) {
-	cfg, err := config.Load()
-	if err != nil {
-		log.Fatalf("Failed to load ods config: %v", err)
-	}
-
-	repo := opts.TargetRepo
-	if repo == "" {
-		repo = cfg.DeployEdge.TargetRepo
-	}
-	workflow := opts.TargetWorkflow
-	if workflow == "" {
-		workflow = cfg.DeployEdge.TargetWorkflow
-	}
-
-	prompted := false
-	if repo == "" {
-		log.Infof("First-time setup: ods will save your deploy target to %s", paths.ConfigFilePath())
-		repo = prompt.String("Deploy target repo (owner/name): ")
-		prompted = true
-	}
-	if workflow == "" {
-		workflow = prompt.String("Deploy workflow filename (e.g. some-workflow.yml): ")
-		prompted = true
-	}
-
-	if prompted {
-		cfg.DeployEdge.TargetRepo = repo
-		cfg.DeployEdge.TargetWorkflow = workflow
-		if err := config.Save(cfg); err != nil {
-			log.Fatalf("Failed to save ods config: %v", err)
-		}
-		log.Infof("Saved deploy target to %s", paths.ConfigFilePath())
-	}
-
-	return repo, workflow
-}
-
-// workflowRun is a partial representation of a `gh run list` JSON entry.
-type workflowRun struct {
-	DatabaseID int64  `json:"databaseId"`
-	Status     string `json:"status"`
-	Conclusion string `json:"conclusion"`
-	URL        string `json:"url"`
-	Event      string `json:"event"`
-	HeadBranch string `json:"headBranch"`
-}
-
-// latestWorkflowRunID returns the highest databaseId for runs of the given
-// workflow filtered by event (and optional branch). Returns 0 if no runs
-// exist yet, which is a valid state.
-func latestWorkflowRunID(repo, workflowFile, event, branch string) (int64, error) {
-	runs, err := listWorkflowRuns(repo, workflowFile, event, branch, 10)
-	if err != nil {
-		return 0, err
-	}
-	var maxID int64
-	for _, r := range runs {
-		if r.DatabaseID > maxID {
-			maxID = r.DatabaseID
-		}
-	}
-	return maxID, nil
-}
-
-func listWorkflowRuns(repo, workflowFile, event, branch string, limit int) ([]workflowRun, error) {
-	args := []string{
-		"run", "list",
-		"-R", repo,
-		"--workflow", workflowFile,
-		"--limit", fmt.Sprintf("%d", limit),
-		"--json", "databaseId,status,conclusion,url,event,headBranch",
-	}
-	if event != "" {
-		args = append(args, "--event", event)
-	}
-	if branch != "" {
-		args = append(args, "--branch", branch)
-	}
-	cmd := exec.Command("gh", args...)
-	output, err := cmd.Output()
-	if err != nil {
-		if exitErr, ok := err.(*exec.ExitError); ok {
-			return nil, fmt.Errorf("gh run list failed: %w: %s", err, string(exitErr.Stderr))
-		}
-		return nil, fmt.Errorf("gh run list failed: %w", err)
-	}
-	var runs []workflowRun
-	if err := json.Unmarshal(output, &runs); err != nil {
-		return nil, fmt.Errorf("failed to parse gh run list output: %w", err)
-	}
-	// Sort newest-first by databaseId for predictable iteration.
-	sort.Slice(runs, func(i, j int) bool { return runs[i].DatabaseID > runs[j].DatabaseID })
-	return runs, nil
-}
-
-// waitForNewRun polls until a workflow run with databaseId > priorRunID
-// appears, or the discovery timeout fires.
-func waitForNewRun(repo, workflowFile, event, branch string, priorRunID int64) (*workflowRun, error) {
-	deadline := time.Now().Add(runDiscoveryTimeout)
-	for {
-		runs, err := listWorkflowRuns(repo, workflowFile, event, branch, 5)
-		if err != nil {
-			return nil, err
-		}
-		for _, r := range runs {
-			if r.DatabaseID > priorRunID {
-				return &r, nil
-			}
-		}
-		if time.Now().After(deadline) {
-			return nil, fmt.Errorf("no new run appeared within %s", runDiscoveryTimeout)
-		}
-		time.Sleep(runDiscoveryInterval)
-	}
-}
-
-// waitForRunCompletion polls a specific run until it reaches a terminal
-// status. Returns an error if the run does not conclude with success or the
-// timeout fires.
-func waitForRunCompletion(repo string, runID int64, timeout time.Duration, label string) error {
-	deadline := time.Now().Add(timeout)
-	for {
-		run, err := getRun(repo, runID)
-		if err != nil {
-			return err
-		}
-		log.Infof("[%s] run %d status=%s conclusion=%s", label, runID, run.Status, run.Conclusion)
-		if run.Status == "completed" {
-			if run.Conclusion == "success" {
-				return nil
-			}
-			return fmt.Errorf("%s run %d concluded with status %q (see %s)", label, runID, run.Conclusion, run.URL)
-		}
-		if time.Now().After(deadline) {
-			return fmt.Errorf("%s run %d did not complete within %s (see %s)", label, runID, timeout, run.URL)
-		}
-		time.Sleep(runProgressInterval)
-	}
-}
-
-func getRun(repo string, runID int64) (*workflowRun, error) {
-	cmd := exec.Command(
-		"gh", "run", "view", fmt.Sprintf("%d", runID),
-		"-R", repo,
-		"--json", "databaseId,status,conclusion,url,event,headBranch",
-	)
-	output, err := cmd.Output()
-	if err != nil {
-		if exitErr, ok := err.(*exec.ExitError); ok {
-			return nil, fmt.Errorf("gh run view failed: %w: %s", err, string(exitErr.Stderr))
-		}
-		return nil, fmt.Errorf("gh run view failed: %w", err)
-	}
-	var run workflowRun
-	if err := json.Unmarshal(output, &run); err != nil {
-		return nil, fmt.Errorf("failed to parse gh run view output: %w", err)
-	}
-	return &run, nil
-}
-
-// dispatchWorkflow fires a workflow_dispatch event for the given workflow with
-// the supplied string inputs.
-func dispatchWorkflow(repo, workflowFile string, inputs map[string]string) error {
-	args := []string{"workflow", "run", workflowFile, "-R", repo}
-	for k, v := range inputs {
-		args = append(args, "-f", fmt.Sprintf("%s=%s", k, v))
-	}
-	cmd := exec.Command("gh", args...)
-	output, err := cmd.CombinedOutput()
-	if err != nil {
-		return fmt.Errorf("gh workflow run failed: %w: %s", err, string(output))
-	}
-	return nil
-}
--- a/tools/ods/cmd/root.go
+++ b/tools/ods/cmd/root.go
@@ -45,7 +45,6 @@ func NewRootCommand() *cobra.Command {
 	cmd.AddCommand(NewCheckLazyImportsCommand())
 	cmd.AddCommand(NewCherryPickCommand())
 	cmd.AddCommand(NewDBCommand())
-	cmd.AddCommand(NewDeployCommand())
 	cmd.AddCommand(NewOpenAPICommand())
 	cmd.AddCommand(NewComposeCommand())
 	cmd.AddCommand(NewLogsCommand())
--- a/tools/ods/internal/config/config.go
+++ b/tools/ods/internal/config/config.go
@@ -1,56 +0,0 @@
-package config
-
-import (
-	"encoding/json"
-	"errors"
-	"fmt"
-	"os"
-
-	"github.com/onyx-dot-app/onyx/tools/ods/internal/paths"
-)
-
-// DeployEdgeConfig holds the persisted settings for `ods deploy edge`.
-type DeployEdgeConfig struct {
-	TargetRepo     string `json:"target_repo,omitempty"`
-	TargetWorkflow string `json:"target_workflow,omitempty"`
-}
-
-// Config is the top-level on-disk schema for ~/.config/onyx-dev/config.json.
-// New per-command sections should be added as additional fields.
-type Config struct {
-	DeployEdge DeployEdgeConfig `json:"deploy_edge,omitempty"`
-}
-
-// Load reads the config file. Returns a zero-valued Config if the file does
-// not exist (a fresh first-run state, not an error).
-func Load() (*Config, error) {
-	path := paths.ConfigFilePath()
-	data, err := os.ReadFile(path)
-	if err != nil {
-		if errors.Is(err, os.ErrNotExist) {
-			return &Config{}, nil
-		}
-		return nil, fmt.Errorf("failed to read config file %s: %w", path, err)
-	}
-	var cfg Config
-	if err := json.Unmarshal(data, &cfg); err != nil {
-		return nil, fmt.Errorf("failed to parse config file %s: %w", path, err)
-	}
-	return &cfg, nil
-}
-
-// Save persists the config to disk, creating the parent directory if needed.
-func Save(cfg *Config) error {
-	if err := paths.EnsureConfigDir(); err != nil {
-		return fmt.Errorf("failed to create config directory: %w", err)
-	}
-	data, err := json.MarshalIndent(cfg, "", "  ")
-	if err != nil {
-		return fmt.Errorf("failed to marshal config: %w", err)
-	}
-	path := paths.ConfigFilePath()
-	if err := os.WriteFile(path, data, 0644); err != nil {
-		return fmt.Errorf("failed to write config file %s: %w", path, err)
-	}
-	return nil
-}
--- a/tools/ods/internal/paths/paths.go
+++ b/tools/ods/internal/paths/paths.go
@@ -47,43 +47,6 @@ func DataDir() string {
 	return filepath.Join(base, "onyx-dev")
 }

-// ConfigDir returns the per-user config directory for onyx-dev tools.
-// On Linux/macOS: ~/.config/onyx-dev/ (respects XDG_CONFIG_HOME)
-// On Windows:    %APPDATA%/onyx-dev/
-func ConfigDir() string {
-	var base string
-	if runtime.GOOS == "windows" {
-		base = os.Getenv("APPDATA")
-		if base == "" {
-			base = os.Getenv("USERPROFILE")
-			if base == "" {
-				log.Fatalf("Cannot determine config directory: APPDATA and USERPROFILE are not set")
-			}
-			base = filepath.Join(base, "AppData", "Roaming")
-		}
-	} else {
-		base = os.Getenv("XDG_CONFIG_HOME")
-		if base == "" {
-			home, err := os.UserHomeDir()
-			if err != nil || home == "" {
-				log.Fatalf("Cannot determine config directory: XDG_CONFIG_HOME not set and home directory unknown: %v", err)
-			}
-			base = filepath.Join(home, ".config")
-		}
-	}
-	return filepath.Join(base, "onyx-dev")
-}
-
-// ConfigFilePath returns the path to the ods config file.
-func ConfigFilePath() string {
-	return filepath.Join(ConfigDir(), "config.json")
-}
-
-// EnsureConfigDir creates the config directory if it doesn't exist.
-func EnsureConfigDir() error {
-	return os.MkdirAll(ConfigDir(), 0755)
-}
-
 // SnapshotsDir returns the directory for database snapshots.
 func SnapshotsDir() string {
 	return filepath.Join(DataDir(), "snapshots")
--- a/tools/ods/internal/prompt/prompt.go
+++ b/tools/ods/internal/prompt/prompt.go
@@ -12,23 +12,6 @@ import (
 // reader is the input reader, can be replaced for testing
 var reader = bufio.NewReader(os.Stdin)

-// String prompts the user for a free-form line of input. Re-prompts until a
-// non-empty value is entered.
-func String(prompt string) string {
-	for {
-		fmt.Print(prompt)
-		response, err := reader.ReadString('\n')
-		if err != nil {
-			log.Fatalf("Failed to read input: %v", err)
-		}
-		response = strings.TrimSpace(response)
-		if response != "" {
-			return response
-		}
-		fmt.Println("Value cannot be empty.")
-	}
-}
-
 // Confirm prompts the user with a yes/no question and returns true for yes, false for no.
 // It will keep prompting until a valid response is given.
 // Empty input (just pressing Enter) defaults to yes.
--- a/uv.lock
+++ b/uv.lock
@@ -14,6 +14,12 @@ resolution-markers = [
    "python_full_version < '3.12' and sys_platform != 'win32'",
 ]

+[manifest]
+members = [
+    "onyx",
+    "onyx-backend",
+]
+
 [[package]]
 name = "accelerate"
 version = "1.6.0"
@@ -4228,7 +4234,7 @@ dependencies = [
    { name = "voyageai" },
 ]

-[package.dev-dependencies]
+[package.optional-dependencies]
 backend = [
    { name = "aiohttp" },
    { name = "alembic" },
@@ -4382,191 +4388,195 @@ model-server = [

 [package.metadata]
 requires-dist = [
+    { name = "accelerate", marker = "extra == 'model-server'", specifier = "==1.6.0" },
    { name = "agent-client-protocol", specifier = ">=0.7.1" },
    { name = "aioboto3", specifier = "==15.1.0" },
+    { name = "aiohttp", marker = "extra == 'backend'", specifier = "==3.13.4" },
+    { name = "alembic", marker = "extra == 'backend'", specifier = "==1.10.4" },
+    { name = "asana", marker = "extra == 'backend'", specifier = "==5.0.8" },
+    { name = "asyncpg", marker = "extra == 'backend'", specifier = "==0.30.0" },
+    { name = "atlassian-python-api", marker = "extra == 'backend'", specifier = "==3.41.16" },
+    { name = "azure-cognitiveservices-speech", marker = "extra == 'backend'", specifier = "==1.38.0" },
+    { name = "beautifulsoup4", marker = "extra == 'backend'", specifier = "==4.12.3" },
+    { name = "black", marker = "extra == 'dev'", specifier = "==25.1.0" },
+    { name = "boto3", marker = "extra == 'backend'", specifier = "==1.39.11" },
+    { name = "boto3-stubs", extras = ["s3"], marker = "extra == 'backend'", specifier = "==1.39.11" },
+    { name = "braintrust", marker = "extra == 'backend'", specifier = "==0.3.9" },
    { name = "brotli", specifier = ">=1.2.0" },
+    { name = "celery", marker = "extra == 'backend'", specifier = "==5.5.1" },
+    { name = "celery-types", marker = "extra == 'dev'", specifier = "==0.19.0" },
+    { name = "chardet", marker = "extra == 'backend'", specifier = "==5.2.0" },
+    { name = "chonkie", marker = "extra == 'backend'", specifier = "==1.0.10" },
    { name = "claude-agent-sdk", specifier = ">=0.1.19" },
    { name = "cohere", specifier = "==5.6.1" },
+    { name = "dask", marker = "extra == 'backend'", specifier = "==2026.1.1" },
+    { name = "ddtrace", marker = "extra == 'backend'", specifier = "==3.10.0" },
    { name = "discord-py", specifier = "==2.4.0" },
+    { name = "discord-py", marker = "extra == 'backend'", specifier = "==2.4.0" },
+    { name = "distributed", marker = "extra == 'backend'", specifier = "==2026.1.1" },
+    { name = "dropbox", marker = "extra == 'backend'", specifier = "==12.0.2" },
+    { name = "einops", marker = "extra == 'model-server'", specifier = "==0.8.1" },
+    { name = "exa-py", marker = "extra == 'backend'", specifier = "==1.15.4" },
+    { name = "faker", marker = "extra == 'dev'", specifier = "==40.1.2" },
    { name = "fastapi", specifier = "==0.133.1" },
+    { name = "fastapi-limiter", marker = "extra == 'backend'", specifier = "==0.1.6" },
+    { name = "fastapi-users", marker = "extra == 'backend'", specifier = "==15.0.4" },
+    { name = "fastapi-users-db-sqlalchemy", marker = "extra == 'backend'", specifier = "==7.0.0" },
+    { name = "fastmcp", marker = "extra == 'backend'", specifier = "==3.2.0" },
+    { name = "filelock", marker = "extra == 'backend'", specifier = "==3.20.3" },
+    { name = "google-api-python-client", marker = "extra == 'backend'", specifier = "==2.86.0" },
+    { name = "google-auth-httplib2", marker = "extra == 'backend'", specifier = "==0.1.0" },
+    { name = "google-auth-oauthlib", marker = "extra == 'backend'", specifier = "==1.0.0" },
    { name = "google-genai", specifier = "==1.52.0" },
+    { name = "hatchling", marker = "extra == 'dev'", specifier = "==1.28.0" },
+    { name = "httpcore", marker = "extra == 'backend'", specifier = "==1.0.9" },
+    { name = "httpx", extras = ["http2"], marker = "extra == 'backend'", specifier = "==0.28.1" },
+    { name = "httpx-oauth", marker = "extra == 'backend'", specifier = "==0.15.1" },
+    { name = "hubspot-api-client", marker = "extra == 'backend'", specifier = "==11.1.0" },
+    { name = "huggingface-hub", marker = "extra == 'backend'", specifier = "==0.35.3" },
+    { name = "inflection", marker = "extra == 'backend'", specifier = "==0.5.1" },
+    { name = "ipykernel", marker = "extra == 'dev'", specifier = "==6.29.5" },
+    { name = "jira", marker = "extra == 'backend'", specifier = "==3.10.5" },
+    { name = "jsonref", marker = "extra == 'backend'", specifier = "==1.1.0" },
    { name = "kubernetes", specifier = ">=31.0.0" },
+    { name = "kubernetes", marker = "extra == 'backend'", specifier = "==31.0.0" },
+    { name = "langchain-core", marker = "extra == 'backend'", specifier = "==1.2.22" },
+    { name = "langfuse", marker = "extra == 'backend'", specifier = "==3.10.0" },
+    { name = "lazy-imports", marker = "extra == 'backend'", specifier = "==1.0.1" },
    { name = "litellm", specifier = "==1.81.6" },
+    { name = "lxml", marker = "extra == 'backend'", specifier = "==5.3.0" },
+    { name = "mako", marker = "extra == 'backend'", specifier = "==1.2.4" },
+    { name = "manygo", marker = "extra == 'dev'", specifier = "==0.2.0" },
+    { name = "markitdown", extras = ["pdf", "docx", "pptx", "xlsx", "xls"], marker = "extra == 'backend'", specifier = "==0.1.2" },
+    { name = "matplotlib", marker = "extra == 'dev'", specifier = "==3.10.8" },
+    { name = "mcp", extras = ["cli"], marker = "extra == 'backend'", specifier = "==1.26.0" },
+    { name = "mistune", marker = "extra == 'backend'", specifier = "==3.2.0" },
+    { name = "msal", marker = "extra == 'backend'", specifier = "==1.34.0" },
+    { name = "msoffcrypto-tool", marker = "extra == 'backend'", specifier = "==5.4.2" },
+    { name = "mypy", marker = "extra == 'dev'", specifier = "==1.13.0" },
+    { name = "mypy-extensions", marker = "extra == 'dev'", specifier = "==1.0.0" },
+    { name = "nest-asyncio", marker = "extra == 'backend'", specifier = "==1.6.0" },
+    { name = "numpy", marker = "extra == 'model-server'", specifier = "==2.4.1" },
+    { name = "oauthlib", marker = "extra == 'backend'", specifier = "==3.2.2" },
+    { name = "office365-rest-python-client", marker = "extra == 'backend'", specifier = "==2.6.2" },
+    { name = "onyx-devtools", marker = "extra == 'dev'", specifier = "==0.7.2" },
    { name = "openai", specifier = "==2.14.0" },
+    { name = "openapi-generator-cli", marker = "extra == 'dev'", specifier = "==7.17.0" },
+    { name = "openinference-instrumentation", marker = "extra == 'backend'", specifier = "==0.1.42" },
+    { name = "openpyxl", marker = "extra == 'backend'", specifier = "==3.0.10" },
+    { name = "opensearch-py", marker = "extra == 'backend'", specifier = "==3.0.0" },
+    { name = "opentelemetry-proto", marker = "extra == 'backend'", specifier = ">=1.39.0" },
+    { name = "pandas-stubs", marker = "extra == 'dev'", specifier = "~=2.3.3" },
+    { name = "passlib", marker = "extra == 'backend'", specifier = "==1.7.4" },
+    { name = "playwright", marker = "extra == 'backend'", specifier = "==1.55.0" },
+    { name = "posthog", marker = "extra == 'ee'", specifier = "==3.7.4" },
+    { name = "pre-commit", marker = "extra == 'dev'", specifier = "==3.2.2" },
    { name = "prometheus-client", specifier = ">=0.21.1" },
    { name = "prometheus-fastapi-instrumentator", specifier = "==7.1.0" },
+    { name = "psutil", marker = "extra == 'backend'", specifier = "==7.1.3" },
+    { name = "psycopg2-binary", marker = "extra == 'backend'", specifier = "==2.9.9" },
+    { name = "puremagic", marker = "extra == 'backend'", specifier = "==1.28" },
+    { name = "pyairtable", marker = "extra == 'backend'", specifier = "==3.0.1" },
+    { name = "pycryptodome", marker = "extra == 'backend'", specifier = "==3.19.1" },
    { name = "pydantic", specifier = "==2.11.7" },
+    { name = "pygithub", marker = "extra == 'backend'", specifier = "==2.5.0" },
+    { name = "pympler", marker = "extra == 'backend'", specifier = "==1.1" },
+    { name = "pypandoc-binary", marker = "extra == 'backend'", specifier = "==1.16.2" },
+    { name = "pypdf", marker = "extra == 'backend'", specifier = "==6.9.2" },
+    { name = "pytest", marker = "extra == 'dev'", specifier = "==8.3.5" },
+    { name = "pytest-alembic", marker = "extra == 'dev'", specifier = "==0.12.1" },
+    { name = "pytest-asyncio", marker = "extra == 'dev'", specifier = "==1.3.0" },
+    { name = "pytest-dotenv", marker = "extra == 'dev'", specifier = "==0.5.2" },
+    { name = "pytest-mock", marker = "extra == 'backend'", specifier = "==3.12.0" },
+    { name = "pytest-playwright", marker = "extra == 'backend'", specifier = "==0.7.0" },
+    { name = "pytest-repeat", marker = "extra == 'dev'", specifier = "==0.9.4" },
+    { name = "pytest-xdist", marker = "extra == 'dev'", specifier = "==3.8.0" },
+    { name = "python-dateutil", marker = "extra == 'backend'", specifier = "==2.8.2" },
+    { name = "python-docx", marker = "extra == 'backend'", specifier = "==1.1.2" },
+    { name = "python-dotenv", marker = "extra == 'backend'", specifier = "==1.1.1" },
+    { name = "python-gitlab", marker = "extra == 'backend'", specifier = "==5.6.0" },
+    { name = "python-multipart", marker = "extra == 'backend'", specifier = "==0.0.22" },
+    { name = "python-pptx", marker = "extra == 'backend'", specifier = "==0.6.23" },
+    { name = "python3-saml", marker = "extra == 'backend'", specifier = "==1.15.0" },
+    { name = "pywikibot", marker = "extra == 'backend'", specifier = "==9.0.0" },
+    { name = "rapidfuzz", marker = "extra == 'backend'", specifier = "==3.13.0" },
+    { name = "redis", marker = "extra == 'backend'", specifier = "==5.0.8" },
+    { name = "release-tag", marker = "extra == 'dev'", specifier = "==0.5.2" },
+    { name = "reorder-python-imports-black", marker = "extra == 'dev'", specifier = "==3.14.0" },
+    { name = "requests", marker = "extra == 'backend'", specifier = "==2.33.0" },
+    { name = "requests-oauthlib", marker = "extra == 'backend'", specifier = "==1.3.1" },
    { name = "retry", specifier = "==0.9.2" },
+    { name = "rfc3986", marker = "extra == 'backend'", specifier = "==1.5.0" },
+    { name = "ruff", marker = "extra == 'dev'", specifier = "==0.12.0" },
+    { name = "safetensors", marker = "extra == 'model-server'", specifier = "==0.5.3" },
+    { name = "sendgrid", marker = "extra == 'backend'", specifier = "==6.12.5" },
+    { name = "sentence-transformers", marker = "extra == 'model-server'", specifier = "==4.0.2" },
    { name = "sentry-sdk", specifier = "==2.14.0" },
+    { name = "sentry-sdk", extras = ["fastapi", "celery", "starlette"], marker = "extra == 'model-server'", specifier = "==2.14.0" },
+    { name = "shapely", marker = "extra == 'backend'", specifier = "==2.0.6" },
+    { name = "simple-salesforce", marker = "extra == 'backend'", specifier = "==1.12.6" },
+    { name = "slack-sdk", marker = "extra == 'backend'", specifier = "==3.20.2" },
+    { name = "sqlalchemy", extras = ["mypy"], marker = "extra == 'backend'", specifier = "==2.0.15" },
+    { name = "starlette", marker = "extra == 'backend'", specifier = "==0.49.3" },
+    { name = "stripe", marker = "extra == 'backend'", specifier = "==10.12.0" },
+    { name = "supervisor", marker = "extra == 'backend'", specifier = "==4.3.0" },
+    { name = "tiktoken", marker = "extra == 'backend'", specifier = "==0.7.0" },
+    { name = "timeago", marker = "extra == 'backend'", specifier = "==1.0.16" },
+    { name = "torch", marker = "extra == 'model-server'", specifier = "==2.9.1" },
+    { name = "trafilatura", marker = "extra == 'backend'", specifier = "==1.12.2" },
+    { name = "transformers", marker = "extra == 'model-server'", specifier = "==4.53.0" },
+    { name = "types-beautifulsoup4", marker = "extra == 'dev'", specifier = "==4.12.0.3" },
+    { name = "types-html5lib", marker = "extra == 'dev'", specifier = "==1.1.11.13" },
+    { name = "types-oauthlib", marker = "extra == 'dev'", specifier = "==3.2.0.9" },
+    { name = "types-openpyxl", marker = "extra == 'backend'", specifier = "==3.0.4.7" },
+    { name = "types-passlib", marker = "extra == 'dev'", specifier = "==1.7.7.20240106" },
+    { name = "types-pillow", marker = "extra == 'dev'", specifier = "==10.2.0.20240822" },
+    { name = "types-psutil", marker = "extra == 'dev'", specifier = "==7.1.3.20251125" },
+    { name = "types-psycopg2", marker = "extra == 'dev'", specifier = "==2.9.21.10" },
+    { name = "types-python-dateutil", marker = "extra == 'dev'", specifier = "==2.8.19.13" },
+    { name = "types-pytz", marker = "extra == 'dev'", specifier = "==2023.3.1.1" },
+    { name = "types-pyyaml", marker = "extra == 'dev'", specifier = "==6.0.12.11" },
+    { name = "types-regex", marker = "extra == 'dev'", specifier = "==2023.3.23.1" },
+    { name = "types-requests", marker = "extra == 'dev'", specifier = "==2.32.0.20250328" },
+    { name = "types-retry", marker = "extra == 'dev'", specifier = "==0.9.9.3" },
+    { name = "types-setuptools", marker = "extra == 'dev'", specifier = "==68.0.0.3" },
+    { name = "unstructured", marker = "extra == 'backend'", specifier = "==0.18.27" },
+    { name = "unstructured-client", marker = "extra == 'backend'", specifier = "==0.42.6" },
+    { name = "urllib3", marker = "extra == 'backend'", specifier = "==2.6.3" },
    { name = "uvicorn", specifier = "==0.35.0" },
    { name = "voyageai", specifier = "==0.2.3" },
+    { name = "xmlsec", marker = "extra == 'backend'", specifier = "==1.3.14" },
+    { name = "zizmor", marker = "extra == 'dev'", specifier = "==1.18.0" },
+    { name = "zulip", marker = "extra == 'backend'", specifier = "==0.8.2" },
+]
+provides-extras = ["backend", "dev", "ee", "model-server"]
+
+[[package]]
+name = "onyx-backend"
+version = "0.0.0"
+source = { virtual = "backend" }
+dependencies = [
+    { name = "onyx", extra = ["backend", "dev", "ee"] },
 ]

-[package.metadata.requires-dev]
-backend = [
-    { name = "aiohttp", specifier = "==3.13.4" },
-    { name = "alembic", specifier = "==1.10.4" },
-    { name = "asana", specifier = "==5.0.8" },
-    { name = "asyncpg", specifier = "==0.30.0" },
-    { name = "atlassian-python-api", specifier = "==3.41.16" },
-    { name = "azure-cognitiveservices-speech", specifier = "==1.38.0" },
-    { name = "beautifulsoup4", specifier = "==4.12.3" },
-    { name = "boto3", specifier = "==1.39.11" },
-    { name = "boto3-stubs", extras = ["s3"], specifier = "==1.39.11" },
-    { name = "braintrust", specifier = "==0.3.9" },
-    { name = "celery", specifier = "==5.5.1" },
-    { name = "chardet", specifier = "==5.2.0" },
-    { name = "chonkie", specifier = "==1.0.10" },
-    { name = "dask", specifier = "==2026.1.1" },
-    { name = "ddtrace", specifier = "==3.10.0" },
-    { name = "discord-py", specifier = "==2.4.0" },
-    { name = "distributed", specifier = "==2026.1.1" },
-    { name = "dropbox", specifier = "==12.0.2" },
-    { name = "exa-py", specifier = "==1.15.4" },
-    { name = "fastapi-limiter", specifier = "==0.1.6" },
-    { name = "fastapi-users", specifier = "==15.0.4" },
-    { name = "fastapi-users-db-sqlalchemy", specifier = "==7.0.0" },
-    { name = "fastmcp", specifier = "==3.2.0" },
-    { name = "filelock", specifier = "==3.20.3" },
-    { name = "google-api-python-client", specifier = "==2.86.0" },
-    { name = "google-auth-httplib2", specifier = "==0.1.0" },
-    { name = "google-auth-oauthlib", specifier = "==1.0.0" },
-    { name = "httpcore", specifier = "==1.0.9" },
-    { name = "httpx", extras = ["http2"], specifier = "==0.28.1" },
-    { name = "httpx-oauth", specifier = "==0.15.1" },
-    { name = "hubspot-api-client", specifier = "==11.1.0" },
-    { name = "huggingface-hub", specifier = "==0.35.3" },
-    { name = "inflection", specifier = "==0.5.1" },
-    { name = "jira", specifier = "==3.10.5" },
-    { name = "jsonref", specifier = "==1.1.0" },
-    { name = "kubernetes", specifier = "==31.0.0" },
-    { name = "langchain-core", specifier = "==1.2.22" },
-    { name = "langfuse", specifier = "==3.10.0" },
-    { name = "lazy-imports", specifier = "==1.0.1" },
-    { name = "lxml", specifier = "==5.3.0" },
-    { name = "mako", specifier = "==1.2.4" },
-    { name = "markitdown", extras = ["pdf", "docx", "pptx", "xlsx", "xls"], specifier = "==0.1.2" },
-    { name = "mcp", extras = ["cli"], specifier = "==1.26.0" },
-    { name = "mistune", specifier = "==3.2.0" },
-    { name = "msal", specifier = "==1.34.0" },
-    { name = "msoffcrypto-tool", specifier = "==5.4.2" },
-    { name = "nest-asyncio", specifier = "==1.6.0" },
-    { name = "oauthlib", specifier = "==3.2.2" },
-    { name = "office365-rest-python-client", specifier = "==2.6.2" },
-    { name = "openinference-instrumentation", specifier = "==0.1.42" },
-    { name = "openpyxl", specifier = "==3.0.10" },
-    { name = "opensearch-py", specifier = "==3.0.0" },
-    { name = "opentelemetry-proto", specifier = ">=1.39.0" },
-    { name = "passlib", specifier = "==1.7.4" },
-    { name = "playwright", specifier = "==1.55.0" },
-    { name = "psutil", specifier = "==7.1.3" },
-    { name = "psycopg2-binary", specifier = "==2.9.9" },
-    { name = "puremagic", specifier = "==1.28" },
-    { name = "pyairtable", specifier = "==3.0.1" },
-    { name = "pycryptodome", specifier = "==3.19.1" },
-    { name = "pygithub", specifier = "==2.5.0" },
-    { name = "pympler", specifier = "==1.1" },
-    { name = "pypandoc-binary", specifier = "==1.16.2" },
-    { name = "pypdf", specifier = "==6.9.2" },
-    { name = "pytest-mock", specifier = "==3.12.0" },
-    { name = "pytest-playwright", specifier = "==0.7.0" },
-    { name = "python-dateutil", specifier = "==2.8.2" },
-    { name = "python-docx", specifier = "==1.1.2" },
-    { name = "python-dotenv", specifier = "==1.1.1" },
-    { name = "python-gitlab", specifier = "==5.6.0" },
-    { name = "python-multipart", specifier = "==0.0.22" },
-    { name = "python-pptx", specifier = "==0.6.23" },
-    { name = "python3-saml", specifier = "==1.15.0" },
-    { name = "pywikibot", specifier = "==9.0.0" },
-    { name = "rapidfuzz", specifier = "==3.13.0" },
-    { name = "redis", specifier = "==5.0.8" },
-    { name = "requests", specifier = "==2.33.0" },
-    { name = "requests-oauthlib", specifier = "==1.3.1" },
-    { name = "rfc3986", specifier = "==1.5.0" },
-    { name = "sendgrid", specifier = "==6.12.5" },
-    { name = "shapely", specifier = "==2.0.6" },
-    { name = "simple-salesforce", specifier = "==1.12.6" },
-    { name = "slack-sdk", specifier = "==3.20.2" },
-    { name = "sqlalchemy", extras = ["mypy"], specifier = "==2.0.15" },
-    { name = "starlette", specifier = "==0.49.3" },
-    { name = "stripe", specifier = "==10.12.0" },
-    { name = "supervisor", specifier = "==4.3.0" },
-    { name = "tiktoken", specifier = "==0.7.0" },
-    { name = "timeago", specifier = "==1.0.16" },
-    { name = "trafilatura", specifier = "==1.12.2" },
-    { name = "types-openpyxl", specifier = "==3.0.4.7" },
-    { name = "unstructured", specifier = "==0.18.27" },
-    { name = "unstructured-client", specifier = "==0.42.6" },
-    { name = "urllib3", specifier = "==2.6.3" },
-    { name = "xmlsec", specifier = "==1.3.14" },
-    { name = "zulip", specifier = "==0.8.2" },
-]
-dev = [
-    { name = "black", specifier = "==25.1.0" },
-    { name = "celery-types", specifier = "==0.19.0" },
-    { name = "faker", specifier = "==40.1.2" },
-    { name = "hatchling", specifier = "==1.28.0" },
-    { name = "ipykernel", specifier = "==6.29.5" },
-    { name = "manygo", specifier = "==0.2.0" },
-    { name = "matplotlib", specifier = "==3.10.8" },
-    { name = "mypy", specifier = "==1.13.0" },
-    { name = "mypy-extensions", specifier = "==1.0.0" },
-    { name = "onyx-devtools", specifier = "==0.7.3" },
-    { name = "openapi-generator-cli", specifier = "==7.17.0" },
-    { name = "pandas-stubs", specifier = "~=2.3.3" },
-    { name = "pre-commit", specifier = "==3.2.2" },
-    { name = "pytest", specifier = "==8.3.5" },
-    { name = "pytest-alembic", specifier = "==0.12.1" },
-    { name = "pytest-asyncio", specifier = "==1.3.0" },
-    { name = "pytest-dotenv", specifier = "==0.5.2" },
-    { name = "pytest-repeat", specifier = "==0.9.4" },
-    { name = "pytest-xdist", specifier = "==3.8.0" },
-    { name = "release-tag", specifier = "==0.5.2" },
-    { name = "reorder-python-imports-black", specifier = "==3.14.0" },
-    { name = "ruff", specifier = "==0.12.0" },
-    { name = "types-beautifulsoup4", specifier = "==4.12.0.3" },
-    { name = "types-html5lib", specifier = "==1.1.11.13" },
-    { name = "types-oauthlib", specifier = "==3.2.0.9" },
-    { name = "types-passlib", specifier = "==1.7.7.20240106" },
-    { name = "types-pillow", specifier = "==10.2.0.20240822" },
-    { name = "types-psutil", specifier = "==7.1.3.20251125" },
-    { name = "types-psycopg2", specifier = "==2.9.21.10" },
-    { name = "types-python-dateutil", specifier = "==2.8.19.13" },
-    { name = "types-pytz", specifier = "==2023.3.1.1" },
-    { name = "types-pyyaml", specifier = "==6.0.12.11" },
-    { name = "types-regex", specifier = "==2023.3.23.1" },
-    { name = "types-requests", specifier = "==2.32.0.20250328" },
-    { name = "types-retry", specifier = "==0.9.9.3" },
-    { name = "types-setuptools", specifier = "==68.0.0.3" },
-    { name = "zizmor", specifier = "==1.18.0" },
-]
-ee = [{ name = "posthog", specifier = "==3.7.4" }]
-model-server = [
-    { name = "accelerate", specifier = "==1.6.0" },
-    { name = "einops", specifier = "==0.8.1" },
-    { name = "numpy", specifier = "==2.4.1" },
-    { name = "safetensors", specifier = "==0.5.3" },
-    { name = "sentence-transformers", specifier = "==4.0.2" },
-    { name = "sentry-sdk", extras = ["fastapi", "celery", "starlette"], specifier = "==2.14.0" },
-    { name = "torch", specifier = "==2.9.1" },
-    { name = "transformers", specifier = "==4.53.0" },
-]
+[package.metadata]
+requires-dist = [{ name = "onyx", extras = ["backend", "dev", "ee"], editable = "." }]

 [[package]]
 name = "onyx-devtools"
-version = "0.7.3"
+version = "0.7.2"
 source = { registry = "https://pypi.org/simple" }
 dependencies = [
    { name = "fastapi" },
    { name = "openapi-generator-cli" },
 ]
 wheels = [
-    { url = "https://files.pythonhosted.org/packages/72/64/c75be8ab325896cc64bccd0e1e139a03ce305bf05598967922d380fc4694/onyx_devtools-0.7.3-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:675e2fdbd8d291fba4b8a6dfcf2bc94c56d22d11f395a9f0d0c3c0e5b39d7f9b", size = 4220613, upload-time = "2026-04-09T00:04:36.624Z" },
-    { url = "https://files.pythonhosted.org/packages/ae/1f/589ff6bd446c4498f5bcdfd2a315709e91fc15edf5440c91ff64cbf0800f/onyx_devtools-0.7.3-py3-none-macosx_11_0_arm64.whl", hash = "sha256:bf3993de8ba02d6c2f1ab12b5b9b965e005040b37502f97db8a7d88d9b0cde4b", size = 3897867, upload-time = "2026-04-09T00:04:40.781Z" },
-    { url = "https://files.pythonhosted.org/packages/10/c0/53c9173eefc13218707282c5b99753960d039684994c3b3caf90ce286094/onyx_devtools-0.7.3-py3-none-manylinux_2_17_aarch64.whl", hash = "sha256:6138a94084bed05c674ad210a0bc4006c43bc4384e8eb54d469233de85c72bd7", size = 3762408, upload-time = "2026-04-09T00:04:41.592Z" },
-    { url = "https://files.pythonhosted.org/packages/d2/37/69fadb65112854a596d200f704da94b837817d4dd0f46cb4482dc0309c94/onyx_devtools-0.7.3-py3-none-manylinux_2_17_x86_64.whl", hash = "sha256:90dac91b0cdc32eb8861f6e83545009a34c439fd3c41fc7dd499acd0105b660e", size = 4184427, upload-time = "2026-04-09T00:04:41.525Z" },
-    { url = "https://files.pythonhosted.org/packages/bd/45/91c829ccb45f1a15e7c9641eccc6dd154adb540e03c7dee2a8f28cea24d0/onyx_devtools-0.7.3-py3-none-win_amd64.whl", hash = "sha256:abc68d70bec06e349481beec4b212de28a1a8b7ed6ef3b41daf7093ee10b44f3", size = 4299935, upload-time = "2026-04-09T00:04:40.262Z" },
-    { url = "https://files.pythonhosted.org/packages/cc/30/c5adcb8e3b46b71d8d92c3f9ee0c1d0bc5e2adc9f46e93931f21b36a3ee4/onyx_devtools-0.7.3-py3-none-win_arm64.whl", hash = "sha256:9e4411cadc5e81fabc9ed991402e3b4b40f02800681299c277b2142e5af0dcee", size = 3840228, upload-time = "2026-04-09T00:04:39.708Z" },
+    { url = "https://files.pythonhosted.org/packages/22/b0/765ed49157470e8ccc8ab89e6a896ade50cde3aa2a494662ad4db92a48c4/onyx_devtools-0.7.2-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:553a2b5e61b29b7913c991c8d5aed78f930f0f81a0f42229c6a8de2b1e8ff57e", size = 4203859, upload-time = "2026-03-27T15:09:49.63Z" },
+    { url = "https://files.pythonhosted.org/packages/f7/9d/bba0a44a16d2fc27e5441aaf10727e10514e7a49bce70eca02bced566eb9/onyx_devtools-0.7.2-py3-none-macosx_11_0_arm64.whl", hash = "sha256:5cf0782dca8b3d861de9e18e65e990cfce5161cd559df44d8fabd3fefd54fdcd", size = 3879750, upload-time = "2026-03-27T15:09:42.413Z" },
+    { url = "https://files.pythonhosted.org/packages/4d/d8/c5725e8af14c74fe0aeed29e4746400bb3c0a078fd1240df729dc6432b84/onyx_devtools-0.7.2-py3-none-manylinux_2_17_aarch64.whl", hash = "sha256:9a0d67373e16b4fbb38a5290c0d9dfd4cfa837e5da0c165b32841b9d37f7455b", size = 3743529, upload-time = "2026-03-27T15:09:44.546Z" },
+    { url = "https://files.pythonhosted.org/packages/1a/82/b7c398a21dbc3e14fd7a29e49caa86b1bc0f8d7c75c051514785441ab779/onyx_devtools-0.7.2-py3-none-manylinux_2_17_x86_64.whl", hash = "sha256:794af14b2de575d0ae41b94551399eca8f8ba9b950c5db7acb7612767fd228f9", size = 4166562, upload-time = "2026-03-27T15:09:49.471Z" },
+    { url = "https://files.pythonhosted.org/packages/26/76/be129e2baafc91fe792d919b1f4d73fc943ba9c2b728a60f1fb98e0c115a/onyx_devtools-0.7.2-py3-none-win_amd64.whl", hash = "sha256:83b3eb84df58d865e4f714222a5fab3ea464836e2c8690569454a940bbb651ff", size = 4282270, upload-time = "2026-03-27T15:09:44.676Z" },
+    { url = "https://files.pythonhosted.org/packages/3b/72/29b8c8dbcf069c56475f00511f04c4aaa5ba3faba1dfc8276107d4b3ef7f/onyx_devtools-0.7.2-py3-none-win_arm64.whl", hash = "sha256:62f0836624ee6a5b31e64fd93162e7fce142ac8a4f959607e411824bc2b88174", size = 3823053, upload-time = "2026-03-27T15:09:43.546Z" },
 ]

 [[package]]
--- a/web/lib/opal/scripts/README.md
+++ b/web/lib/opal/scripts/README.md
@@ -14,23 +14,21 @@ All scripts in this directory should be run from the **opal package root** (`web
 web/lib/opal/
 ├── scripts/                          # SVG conversion tooling (this directory)
 │   ├── convert-svg.sh                # Converts SVGs into React components
-│   └── icon-template.js              # Shared SVGR template (used for icons, logos, and illustrations)
+│   └── icon-template.js              # Shared SVGR template (used for both icons and illustrations)
 ├── src/
 │   ├── icons/                        # Small, single-colour icons (stroke = currentColor)
-│   ├── logos/                        # Brand/vendor logos (original colours preserved)
 │   └── illustrations/                # Larger, multi-colour illustrations (colours preserved)
 └── package.json
 ```

-## Icons vs Logos vs Illustrations
+## Icons vs Illustrations

-| | Icons | Logos | Illustrations |
-|---|---|---|---|
-| **Import path** | `@opal/icons` | `@opal/logos` | `@opal/illustrations` |
-| **Location** | `src/icons/` | `src/logos/` | `src/illustrations/` |
-| **Colour** | Overridable via `currentColor` | Fixed — original brand colours preserved | Fixed — original SVG colours preserved |
-| **Script flag** | (none) | `--logo` | `--illustration` |
-| **Use case** | UI elements, actions, navigation | Provider logos, platform logos, brand marks | Empty states, error pages, placeholders |
+| | Icons | Illustrations |
+|---|---|---|
+| **Import path** | `@opal/icons` | `@opal/illustrations` |
+| **Location** | `src/icons/` | `src/illustrations/` |
+| **Colour** | Overridable via `currentColor` | Fixed — original SVG colours preserved |
+| **Script flag** | (none) | `--illustration` |

 ## Files in This Directory

@@ -51,19 +49,12 @@ Converts an SVG into a React component. Behaviour depends on the mode:
 - Adds `width={size}`, `height={size}`, and `stroke="currentColor"`
 - Result is colour-overridable via CSS `color` property

-**Logo mode** (`--logo`):
- Strips only `width` and `height` attributes (all colours preserved)
- Adds `width={size}` and `height={size}`
- Does **not** add `stroke="currentColor"` — logos keep their original brand colours
-
 **Illustration mode** (`--illustration`):
 - Strips only `width` and `height` attributes (all colours preserved)
 - Adds `width={size}` and `height={size}`
 - Does **not** add `stroke="currentColor"` — illustrations keep their original colours

-Both `--logo` and `--illustration` produce the same output — the distinction is purely organizational (different directories, different barrel exports).
-
-All modes automatically delete the source SVG file after successful conversion.
+Both modes automatically delete the source SVG file after successful conversion.

 ## Adding New SVGs

@@ -79,18 +70,6 @@ Then add the export to `src/icons/index.ts`:
 export { default as SvgMyIcon } from "@opal/icons/my-icon";
 ```

-### Logos
-
-```sh
-# From web/lib/opal/
-./scripts/convert-svg.sh --logo src/logos/my-logo.svg
-```
-
-Then add the export to `src/logos/index.ts`:
-```ts
-export { default as SvgMyLogo } from "@opal/logos/my-logo";
-```
-
 ### Illustrations

 ```sh
@@ -112,7 +91,7 @@ If you prefer to run the SVGR command directly:
 bunx @svgr/cli <file>.svg --typescript --svgo-config '{"plugins":[{"name":"removeAttrs","params":{"attrs":["stroke","stroke-opacity","width","height"]}}]}' --template scripts/icon-template.js > <file>.tsx
 ```

-**For logos and illustrations** (preserves colours):
+**For illustrations** (preserves colours):
 ```sh
 bunx @svgr/cli <file>.svg --typescript --svgo-config '{"plugins":[{"name":"removeAttrs","params":{"attrs":["width","height"]}}]}' --template scripts/icon-template.js > <file>.tsx
 ```
--- a/web/lib/opal/scripts/convert-svg.sh
+++ b/web/lib/opal/scripts/convert-svg.sh
@@ -4,36 +4,30 @@
 #
 # By default, converts to a colour-overridable icon (stroke colours stripped, replaced with currentColor).
 # With --illustration, converts to a fixed-colour illustration (all original colours preserved).
-# With --logo, converts to a fixed-colour logo (all original colours preserved, same as illustration).
 #
 # Usage (from the opal package root — web/lib/opal/):
 #   ./scripts/convert-svg.sh src/icons/<filename.svg>
 #   ./scripts/convert-svg.sh --illustration src/illustrations/<filename.svg>
-#   ./scripts/convert-svg.sh --logo src/logos/<filename.svg>

-MODE="icon"
+ILLUSTRATION=false

 # Parse flags
 while [[ "$1" == --* ]]; do
  case "$1" in
    --illustration)
-      MODE="illustration"
-      shift
-      ;;
-    --logo)
-      MODE="logo"
+      ILLUSTRATION=true
      shift
      ;;
    *)
      echo "Unknown flag: $1" >&2
-      echo "Usage: ./scripts/convert-svg.sh [--illustration | --logo] <filename.svg>" >&2
+      echo "Usage: ./scripts/convert-svg.sh [--illustration] <filename.svg>" >&2
      exit 1
      ;;
  esac
 done

 if [ -z "$1" ]; then
-  echo "Usage: ./scripts/convert-svg.sh [--illustration | --logo] <filename.svg>" >&2
+  echo "Usage: ./scripts/convert-svg.sh [--illustration] <filename.svg>" >&2
  exit 1
 fi

@@ -55,12 +49,12 @@ fi
 BASE_NAME="${SVG_FILE%.svg}"

 # Build the SVGO config based on mode
-if [ "$MODE" = "icon" ]; then
+if [ "$ILLUSTRATION" = true ]; then
+  # Illustrations: only strip width and height (preserve all colours)
+  SVGO_CONFIG='{"plugins":[{"name":"removeAttrs","params":{"attrs":["width","height"]}}]}'
+else
  # Icons: strip stroke, stroke-opacity, width, and height
  SVGO_CONFIG='{"plugins":[{"name":"removeAttrs","params":{"attrs":["stroke","stroke-opacity","width","height"]}}]}'
-else
-  # Illustrations and logos: only strip width and height (preserve all colours)
-  SVGO_CONFIG='{"plugins":[{"name":"removeAttrs","params":{"attrs":["width","height"]}}]}'
 fi

 # Resolve the template path relative to this script (not the caller's CWD)
@@ -91,7 +85,7 @@ if [ $? -eq 0 ]; then
  fi

  # Icons additionally get stroke="currentColor"
-  if [ "$MODE" = "icon" ]; then
+  if [ "$ILLUSTRATION" = false ]; then
    perl -i -pe 's/\{\.\.\.props\}/stroke="currentColor" {...props}/g' "${BASE_NAME}.tsx"
    if [ $? -ne 0 ]; then
      echo "Error: Failed to add stroke attribute" >&2
@@ -112,7 +106,7 @@ if [ $? -eq 0 ]; then
  fi

  # For icons, also verify stroke="currentColor" was added
-  if [ "$MODE" = "icon" ]; then
+  if [ "$ILLUSTRATION" = false ]; then
    if ! grep -q 'stroke="currentColor"' "${BASE_NAME}.tsx"; then
      echo "Error: Post-processing did not add stroke=\"currentColor\"" >&2
      exit 1
--- a/web/lib/opal/src/core/interactive/stateful/components.tsx
+++ b/web/lib/opal/src/core/interactive/stateful/components.tsx
@@ -15,7 +15,6 @@ type InteractiveStatefulVariant =
  | "select-heavy"
  | "select-card"
  | "select-tinted"
-  | "select-input"
  | "select-filter"
  | "sidebar-heavy"
  | "sidebar-light";
@@ -36,7 +35,6 @@ interface InteractiveStatefulProps
   * - `"select-heavy"` — tinted selected background (for list rows, model pickers)
   * - `"select-card"` — like select-heavy but filled state has a visible background (for cards/larger surfaces)
   * - `"select-tinted"` — like select-heavy but with a tinted rest background
-   * - `"select-input"` — rests at neutral-00 (matches input bar), hover/open shows neutral-03 + border-01
   * - `"select-filter"` — like select-tinted for empty/filled; selected state uses inverted tint backgrounds and inverted text (for filter buttons)
   * - `"sidebar-heavy"` — sidebar navigation items: muted when unselected (text-03/text-02), bold when selected (text-04/text-03)
   * - `"sidebar-light"` — sidebar navigation items: uniformly muted across all states (text-02/text-02)
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
SubashMohan	acfa30f865	feat(permissions): update permission checks for creating personal access tokens and enhance UI with permission validation	2026-04-11 18:15:42 +05:30
SubashMohan	606ab55d73	feat(permissions): update permission identifiers for service accounts and bots management	2026-04-11 17:51:50 +05:30
SubashMohan	6223ba531b	feat(permissions): update permission checks for query history access	2026-04-11 17:45:41 +05:30
SubashMohan	6f6e64ad63	feat(permissions): enhance document set management with refined error handling and permission checks	2026-04-10 10:35:29 +05:30
SubashMohan	a05f09faa2	feat(permissions): update permission checks to use ADD_AGENTS for user actions and enhance agent creation button with permission validation	2026-04-09 17:56:13 +05:30
SubashMohan	5912f632a3	feat(permissions): add READ_DOCUMENT_SETS permission and update permission checks for document sets and personas	2026-04-09 17:15:32 +05:30
SubashMohan	3e5cfa66d1	refactor(permissions): remove admin role check from effective permissions function	2026-04-09 15:25:09 +05:30
SubashMohan	b2f5eb3ec7	feat(permissions): add READ_USER_GROUPS permission and update user group access checks	2026-04-09 15:19:57 +05:30
SubashMohan	0ab2b8065d	feat(permissions): enhance permission handling with effective permissions and admin checks	2026-04-09 13:16:22 +05:30
SubashMohan	4c304bf393	feat(permissions): add has_permission function and update permission checks to use MANAGE_LLMS	2026-04-09 11:56:27 +05:30
SubashMohan	cef5caa8b1	feat(icons): add SvgCreateAgent and SvgManageAgent components and update icon mapping	2026-04-08 18:16:47 +05:30
SubashMohan	f7b8650d5c	feat(permissions): implement permission registry endpoint and update related models	2026-04-08 17:58:00 +05:30
SubashMohan	df532aa87d	feat(user-group): implement bulk permission setting for user groups	2026-04-08 17:38:09 +05:30