chore(helm): remove broken code-interpreter dependency

The code-interpreter Helm chart repo at https://onyx-dot-app.github.io/code-interpreter/ returns 404, causing ct lint to fail in CI. Remove it from Chart.yaml dependencies, Chart.lock, ct.yaml chart-repos, and the CI workflow's helm repo add step.
feat: delta sync sharepoint (#8532 )
2026-02-20 17:25:44 +00:00 · 2026-02-19 20:17:14 -08:00 · 2026-02-20 03:26:54 +00:00 · 2026-02-19 19:23:26 -08:00 · 2026-02-20 03:14:30 +00:00 · 2026-02-19 19:05:14 -08:00
1961 changed files with 194530 additions and 33605 deletions
--- a/.claude/skills
+++ b/.claude/skills
@@ -0,0 +1 @@
+../.cursor/skills
--- a/.cursor/mcp.json
+++ b/.cursor/mcp.json
@@ -0,0 +1,16 @@
+{
+  "mcpServers": {
+    "Playwright": {
+      "command": "npx",
+      "args": [
+        "@playwright/mcp"
+      ]
+    },
+    "Linear": {
+      "url": "https://mcp.linear.app/mcp"
+    },
+    "Figma": {
+      "url": "https://mcp.figma.com/mcp"
+    }
+  }
+}
--- a/.cursor/skills/playwright/SKILL.md
+++ b/.cursor/skills/playwright/SKILL.md
@@ -0,0 +1,248 @@
+---
+name: playwright-e2e-tests
+description: Write and maintain Playwright end-to-end tests for the Onyx application. Use when creating new E2E tests, debugging test failures, adding test coverage, or when the user mentions Playwright, E2E tests, or browser testing.
+---
+
+# Playwright E2E Tests
+
+## Project Layout
+
+- **Tests**: `web/tests/e2e/` — organized by feature (`auth/`, `admin/`, `chat/`, `assistants/`, `connectors/`, `mcp/`)
+- **Config**: `web/playwright.config.ts`
+- **Utilities**: `web/tests/e2e/utils/`
+- **Constants**: `web/tests/e2e/constants.ts`
+- **Global setup**: `web/tests/e2e/global-setup.ts`
+- **Output**: `web/output/playwright/`
+
+## Imports
+
+Always use absolute imports with the `@tests/e2e/` prefix — never relative paths (`../`, `../../`). The alias is defined in `web/tsconfig.json` and resolves to `web/tests/`.
+
+```typescript
+import { loginAs } from "@tests/e2e/utils/auth";
+import { OnyxApiClient } from "@tests/e2e/utils/onyxApiClient";
+import { TEST_ADMIN_CREDENTIALS } from "@tests/e2e/constants";
+```
+
+All new files should be `.ts`, not `.js`.
+
+## Running Tests
+
+```bash
+# Run a specific test file
+npx playwright test web/tests/e2e/chat/default_assistant.spec.ts
+
+# Run a specific project
+npx playwright test --project admin
+npx playwright test --project exclusive
+```
+
+## Test Projects
+
+| Project | Description | Parallelism |
+|---------|-------------|-------------|
+| `admin` | Standard tests (excludes `@exclusive`) | Parallel |
+| `exclusive` | Serial, slower tests (tagged `@exclusive`) | 1 worker |
+
+All tests use `admin_auth.json` storage state by default (pre-authenticated admin session).
+
+## Authentication
+
+Global setup (`global-setup.ts`) runs automatically before all tests and handles:
+
+- Server readiness check (polls health endpoint, 60s timeout)
+- Provisioning test users: admin, admin2, and a **pool of worker users** (`worker0@example.com` through `worker7@example.com`) (idempotent)
+- API login + saving storage states: `admin_auth.json`, `admin2_auth.json`, and `worker{N}_auth.json` for each worker user
+- Setting display name to `"worker"` for each worker user
+- Promoting admin2 to admin role
+- Ensuring a public LLM provider exists
+
+Both test projects set `storageState: "admin_auth.json"`, so **every test starts pre-authenticated as admin with no login code needed**.
+
+When a test needs a different user, use API-based login — never drive the login UI:
+
+```typescript
+import { loginAs } from "@tests/e2e/utils/auth";
+
+await page.context().clearCookies();
+await loginAs(page, "admin2");
+
+// Log in as the worker-specific user (preferred for test isolation):
+import { loginAsWorkerUser } from "@tests/e2e/utils/auth";
+await page.context().clearCookies();
+await loginAsWorkerUser(page, testInfo.workerIndex);
+```
+
+## Test Structure
+
+Tests start pre-authenticated as admin — navigate and test directly:
+
+```typescript
+import { test, expect } from "@playwright/test";
+
+test.describe("Feature Name", () => {
+  test("should describe expected behavior clearly", async ({ page }) => {
+    await page.goto("/app");
+    await page.waitForLoadState("networkidle");
+    // Already authenticated as admin — go straight to testing
+  });
+});
+```
+
+**User isolation** — tests that modify visible app state (creating assistants, sending chat messages, pinning items) should run as a **worker-specific user** and clean up resources in `afterAll`. Global setup provisions a pool of worker users (`worker0@example.com` through `worker7@example.com`). `loginAsWorkerUser` maps `testInfo.workerIndex` to a pool slot via modulo, so retry workers (which get incrementing indices beyond the pool size) safely reuse existing users. This ensures parallel workers never share user state, keeps usernames deterministic for screenshots, and avoids cross-contamination:
+
+```typescript
+import { test } from "@playwright/test";
+import { loginAsWorkerUser } from "@tests/e2e/utils/auth";
+
+test.beforeEach(async ({ page }, testInfo) => {
+  await page.context().clearCookies();
+  await loginAsWorkerUser(page, testInfo.workerIndex);
+});
+```
+
+If the test requires admin privileges *and* modifies visible state, use `"admin2"` instead — it's a pre-provisioned admin account that keeps the primary `"admin"` clean for other parallel tests. Switch to `"admin"` only for privileged setup (creating providers, configuring tools), then back to the worker user for the actual test. See `chat/default_assistant.spec.ts` for a full example.
+
+`loginAsRandomUser` exists for the rare case where the test requires a brand-new user (e.g. onboarding flows). Avoid it elsewhere — it produces non-deterministic usernames that complicate screenshots.
+
+**API resource setup** — only when tests need to create backend resources (image gen configs, web search providers, MCP servers). Use `beforeAll`/`afterAll` with `OnyxApiClient` to create and clean up. See `chat/default_assistant.spec.ts` or `mcp/mcp_oauth_flow.spec.ts` for examples. This is uncommon (~4 of 37 test files).
+
+## Key Utilities
+
+### `OnyxApiClient` (`@tests/e2e/utils/onyxApiClient`)
+
+Backend API client for test setup/teardown. Key methods:
+
+- **Connectors**: `createFileConnector()`, `deleteCCPair()`, `pauseConnector()`
+- **LLM Providers**: `ensurePublicProvider()`, `createRestrictedProvider()`, `setProviderAsDefault()`
+- **Assistants**: `createAssistant()`, `deleteAssistant()`, `findAssistantByName()`
+- **User Groups**: `createUserGroup()`, `deleteUserGroup()`, `setUserRole()`
+- **Tools**: `createWebSearchProvider()`, `createImageGenerationConfig()`
+- **Chat**: `createChatSession()`, `deleteChatSession()`
+
+### `chatActions` (`@tests/e2e/utils/chatActions`)
+
+- `sendMessage(page, message)` — sends a message and waits for AI response
+- `startNewChat(page)` — clicks new-chat button and waits for intro
+- `verifyDefaultAssistantIsChosen(page)` — checks Onyx logo is visible
+- `verifyAssistantIsChosen(page, name)` — checks assistant name display
+- `switchModel(page, modelName)` — switches LLM model via popover
+
+### `visualRegression` (`@tests/e2e/utils/visualRegression`)
+
+- `expectScreenshot(page, { name, mask?, hide?, fullPage? })`
+- `expectElementScreenshot(locator, { name, mask?, hide? })`
+- Controlled by `VISUAL_REGRESSION=true` env var
+
+### `theme` (`@tests/e2e/utils/theme`)
+
+- `THEMES` — `["light", "dark"] as const` array for iterating over both themes
+- `setThemeBeforeNavigation(page, theme)` — sets `next-themes` theme via `localStorage` before navigation
+
+When tests need light/dark screenshots, loop over `THEMES` at the `test.describe` level and call `setThemeBeforeNavigation` in `beforeEach` **before** any `page.goto()`. Include the theme in screenshot names. See `admin/admin_pages.spec.ts` or `chat/chat_message_rendering.spec.ts` for examples:
+
+```typescript
+import { THEMES, setThemeBeforeNavigation } from "@tests/e2e/utils/theme";
+
+for (const theme of THEMES) {
+  test.describe(`Feature (${theme} mode)`, () => {
+    test.beforeEach(async ({ page }) => {
+      await setThemeBeforeNavigation(page, theme);
+    });
+
+    test("renders correctly", async ({ page }) => {
+      await page.goto("/app");
+      await expectScreenshot(page, { name: `feature-${theme}` });
+    });
+  });
+}
+```
+
+### `tools` (`@tests/e2e/utils/tools`)
+
+- `TOOL_IDS` — centralized `data-testid` selectors for tool options
+- `openActionManagement(page)` — opens the tool management popover
+
+## Locator Strategy
+
+Use locators in this priority order:
+
+1. **`data-testid` / `aria-label`** — preferred for Onyx components
+   ```typescript
+   page.getByTestId("AppSidebar/new-session")
+   page.getByLabel("admin-page-title")
+   ```
+
+2. **Role-based** — for standard HTML elements
+   ```typescript
+   page.getByRole("button", { name: "Create" })
+   page.getByRole("dialog")
+   ```
+
+3. **Text/Label** — for visible text content
+   ```typescript
+   page.getByText("Custom Assistant")
+   page.getByLabel("Email")
+   ```
+
+4. **CSS selectors** — last resort, only when above won't work
+   ```typescript
+   page.locator('input[name="name"]')
+   page.locator("#onyx-chat-input-textarea")
+   ```
+
+**Never use** `page.locator` with complex CSS/XPath when a built-in locator works.
+
+## Assertions
+
+Use web-first assertions — they auto-retry until the condition is met:
+
+```typescript
+// Visibility
+await expect(page.getByTestId("onyx-logo")).toBeVisible({ timeout: 5000 });
+
+// Text content
+await expect(page.getByTestId("assistant-name-display")).toHaveText("My Assistant");
+
+// Count
+await expect(page.locator('[data-testid="onyx-ai-message"]')).toHaveCount(2, { timeout: 30000 });
+
+// URL
+await expect(page).toHaveURL(/chatId=/);
+
+// Element state
+await expect(toggle).toBeChecked();
+await expect(button).toBeEnabled();
+```
+
+**Never use** `assert` statements or hardcoded `page.waitForTimeout()`.
+
+## Waiting Strategy
+
+```typescript
+// Wait for load state after navigation
+await page.goto("/app");
+await page.waitForLoadState("networkidle");
+
+// Wait for specific element
+await page.getByTestId("chat-intro").waitFor({ state: "visible", timeout: 10000 });
+
+// Wait for URL change
+await page.waitForFunction(() => window.location.href.includes("chatId="), null, { timeout: 10000 });
+
+// Wait for network response
+await page.waitForResponse(resp => resp.url().includes("/api/chat") && resp.status() === 200);
+```
+
+## Best Practices
+
+1. **Descriptive test names** — clearly state expected behavior: `"should display greeting message when opening new chat"`
+2. **API-first setup** — use `OnyxApiClient` for backend state; reserve UI interactions for the behavior under test
+3. **User isolation** — tests that modify visible app state (sidebar, chat history) should run as the worker-specific user via `loginAsWorkerUser(page, testInfo.workerIndex)` (not admin) and clean up resources in `afterAll`. Each parallel worker gets its own user, preventing cross-contamination. Reserve `loginAsRandomUser` for flows that require a brand-new user (e.g. onboarding)
+4. **DRY helpers** — extract reusable logic into `utils/` with JSDoc comments
+5. **No hardcoded waits** — use `waitFor`, `waitForLoadState`, or web-first assertions
+6. **Parallel-safe** — no shared mutable state between tests. Prefer static, human-readable names (e.g. `"E2E-CMD Chat 1"`) and clean up resources by ID in `afterAll`. This keeps screenshots deterministic and avoids needing to mask/hide dynamic text. Only fall back to timestamps (`\`test-${Date.now()}\``) when resources cannot be reliably cleaned up or when name collisions across parallel workers would cause functional failures
+7. **Error context** — catch and re-throw with useful debug info (page text, URL, etc.)
+8. **Tag slow tests** — mark serial/slow tests with `@exclusive` in the test title
+9. **Visual regression** — use `expectScreenshot()` for UI consistency checks
+10. **Minimal comments** — only comment to clarify non-obvious intent; never restate what the next line of code does
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -6,5 +6,5 @@
 /web/STANDARDS.md @raunakab @Weves

 # Agent context files
-/CLAUDE.md.template @Weves
-/AGENTS.md.template @Weves
+/CLAUDE.md @Weves
+/AGENTS.md @Weves
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -8,4 +8,5 @@

 ## Additional Options

+- [ ] [Required] I have considered whether this PR needs to be cherry-picked to the latest beta branch.
 - [ ] [Optional] Override Linear Check
--- a/.github/workflows/deployment.yml
+++ b/.github/workflows/deployment.yml
@@ -26,12 +26,14 @@ jobs:
      build-web: ${{ steps.check.outputs.build-web }}
      build-web-cloud: ${{ steps.check.outputs.build-web-cloud }}
      build-backend: ${{ steps.check.outputs.build-backend }}
+      build-backend-craft: ${{ steps.check.outputs.build-backend-craft }}
      build-model-server: ${{ steps.check.outputs.build-model-server }}
      is-cloud-tag: ${{ steps.check.outputs.is-cloud-tag }}
      is-stable: ${{ steps.check.outputs.is-stable }}
      is-beta: ${{ steps.check.outputs.is-beta }}
      is-stable-standalone: ${{ steps.check.outputs.is-stable-standalone }}
      is-beta-standalone: ${{ steps.check.outputs.is-beta-standalone }}
+      is-craft-latest: ${{ steps.check.outputs.is-craft-latest }}
      is-test-run: ${{ steps.check.outputs.is-test-run }}
      sanitized-tag: ${{ steps.check.outputs.sanitized-tag }}
      short-sha: ${{ steps.check.outputs.short-sha }}
@@ -54,15 +56,20 @@ jobs:
          IS_BETA=false
          IS_STABLE_STANDALONE=false
          IS_BETA_STANDALONE=false
+          IS_CRAFT_LATEST=false
          IS_PROD_TAG=false
          IS_TEST_RUN=false
          BUILD_DESKTOP=false
          BUILD_WEB=false
          BUILD_WEB_CLOUD=false
          BUILD_BACKEND=true
+          BUILD_BACKEND_CRAFT=false
          BUILD_MODEL_SERVER=true

          # Determine tag type based on pattern matching (do regex checks once)
+          if [[ "$TAG" == craft-* ]]; then
+            IS_CRAFT_LATEST=true
+          fi
          if [[ "$TAG" == *cloud* ]]; then
            IS_CLOUD=true
          fi
@@ -75,7 +82,7 @@ jobs:
          if [[ "$TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
            IS_STABLE=true
          fi
-          if [[ "$TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+-beta\.[0-9]+$ ]]; then
+          if [[ "$TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+-beta(\.[0-9]+)?$ ]]; then
            IS_BETA=true
          fi

@@ -84,12 +91,18 @@ jobs:
            BUILD_WEB_CLOUD=true
          else
            BUILD_WEB=true
-            # Skip desktop builds on beta tags and nightly runs
-            if [[ "$IS_BETA" != "true" ]] && [[ "$IS_NIGHTLY" != "true" ]]; then
+            # Only build desktop for semver tags (excluding beta)
+            if [[ "$IS_VERSION_TAG" == "true" ]] && [[ "$IS_BETA" != "true" ]]; then
              BUILD_DESKTOP=true
            fi
          fi

+          # Craft-latest builds backend with Craft enabled
+          if [[ "$IS_CRAFT_LATEST" == "true" ]]; then
+            BUILD_BACKEND_CRAFT=true
+            BUILD_BACKEND=false
+          fi
+
          # Standalone version checks (for backend/model-server - version excluding cloud tags)
          if [[ "$IS_STABLE" == "true" ]] && [[ "$IS_CLOUD" != "true" ]]; then
            IS_STABLE_STANDALONE=true
@@ -113,12 +126,14 @@ jobs:
            echo "build-web=$BUILD_WEB"
            echo "build-web-cloud=$BUILD_WEB_CLOUD"
            echo "build-backend=$BUILD_BACKEND"
+            echo "build-backend-craft=$BUILD_BACKEND_CRAFT"
            echo "build-model-server=$BUILD_MODEL_SERVER"
            echo "is-cloud-tag=$IS_CLOUD"
            echo "is-stable=$IS_STABLE"
            echo "is-beta=$IS_BETA"
            echo "is-stable-standalone=$IS_STABLE_STANDALONE"
            echo "is-beta-standalone=$IS_BETA_STANDALONE"
+            echo "is-craft-latest=$IS_CRAFT_LATEST"
            echo "is-test-run=$IS_TEST_RUN"
            echo "sanitized-tag=$SANITIZED_TAG"
            echo "short-sha=$SHORT_SHA"
@@ -130,13 +145,13 @@ jobs:
    if: ${{ !startsWith(github.ref_name, 'nightly-latest') && github.event_name != 'workflow_dispatch' }}
    steps:
      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false
          fetch-depth: 0

      - name: Setup uv
-        uses: astral-sh/setup-uv@ed21f2f24f8dd64503750218de024bcf64c7250a # ratchet:astral-sh/setup-uv@v7
+        uses: astral-sh/setup-uv@61cb8a9741eeb8a550a1b8544337180c0fc8476b # ratchet:astral-sh/setup-uv@v7
        with:
          version: "0.9.9"
          # NOTE: This isn't caching much and zizmor suggests this could be poisoned, so disable.
@@ -155,27 +170,14 @@ jobs:
    environment: release
    steps:
      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

-      - name: Configure AWS credentials
-        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
-        with:
-          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
-          aws-region: us-east-2
-
-      - name: Get AWS Secrets
-        uses: aws-actions/aws-secretsmanager-get-secrets@a9a7eb4e2f2871d30dc5b892576fde60a2ecc802
-        with:
-          secret-ids: |
-            MONITOR_DEPLOYMENTS_WEBHOOK, deploy/monitor-deployments-webhook
-          parse-json-secrets: true
-
      - name: Send Slack notification
        uses: ./.github/actions/slack-notify
        with:
-          webhook-url: ${{ env.MONITOR_DEPLOYMENTS_WEBHOOK }}
+          webhook-url: ${{ secrets.MONITOR_DEPLOYMENTS_WEBHOOK }}
          failed-jobs: "• check-version-tag"
          title: "🚨 Version Tag Check Failed"
          ref-name: ${{ github.ref_name }}
@@ -204,7 +206,7 @@ jobs:
    timeout-minutes: 90
    environment: release
    steps:
-      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6.0.1
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6.0.2
        with:
          # NOTE: persist-credentials is needed for tauri-action to create GitHub releases.
          persist-credentials: true # zizmor: ignore[artipacked]
@@ -247,7 +249,7 @@ jobs:
            xdg-utils

      - name: setup node
-        uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # ratchet:actions/setup-node@v6.1.0
+        uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # ratchet:actions/setup-node@v6.2.0
        with:
          node-version: 24
          package-manager-cache: false
@@ -377,7 +379,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -407,7 +409,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -450,7 +452,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -480,7 +482,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -540,7 +542,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -588,7 +590,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -618,7 +620,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -638,6 +640,7 @@ jobs:
            NEXT_PUBLIC_POSTHOG_HOST=${{ secrets.POSTHOG_HOST }}
            NEXT_PUBLIC_SENTRY_DSN=${{ secrets.SENTRY_DSN }}
            NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY=${{ secrets.STRIPE_PUBLISHABLE_KEY }}
+            NEXT_PUBLIC_RECAPTCHA_SITE_KEY=${{ vars.NEXT_PUBLIC_RECAPTCHA_SITE_KEY }}
            NEXT_PUBLIC_GTM_ENABLED=true
            NEXT_PUBLIC_FORGOT_PASSWORD_ENABLED=true
            NEXT_PUBLIC_INCLUDE_ERROR_POPUP_SUPPORT_LINK=true
@@ -669,7 +672,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -699,7 +702,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -719,6 +722,7 @@ jobs:
            NEXT_PUBLIC_POSTHOG_HOST=${{ secrets.POSTHOG_HOST }}
            NEXT_PUBLIC_SENTRY_DSN=${{ secrets.SENTRY_DSN }}
            NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY=${{ secrets.STRIPE_PUBLISHABLE_KEY }}
+            NEXT_PUBLIC_RECAPTCHA_SITE_KEY=${{ vars.NEXT_PUBLIC_RECAPTCHA_SITE_KEY }}
            NEXT_PUBLIC_GTM_ENABLED=true
            NEXT_PUBLIC_FORGOT_PASSWORD_ENABLED=true
            NEXT_PUBLIC_INCLUDE_ERROR_POPUP_SUPPORT_LINK=true
@@ -767,7 +771,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -812,7 +816,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -842,7 +846,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -884,7 +888,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -914,7 +918,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -973,7 +977,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -1003,6 +1007,217 @@ jobs:
            $(printf '%s\n' "${META_TAGS}" | xargs -I {} echo -t {}) \
            $IMAGES

+  build-backend-craft-amd64:
+    needs: determine-builds
+    if: needs.determine-builds.outputs.build-backend-craft == 'true'
+    runs-on:
+      - runs-on
+      - runner=2cpu-linux-x64
+      - run-id=${{ github.run_id }}-backend-craft-amd64
+      - extras=ecr-cache
+    timeout-minutes: 90
+    environment: release
+    outputs:
+      digest: ${{ steps.build.outputs.digest }}
+    env:
+      REGISTRY_IMAGE: onyxdotapp/onyx-backend
+    steps:
+      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
+
+      - name: Checkout code
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
+        with:
+          persist-credentials: false
+
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
+        with:
+          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
+          aws-region: us-east-2
+
+      - name: Get AWS Secrets
+        uses: aws-actions/aws-secretsmanager-get-secrets@a9a7eb4e2f2871d30dc5b892576fde60a2ecc802
+        with:
+          secret-ids: |
+            DOCKER_USERNAME, deploy/docker-username
+            DOCKER_TOKEN, deploy/docker-token
+          parse-json-secrets: true
+
+      - name: Docker meta
+        id: meta
+        uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # ratchet:docker/metadata-action@v5
+        with:
+          images: ${{ env.REGISTRY_IMAGE }}
+          flavor: |
+            latest=false
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
+
+      - name: Login to Docker Hub
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
+        with:
+          username: ${{ env.DOCKER_USERNAME }}
+          password: ${{ env.DOCKER_TOKEN }}
+
+      - name: Build and push AMD64
+        id: build
+        uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # ratchet:docker/build-push-action@v6
+        with:
+          context: ./backend
+          file: ./backend/Dockerfile
+          platforms: linux/amd64
+          labels: ${{ steps.meta.outputs.labels }}
+          build-args: |
+            ONYX_VERSION=${{ github.ref_name }}
+            ENABLE_CRAFT=true
+          cache-from: |
+            type=registry,ref=${{ env.REGISTRY_IMAGE }}:latest
+            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-craft-cache-amd64
+          cache-to: |
+            type=inline
+            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-craft-cache-amd64,mode=max
+          outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
+          no-cache: ${{ vars.DOCKER_NO_CACHE == 'true' }}
+
+  build-backend-craft-arm64:
+    needs: determine-builds
+    if: needs.determine-builds.outputs.build-backend-craft == 'true'
+    runs-on:
+      - runs-on
+      - runner=2cpu-linux-arm64
+      - run-id=${{ github.run_id }}-backend-craft-arm64
+      - extras=ecr-cache
+    timeout-minutes: 90
+    environment: release
+    outputs:
+      digest: ${{ steps.build.outputs.digest }}
+    env:
+      REGISTRY_IMAGE: onyxdotapp/onyx-backend
+    steps:
+      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
+
+      - name: Checkout code
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
+        with:
+          persist-credentials: false
+
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
+        with:
+          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
+          aws-region: us-east-2
+
+      - name: Get AWS Secrets
+        uses: aws-actions/aws-secretsmanager-get-secrets@a9a7eb4e2f2871d30dc5b892576fde60a2ecc802
+        with:
+          secret-ids: |
+            DOCKER_USERNAME, deploy/docker-username
+            DOCKER_TOKEN, deploy/docker-token
+          parse-json-secrets: true
+
+      - name: Docker meta
+        id: meta
+        uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # ratchet:docker/metadata-action@v5
+        with:
+          images: ${{ env.REGISTRY_IMAGE }}
+          flavor: |
+            latest=false
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
+
+      - name: Login to Docker Hub
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
+        with:
+          username: ${{ env.DOCKER_USERNAME }}
+          password: ${{ env.DOCKER_TOKEN }}
+
+      - name: Build and push ARM64
+        id: build
+        uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # ratchet:docker/build-push-action@v6
+        with:
+          context: ./backend
+          file: ./backend/Dockerfile
+          platforms: linux/arm64
+          labels: ${{ steps.meta.outputs.labels }}
+          build-args: |
+            ONYX_VERSION=${{ github.ref_name }}
+            ENABLE_CRAFT=true
+          cache-from: |
+            type=registry,ref=${{ env.REGISTRY_IMAGE }}:latest
+            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-craft-cache-arm64
+          cache-to: |
+            type=inline
+            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-craft-cache-arm64,mode=max
+          outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
+          no-cache: ${{ vars.DOCKER_NO_CACHE == 'true' }}
+
+  merge-backend-craft:
+    needs:
+      - determine-builds
+      - build-backend-craft-amd64
+      - build-backend-craft-arm64
+    if: needs.determine-builds.outputs.build-backend-craft == 'true'
+    runs-on:
+      - runs-on
+      - runner=2cpu-linux-x64
+      - run-id=${{ github.run_id }}-merge-backend-craft
+      - extras=ecr-cache
+    timeout-minutes: 90
+    environment: release
+    env:
+      REGISTRY_IMAGE: onyxdotapp/onyx-backend
+    steps:
+      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
+
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
+        with:
+          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
+          aws-region: us-east-2
+
+      - name: Get AWS Secrets
+        uses: aws-actions/aws-secretsmanager-get-secrets@a9a7eb4e2f2871d30dc5b892576fde60a2ecc802
+        with:
+          secret-ids: |
+            DOCKER_USERNAME, deploy/docker-username
+            DOCKER_TOKEN, deploy/docker-token
+          parse-json-secrets: true
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
+
+      - name: Login to Docker Hub
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
+        with:
+          username: ${{ env.DOCKER_USERNAME }}
+          password: ${{ env.DOCKER_TOKEN }}
+
+      - name: Docker meta
+        id: meta
+        uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # ratchet:docker/metadata-action@v5
+        with:
+          images: ${{ env.REGISTRY_IMAGE }}
+          flavor: |
+            latest=false
+          tags: |
+            type=raw,value=craft-latest
+            # TODO: Consider aligning craft-latest tags with regular backend builds (e.g., latest, edge, beta)
+            # to keep tagging strategy consistent across all backend images
+
+      - name: Create and push manifest
+        env:
+          IMAGE_REPO: ${{ env.REGISTRY_IMAGE }}
+          AMD64_DIGEST: ${{ needs.build-backend-craft-amd64.outputs.digest }}
+          ARM64_DIGEST: ${{ needs.build-backend-craft-arm64.outputs.digest }}
+          META_TAGS: ${{ steps.meta.outputs.tags }}
+        run: |
+          IMAGES="${IMAGE_REPO}@${AMD64_DIGEST} ${IMAGE_REPO}@${ARM64_DIGEST}"
+          docker buildx imagetools create \
+            $(printf '%s\n' "${META_TAGS}" | xargs -I {} echo -t {}) \
+            $IMAGES
+
  build-model-server-amd64:
    needs: determine-builds
    if: needs.determine-builds.outputs.build-model-server == 'true'
@@ -1022,7 +1237,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -1054,7 +1269,7 @@ jobs:
          buildkitd-flags: ${{ vars.DOCKER_DEBUG == 'true' && '--debug' || '' }}

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -1101,7 +1316,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -1133,7 +1348,7 @@ jobs:
          buildkitd-flags: ${{ vars.DOCKER_DEBUG == 'true' && '--debug' || '' }}

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -1196,7 +1411,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ env.DOCKER_USERNAME }}
          password: ${{ env.DOCKER_TOKEN }}
@@ -1354,7 +1569,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -1466,33 +1681,23 @@ jobs:
      - build-backend-amd64
      - build-backend-arm64
      - merge-backend
+      - build-backend-craft-amd64
+      - build-backend-craft-arm64
+      - merge-backend-craft
      - build-model-server-amd64
      - build-model-server-arm64
      - merge-model-server
-    if: always() && (needs.build-desktop.result == 'failure' || needs.build-web-amd64.result == 'failure' || needs.build-web-arm64.result == 'failure' || needs.merge-web.result == 'failure' || needs.build-web-cloud-amd64.result == 'failure' || needs.build-web-cloud-arm64.result == 'failure' || needs.merge-web-cloud.result == 'failure' || needs.build-backend-amd64.result == 'failure' || needs.build-backend-arm64.result == 'failure' || needs.merge-backend.result == 'failure' || needs.build-model-server-amd64.result == 'failure' || needs.build-model-server-arm64.result == 'failure' || needs.merge-model-server.result == 'failure') && needs.determine-builds.outputs.is-test-run != 'true'
+    if: always() && (needs.build-desktop.result == 'failure' || needs.build-web-amd64.result == 'failure' || needs.build-web-arm64.result == 'failure' || needs.merge-web.result == 'failure' || needs.build-web-cloud-amd64.result == 'failure' || needs.build-web-cloud-arm64.result == 'failure' || needs.merge-web-cloud.result == 'failure' || needs.build-backend-amd64.result == 'failure' || needs.build-backend-arm64.result == 'failure' || needs.merge-backend.result == 'failure' || (needs.determine-builds.outputs.build-backend-craft == 'true' && (needs.build-backend-craft-amd64.result == 'failure' || needs.build-backend-craft-arm64.result == 'failure' || needs.merge-backend-craft.result == 'failure')) || needs.build-model-server-amd64.result == 'failure' || needs.build-model-server-arm64.result == 'failure' || needs.merge-model-server.result == 'failure') && needs.determine-builds.outputs.is-test-run != 'true'
    # NOTE: Github-hosted runners have about 20s faster queue times and are preferred here.
    runs-on: ubuntu-slim
    timeout-minutes: 90
    environment: release
    steps:
      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

-      - name: Configure AWS credentials
-        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
-        with:
-          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
-          aws-region: us-east-2
-
-      - name: Get AWS Secrets
-        uses: aws-actions/aws-secretsmanager-get-secrets@a9a7eb4e2f2871d30dc5b892576fde60a2ecc802
-        with:
-          secret-ids: |
-            MONITOR_DEPLOYMENTS_WEBHOOK, deploy/monitor-deployments-webhook
-          parse-json-secrets: true
-
      - name: Determine failed jobs
        id: failed-jobs
        shell: bash
@@ -1558,7 +1763,7 @@ jobs:
      - name: Send Slack notification
        uses: ./.github/actions/slack-notify
        with:
-          webhook-url: ${{ env.MONITOR_DEPLOYMENTS_WEBHOOK }}
+          webhook-url: ${{ secrets.MONITOR_DEPLOYMENTS_WEBHOOK }}
          failed-jobs: ${{ steps.failed-jobs.outputs.jobs }}
          title: "🚨 Deployment Workflow Failed"
          ref-name: ${{ github.ref_name }}
--- a/.github/workflows/docker-tag-beta.yml
+++ b/.github/workflows/docker-tag-beta.yml
@@ -24,7 +24,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
--- a/.github/workflows/docker-tag-latest.yml
+++ b/.github/workflows/docker-tag-latest.yml
@@ -24,7 +24,7 @@ jobs:
        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
--- a/.github/workflows/helm-chart-releases.yml
+++ b/.github/workflows/helm-chart-releases.yml
@@ -15,7 +15,7 @@ jobs:
    timeout-minutes: 45
    steps:
      - name: Checkout
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          fetch-depth: 0
          persist-credentials: false
--- a/.github/workflows/nightly-scan-licenses.yml
+++ b/.github/workflows/nightly-scan-licenses.yml
@@ -1,151 +0,0 @@
-# Scan for problematic software licenses
-
-# trivy has their own rate limiting issues causing this action to flake
-# we worked around it by hardcoding to different db repos in env
-# can re-enable when they figure it out
-# https://github.com/aquasecurity/trivy/discussions/7538
-# https://github.com/aquasecurity/trivy-action/issues/389
-
-name: 'Nightly - Scan licenses'
-on:
-#   schedule:
-#     - cron: '0 14 * * *'  # Runs every day at 6 AM PST / 7 AM PDT / 2 PM UTC
-  workflow_dispatch:  # Allows manual triggering
-
-permissions:
-  actions: read
-  contents: read
-
-jobs:
-  scan-licenses:
-    # See https://runs-on.com/runners/linux/
-    runs-on: [runs-on,runner=2cpu-linux-x64,"run-id=${{ github.run_id }}-scan-licenses"]
-    timeout-minutes: 45
-    permissions:
-      actions: read
-      contents: read
-      security-events: write
-
-    steps:
-      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
-        with:
-          persist-credentials: false
-
-      - name: Set up Python
-        uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 # ratchet:actions/setup-python@v6
-        with:
-          python-version: '3.11'
-          cache: 'pip'
-          cache-dependency-path: |
-            backend/requirements/default.txt
-            backend/requirements/dev.txt
-            backend/requirements/model_server.txt
-
-      - name: Get explicit and transitive dependencies
-        run: |
-          python -m pip install --upgrade pip
-          pip install --retries 5 --timeout 30 -r backend/requirements/default.txt
-          pip install --retries 5 --timeout 30 -r backend/requirements/dev.txt
-          pip install --retries 5 --timeout 30 -r backend/requirements/model_server.txt
-          pip freeze > requirements-all.txt
-
-      - name: Check python
-        id: license_check_report
-        uses: pilosus/action-pip-license-checker@e909b0226ff49d3235c99c4585bc617f49fff16a # ratchet:pilosus/action-pip-license-checker@v3
-        with:
-          requirements: 'requirements-all.txt'
-          fail: 'Copyleft'
-          exclude: '(?i)^(pylint|aio[-_]*).*'
-
-      - name: Print report
-        if: always()
-        env:
-          REPORT: ${{ steps.license_check_report.outputs.report }}
-        run: echo "$REPORT"
-
-      - name: Install npm dependencies
-        working-directory: ./web
-        run: npm ci
-
-        # be careful enabling the sarif and upload as it may spam the security tab
-        # with a huge amount of items. Work out the issues before enabling upload.
-#       - name: Run Trivy vulnerability scanner in repo mode
-#         if: always()
-#         uses: aquasecurity/trivy-action@b6643a29fecd7f34b3597bc6acb0a98b03d33ff8 # ratchet:aquasecurity/trivy-action@0.33.1
-#         with:
-#           scan-type: fs
-#           scan-ref: .
-#           scanners: license
-#           format: table
-#           severity: HIGH,CRITICAL
-# #           format: sarif
-# #           output: trivy-results.sarif
-#
-# #       - name: Upload Trivy scan results to GitHub Security tab
-# #         uses: github/codeql-action/upload-sarif@v3
-# #         with:
-# #           sarif_file: trivy-results.sarif
-
-  scan-trivy:
-    # See https://runs-on.com/runners/linux/
-    runs-on: [runs-on,runner=2cpu-linux-x64,"run-id=${{ github.run_id }}-scan-trivy"]
-    timeout-minutes: 45
-
-    steps:
-    - name: Set up Docker Buildx
-      uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
-
-    - name: Login to Docker Hub
-      uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
-      with:
-        username: ${{ secrets.DOCKER_USERNAME }}
-        password: ${{ secrets.DOCKER_TOKEN }}
-
-    # Backend
-    - name: Pull backend docker image
-      run: docker pull onyxdotapp/onyx-backend:latest
-
-    - name: Run Trivy vulnerability scanner on backend
-      uses: aquasecurity/trivy-action@b6643a29fecd7f34b3597bc6acb0a98b03d33ff8 # ratchet:aquasecurity/trivy-action@0.33.1
-      env:
-        TRIVY_DB_REPOSITORY: 'public.ecr.aws/aquasecurity/trivy-db:2'
-        TRIVY_JAVA_DB_REPOSITORY: 'public.ecr.aws/aquasecurity/trivy-java-db:1'
-      with:
-        image-ref: onyxdotapp/onyx-backend:latest
-        scanners: license
-        severity: HIGH,CRITICAL
-        vuln-type: library
-        exit-code: 0  # Set to 1 if we want a failed scan to fail the workflow
-
-    # Web server
-    - name: Pull web server docker image
-      run: docker pull onyxdotapp/onyx-web-server:latest
-
-    - name: Run Trivy vulnerability scanner on web server
-      uses: aquasecurity/trivy-action@b6643a29fecd7f34b3597bc6acb0a98b03d33ff8 # ratchet:aquasecurity/trivy-action@0.33.1
-      env:
-        TRIVY_DB_REPOSITORY: 'public.ecr.aws/aquasecurity/trivy-db:2'
-        TRIVY_JAVA_DB_REPOSITORY: 'public.ecr.aws/aquasecurity/trivy-java-db:1'
-      with:
-        image-ref: onyxdotapp/onyx-web-server:latest
-        scanners: license
-        severity: HIGH,CRITICAL
-        vuln-type: library
-        exit-code: 0
-
-    # Model server
-    - name: Pull model server docker image
-      run: docker pull onyxdotapp/onyx-model-server:latest
-
-    - name: Run Trivy vulnerability scanner
-      uses: aquasecurity/trivy-action@b6643a29fecd7f34b3597bc6acb0a98b03d33ff8 # ratchet:aquasecurity/trivy-action@0.33.1
-      env:
-        TRIVY_DB_REPOSITORY: 'public.ecr.aws/aquasecurity/trivy-db:2'
-        TRIVY_JAVA_DB_REPOSITORY: 'public.ecr.aws/aquasecurity/trivy-java-db:1'
-      with:
-        image-ref: onyxdotapp/onyx-model-server:latest
-        scanners: license
-        severity: HIGH,CRITICAL
-        vuln-type: library
-        exit-code: 0
--- a/.github/workflows/pr-beta-cherrypick-check.yml
+++ b/.github/workflows/pr-beta-cherrypick-check.yml
@@ -0,0 +1,28 @@
+name: Require beta cherry-pick consideration
+concurrency:
+  group: Require-Beta-Cherrypick-Consideration-${{ github.workflow }}-${{ github.head_ref || github.event.workflow_run.head_branch || github.run_id }}
+  cancel-in-progress: true
+
+on:
+  pull_request:
+    types: [opened, edited, reopened, synchronize]
+
+permissions:
+  contents: read
+
+jobs:
+  beta-cherrypick-check:
+    runs-on: ubuntu-latest
+    timeout-minutes: 45
+    steps:
+      - name: Check PR body for beta cherry-pick consideration
+        env:
+          PR_BODY: ${{ github.event.pull_request.body }}
+        run: |
+          if echo "$PR_BODY" | grep -qiE "\\[x\\][[:space:]]*\\[Required\\][[:space:]]*I have considered whether this PR needs to be cherry[- ]picked to the latest beta branch"; then
+            echo "Cherry-pick consideration box is checked. Check passed."
+            exit 0
+          fi
+
+          echo "::error::Please check the 'I have considered whether this PR needs to be cherry-picked to the latest beta branch' box in the PR description."
+          exit 1
--- a/.github/workflows/pr-database-tests.yml
+++ b/.github/workflows/pr-database-tests.yml
@@ -27,7 +27,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -40,13 +40,16 @@ jobs:

      - name: Generate OpenAPI schema and Python client
        shell: bash
+        # TODO(Nik): https://linear.app/onyx-app/issue/ENG-1/update-test-infra-to-use-test-license
+        env:
+          LICENSE_ENFORCEMENT_ENABLED: "false"
        run: |
          ods openapi all

      # needed for pulling external images otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
--- a/.github/workflows/pr-desktop-build.yml
+++ b/.github/workflows/pr-desktop-build.yml
@@ -0,0 +1,114 @@
+name: Build Desktop App
+concurrency:
+  group: Build-Desktop-App-${{ github.workflow }}-${{ github.head_ref || github.event.workflow_run.head_branch || github.run_id }}
+  cancel-in-progress: true
+
+on:
+  merge_group:
+  pull_request:
+    branches:
+      - main
+      - "release/**"
+    paths:
+      - "desktop/**"
+      - ".github/workflows/pr-desktop-build.yml"
+  push:
+    tags:
+      - "v*.*.*"
+
+permissions:
+  contents: read
+
+jobs:
+  build-desktop:
+    name: Build Desktop (${{ matrix.platform }})
+    runs-on: ${{ matrix.os }}
+    timeout-minutes: 60
+    strategy:
+      fail-fast: false
+      matrix:
+        include:
+          - platform: linux
+            os: ubuntu-latest
+            target: x86_64-unknown-linux-gnu
+            args: "--bundles deb,rpm"
+          # TODO: Fix and enable the macOS build.
+          #- platform: macos
+          #  os: macos-latest
+          #  target: universal-apple-darwin
+          #  args: "--target universal-apple-darwin"
+          # TODO: Fix and enable the Windows build.
+          #- platform: windows
+          #  os: windows-latest
+          #  target: x86_64-pc-windows-msvc
+          #  args: ""
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd
+        with:
+          persist-credentials: false
+
+      - name: Setup node
+        uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238
+        with:
+          node-version: 24
+          cache: "npm" # zizmor: ignore[cache-poisoning]
+          cache-dependency-path: ./desktop/package-lock.json
+
+      - name: Setup Rust
+        uses: dtolnay/rust-toolchain@4be9e76fd7c4901c61fb841f559994984270fce7
+        with:
+          toolchain: stable
+          targets: ${{ matrix.target }}
+
+      - name: Cache Cargo registry and build
+        uses: actions/cache@cdf6c1fa76f9f475f3d7449005a359c84ca0f306 # zizmor: ignore[cache-poisoning]
+        with:
+          path: |
+            ~/.cargo/bin/
+            ~/.cargo/registry/index/
+            ~/.cargo/registry/cache/
+            ~/.cargo/git/db/
+            desktop/src-tauri/target/
+          key: ${{ runner.os }}-cargo-${{ hashFiles('desktop/src-tauri/Cargo.lock') }}
+          restore-keys: |
+            ${{ runner.os }}-cargo-
+
+      - name: Install Linux dependencies
+        if: matrix.platform == 'linux'
+        run: |
+          sudo apt-get update
+          sudo apt-get install -y \
+            build-essential \
+            libglib2.0-dev \
+            libgirepository1.0-dev \
+            libgtk-3-dev \
+            libjavascriptcoregtk-4.1-dev \
+            libwebkit2gtk-4.1-dev \
+            libayatana-appindicator3-dev \
+            gobject-introspection \
+            pkg-config \
+            curl \
+            xdg-utils
+
+      - name: Install npm dependencies
+        working-directory: ./desktop
+        run: npm ci
+
+      - name: Build desktop app
+        working-directory: ./desktop
+        run: npx tauri build ${{ matrix.args }}
+        env:
+          TAURI_SIGNING_PRIVATE_KEY: ""
+          TAURI_SIGNING_PRIVATE_KEY_PASSWORD: ""
+
+      - name: Upload build artifacts
+        if: always()
+        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
+        with:
+          name: desktop-build-${{ matrix.platform }}-${{ github.run_id }}
+          path: |
+            desktop/src-tauri/target/release/bundle/
+          retention-days: 7
+          if-no-files-found: ignore
--- a/.github/workflows/pr-external-dependency-unit-tests.yml
+++ b/.github/workflows/pr-external-dependency-unit-tests.yml
@@ -57,7 +57,7 @@ jobs:
      test-dirs: ${{ steps.set-matrix.outputs.test-dirs }}
    steps:
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -91,7 +91,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -110,7 +110,7 @@ jobs:
      # otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -118,6 +118,7 @@ jobs:
      - name: Create .env file for Docker Compose
        run: |
          cat <<EOF > deployment/docker_compose/.env
+          COMPOSE_PROFILES=s3-filestore
          CODE_INTERPRETER_BETA_ENABLED=true
          DISABLE_TELEMETRY=true
          EOF
--- a/.github/workflows/pr-helm-chart-testing.yml
+++ b/.github/workflows/pr-helm-chart-testing.yml
@@ -30,7 +30,7 @@ jobs:
    # fetch-depth 0 is required for helm/chart-testing-action
    steps:
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          fetch-depth: 0
          persist-credentials: false
@@ -41,8 +41,7 @@ jobs:
          version: v3.19.0

      - name: Set up chart-testing
-        # NOTE: This is Jamison's patch from https://github.com/helm/chart-testing-action/pull/194
-        uses: helm/chart-testing-action@8958a6ac472cbd8ee9a8fbb6f1acbc1b0e966e44 # zizmor: ignore[impostor-commit]
+        uses: helm/chart-testing-action@b5eebdd9998021f29756c53432f48dab66394810
        with:
          uv_version: "0.9.9"

@@ -92,7 +91,6 @@ jobs:
          helm repo add cloudnative-pg https://cloudnative-pg.github.io/charts
          helm repo add ot-container-kit https://ot-container-kit.github.io/helm-charts
          helm repo add minio https://charts.min.io/
-          helm repo add code-interpreter https://onyx-dot-app.github.io/code-interpreter/
          helm repo update

      - name: Install Redis operator
@@ -197,7 +195,6 @@ jobs:
              --set=auth.opensearch.enabled=true \
              --set=slackbot.enabled=false \
              --set=postgresql.enabled=true \
-              --set=postgresql.nameOverride=cloudnative-pg \
              --set=postgresql.cluster.storage.storageClass=standard \
              --set=redis.enabled=true \
              --set=redis.storageSpec.volumeClaimTemplate.spec.storageClassName=standard \
--- a/.github/workflows/pr-integration-tests.yml
+++ b/.github/workflows/pr-integration-tests.yml
@@ -46,9 +46,10 @@ jobs:
    timeout-minutes: 45
    outputs:
      test-dirs: ${{ steps.set-matrix.outputs.test-dirs }}
+      editions: ${{ steps.set-editions.outputs.editions }}
    steps:
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -56,7 +57,7 @@ jobs:
        id: set-matrix
        run: |
          # Find all leaf-level directories in both test directories
-          tests_dirs=$(find backend/tests/integration/tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" ! -name "mcp" -exec basename {} \; | sort)
+          tests_dirs=$(find backend/tests/integration/tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" ! -name "mcp" ! -name "no_vectordb" -exec basename {} \; | sort)
          connector_dirs=$(find backend/tests/integration/connector_job_tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" -exec basename {} \; | sort)

          # Create JSON array with directory info
@@ -72,6 +73,16 @@ jobs:
          all_dirs="[${all_dirs%,}]"
          echo "test-dirs=$all_dirs" >> $GITHUB_OUTPUT

+      - name: Determine editions to test
+        id: set-editions
+        run: |
+          # On PRs, only run EE tests. On merge_group and tags, run both EE and MIT.
+          if [ "${{ github.event_name }}" = "pull_request" ]; then
+            echo 'editions=["ee"]' >> $GITHUB_OUTPUT
+          else
+            echo 'editions=["ee","mit"]' >> $GITHUB_OUTPUT
+          fi
+
  build-backend-image:
    runs-on:
      [
@@ -84,7 +95,7 @@ jobs:
    steps:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -109,7 +120,7 @@ jobs:
      # otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -144,7 +155,7 @@ jobs:
    steps:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -169,7 +180,7 @@ jobs:
      # otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -203,7 +214,7 @@ jobs:
    steps:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -214,7 +225,7 @@ jobs:
      # otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -267,7 +278,7 @@ jobs:
    runs-on:
      - runs-on
      - runner=4cpu-linux-arm64
-      - ${{ format('run-id={0}-integration-tests-job-{1}', github.run_id, strategy['job-index']) }}
+      - ${{ format('run-id={0}-integration-tests-{1}-job-{2}', github.run_id, matrix.edition, strategy['job-index']) }}
      - extras=ecr-cache
    timeout-minutes: 45

@@ -275,11 +286,12 @@ jobs:
      fail-fast: false
      matrix:
        test-dir: ${{ fromJson(needs.discover-test-dirs.outputs.test-dirs) }}
+        edition: ${{ fromJson(needs.discover-test-dirs.outputs.editions) }}

    steps:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -287,7 +299,7 @@ jobs:
      # otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -298,9 +310,11 @@ jobs:
        env:
          ECR_CACHE: ${{ env.RUNS_ON_ECR_CACHE }}
          RUN_ID: ${{ github.run_id }}
+          EDITION: ${{ matrix.edition }}
        run: |
+          # Base config shared by both editions
          cat <<EOF > deployment/docker_compose/.env
-          ENABLE_PAID_ENTERPRISE_EDITION_FEATURES=true
+          COMPOSE_PROFILES=s3-filestore
          AUTH_TYPE=basic
          POSTGRES_POOL_PRE_PING=true
          POSTGRES_USE_NULL_POOL=true
@@ -309,11 +323,20 @@ jobs:
          ONYX_BACKEND_IMAGE=${ECR_CACHE}:integration-test-backend-test-${RUN_ID}
          ONYX_MODEL_SERVER_IMAGE=${ECR_CACHE}:integration-test-model-server-test-${RUN_ID}
          INTEGRATION_TESTS_MODE=true
-          CHECK_TTL_MANAGEMENT_TASK_FREQUENCY_IN_HOURS=0.001
-          AUTO_LLM_UPDATE_INTERVAL_SECONDS=10
          MCP_SERVER_ENABLED=true
+          AUTO_LLM_UPDATE_INTERVAL_SECONDS=10
+          EOF
+
+          # EE-only config
+          if [ "$EDITION" = "ee" ]; then
+            cat <<EOF >> deployment/docker_compose/.env
+          ENABLE_PAID_ENTERPRISE_EDITION_FEATURES=true
+          # TODO(Nik): https://linear.app/onyx-app/issue/ENG-1/update-test-infra-to-use-test-license
+          LICENSE_ENFORCEMENT_ENABLED=false
+          CHECK_TTL_MANAGEMENT_TASK_FREQUENCY_IN_HOURS=0.001
          USE_LIGHTWEIGHT_BACKGROUND_WORKER=false
          EOF
+          fi

      - name: Start Docker containers
        run: |
@@ -376,14 +399,14 @@ jobs:
          docker compose -f docker-compose.mock-it-services.yml \
            -p mock-it-services-stack up -d

-      - name: Run Integration Tests for ${{ matrix.test-dir.name }}
+      - name: Run Integration Tests (${{ matrix.edition }}) for ${{ matrix.test-dir.name }}
        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # ratchet:nick-fields/retry@v3
        with:
          timeout_minutes: 20
          max_attempts: 3
          retry_wait_seconds: 10
          command: |
-            echo "Running integration tests for ${{ matrix.test-dir.path }}..."
+            echo "Running ${{ matrix.edition }} integration tests for ${{ matrix.test-dir.path }}..."
            docker run --rm --network onyx_default \
              --name test-runner \
              -e POSTGRES_HOST=relational_db \
@@ -441,10 +464,143 @@ jobs:
        if: always()
        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
        with:
-          name: docker-all-logs-${{ matrix.test-dir.name }}
+          name: docker-all-logs-${{ matrix.edition }}-${{ matrix.test-dir.name }}
          path: ${{ github.workspace }}/docker-compose.log
      # ------------------------------------------------------------

+  no-vectordb-tests:
+    needs: [build-backend-image, build-integration-image]
+    runs-on:
+      [
+        runs-on,
+        runner=4cpu-linux-arm64,
+        "run-id=${{ github.run_id }}-no-vectordb-tests",
+        "extras=ecr-cache",
+      ]
+    timeout-minutes: 45
+
+    steps:
+      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
+      - name: Checkout code
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
+        with:
+          persist-credentials: false
+
+      - name: Login to Docker Hub
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
+        with:
+          username: ${{ secrets.DOCKER_USERNAME }}
+          password: ${{ secrets.DOCKER_TOKEN }}
+
+      - name: Create .env file for no-vectordb Docker Compose
+        env:
+          ECR_CACHE: ${{ env.RUNS_ON_ECR_CACHE }}
+          RUN_ID: ${{ github.run_id }}
+        run: |
+          cat <<EOF > deployment/docker_compose/.env
+          COMPOSE_PROFILES=s3-filestore
+          ENABLE_PAID_ENTERPRISE_EDITION_FEATURES=true
+          LICENSE_ENFORCEMENT_ENABLED=false
+          AUTH_TYPE=basic
+          POSTGRES_POOL_PRE_PING=true
+          POSTGRES_USE_NULL_POOL=true
+          REQUIRE_EMAIL_VERIFICATION=false
+          DISABLE_TELEMETRY=true
+          DISABLE_VECTOR_DB=true
+          ONYX_BACKEND_IMAGE=${ECR_CACHE}:integration-test-backend-test-${RUN_ID}
+          INTEGRATION_TESTS_MODE=true
+          USE_LIGHTWEIGHT_BACKGROUND_WORKER=true
+          EOF
+
+      # Start only the services needed for no-vectordb mode (no Vespa, no model servers)
+      - name: Start Docker containers (no-vectordb)
+        run: |
+          cd deployment/docker_compose
+          docker compose -f docker-compose.yml -f docker-compose.no-vectordb.yml -f docker-compose.dev.yml up \
+            relational_db \
+            cache \
+            minio \
+            api_server \
+            background \
+            -d
+        id: start_docker_no_vectordb
+
+      - name: Wait for services to be ready
+        run: |
+          echo "Starting wait-for-service script (no-vectordb)..."
+          start_time=$(date +%s)
+          timeout=300
+          while true; do
+            current_time=$(date +%s)
+            elapsed_time=$((current_time - start_time))
+            if [ $elapsed_time -ge $timeout ]; then
+              echo "Timeout reached. Service did not become ready in $timeout seconds."
+              exit 1
+            fi
+            response=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/health || echo "curl_error")
+            if [ "$response" = "200" ]; then
+              echo "API server is ready!"
+              break
+            elif [ "$response" = "curl_error" ]; then
+              echo "Curl encountered an error; retrying..."
+            else
+              echo "Service not ready yet (HTTP $response). Retrying in 5 seconds..."
+            fi
+            sleep 5
+          done
+
+      - name: Run No-VectorDB Integration Tests
+        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # ratchet:nick-fields/retry@v3
+        with:
+          timeout_minutes: 20
+          max_attempts: 3
+          retry_wait_seconds: 10
+          command: |
+            echo "Running no-vectordb integration tests..."
+            docker run --rm --network onyx_default \
+              --name test-runner \
+              -e POSTGRES_HOST=relational_db \
+              -e POSTGRES_USER=postgres \
+              -e POSTGRES_PASSWORD=password \
+              -e POSTGRES_DB=postgres \
+              -e DB_READONLY_USER=db_readonly_user \
+              -e DB_READONLY_PASSWORD=password \
+              -e POSTGRES_POOL_PRE_PING=true \
+              -e POSTGRES_USE_NULL_POOL=true \
+              -e REDIS_HOST=cache \
+              -e API_SERVER_HOST=api_server \
+              -e OPENAI_API_KEY=${OPENAI_API_KEY} \
+              -e TEST_WEB_HOSTNAME=test-runner \
+              ${{ env.RUNS_ON_ECR_CACHE }}:integration-test-${{ github.run_id }} \
+              /app/tests/integration/tests/no_vectordb
+
+      - name: Dump API server logs (no-vectordb)
+        if: always()
+        run: |
+          cd deployment/docker_compose
+          docker compose -f docker-compose.yml -f docker-compose.no-vectordb.yml -f docker-compose.dev.yml \
+            logs --no-color api_server > $GITHUB_WORKSPACE/api_server_no_vectordb.log || true
+
+      - name: Dump all-container logs (no-vectordb)
+        if: always()
+        run: |
+          cd deployment/docker_compose
+          docker compose -f docker-compose.yml -f docker-compose.no-vectordb.yml -f docker-compose.dev.yml \
+            logs --no-color > $GITHUB_WORKSPACE/docker-compose-no-vectordb.log || true
+
+      - name: Upload logs (no-vectordb)
+        if: always()
+        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
+        with:
+          name: docker-all-logs-no-vectordb
+          path: ${{ github.workspace }}/docker-compose-no-vectordb.log
+
+      - name: Stop Docker containers (no-vectordb)
+        if: always()
+        run: |
+          cd deployment/docker_compose
+          docker compose -f docker-compose.yml -f docker-compose.no-vectordb.yml -f docker-compose.dev.yml down -v
+
  multitenant-tests:
    needs:
      [build-backend-image, build-model-server-image, build-integration-image]
@@ -460,12 +616,12 @@ jobs:
    steps:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -477,6 +633,7 @@ jobs:
        run: |
          cd deployment/docker_compose
          ENABLE_PAID_ENTERPRISE_EDITION_FEATURES=true \
+          LICENSE_ENFORCEMENT_ENABLED=false \
          MULTI_TENANT=true \
          AUTH_TYPE=cloud \
          REQUIRE_EMAIL_VERIFICATION=false \
@@ -583,7 +740,7 @@ jobs:
    # NOTE: Github-hosted runners have about 20s faster queue times and are preferred here.
    runs-on: ubuntu-slim
    timeout-minutes: 45
-    needs: [integration-tests, multitenant-tests]
+    needs: [integration-tests, no-vectordb-tests, multitenant-tests]
    if: ${{ always() }}
    steps:
      - name: Check job status
--- a/.github/workflows/pr-jest-tests.yml
+++ b/.github/workflows/pr-jest-tests.yml
@@ -23,12 +23,12 @@ jobs:
    timeout-minutes: 45
    steps:
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

      - name: Setup node
-        uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # ratchet:actions/setup-node@v4
+        uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # ratchet:actions/setup-node@v4
        with:
          node-version: 22
          cache: "npm"
--- a/.github/workflows/pr-mit-integration-tests.yml
+++ b/.github/workflows/pr-mit-integration-tests.yml
@@ -1,442 +0,0 @@
-name: Run MIT Integration Tests v2
-concurrency:
-  group: Run-MIT-Integration-Tests-${{ github.workflow }}-${{ github.head_ref || github.event.workflow_run.head_branch || github.run_id }}
-  cancel-in-progress: true
-
-on:
-  merge_group:
-    types: [checks_requested]
-  push:
-    tags:
-      - "v*.*.*"
-
-permissions:
-  contents: read
-
-env:
-  # Test Environment Variables
-  OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-  SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
-  EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
-  CONFLUENCE_TEST_SPACE_URL: ${{ vars.CONFLUENCE_TEST_SPACE_URL }}
-  CONFLUENCE_USER_NAME: ${{ vars.CONFLUENCE_USER_NAME }}
-  CONFLUENCE_ACCESS_TOKEN: ${{ secrets.CONFLUENCE_ACCESS_TOKEN }}
-  CONFLUENCE_ACCESS_TOKEN_SCOPED: ${{ secrets.CONFLUENCE_ACCESS_TOKEN_SCOPED }}
-  JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }}
-  JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }}
-  JIRA_API_TOKEN: ${{ secrets.JIRA_API_TOKEN }}
-  JIRA_API_TOKEN_SCOPED: ${{ secrets.JIRA_API_TOKEN_SCOPED }}
-  PERM_SYNC_SHAREPOINT_CLIENT_ID: ${{ secrets.PERM_SYNC_SHAREPOINT_CLIENT_ID }}
-  PERM_SYNC_SHAREPOINT_PRIVATE_KEY: ${{ secrets.PERM_SYNC_SHAREPOINT_PRIVATE_KEY }}
-  PERM_SYNC_SHAREPOINT_CERTIFICATE_PASSWORD: ${{ secrets.PERM_SYNC_SHAREPOINT_CERTIFICATE_PASSWORD }}
-  PERM_SYNC_SHAREPOINT_DIRECTORY_ID: ${{ secrets.PERM_SYNC_SHAREPOINT_DIRECTORY_ID }}
-
-jobs:
-  discover-test-dirs:
-    # NOTE: Github-hosted runners have about 20s faster queue times and are preferred here.
-    runs-on: ubuntu-slim
-    timeout-minutes: 45
-    outputs:
-      test-dirs: ${{ steps.set-matrix.outputs.test-dirs }}
-    steps:
-      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
-        with:
-          persist-credentials: false
-
-      - name: Discover test directories
-        id: set-matrix
-        run: |
-          # Find all leaf-level directories in both test directories
-          tests_dirs=$(find backend/tests/integration/tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" ! -name "mcp" -exec basename {} \; | sort)
-          connector_dirs=$(find backend/tests/integration/connector_job_tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" -exec basename {} \; | sort)
-
-          # Create JSON array with directory info
-          all_dirs=""
-          for dir in $tests_dirs; do
-            all_dirs="$all_dirs{\"path\":\"tests/$dir\",\"name\":\"tests-$dir\"},"
-          done
-          for dir in $connector_dirs; do
-            all_dirs="$all_dirs{\"path\":\"connector_job_tests/$dir\",\"name\":\"connector-$dir\"},"
-          done
-
-          # Remove trailing comma and wrap in array
-          all_dirs="[${all_dirs%,}]"
-          echo "test-dirs=$all_dirs" >> $GITHUB_OUTPUT
-
-  build-backend-image:
-    runs-on:
-      [
-        runs-on,
-        runner=1cpu-linux-arm64,
-        "run-id=${{ github.run_id }}-build-backend-image",
-        "extras=ecr-cache",
-      ]
-    timeout-minutes: 45
-    steps:
-      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
-      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
-        with:
-          persist-credentials: false
-
-      - name: Format branch name for cache
-        id: format-branch
-        env:
-          PR_NUMBER: ${{ github.event.pull_request.number }}
-          REF_NAME: ${{ github.ref_name }}
-        run: |
-          if [ -n "${PR_NUMBER}" ]; then
-            CACHE_SUFFIX="${PR_NUMBER}"
-          else
-            # shellcheck disable=SC2001
-            CACHE_SUFFIX=$(echo "${REF_NAME}" | sed 's/[^A-Za-z0-9._-]/-/g')
-          fi
-          echo "cache-suffix=${CACHE_SUFFIX}" >> $GITHUB_OUTPUT
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
-
-      # needed for pulling Vespa, Redis, Postgres, and Minio images
-      # otherwise, we hit the "Unauthenticated users" limit
-      # https://docs.docker.com/docker-hub/usage/
-      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
-        with:
-          username: ${{ secrets.DOCKER_USERNAME }}
-          password: ${{ secrets.DOCKER_TOKEN }}
-
-      - name: Build and push Backend Docker image
-        uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # ratchet:docker/build-push-action@v6
-        with:
-          context: ./backend
-          file: ./backend/Dockerfile
-          push: true
-          tags: ${{ env.RUNS_ON_ECR_CACHE }}:integration-test-backend-test-${{ github.run_id }}
-          cache-from: |
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-cache-${{ github.event.pull_request.head.sha || github.sha }}
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-cache-${{ steps.format-branch.outputs.cache-suffix }}
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-cache
-            type=registry,ref=onyxdotapp/onyx-backend:latest
-          cache-to: |
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-cache-${{ github.event.pull_request.head.sha || github.sha }},mode=max
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-cache-${{ steps.format-branch.outputs.cache-suffix }},mode=max
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:backend-cache,mode=max
-          no-cache: ${{ vars.DOCKER_NO_CACHE == 'true' }}
-
-  build-model-server-image:
-    runs-on:
-      [
-        runs-on,
-        runner=1cpu-linux-arm64,
-        "run-id=${{ github.run_id }}-build-model-server-image",
-        "extras=ecr-cache",
-      ]
-    timeout-minutes: 45
-    steps:
-      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
-      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
-        with:
-          persist-credentials: false
-
-      - name: Format branch name for cache
-        id: format-branch
-        env:
-          PR_NUMBER: ${{ github.event.pull_request.number }}
-          REF_NAME: ${{ github.ref_name }}
-        run: |
-          if [ -n "${PR_NUMBER}" ]; then
-            CACHE_SUFFIX="${PR_NUMBER}"
-          else
-            # shellcheck disable=SC2001
-            CACHE_SUFFIX=$(echo "${REF_NAME}" | sed 's/[^A-Za-z0-9._-]/-/g')
-          fi
-          echo "cache-suffix=${CACHE_SUFFIX}" >> $GITHUB_OUTPUT
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
-
-      # needed for pulling Vespa, Redis, Postgres, and Minio images
-      # otherwise, we hit the "Unauthenticated users" limit
-      # https://docs.docker.com/docker-hub/usage/
-      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
-        with:
-          username: ${{ secrets.DOCKER_USERNAME }}
-          password: ${{ secrets.DOCKER_TOKEN }}
-
-      - name: Build and push Model Server Docker image
-        uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # ratchet:docker/build-push-action@v6
-        with:
-          context: ./backend
-          file: ./backend/Dockerfile.model_server
-          push: true
-          tags: ${{ env.RUNS_ON_ECR_CACHE }}:integration-test-model-server-test-${{ github.run_id }}
-          cache-from: |
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:model-server-cache-${{ github.event.pull_request.head.sha || github.sha }}
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:model-server-cache-${{ steps.format-branch.outputs.cache-suffix }}
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:model-server-cache
-            type=registry,ref=onyxdotapp/onyx-model-server:latest
-          cache-to: |
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:model-server-cache-${{ github.event.pull_request.head.sha || github.sha }},mode=max
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:model-server-cache-${{ steps.format-branch.outputs.cache-suffix }},mode=max
-            type=registry,ref=${{ env.RUNS_ON_ECR_CACHE }}:model-server-cache,mode=max
-
-  build-integration-image:
-    runs-on:
-      [
-        runs-on,
-        runner=2cpu-linux-arm64,
-        "run-id=${{ github.run_id }}-build-integration-image",
-        "extras=ecr-cache",
-      ]
-    timeout-minutes: 45
-    steps:
-      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
-      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
-        with:
-          persist-credentials: false
-
-      - name: Format branch name for cache
-        id: format-branch
-        env:
-          PR_NUMBER: ${{ github.event.pull_request.number }}
-          REF_NAME: ${{ github.ref_name }}
-        run: |
-          if [ -n "${PR_NUMBER}" ]; then
-            CACHE_SUFFIX="${PR_NUMBER}"
-          else
-            # shellcheck disable=SC2001
-            CACHE_SUFFIX=$(echo "${REF_NAME}" | sed 's/[^A-Za-z0-9._-]/-/g')
-          fi
-          echo "cache-suffix=${CACHE_SUFFIX}" >> $GITHUB_OUTPUT
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
-
-      # needed for pulling openapitools/openapi-generator-cli
-      # otherwise, we hit the "Unauthenticated users" limit
-      # https://docs.docker.com/docker-hub/usage/
-      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
-        with:
-          username: ${{ secrets.DOCKER_USERNAME }}
-          password: ${{ secrets.DOCKER_TOKEN }}
-
-      - name: Build and push integration test image with Docker Bake
-        env:
-          INTEGRATION_REPOSITORY: ${{ env.RUNS_ON_ECR_CACHE }}
-          TAG: integration-test-${{ github.run_id }}
-          CACHE_SUFFIX: ${{ steps.format-branch.outputs.cache-suffix }}
-          HEAD_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
-        run: |
-          docker buildx bake --push \
-            --set backend.cache-from=type=registry,ref=${RUNS_ON_ECR_CACHE}:backend-cache-${HEAD_SHA} \
-            --set backend.cache-from=type=registry,ref=${RUNS_ON_ECR_CACHE}:backend-cache-${CACHE_SUFFIX} \
-            --set backend.cache-from=type=registry,ref=${RUNS_ON_ECR_CACHE}:backend-cache \
-            --set backend.cache-from=type=registry,ref=onyxdotapp/onyx-backend:latest \
-            --set backend.cache-to=type=registry,ref=${RUNS_ON_ECR_CACHE}:backend-cache-${HEAD_SHA},mode=max \
-            --set backend.cache-to=type=registry,ref=${RUNS_ON_ECR_CACHE}:backend-cache-${CACHE_SUFFIX},mode=max \
-            --set backend.cache-to=type=registry,ref=${RUNS_ON_ECR_CACHE}:backend-cache,mode=max \
-            --set integration.cache-from=type=registry,ref=${RUNS_ON_ECR_CACHE}:integration-cache-${HEAD_SHA} \
-            --set integration.cache-from=type=registry,ref=${RUNS_ON_ECR_CACHE}:integration-cache-${CACHE_SUFFIX} \
-            --set integration.cache-from=type=registry,ref=${RUNS_ON_ECR_CACHE}:integration-cache \
-            --set integration.cache-to=type=registry,ref=${RUNS_ON_ECR_CACHE}:integration-cache-${HEAD_SHA},mode=max \
-            --set integration.cache-to=type=registry,ref=${RUNS_ON_ECR_CACHE}:integration-cache-${CACHE_SUFFIX},mode=max \
-            --set integration.cache-to=type=registry,ref=${RUNS_ON_ECR_CACHE}:integration-cache,mode=max \
-            integration
-
-  integration-tests-mit:
-    needs:
-      [
-        discover-test-dirs,
-        build-backend-image,
-        build-model-server-image,
-        build-integration-image,
-      ]
-    runs-on:
-      - runs-on
-      - runner=4cpu-linux-arm64
-      - ${{ format('run-id={0}-integration-tests-mit-job-{1}', github.run_id, strategy['job-index']) }}
-      - extras=ecr-cache
-    timeout-minutes: 45
-
-    strategy:
-      fail-fast: false
-      matrix:
-        test-dir: ${{ fromJson(needs.discover-test-dirs.outputs.test-dirs) }}
-
-    steps:
-      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
-      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
-        with:
-          persist-credentials: false
-
-      # needed for pulling Vespa, Redis, Postgres, and Minio images
-      # otherwise, we hit the "Unauthenticated users" limit
-      # https://docs.docker.com/docker-hub/usage/
-      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
-        with:
-          username: ${{ secrets.DOCKER_USERNAME }}
-          password: ${{ secrets.DOCKER_TOKEN }}
-
-      # NOTE: Use pre-ping/null pool to reduce flakiness due to dropped connections
-      # NOTE: don't need web server for integration tests
-      - name: Create .env file for Docker Compose
-        env:
-          ECR_CACHE: ${{ env.RUNS_ON_ECR_CACHE }}
-          RUN_ID: ${{ github.run_id }}
-        run: |
-          cat <<EOF > deployment/docker_compose/.env
-          AUTH_TYPE=basic
-          POSTGRES_POOL_PRE_PING=true
-          POSTGRES_USE_NULL_POOL=true
-          REQUIRE_EMAIL_VERIFICATION=false
-          DISABLE_TELEMETRY=true
-          ONYX_BACKEND_IMAGE=${ECR_CACHE}:integration-test-backend-test-${RUN_ID}
-          ONYX_MODEL_SERVER_IMAGE=${ECR_CACHE}:integration-test-model-server-test-${RUN_ID}
-          INTEGRATION_TESTS_MODE=true
-          MCP_SERVER_ENABLED=true
-          AUTO_LLM_UPDATE_INTERVAL_SECONDS=10
-          EOF
-
-      - name: Start Docker containers
-        run: |
-          cd deployment/docker_compose
-          docker compose -f docker-compose.yml -f docker-compose.dev.yml up \
-            relational_db \
-            index \
-            cache \
-            minio \
-            api_server \
-            inference_model_server \
-            indexing_model_server \
-            background \
-            -d
-        id: start_docker
-
-      - name: Wait for services to be ready
-        run: |
-          echo "Starting wait-for-service script..."
-
-          wait_for_service() {
-            local url=$1
-            local label=$2
-            local timeout=${3:-300}  # default 5 minutes
-            local start_time
-            start_time=$(date +%s)
-
-            while true; do
-              local current_time
-              current_time=$(date +%s)
-              local elapsed_time=$((current_time - start_time))
-
-              if [ $elapsed_time -ge $timeout ]; then
-                echo "Timeout reached. ${label} did not become ready in $timeout seconds."
-                exit 1
-              fi
-
-              local response
-              response=$(curl -s -o /dev/null -w "%{http_code}" "$url" || echo "curl_error")
-
-              if [ "$response" = "200" ]; then
-                echo "${label} is ready!"
-                break
-              elif [ "$response" = "curl_error" ]; then
-                echo "Curl encountered an error while checking ${label}. Retrying in 5 seconds..."
-              else
-                echo "${label} not ready yet (HTTP status $response). Retrying in 5 seconds..."
-              fi
-
-              sleep 5
-            done
-          }
-
-          wait_for_service "http://localhost:8080/health" "API server"
-          echo "Finished waiting for services."
-
-      - name: Start Mock Services
-        run: |
-          cd backend/tests/integration/mock_services
-          docker compose -f docker-compose.mock-it-services.yml \
-            -p mock-it-services-stack up -d
-
-      # NOTE: Use pre-ping/null to reduce flakiness due to dropped connections
-      - name: Run Integration Tests for ${{ matrix.test-dir.name }}
-        uses: nick-fields/retry@ce71cc2ab81d554ebbe88c79ab5975992d79ba08 # ratchet:nick-fields/retry@v3
-        with:
-          timeout_minutes: 20
-          max_attempts: 3
-          retry_wait_seconds: 10
-          command: |
-            echo "Running integration tests for ${{ matrix.test-dir.path }}..."
-            docker run --rm --network onyx_default \
-              --name test-runner \
-              -e POSTGRES_HOST=relational_db \
-              -e POSTGRES_USER=postgres \
-              -e POSTGRES_PASSWORD=password \
-              -e POSTGRES_DB=postgres \
-              -e DB_READONLY_USER=db_readonly_user \
-              -e DB_READONLY_PASSWORD=password \
-              -e POSTGRES_POOL_PRE_PING=true \
-              -e POSTGRES_USE_NULL_POOL=true \
-              -e VESPA_HOST=index \
-              -e REDIS_HOST=cache \
-              -e API_SERVER_HOST=api_server \
-              -e OPENAI_API_KEY=${OPENAI_API_KEY} \
-              -e EXA_API_KEY=${EXA_API_KEY} \
-              -e SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN} \
-              -e CONFLUENCE_TEST_SPACE_URL=${CONFLUENCE_TEST_SPACE_URL} \
-              -e CONFLUENCE_USER_NAME=${CONFLUENCE_USER_NAME} \
-              -e CONFLUENCE_ACCESS_TOKEN=${CONFLUENCE_ACCESS_TOKEN} \
-              -e CONFLUENCE_ACCESS_TOKEN_SCOPED=${CONFLUENCE_ACCESS_TOKEN_SCOPED} \
-              -e JIRA_BASE_URL=${JIRA_BASE_URL} \
-              -e JIRA_USER_EMAIL=${JIRA_USER_EMAIL} \
-              -e JIRA_API_TOKEN=${JIRA_API_TOKEN} \
-              -e JIRA_API_TOKEN_SCOPED=${JIRA_API_TOKEN_SCOPED} \
-              -e PERM_SYNC_SHAREPOINT_CLIENT_ID=${PERM_SYNC_SHAREPOINT_CLIENT_ID} \
-              -e PERM_SYNC_SHAREPOINT_PRIVATE_KEY="${PERM_SYNC_SHAREPOINT_PRIVATE_KEY}" \
-              -e PERM_SYNC_SHAREPOINT_CERTIFICATE_PASSWORD=${PERM_SYNC_SHAREPOINT_CERTIFICATE_PASSWORD} \
-              -e PERM_SYNC_SHAREPOINT_DIRECTORY_ID=${PERM_SYNC_SHAREPOINT_DIRECTORY_ID} \
-              -e TEST_WEB_HOSTNAME=test-runner \
-              -e MOCK_CONNECTOR_SERVER_HOST=mock_connector_server \
-              -e MOCK_CONNECTOR_SERVER_PORT=8001 \
-              ${{ env.RUNS_ON_ECR_CACHE }}:integration-test-${{ github.run_id }} \
-              /app/tests/integration/${{ matrix.test-dir.path }}
-
-      # ------------------------------------------------------------
-      # Always gather logs BEFORE "down":
-      - name: Dump API server logs
-        if: always()
-        run: |
-          cd deployment/docker_compose
-          docker compose logs --no-color api_server > $GITHUB_WORKSPACE/api_server.log || true
-
-      - name: Dump all-container logs (optional)
-        if: always()
-        run: |
-          cd deployment/docker_compose
-          docker compose logs --no-color > $GITHUB_WORKSPACE/docker-compose.log || true
-
-      - name: Upload logs
-        if: always()
-        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
-        with:
-          name: docker-all-logs-${{ matrix.test-dir.name }}
-          path: ${{ github.workspace }}/docker-compose.log
-      # ------------------------------------------------------------
-
-  required:
-    # NOTE: Github-hosted runners have about 20s faster queue times and are preferred here.
-    runs-on: ubuntu-slim
-    timeout-minutes: 45
-    needs: [integration-tests-mit]
-    if: ${{ always() }}
-    steps:
-      - name: Check job status
-        if: ${{ contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled') || contains(needs.*.result, 'skipped') }}
-        run: exit 1
--- a/.github/workflows/pr-playwright-tests.yml
+++ b/.github/workflows/pr-playwright-tests.yml
@@ -22,6 +22,9 @@ env:
  SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
  GEN_AI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
  EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
+  FIRECRAWL_API_KEY: ${{ secrets.FIRECRAWL_API_KEY }}
+  GOOGLE_PSE_API_KEY: ${{ secrets.GOOGLE_PSE_API_KEY }}
+  GOOGLE_PSE_SEARCH_ENGINE_ID: ${{ secrets.GOOGLE_PSE_SEARCH_ENGINE_ID }}

  # for federated slack tests
  SLACK_CLIENT_ID: ${{ secrets.SLACK_CLIENT_ID }}
@@ -52,6 +55,9 @@ env:
  MCP_SERVER_PUBLIC_HOST: host.docker.internal
  MCP_SERVER_PUBLIC_URL: http://host.docker.internal:8004/mcp

+  # Visual regression S3 bucket (shared across all jobs)
+  PLAYWRIGHT_S3_BUCKET: onyx-playwright-artifacts
+
 jobs:
  build-web-image:
    runs-on:
@@ -66,7 +72,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -90,7 +96,7 @@ jobs:
      # needed for pulling external images otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -127,7 +133,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -151,7 +157,7 @@ jobs:
      # needed for pulling external images otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -188,7 +194,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -212,7 +218,7 @@ jobs:
      # needed for pulling external images otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -239,6 +245,9 @@ jobs:
  playwright-tests:
    needs: [build-web-image, build-backend-image, build-model-server-image]
    name: Playwright Tests (${{ matrix.project }})
+    permissions:
+      id-token: write # Required for OIDC-based AWS credential exchange (S3 access)
+      contents: read
    runs-on:
      - runs-on
      - runner=8cpu-linux-arm64
@@ -249,17 +258,17 @@ jobs:
    strategy:
      fail-fast: false
      matrix:
-        project: [admin, no-auth, exclusive]
+        project: [admin, exclusive]
    steps:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

      - name: Setup node
-        uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # ratchet:actions/setup-node@v4
+        uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # ratchet:actions/setup-node@v4
        with:
          node-version: 22
          cache: "npm"
@@ -289,8 +298,12 @@ jobs:
          RUN_ID: ${{ github.run_id }}
        run: |
          cat <<EOF > deployment/docker_compose/.env
+          COMPOSE_PROFILES=s3-filestore
          ENABLE_PAID_ENTERPRISE_EDITION_FEATURES=true
+          # TODO(Nik): https://linear.app/onyx-app/issue/ENG-1/update-test-infra-to-use-test-license
+          LICENSE_ENFORCEMENT_ENABLED=false
          AUTH_TYPE=basic
+          INTEGRATION_TESTS_MODE=true
          GEN_AI_API_KEY=${OPENAI_API_KEY_VALUE}
          EXA_API_KEY=${EXA_API_KEY_VALUE}
          REQUIRE_EMAIL_VERIFICATION=false
@@ -299,15 +312,12 @@ jobs:
          ONYX_MODEL_SERVER_IMAGE=${ECR_CACHE}:playwright-test-model-server-${RUN_ID}
          ONYX_WEB_SERVER_IMAGE=${ECR_CACHE}:playwright-test-web-${RUN_ID}
          EOF
-          if [ "${{ matrix.project }}" = "no-auth" ]; then
-            echo "PLAYWRIGHT_FORCE_EMPTY_LLM_PROVIDERS=true" >> deployment/docker_compose/.env
-          fi

      # needed for pulling Vespa, Redis, Postgres, and Minio images
      # otherwise, we hit the "Unauthenticated users" limit
      # https://docs.docker.com/docker-hub/usage/
      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef # ratchet:docker/login-action@v3
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
@@ -428,11 +438,6 @@ jobs:
        env:
          PROJECT: ${{ matrix.project }}
        run: |
-          # Create test-results directory to ensure it exists for artifact upload
-          mkdir -p test-results
-          if [ "${PROJECT}" = "no-auth" ]; then
-            export PLAYWRIGHT_FORCE_EMPTY_LLM_PROVIDERS=true
-          fi
          npx playwright test --project ${PROJECT}

      - uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
@@ -440,9 +445,134 @@ jobs:
        with:
          # Includes test results and trace.zip files
          name: playwright-test-results-${{ matrix.project }}-${{ github.run_id }}
-          path: ./web/test-results/
+          path: ./web/output/playwright/
          retention-days: 30

+      - name: Upload screenshots
+        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
+        if: always()
+        with:
+          name: playwright-screenshots-${{ matrix.project }}-${{ github.run_id }}
+          path: ./web/output/screenshots/
+          retention-days: 30
+
+      # --- Visual Regression Diff ---
+      - name: Configure AWS credentials
+        if: always()
+        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
+        with:
+          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
+          aws-region: us-east-2
+
+      - name: Install the latest version of uv
+        if: always()
+        uses: astral-sh/setup-uv@61cb8a9741eeb8a550a1b8544337180c0fc8476b # ratchet:astral-sh/setup-uv@v7
+        with:
+          enable-cache: false
+          version: "0.9.9"
+
+      - name: Determine baseline revision
+        if: always()
+        id: baseline-rev
+        env:
+          EVENT_NAME: ${{ github.event_name }}
+          BASE_REF: ${{ github.event.pull_request.base.ref }}
+          MERGE_GROUP_BASE_REF: ${{ github.event.merge_group.base_ref }}
+          GH_REF: ${{ github.ref }}
+          REF_NAME: ${{ github.ref_name }}
+        run: |
+          if [ "${EVENT_NAME}" = "pull_request" ]; then
+            # PRs compare against the base branch (e.g. main, release/2.5)
+            echo "rev=${BASE_REF}" >> "$GITHUB_OUTPUT"
+          elif [ "${EVENT_NAME}" = "merge_group" ]; then
+            # Merge queue compares against the target branch (e.g. refs/heads/main -> main)
+            echo "rev=${MERGE_GROUP_BASE_REF#refs/heads/}" >> "$GITHUB_OUTPUT"
+          elif [[ "${GH_REF}" == refs/tags/* ]]; then
+            # Tag builds compare against the tag name
+            echo "rev=${REF_NAME}" >> "$GITHUB_OUTPUT"
+          else
+            # Push builds (main, release/*) compare against the branch name
+            echo "rev=${REF_NAME}" >> "$GITHUB_OUTPUT"
+          fi
+
+      - name: Generate screenshot diff report
+        if: always()
+        env:
+          PROJECT: ${{ matrix.project }}
+          PLAYWRIGHT_S3_BUCKET: ${{ env.PLAYWRIGHT_S3_BUCKET }}
+          BASELINE_REV: ${{ steps.baseline-rev.outputs.rev }}
+        run: |
+          uv run --no-sync --with onyx-devtools ods screenshot-diff compare \
+            --project "${PROJECT}" \
+            --rev "${BASELINE_REV}"
+
+      - name: Upload visual diff report to S3
+        if: always()
+        env:
+          PROJECT: ${{ matrix.project }}
+          PR_NUMBER: ${{ github.event.pull_request.number }}
+          RUN_ID: ${{ github.run_id }}
+        run: |
+          SUMMARY_FILE="web/output/screenshot-diff/${PROJECT}/summary.json"
+          if [ ! -f "${SUMMARY_FILE}" ]; then
+            echo "No summary file found — skipping S3 upload."
+            exit 0
+          fi
+
+          HAS_DIFF=$(jq -r '.has_differences' "${SUMMARY_FILE}")
+          if [ "${HAS_DIFF}" != "true" ]; then
+            echo "No visual differences for ${PROJECT} — skipping S3 upload."
+            exit 0
+          fi
+
+          aws s3 sync "web/output/screenshot-diff/${PROJECT}/" \
+            "s3://${PLAYWRIGHT_S3_BUCKET}/reports/pr-${PR_NUMBER}/${RUN_ID}/${PROJECT}/"
+
+      - name: Upload visual diff summary
+        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
+        if: always()
+        with:
+          name: screenshot-diff-summary-${{ matrix.project }}
+          path: ./web/output/screenshot-diff/${{ matrix.project }}/summary.json
+          if-no-files-found: ignore
+          retention-days: 5
+
+      - name: Upload visual diff report artifact
+        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
+        if: always()
+        with:
+          name: screenshot-diff-report-${{ matrix.project }}-${{ github.run_id }}
+          path: ./web/output/screenshot-diff/${{ matrix.project }}/
+          if-no-files-found: ignore
+          retention-days: 30
+
+      - name: Update S3 baselines
+        if: >-
+          success() && (
+            github.ref == 'refs/heads/main' ||
+            startsWith(github.ref, 'refs/heads/release/') ||
+            startsWith(github.ref, 'refs/tags/v') ||
+            (
+              github.event_name == 'merge_group' && (
+                github.event.merge_group.base_ref == 'refs/heads/main' ||
+                startsWith(github.event.merge_group.base_ref, 'refs/heads/release/')
+              )
+            )
+          )
+        env:
+          PROJECT: ${{ matrix.project }}
+          PLAYWRIGHT_S3_BUCKET: ${{ env.PLAYWRIGHT_S3_BUCKET }}
+          BASELINE_REV: ${{ steps.baseline-rev.outputs.rev }}
+        run: |
+          if [ -d "web/output/screenshots/" ] && [ "$(ls -A web/output/screenshots/)" ]; then
+            uv run --no-sync --with onyx-devtools ods screenshot-diff upload-baselines \
+              --project "${PROJECT}" \
+              --rev "${BASELINE_REV}" \
+              --delete
+          else
+            echo "No screenshots to upload for ${PROJECT} — skipping baseline update."
+          fi
+
      # save before stopping the containers so the logs can be captured
      - name: Save Docker logs
        if: success() || failure()
@@ -460,6 +590,98 @@ jobs:
          name: docker-logs-${{ matrix.project }}-${{ github.run_id }}
          path: ${{ github.workspace }}/docker-compose.log

+  # Post a single combined visual regression comment after all matrix jobs finish
+  visual-regression-comment:
+    needs: [playwright-tests]
+    if: >-
+      always() &&
+      github.event_name == 'pull_request' &&
+      needs.playwright-tests.result != 'cancelled'
+    runs-on: ubuntu-slim
+    timeout-minutes: 5
+    permissions:
+      pull-requests: write
+    steps:
+      - name: Download visual diff summaries
+        uses: actions/download-artifact@95815c38cf2ff2164869cbab79da8d1f422bc89e # ratchet:actions/download-artifact@v4
+        with:
+          pattern: screenshot-diff-summary-*
+          path: summaries/
+
+      - name: Post combined PR comment
+        env:
+          GH_TOKEN: ${{ github.token }}
+          PR_NUMBER: ${{ github.event.pull_request.number }}
+          RUN_ID: ${{ github.run_id }}
+          REPO: ${{ github.repository }}
+          S3_BUCKET: ${{ env.PLAYWRIGHT_S3_BUCKET }}
+        run: |
+          MARKER="<!-- visual-regression-report -->"
+
+          # Build the markdown table from all summary files
+          TABLE_HEADER="| Project | Changed | Added | Removed | Unchanged | Report |"
+          TABLE_DIVIDER="|---------|---------|-------|---------|-----------|--------|"
+          TABLE_ROWS=""
+          HAS_ANY_SUMMARY=false
+
+          for SUMMARY_DIR in summaries/screenshot-diff-summary-*/; do
+            SUMMARY_FILE="${SUMMARY_DIR}summary.json"
+            if [ ! -f "${SUMMARY_FILE}" ]; then
+              continue
+            fi
+
+            HAS_ANY_SUMMARY=true
+            PROJECT=$(jq -r '.project' "${SUMMARY_FILE}")
+            CHANGED=$(jq -r '.changed' "${SUMMARY_FILE}")
+            ADDED=$(jq -r '.added' "${SUMMARY_FILE}")
+            REMOVED=$(jq -r '.removed' "${SUMMARY_FILE}")
+            UNCHANGED=$(jq -r '.unchanged' "${SUMMARY_FILE}")
+            TOTAL=$(jq -r '.total' "${SUMMARY_FILE}")
+            HAS_DIFF=$(jq -r '.has_differences' "${SUMMARY_FILE}")
+
+            if [ "${TOTAL}" = "0" ]; then
+              REPORT_LINK="_No screenshots_"
+            elif [ "${HAS_DIFF}" = "true" ]; then
+              REPORT_URL="https://${S3_BUCKET}.s3.us-east-2.amazonaws.com/reports/pr-${PR_NUMBER}/${RUN_ID}/${PROJECT}/index.html"
+              REPORT_LINK="[View Report](${REPORT_URL})"
+            else
+              REPORT_LINK="✅ No changes"
+            fi
+
+            TABLE_ROWS="${TABLE_ROWS}| \`${PROJECT}\` | ${CHANGED} | ${ADDED} | ${REMOVED} | ${UNCHANGED} | ${REPORT_LINK} |\n"
+          done
+
+          if [ "${HAS_ANY_SUMMARY}" = "false" ]; then
+            echo "No visual diff summaries found — skipping PR comment."
+            exit 0
+          fi
+
+          BODY=$(printf '%s\n' \
+            "${MARKER}" \
+            "### 🖼️ Visual Regression Report" \
+            "" \
+            "${TABLE_HEADER}" \
+            "${TABLE_DIVIDER}" \
+            "$(printf '%b' "${TABLE_ROWS}")")
+
+          # Upsert: find existing comment with the marker, or create a new one
+          EXISTING_COMMENT_ID=$(gh api \
+            "repos/${REPO}/issues/${PR_NUMBER}/comments" \
+            --jq ".[] | select(.body | startswith(\"${MARKER}\")) | .id" \
+            2>/dev/null | head -1)
+
+          if [ -n "${EXISTING_COMMENT_ID}" ]; then
+            gh api \
+              --method PATCH \
+              "repos/${REPO}/issues/comments/${EXISTING_COMMENT_ID}" \
+              -f body="${BODY}"
+          else
+            gh api \
+              --method POST \
+              "repos/${REPO}/issues/${PR_NUMBER}/comments" \
+              -f body="${BODY}"
+          fi
+
  playwright-required:
    # NOTE: Github-hosted runners have about 20s faster queue times and are preferred here.
    runs-on: ubuntu-slim
@@ -470,48 +692,3 @@ jobs:
      - name: Check job status
        if: ${{ contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled') || contains(needs.*.result, 'skipped') }}
        run: exit 1
-
-# NOTE: Chromatic UI diff testing is currently disabled.
-# We are using Playwright for local and CI testing without visual regression checks.
-# Chromatic may be reintroduced in the future for UI diff testing if needed.
-
-# chromatic-tests:
-#   name: Chromatic Tests
-
-#   needs: playwright-tests
-#   runs-on:
-#     [
-#       runs-on,
-#       runner=32cpu-linux-x64,
-#       disk=large,
-#       "run-id=${{ github.run_id }}",
-#     ]
-#   steps:
-#     - name: Checkout code
-#       uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
-#       with:
-#         fetch-depth: 0
-
-#     - name: Setup node
-#       uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # ratchet:actions/setup-node@v4
-#       with:
-#         node-version: 22
-
-#     - name: Install node dependencies
-#       working-directory: ./web
-#       run: npm ci
-
-#     - name: Download Playwright test results
-#       uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # ratchet:actions/download-artifact@v4
-#       with:
-#         name: test-results
-#         path: ./web/test-results
-
-#     - name: Run Chromatic
-#       uses: chromaui/action@latest
-#       with:
-#         playwright: true
-#         projectToken: ${{ secrets.CHROMATIC_PROJECT_TOKEN }}
-#         workingDir: ./web
-#       env:
-#         CHROMATIC_ARCHIVE_LOCATION: ./test-results
--- a/.github/workflows/pr-python-checks.yml
+++ b/.github/workflows/pr-python-checks.yml
@@ -27,7 +27,7 @@ jobs:
    steps:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -42,6 +42,9 @@ jobs:

      - name: Generate OpenAPI schema and Python client
        shell: bash
+        # TODO(Nik): https://linear.app/onyx-app/issue/ENG-1/update-test-infra-to-use-test-license
+        env:
+          LICENSE_ENFORCEMENT_ENABLED: "false"
        run: |
          ods openapi all

@@ -50,8 +53,9 @@ jobs:
        uses: runs-on/cache@50350ad4242587b6c8c2baa2e740b1bc11285ff4 # ratchet:runs-on/cache@v4
        with:
          path: backend/.mypy_cache
-          key: mypy-${{ runner.os }}-${{ hashFiles('**/*.py', '**/*.pyi', 'backend/pyproject.toml') }}
+          key: mypy-${{ runner.os }}-${{ github.base_ref || github.event.merge_group.base_ref || 'main' }}-${{ hashFiles('**/*.py', '**/*.pyi', 'backend/pyproject.toml') }}
          restore-keys: |
+            mypy-${{ runner.os }}-${{ github.base_ref || github.event.merge_group.base_ref || 'main' }}-
            mypy-${{ runner.os }}-

      - name: Run MyPy
--- a/.github/workflows/pr-python-connector-tests.yml
+++ b/.github/workflows/pr-python-connector-tests.yml
@@ -65,7 +65,7 @@ env:
  ZENDESK_TOKEN: ${{ secrets.ZENDESK_TOKEN }}

  # Salesforce
-  SF_USERNAME: ${{ secrets.SF_USERNAME }}
+  SF_USERNAME: ${{ vars.SF_USERNAME }}
  SF_PASSWORD: ${{ secrets.SF_PASSWORD }}
  SF_SECURITY_TOKEN: ${{ secrets.SF_SECURITY_TOKEN }}

@@ -110,6 +110,9 @@ env:
  # Slack
  SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

+  # Discord
+  DISCORD_CONNECTOR_BOT_TOKEN: ${{ secrets.DISCORD_CONNECTOR_BOT_TOKEN }}
+
  # Teams
  TEAMS_APPLICATION_ID: ${{ secrets.TEAMS_APPLICATION_ID }}
  TEAMS_DIRECTORY_ID: ${{ secrets.TEAMS_DIRECTORY_ID }}
@@ -139,7 +142,7 @@ jobs:
      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

--- a/.github/workflows/pr-python-model-tests.yml
+++ b/.github/workflows/pr-python-model-tests.yml
@@ -38,7 +38,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false

@@ -64,7 +64,7 @@ jobs:
          echo "cache-suffix=${CACHE_SUFFIX}" >> $GITHUB_OUTPUT

      - name: Login to Docker Hub
-        uses: docker/login-action@5e57cd118135c172c3672efd75eb46360885c0ef
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_TOKEN }}
--- a/.github/workflows/pr-python-tests.yml
+++ b/.github/workflows/pr-python-tests.yml
@@ -27,12 +27,14 @@ jobs:
      PYTHONPATH: ./backend
      REDIS_CLOUD_PYTEST_PASSWORD: ${{ secrets.REDIS_CLOUD_PYTEST_PASSWORD }}
      DISABLE_TELEMETRY: "true"
+      # TODO(Nik): https://linear.app/onyx-app/issue/ENG-1/update-test-infra-to-use-test-license
+      LICENSE_ENFORCEMENT_ENABLED: "false"

    steps:
    - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2

    - name: Checkout code
-      uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+      uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
      with:
        persist-credentials: false

--- a/.github/workflows/pr-quality-checks.yml
+++ b/.github/workflows/pr-quality-checks.yml
@@ -20,17 +20,17 @@ jobs:
    runs-on: ubuntu-latest
    timeout-minutes: 45
    steps:
-      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          fetch-depth: 0
          persist-credentials: false
-      - uses: actions/setup-python@83679a892e2d95755f2dac6acb0bfd1e9ac5d548 # ratchet:actions/setup-python@v6
+      - uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # ratchet:actions/setup-python@v6
        with:
          python-version: "3.11"
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@b9cd54a3c349d3f38e8881555d616ced269862dd # ratchet:hashicorp/setup-terraform@v3
      - name: Setup node
-        uses: actions/setup-node@395ad3262231945c25e8478fd5baf05154b1d79f # ratchet:actions/setup-node@v6
+        uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # ratchet:actions/setup-node@v6
        with: # zizmor: ignore[cache-poisoning]
          node-version: 22
          cache: "npm"
@@ -38,7 +38,7 @@ jobs:
      - name: Install node dependencies
        working-directory: ./web
        run: npm ci
-      - uses: j178/prek-action@91fd7d7cf70ae1dee9f4f44e7dfa5d1073fe6623 # ratchet:j178/prek-action@v1
+      - uses: j178/prek-action@9d6a3097e0c1865ecce00cfb89fe80f2ee91b547 # ratchet:j178/prek-action@v1
        with:
          prek-version: '0.2.21'
          extra-args: ${{ github.event_name == 'pull_request' && format('--from-ref {0} --to-ref {1}', github.event.pull_request.base.sha, github.event.pull_request.head.sha) || github.event_name == 'merge_group' && format('--from-ref {0} --to-ref {1}', github.event.merge_group.base_sha, github.event.merge_group.head_sha) || github.ref_name == 'main' && '--all-files' || '' }}
--- a/.github/workflows/preview.yml
+++ b/.github/workflows/preview.yml
@@ -0,0 +1,73 @@
+name: Preview Deployment
+env:
+  VERCEL_ORG_ID: ${{ secrets.VERCEL_ORG_ID }}
+  VERCEL_PROJECT_ID: ${{ secrets.VERCEL_PROJECT_ID }}
+  VERCEL_CLI: vercel@50.14.1
+on:
+  push:
+    branches-ignore:
+      - main
+    paths:
+      - "web/**"
+permissions:
+  contents: read
+  pull-requests: write
+jobs:
+  Deploy-Preview:
+    runs-on: ubuntu-latest
+    timeout-minutes: 30
+    steps:
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd
+        with:
+          persist-credentials: false
+
+      - name: Setup node
+        uses: actions/setup-node@6044e13b5dc448c55e2357c09f80417699197238 # ratchet:actions/setup-node@v4
+        with:
+          node-version: 22
+          cache: "npm"
+          cache-dependency-path: ./web/package-lock.json
+
+      - name: Pull Vercel Environment Information
+        run: npx --yes ${{ env.VERCEL_CLI }} pull --yes --environment=preview --token=${{ secrets.VERCEL_TOKEN }}
+
+      - name: Build Project Artifacts
+        run: npx --yes ${{ env.VERCEL_CLI }} build --token=${{ secrets.VERCEL_TOKEN }}
+
+      - name: Deploy Project Artifacts to Vercel
+        id: deploy
+        run: |
+          DEPLOYMENT_URL=$(npx --yes ${{ env.VERCEL_CLI }} deploy --prebuilt --token=${{ secrets.VERCEL_TOKEN }})
+          echo "url=$DEPLOYMENT_URL" >> "$GITHUB_OUTPUT"
+
+      - name: Update PR comment with deployment URL
+        if: always() && steps.deploy.outputs.url
+        env:
+          GH_TOKEN: ${{ github.token }}
+          DEPLOYMENT_URL: ${{ steps.deploy.outputs.url }}
+        run: |
+          # Find the PR for this branch
+          PR_NUMBER=$(gh pr list --head "$GITHUB_REF_NAME" --json number --jq '.[0].number')
+          if [ -z "$PR_NUMBER" ]; then
+            echo "No open PR found for branch $GITHUB_REF_NAME, skipping comment."
+            exit 0
+          fi
+
+          COMMENT_MARKER="<!-- preview-deployment -->"
+          COMMENT_BODY="$COMMENT_MARKER
+          **Preview Deployment**
+
+          | Status | Preview | Commit | Updated |
+          | --- | --- | --- | --- |
+          | ✅ |  $DEPLOYMENT_URL | \`${GITHUB_SHA::7}\` | $(date -u '+%Y-%m-%d %H:%M:%S UTC') |"
+
+          # Find existing comment by marker
+          EXISTING_COMMENT_ID=$(gh api "repos/$GITHUB_REPOSITORY/issues/$PR_NUMBER/comments" \
+            --jq ".[] | select(.body | startswith(\"$COMMENT_MARKER\")) | .id" | head -1)
+
+          if [ -n "$EXISTING_COMMENT_ID" ]; then
+            gh api "repos/$GITHUB_REPOSITORY/issues/comments/$EXISTING_COMMENT_ID" \
+              --method PATCH --field body="$COMMENT_BODY"
+          else
+            gh pr comment "$PR_NUMBER" --body "$COMMENT_BODY"
+          fi
--- a/.github/workflows/release-devtools.yml
+++ b/.github/workflows/release-devtools.yml
@@ -24,11 +24,11 @@ jobs:
          - { goos: "darwin", goarch: "arm64" }
          - { goos: "", goarch: "" }
    steps:
-      - uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          persist-credentials: false
          fetch-depth: 0
-      - uses: astral-sh/setup-uv@ed21f2f24f8dd64503750218de024bcf64c7250a # ratchet:astral-sh/setup-uv@v7
+      - uses: astral-sh/setup-uv@61cb8a9741eeb8a550a1b8544337180c0fc8476b # ratchet:astral-sh/setup-uv@v7
        with:
          enable-cache: false
          version: "0.9.9"
--- a/.github/workflows/sandbox-deployment.yml
+++ b/.github/workflows/sandbox-deployment.yml
@@ -0,0 +1,290 @@
+name: Build and Push Sandbox Image on Tag
+
+on:
+  push:
+    tags:
+      - "experimental-cc4a.*"
+
+# Restrictive defaults; jobs declare what they need.
+permissions: {}
+
+jobs:
+  check-sandbox-changes:
+    runs-on: ubuntu-slim
+    timeout-minutes: 10
+    permissions:
+      contents: read
+    outputs:
+      sandbox-changed: ${{ steps.check.outputs.sandbox-changed }}
+      new-version: ${{ steps.version.outputs.new-version }}
+    steps:
+      - name: Checkout
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
+        with:
+          persist-credentials: false
+          fetch-depth: 0
+
+      - name: Check for sandbox-relevant file changes
+        id: check
+        run: |
+          # Get the previous tag to diff against
+          CURRENT_TAG="${GITHUB_REF_NAME}"
+          PREVIOUS_TAG=$(git tag --sort=-creatordate | grep '^experimental-cc4a\.' | grep -v "^${CURRENT_TAG}$" | head -n 1)
+
+          if [ -z "$PREVIOUS_TAG" ]; then
+            echo "No previous experimental-cc4a tag found, building unconditionally"
+            echo "sandbox-changed=true" >> "$GITHUB_OUTPUT"
+            exit 0
+          fi
+
+          echo "Comparing ${PREVIOUS_TAG}..${CURRENT_TAG}"
+
+          # Check if any sandbox-relevant files changed
+          SANDBOX_PATHS=(
+            "backend/onyx/server/features/build/sandbox/"
+          )
+
+          CHANGED=false
+          for path in "${SANDBOX_PATHS[@]}"; do
+            if git diff --name-only "${PREVIOUS_TAG}..${CURRENT_TAG}" -- "$path" | grep -q .; then
+              echo "Changes detected in: $path"
+              CHANGED=true
+              break
+            fi
+          done
+
+          echo "sandbox-changed=$CHANGED" >> "$GITHUB_OUTPUT"
+
+      - name: Determine new sandbox version
+        id: version
+        if: steps.check.outputs.sandbox-changed == 'true'
+        run: |
+          # Query Docker Hub for the latest versioned tag
+          LATEST_TAG=$(curl -s "https://hub.docker.com/v2/repositories/onyxdotapp/sandbox/tags?page_size=100" \
+            | jq -r '.results[].name' \
+            | grep -E '^v[0-9]+\.[0-9]+\.[0-9]+$' \
+            | sort -V \
+            | tail -n 1)
+
+          if [ -z "$LATEST_TAG" ]; then
+            echo "No existing version tags found on Docker Hub, starting at 0.1.1"
+            NEW_VERSION="0.1.1"
+          else
+            CURRENT_VERSION="${LATEST_TAG#v}"
+            echo "Latest version on Docker Hub: $CURRENT_VERSION"
+
+            # Increment patch version
+            MAJOR=$(echo "$CURRENT_VERSION" | cut -d. -f1)
+            MINOR=$(echo "$CURRENT_VERSION" | cut -d. -f2)
+            PATCH=$(echo "$CURRENT_VERSION" | cut -d. -f3)
+            NEW_PATCH=$((PATCH + 1))
+            NEW_VERSION="${MAJOR}.${MINOR}.${NEW_PATCH}"
+          fi
+
+          echo "New version: $NEW_VERSION"
+          echo "new-version=$NEW_VERSION" >> "$GITHUB_OUTPUT"
+
+  build-sandbox-amd64:
+    needs: check-sandbox-changes
+    if: needs.check-sandbox-changes.outputs.sandbox-changed == 'true'
+    runs-on:
+      - runs-on
+      - runner=4cpu-linux-x64
+      - run-id=${{ github.run_id }}-sandbox-amd64
+      - extras=ecr-cache
+    timeout-minutes: 90
+    environment: release
+    permissions:
+      contents: read
+      id-token: write
+    outputs:
+      digest: ${{ steps.build.outputs.digest }}
+    env:
+      REGISTRY_IMAGE: onyxdotapp/sandbox
+    steps:
+      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
+
+      - name: Checkout
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
+        with:
+          persist-credentials: false
+
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
+        with:
+          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
+          aws-region: us-east-2
+
+      - name: Get AWS Secrets
+        uses: aws-actions/aws-secretsmanager-get-secrets@a9a7eb4e2f2871d30dc5b892576fde60a2ecc802
+        with:
+          secret-ids: |
+            DOCKER_USERNAME, deploy/docker-username
+            DOCKER_TOKEN, deploy/docker-token
+          parse-json-secrets: true
+
+      - name: Docker meta
+        id: meta
+        uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # ratchet:docker/metadata-action@v5
+        with:
+          images: ${{ env.REGISTRY_IMAGE }}
+          flavor: |
+            latest=false
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
+
+      - name: Login to Docker Hub
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
+        with:
+          username: ${{ env.DOCKER_USERNAME }}
+          password: ${{ env.DOCKER_TOKEN }}
+
+      - name: Build and push AMD64
+        id: build
+        uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # ratchet:docker/build-push-action@v6
+        with:
+          context: ./backend/onyx/server/features/build/sandbox/kubernetes/docker
+          file: ./backend/onyx/server/features/build/sandbox/kubernetes/docker/Dockerfile
+          platforms: linux/amd64
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: |
+            type=registry,ref=${{ env.REGISTRY_IMAGE }}:latest
+          cache-to: |
+            type=inline
+          outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
+
+  build-sandbox-arm64:
+    needs: check-sandbox-changes
+    if: needs.check-sandbox-changes.outputs.sandbox-changed == 'true'
+    runs-on:
+      - runs-on
+      - runner=4cpu-linux-arm64
+      - run-id=${{ github.run_id }}-sandbox-arm64
+      - extras=ecr-cache
+    timeout-minutes: 90
+    environment: release
+    permissions:
+      contents: read
+      id-token: write
+    outputs:
+      digest: ${{ steps.build.outputs.digest }}
+    env:
+      REGISTRY_IMAGE: onyxdotapp/sandbox
+    steps:
+      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
+
+      - name: Checkout
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
+        with:
+          persist-credentials: false
+
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
+        with:
+          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
+          aws-region: us-east-2
+
+      - name: Get AWS Secrets
+        uses: aws-actions/aws-secretsmanager-get-secrets@a9a7eb4e2f2871d30dc5b892576fde60a2ecc802
+        with:
+          secret-ids: |
+            DOCKER_USERNAME, deploy/docker-username
+            DOCKER_TOKEN, deploy/docker-token
+          parse-json-secrets: true
+
+      - name: Docker meta
+        id: meta
+        uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # ratchet:docker/metadata-action@v5
+        with:
+          images: ${{ env.REGISTRY_IMAGE }}
+          flavor: |
+            latest=false
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
+
+      - name: Login to Docker Hub
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
+        with:
+          username: ${{ env.DOCKER_USERNAME }}
+          password: ${{ env.DOCKER_TOKEN }}
+
+      - name: Build and push ARM64
+        id: build
+        uses: docker/build-push-action@263435318d21b8e681c14492fe198d362a7d2c83 # ratchet:docker/build-push-action@v6
+        with:
+          context: ./backend/onyx/server/features/build/sandbox/kubernetes/docker
+          file: ./backend/onyx/server/features/build/sandbox/kubernetes/docker/Dockerfile
+          platforms: linux/arm64
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: |
+            type=registry,ref=${{ env.REGISTRY_IMAGE }}:latest
+          cache-to: |
+            type=inline
+          outputs: type=image,name=${{ env.REGISTRY_IMAGE }},push-by-digest=true,name-canonical=true,push=true
+
+  merge-sandbox:
+    needs:
+      - check-sandbox-changes
+      - build-sandbox-amd64
+      - build-sandbox-arm64
+    runs-on:
+      - runs-on
+      - runner=2cpu-linux-x64
+      - run-id=${{ github.run_id }}-merge-sandbox
+      - extras=ecr-cache
+    timeout-minutes: 30
+    environment: release
+    permissions:
+      id-token: write
+    env:
+      REGISTRY_IMAGE: onyxdotapp/sandbox
+    steps:
+      - uses: runs-on/action@cd2b598b0515d39d78c38a02d529db87d2196d1e # ratchet:runs-on/action@v2
+
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@61815dcd50bd041e203e49132bacad1fd04d2708
+        with:
+          role-to-assume: ${{ secrets.AWS_OIDC_ROLE_ARN }}
+          aws-region: us-east-2
+
+      - name: Get AWS Secrets
+        uses: aws-actions/aws-secretsmanager-get-secrets@a9a7eb4e2f2871d30dc5b892576fde60a2ecc802
+        with:
+          secret-ids: |
+            DOCKER_USERNAME, deploy/docker-username
+            DOCKER_TOKEN, deploy/docker-token
+          parse-json-secrets: true
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # ratchet:docker/setup-buildx-action@v3
+
+      - name: Login to Docker Hub
+        uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # ratchet:docker/login-action@v3
+        with:
+          username: ${{ env.DOCKER_USERNAME }}
+          password: ${{ env.DOCKER_TOKEN }}
+
+      - name: Docker meta
+        id: meta
+        uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # ratchet:docker/metadata-action@v5
+        with:
+          images: ${{ env.REGISTRY_IMAGE }}
+          flavor: |
+            latest=false
+          tags: |
+            type=raw,value=v${{ needs.check-sandbox-changes.outputs.new-version }}
+            type=raw,value=latest
+
+      - name: Create and push manifest
+        env:
+          IMAGE_REPO: ${{ env.REGISTRY_IMAGE }}
+          AMD64_DIGEST: ${{ needs.build-sandbox-amd64.outputs.digest }}
+          ARM64_DIGEST: ${{ needs.build-sandbox-arm64.outputs.digest }}
+          META_TAGS: ${{ steps.meta.outputs.tags }}
+        run: |
+          IMAGES="${IMAGE_REPO}@${AMD64_DIGEST} ${IMAGE_REPO}@${ARM64_DIGEST}"
+          docker buildx imagetools create \
+            $(printf '%s\n' "${META_TAGS}" | xargs -I {} echo -t {}) \
+            $IMAGES
--- a/.github/workflows/sync_foss.yml
+++ b/.github/workflows/sync_foss.yml
@@ -14,7 +14,7 @@ jobs:
      contents: read
    steps:
      - name: Checkout main Onyx repo
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          fetch-depth: 0
          persist-credentials: false
--- a/.github/workflows/tag-nightly.yml
+++ b/.github/workflows/tag-nightly.yml
@@ -18,7 +18,7 @@ jobs:
      # see https://github.com/orgs/community/discussions/27028#discussioncomment-3254367 for the workaround we
      # implement here which needs an actual user's deploy key
      - name: Checkout code
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6
        with:
          ssh-key: "${{ secrets.DEPLOY_KEY }}"
          persist-credentials: true
--- a/.github/workflows/zizmor.yml
+++ b/.github/workflows/zizmor.yml
@@ -5,6 +5,8 @@ on:
    branches: ["main"]
  pull_request:
    branches: ["**"]
+    paths:
+      - ".github/**"

 permissions: {}

@@ -17,33 +19,22 @@ jobs:
      security-events: write # needed for SARIF uploads
    steps:
      - name: Checkout repository
-        uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6.0.1
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # ratchet:actions/checkout@v6.0.2
        with:
          persist-credentials: false

-      - name: Detect changes
-        id: filter
-        uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # ratchet:dorny/paths-filter@v3
-        with:
-          filters: |
-            zizmor:
-              - '.github/**'
-
      - name: Install the latest version of uv
-        if: steps.filter.outputs.zizmor == 'true' || github.ref_name == 'main'
-        uses: astral-sh/setup-uv@ed21f2f24f8dd64503750218de024bcf64c7250a # ratchet:astral-sh/setup-uv@v7
+        uses: astral-sh/setup-uv@61cb8a9741eeb8a550a1b8544337180c0fc8476b # ratchet:astral-sh/setup-uv@v7
        with:
          enable-cache: false
          version: "0.9.9"

      - name: Run zizmor
-        if: steps.filter.outputs.zizmor == 'true' || github.ref_name == 'main'
        run: uv run --no-sync --with zizmor zizmor --format=sarif . > results.sarif
        env:
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      - name: Upload SARIF file
-        if: steps.filter.outputs.zizmor == 'true' || github.ref_name == 'main'
        uses: github/codeql-action/upload-sarif@ba454b8ab46733eb6145342877cd148270bb77ab # ratchet:github/codeql-action/upload-sarif@codeql-bundle-v2.23.5
        with:
          sarif_file: results.sarif
--- a/.gitignore
+++ b/.gitignore
@@ -1,10 +1,13 @@
 # editors
-.vscode
+.vscode/*
 !/.vscode/env_template.txt
+!/.vscode/env.web_template.txt
 !/.vscode/launch.json
 !/.vscode/tasks.template.jsonc
 .zed
 .cursor
+!/.cursor/mcp.json
+!/.cursor/skills/

 # macos
 .DS_store
@@ -39,10 +42,6 @@ settings.json
 /backend/tests/regression/answer_quality/search_test_config.yaml
 *.egg-info

-# Claude
-AGENTS.md
-CLAUDE.md
-
 # Local .terraform directories
 **/.terraform/*

--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -66,7 +66,8 @@ repos:
      - id: uv-run
        name: Check lazy imports
        args: ["--active", "--with=onyx-devtools", "ods", "check-lazy-imports"]
-        files: ^backend/(?!\.venv/).*\.py$
+        pass_filenames: true
+        files: ^backend/(?!\.venv/|scripts/).*\.py$
      # NOTE: This takes ~6s on a single, large module which is prohibitively slow.
      # - id: uv-run
      #   name: mypy
--- a/.vscode/env.web_template.txt
+++ b/.vscode/env.web_template.txt
@@ -0,0 +1,16 @@
+# Copy this file to .env.web in the .vscode folder.
+# Fill in the <REPLACE THIS> values as needed
+# Web Server specific environment variables
+# Minimal set needed for Next.js dev server
+
+# Auth
+AUTH_TYPE=basic
+DEV_MODE=true
+
+# Enable the full set of Danswer Enterprise Edition features.
+# NOTE: DO NOT ENABLE THIS UNLESS YOU HAVE A PAID ENTERPRISE LICENSE (or if you
+# are using this for local testing/development).
+ENABLE_PAID_ENTERPRISE_EDITION_FEATURES=false
+
+# Enable Onyx Craft
+ENABLE_CRAFT=true
--- a/.vscode/env_template.txt
+++ b/.vscode/env_template.txt
@@ -6,13 +6,13 @@
 # processes.


-# For local dev, often user Authentication is not needed.
-AUTH_TYPE=disabled
+AUTH_TYPE=basic
+DEV_MODE=true


 # Always keep these on for Dev.
 # Logs model prompts, reasoning, and answer to stdout.
-LOG_ONYX_MODEL_INTERACTIONS=True
+LOG_ONYX_MODEL_INTERACTIONS=False
 # More verbose logging
 LOG_LEVEL=debug

@@ -35,7 +35,6 @@ GEN_AI_API_KEY=<REPLACE THIS>
 OPENAI_API_KEY=<REPLACE THIS>
 # If answer quality isn't important for dev, use gpt-4o-mini since it's cheaper.
 GEN_AI_MODEL_VERSION=gpt-4o
-FAST_GEN_AI_MODEL_VERSION=gpt-4o


 # Python stuff
--- a/.vscode/launch.json
+++ b/.vscode/launch.json
@@ -25,6 +25,7 @@
        "Celery heavy",
        "Celery docfetching",
        "Celery docprocessing",
+        "Celery user_file_processing",
        "Celery beat"
      ],
      "presentation": {
@@ -86,7 +87,7 @@
      "request": "launch",
      "cwd": "${workspaceRoot}/web",
      "runtimeExecutable": "npm",
-      "envFile": "${workspaceFolder}/.vscode/.env",
+      "envFile": "${workspaceFolder}/.vscode/.env.web",
      "runtimeArgs": ["run", "dev"],
      "presentation": {
        "group": "2"
@@ -121,7 +122,6 @@
      "cwd": "${workspaceFolder}/backend",
      "envFile": "${workspaceFolder}/.vscode/.env",
      "env": {
-        "LOG_ONYX_MODEL_INTERACTIONS": "True",
        "LOG_LEVEL": "DEBUG",
        "PYTHONUNBUFFERED": "1"
      },
@@ -149,6 +149,24 @@
      },
      "consoleTitle": "Slack Bot Console"
    },
+    {
+      "name": "Discord Bot",
+      "consoleName": "Discord Bot",
+      "type": "debugpy",
+      "request": "launch",
+      "program": "onyx/onyxbot/discord/client.py",
+      "cwd": "${workspaceFolder}/backend",
+      "envFile": "${workspaceFolder}/.vscode/.env",
+      "env": {
+        "LOG_LEVEL": "DEBUG",
+        "PYTHONUNBUFFERED": "1",
+        "PYTHONPATH": "."
+      },
+      "presentation": {
+        "group": "2"
+      },
+      "consoleTitle": "Discord Bot Console"
+    },
    {
      "name": "MCP Server",
      "consoleName": "MCP Server",
@@ -228,7 +246,7 @@
        "--loglevel=INFO",
        "--hostname=light@%n",
        "-Q",
-        "vespa_metadata_sync,connector_deletion,doc_permissions_upsert,index_attempt_cleanup"
+        "vespa_metadata_sync,connector_deletion,doc_permissions_upsert,index_attempt_cleanup,opensearch_migration"
      ],
      "presentation": {
        "group": "2"
@@ -257,7 +275,7 @@
        "--loglevel=INFO",
        "--hostname=background@%n",
        "-Q",
-        "vespa_metadata_sync,connector_deletion,doc_permissions_upsert,checkpoint_cleanup,index_attempt_cleanup,docprocessing,connector_doc_fetching,user_files_indexing,connector_pruning,connector_doc_permissions_sync,connector_external_group_sync,csv_generation,kg_processing,monitoring,user_file_processing,user_file_project_sync,user_file_delete"
+        "vespa_metadata_sync,connector_deletion,doc_permissions_upsert,checkpoint_cleanup,index_attempt_cleanup,docprocessing,connector_doc_fetching,connector_pruning,connector_doc_permissions_sync,connector_external_group_sync,csv_generation,kg_processing,monitoring,user_file_processing,user_file_project_sync,user_file_delete,opensearch_migration"
      ],
      "presentation": {
        "group": "2"
@@ -397,12 +415,11 @@
        "onyx.background.celery.versioned_apps.docfetching",
        "worker",
        "--pool=threads",
-        "--concurrency=1",
        "--prefetch-multiplier=1",
        "--loglevel=INFO",
        "--hostname=docfetching@%n",
        "-Q",
-        "connector_doc_fetching,user_files_indexing"
+        "connector_doc_fetching"
      ],
      "presentation": {
        "group": "2"
@@ -428,7 +445,6 @@
        "onyx.background.celery.versioned_apps.docprocessing",
        "worker",
        "--pool=threads",
-        "--concurrency=6",
        "--prefetch-multiplier=1",
        "--loglevel=INFO",
        "--hostname=docprocessing@%n",
@@ -556,7 +572,6 @@
      "cwd": "${workspaceFolder}/backend",
      "envFile": "${workspaceFolder}/.vscode/.env",
      "env": {
-        "LOG_ONYX_MODEL_INTERACTIONS": "True",
        "LOG_LEVEL": "DEBUG",
        "PYTHONUNBUFFERED": "1",
        "PYTHONPATH": "."
@@ -577,6 +592,23 @@
        "group": "3"
      }
    },
+    {
+      "name": "Build Sandbox Templates",
+      "type": "debugpy",
+      "request": "launch",
+      "module": "onyx.server.features.build.sandbox.build_templates",
+      "cwd": "${workspaceFolder}/backend",
+      "envFile": "${workspaceFolder}/.vscode/.env",
+      "env": {
+        "PYTHONUNBUFFERED": "1",
+        "PYTHONPATH": "."
+      },
+      "console": "integratedTerminal",
+      "presentation": {
+        "group": "3"
+      },
+      "consoleTitle": "Build Sandbox Templates"
+    },
    {
      // Dummy entry used to label the group
      "name": "--- Database ---",
@@ -587,6 +619,27 @@
        "order": 0
      }
    },
+    {
+      "name": "Restore seeded database dump",
+      "type": "node",
+      "request": "launch",
+      "runtimeExecutable": "uv",
+      "runtimeArgs": [
+        "run",
+        "--with",
+        "onyx-devtools",
+        "ods",
+        "db",
+        "restore",
+        "--fetch-seeded",
+        "--yes"
+      ],
+      "cwd": "${workspaceFolder}",
+      "console": "integratedTerminal",
+      "presentation": {
+        "group": "4"
+      }
+    },
    {
      "name": "Clean restore seeded database dump (destructive)",
      "type": "node",
--- a/CLAUDE.md.template
+++ b/CLAUDE.md.template
@@ -1,26 +1,25 @@
-# CLAUDE.md
+# PROJECT KNOWLEDGE BASE

-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+This file provides guidance to AI agents when working with code in this repository.

 ## KEY NOTES

 - If you run into any missing python dependency errors, try running your command with `source .venv/bin/activate` \
-to assume the python venv.
+  to assume the python venv.
 - To make tests work, check the `.env` file at the root of the project to find an OpenAI key.
 - If using `playwright` to explore the frontend, you can usually log in with username `a@example.com` and password
-`a`. The app can be accessed at `http://localhost:3000`.
+  `a`. The app can be accessed at `http://localhost:3000`.
 - You should assume that all Onyx services are running. To verify, you can check the `backend/log` directory to
-make sure we see logs coming out from the relevant service.
+  make sure we see logs coming out from the relevant service.
 - To connect to the Postgres database, use: `docker exec -it onyx-relational_db-1 psql -U postgres -c "<SQL>"`
 - When making calls to the backend, always go through the frontend. E.g. make a call to `http://localhost:3000/api/persona` not `http://localhost:8080/api/persona`
 - Put ALL db operations under the `backend/onyx/db` / `backend/ee/onyx/db` directories. Don't run queries
-outside of those directories.
+  outside of those directories.

 ## Project Overview

 **Onyx** (formerly Danswer) is an open-source Gen-AI and Enterprise Search platform that connects to company documents, apps, and people. It features a modular architecture with both Community Edition (MIT licensed) and Enterprise Edition offerings.

-
 ### Background Workers (Celery)

 Onyx uses Celery for asynchronous task processing with multiple specialized workers:
@@ -92,6 +91,7 @@ Onyx uses Celery for asynchronous task processing with multiple specialized work
 Onyx supports two deployment modes for background workers, controlled by the `USE_LIGHTWEIGHT_BACKGROUND_WORKER` environment variable:

 **Lightweight Mode** (default, `USE_LIGHTWEIGHT_BACKGROUND_WORKER=true`):
+
 - Runs a single consolidated `background` worker that handles all background tasks:
  - Light worker tasks (Vespa operations, permissions sync, deletion)
  - Document processing (indexing pipeline)
@@ -105,12 +105,14 @@ Onyx supports two deployment modes for background workers, controlled by the `US
 - Default concurrency: 20 threads (increased to handle combined workload)

 **Standard Mode** (`USE_LIGHTWEIGHT_BACKGROUND_WORKER=false`):
+
 - Runs separate specialized workers as documented above (light, docprocessing, docfetching, heavy, kg_processing, monitoring, user_file_processing)
 - Better isolation and scalability
 - Can scale individual workers independently based on workload
 - Suitable for production deployments with higher load

 The deployment mode affects:
+
 - **Backend**: Worker processes spawned by supervisord or dev scripts
 - **Helm**: Which Kubernetes deployments are created
 - **Dev Environment**: Which workers `dev_run_background_jobs.py` spawns
@@ -119,18 +121,18 @@ The deployment mode affects:

 - **Thread-based Workers**: All workers use thread pools (not processes) for stability
 - **Tenant Awareness**: Multi-tenant support with per-tenant task isolation. There is a
-middleware layer that automatically finds the appropriate tenant ID when sending tasks
-via Celery Beat.
+  middleware layer that automatically finds the appropriate tenant ID when sending tasks
+  via Celery Beat.
 - **Task Prioritization**: High, Medium, Low priority queues
 - **Monitoring**: Built-in heartbeat and liveness checking
 - **Failure Handling**: Automatic retry and failure recovery mechanisms
 - **Redis Coordination**: Inter-process communication via Redis
 - **PostgreSQL State**: Task state and metadata stored in PostgreSQL

-
 #### Important Notes

-**Defining Tasks**: 
+**Defining Tasks**:
+
 - Always use `@shared_task` rather than `@celery_app`
 - Put tasks under `background/celery/tasks/` or `ee/background/celery/tasks`

@@ -142,7 +144,12 @@ function.
 If you make any updates to a celery worker and you want to test these changes, you will need
 to ask me to restart the celery worker. There is no auto-restart on code-change mechanism.

+**Task Time Limits**:
+Since all tasks are executed in thread pools, the time limit features of Celery are silently 
+disabled and won't work. Timeout logic must be implemented within the task itself.
+
 ### Code Quality
+
 ```bash
 # Install and run pre-commit hooks
 pre-commit install
@@ -154,6 +161,7 @@ NOTE: Always make sure everything is strictly typed (both in Python and Typescri
 ## Architecture Overview

 ### Technology Stack
+
 - **Backend**: Python 3.11, FastAPI, SQLAlchemy, Alembic, Celery
 - **Frontend**: Next.js 15+, React 18, TypeScript, Tailwind CSS
 - **Database**: PostgreSQL with Redis caching
@@ -435,6 +443,7 @@ function ContactForm() {
 **Reason:** Our custom color system uses CSS variables that automatically handle dark mode and maintain design consistency across the app. Standard Tailwind colors bypass this system.

 **Available color categories:**
+
 - **Text:** `text-01` through `text-05`, `text-inverted-XX`
 - **Backgrounds:** `background-neutral-XX`, `background-tint-XX` (and inverted variants)
 - **Borders:** `border-01` through `border-05`, `border-inverted-XX`
@@ -467,6 +476,7 @@ function ContactForm() {
 ## Database & Migrations

 ### Running Migrations
+
 ```bash
 # Standard migrations
 alembic upgrade head
@@ -476,6 +486,7 @@ alembic -n schema_private upgrade head
 ```

 ### Creating Migrations
+
 ```bash
 # Create migration
 alembic revision -m "description"
@@ -488,13 +499,14 @@ Write the migration manually and place it in the file that alembic creates when

 ## Testing Strategy

-First, you must activate the virtual environment with `source .venv/bin/activate`. 
+First, you must activate the virtual environment with `source .venv/bin/activate`.

 There are 4 main types of tests within Onyx:

 ### Unit Tests
+
 These should not assume any Onyx/external services are available to be called.
-Interactions with the outside world should be mocked using `unittest.mock`. Generally, only 
+Interactions with the outside world should be mocked using `unittest.mock`. Generally, only
 write these for complex, isolated modules e.g. `citation_processing.py`.

 To run them:
@@ -504,13 +516,14 @@ pytest -xv backend/tests/unit
 ```

 ### External Dependency Unit Tests
-These tests assume that all external dependencies of Onyx are available and callable (e.g. Postgres, Redis, 
+
+These tests assume that all external dependencies of Onyx are available and callable (e.g. Postgres, Redis,
 MinIO/S3, Vespa are running + OpenAI can be called + any request to the internet is fine + etc.).

 However, the actual Onyx containers are not running and with these tests we call the function to test directly.
-We can also mock components/calls at will. 
+We can also mock components/calls at will.

-The goal with these tests are to minimize mocking while giving some flexibility to mock things that are flakey, 
+The goal with these tests are to minimize mocking while giving some flexibility to mock things that are flakey,
 need strictly controlled behavior, or need to have their internal behavior validated (e.g. verify a function is called
 with certain args, something that would be impossible with proper integration tests).

@@ -523,15 +536,16 @@ python -m dotenv -f .vscode/.env run -- pytest backend/tests/external_dependency
 ```

 ### Integration Tests
-Standard integration tests. Every test in `backend/tests/integration` runs against a real Onyx deployment. We cannot 
-mock anything in these tests. Prefer writing integration tests (or External Dependency Unit Tests if mocking/internal 
+
+Standard integration tests. Every test in `backend/tests/integration` runs against a real Onyx deployment. We cannot
+mock anything in these tests. Prefer writing integration tests (or External Dependency Unit Tests if mocking/internal
 verification is necessary) over any other type of test.

 Tests are parallelized at a directory level.

-When writing integration tests, make sure to check the root `conftest.py` for useful fixtures + the `backend/tests/integration/common_utils` directory for utilities. Prefer (if one exists), calling the appropriate Manager 
+When writing integration tests, make sure to check the root `conftest.py` for useful fixtures + the `backend/tests/integration/common_utils` directory for utilities. Prefer (if one exists), calling the appropriate Manager
 class in the utils over directly calling the APIs with a library like `requests`. Prefer using fixtures rather than
-calling the utilities directly (e.g. do NOT create admin users with 
+calling the utilities directly (e.g. do NOT create admin users with
 `admin_user = UserManager.create(name="admin_user")`, instead use the `admin_user` fixture).

 A great example of this type of test is `backend/tests/integration/dev_apis/test_simple_chat_api.py`.
@@ -543,8 +557,9 @@ python -m dotenv -f .vscode/.env run -- pytest backend/tests/integration
 ```

 ### Playwright (E2E) Tests
-These tests are an even more complete version of the Integration Tests mentioned above. Has all services of Onyx 
-running, *including* the Web Server.
+
+These tests are an even more complete version of the Integration Tests mentioned above. Has all services of Onyx
+running, _including_ the Web Server.

 Use these tests for anything that requires significant frontend <-> backend coordination.

@@ -556,13 +571,11 @@ To run them:
 npx playwright test <TEST_NAME>
 ```

-
 ## Logs

 When (1) writing integration tests or (2) doing live tests (e.g. curl / playwright) you can get access
 to logs via the `backend/log/<service_name>_debug.log` file. All Onyx services (api_server, web_server, celery_X)
-will be tailing their logs to this file. 
-
+will be tailing their logs to this file.

 ## Security Considerations

@@ -581,6 +594,7 @@ will be tailing their logs to this file.
 - Custom prompts and agent actions

 ## Creating a Plan
+
 When creating a plan in the `plans` directory, make sure to include at least these elements:

 **Issues to Address**
@@ -593,10 +607,10 @@ Things you come across in your research that are important to the implementation
 How you are going to make the changes happen. High level approach.

 **Tests**
-What unit (use rarely), external dependency unit, integration, and playwright tests you plan to write to 
+What unit (use rarely), external dependency unit, integration, and playwright tests you plan to write to
 verify the correct behavior. Don't overtest. Usually, a given change only needs one type of test.

-Do NOT include these: *Timeline*, *Rollback plan*
+Do NOT include these: _Timeline_, _Rollback plan_

 This is a minimal list - feel free to include more. Do NOT write code as part of your plan.
 Keep it high level. You can reference certain files or functions though.
--- a/AGENTS.md.template
+++ b/AGENTS.md.template
@@ -1,599 +0,0 @@
-# AGENTS.md
-
-This file provides guidance to AI agents when working with code in this repository.
-
-## KEY NOTES
-
- If you run into any missing python dependency errors, try running your command with `source .venv/bin/activate` \
-to assume the python venv.
- To make tests work, check the `.env` file at the root of the project to find an OpenAI key.
- If using `playwright` to explore the frontend, you can usually log in with username `a@example.com` and password
-`a`. The app can be accessed at `http://localhost:3000`.
- You should assume that all Onyx services are running. To verify, you can check the `backend/log` directory to
-make sure we see logs coming out from the relevant service.
- To connect to the Postgres database, use: `docker exec -it onyx-relational_db-1 psql -U postgres -c "<SQL>"`
- When making calls to the backend, always go through the frontend. E.g. make a call to `http://localhost:3000/api/persona` not `http://localhost:8080/api/persona`
- Put ALL db operations under the `backend/onyx/db` / `backend/ee/onyx/db` directories. Don't run queries
-outside of those directories.
-
-## Project Overview
-
-**Onyx** (formerly Danswer) is an open-source Gen-AI and Enterprise Search platform that connects to company documents, apps, and people. It features a modular architecture with both Community Edition (MIT licensed) and Enterprise Edition offerings.
-
-
-### Background Workers (Celery)
-
-Onyx uses Celery for asynchronous task processing with multiple specialized workers:
-
-#### Worker Types
-
-1. **Primary Worker** (`celery_app.py`)
-   - Coordinates core background tasks and system-wide operations
-   - Handles connector management, document sync, pruning, and periodic checks
-   - Runs with 4 threads concurrency
-   - Tasks: connector deletion, vespa sync, pruning, LLM model updates, user file sync
-
-2. **Docfetching Worker** (`docfetching`)
-   - Fetches documents from external data sources (connectors)
-   - Spawns docprocessing tasks for each document batch
-   - Implements watchdog monitoring for stuck connectors
-   - Configurable concurrency (default from env)
-
-3. **Docprocessing Worker** (`docprocessing`)
-   - Processes fetched documents through the indexing pipeline:
-     - Upserts documents to PostgreSQL
-     - Chunks documents and adds contextual information
-     - Embeds chunks via model server
-     - Writes chunks to Vespa vector database
-     - Updates document metadata
-   - Configurable concurrency (default from env)
-
-4. **Light Worker** (`light`)
-   - Handles lightweight, fast operations
-   - Tasks: vespa operations, document permissions sync, external group sync
-   - Higher concurrency for quick tasks
-
-5. **Heavy Worker** (`heavy`)
-   - Handles resource-intensive operations
-   - Primary task: document pruning operations
-   - Runs with 4 threads concurrency
-
-6. **KG Processing Worker** (`kg_processing`)
-   - Handles Knowledge Graph processing and clustering
-   - Builds relationships between documents
-   - Runs clustering algorithms
-   - Configurable concurrency
-
-7. **Monitoring Worker** (`monitoring`)
-   - System health monitoring and metrics collection
-   - Monitors Celery queues, process memory, and system status
-   - Single thread (monitoring doesn't need parallelism)
-   - Cloud-specific monitoring tasks
-
-8. **User File Processing Worker** (`user_file_processing`)
-   - Processes user-uploaded files
-   - Handles user file indexing and project synchronization
-   - Configurable concurrency
-
-9. **Beat Worker** (`beat`)
-   - Celery's scheduler for periodic tasks
-   - Uses DynamicTenantScheduler for multi-tenant support
-   - Schedules tasks like:
-     - Indexing checks (every 15 seconds)
-     - Connector deletion checks (every 20 seconds)
-     - Vespa sync checks (every 20 seconds)
-     - Pruning checks (every 20 seconds)
-     - KG processing (every 60 seconds)
-     - Monitoring tasks (every 5 minutes)
-     - Cleanup tasks (hourly)
-
-#### Worker Deployment Modes
-
-Onyx supports two deployment modes for background workers, controlled by the `USE_LIGHTWEIGHT_BACKGROUND_WORKER` environment variable:
-
-**Lightweight Mode** (default, `USE_LIGHTWEIGHT_BACKGROUND_WORKER=true`):
- Runs a single consolidated `background` worker that handles all background tasks:
-  - Pruning operations (from `heavy` worker)
-  - Knowledge graph processing (from `kg_processing` worker)
-  - Monitoring tasks (from `monitoring` worker)
-  - User file processing (from `user_file_processing` worker)
- Lower resource footprint (single worker process)
- Suitable for smaller deployments or development environments
- Default concurrency: 6 threads
-
-**Standard Mode** (`USE_LIGHTWEIGHT_BACKGROUND_WORKER=false`):
- Runs separate specialized workers as documented above (heavy, kg_processing, monitoring, user_file_processing)
- Better isolation and scalability
- Can scale individual workers independently based on workload
- Suitable for production deployments with higher load
-
-The deployment mode affects:
- **Backend**: Worker processes spawned by supervisord or dev scripts
- **Helm**: Which Kubernetes deployments are created
- **Dev Environment**: Which workers `dev_run_background_jobs.py` spawns
-
-#### Key Features
-
- **Thread-based Workers**: All workers use thread pools (not processes) for stability
- **Tenant Awareness**: Multi-tenant support with per-tenant task isolation. There is a 
-middleware layer that automatically finds the appropriate tenant ID when sending tasks 
-via Celery Beat.
- **Task Prioritization**: High, Medium, Low priority queues
- **Monitoring**: Built-in heartbeat and liveness checking
- **Failure Handling**: Automatic retry and failure recovery mechanisms
- **Redis Coordination**: Inter-process communication via Redis
- **PostgreSQL State**: Task state and metadata stored in PostgreSQL
-
-
-#### Important Notes
-
-**Defining Tasks**: 
- Always use `@shared_task` rather than `@celery_app`
- Put tasks under `background/celery/tasks/` or `ee/background/celery/tasks`
-
-**Defining APIs**:
-When creating new FastAPI APIs, do NOT use the `response_model` field. Instead, just type the
-function.
-
-**Testing Updates**:
-If you make any updates to a celery worker and you want to test these changes, you will need
-to ask me to restart the celery worker. There is no auto-restart on code-change mechanism.
-
-### Code Quality
-```bash
-# Install and run pre-commit hooks
-pre-commit install
-pre-commit run --all-files
-```
-
-NOTE: Always make sure everything is strictly typed (both in Python and Typescript).
-
-## Architecture Overview
-
-### Technology Stack
- **Backend**: Python 3.11, FastAPI, SQLAlchemy, Alembic, Celery
- **Frontend**: Next.js 15+, React 18, TypeScript, Tailwind CSS
- **Database**: PostgreSQL with Redis caching
- **Search**: Vespa vector database
- **Auth**: OAuth2, SAML, multi-provider support
- **AI/ML**: LangChain, LiteLLM, multiple embedding models
-
-### Directory Structure
-
-```
-backend/
-├── onyx/
-│   ├── auth/                    # Authentication & authorization
-│   ├── chat/                    # Chat functionality & LLM interactions
-│   ├── connectors/              # Data source connectors
-│   ├── db/                      # Database models & operations
-│   ├── document_index/          # Vespa integration
-│   ├── federated_connectors/    # External search connectors
-│   ├── llm/                     # LLM provider integrations
-│   └── server/                  # API endpoints & routers
-├── ee/                          # Enterprise Edition features
-├── alembic/                     # Database migrations
-└── tests/                       # Test suites
-
-web/
-├── src/app/                     # Next.js app router pages
-├── src/components/              # Reusable React components
-└── src/lib/                     # Utilities & business logic
-```
-
-## Frontend Standards
-
-### 1. Import Standards
-
-**Always use absolute imports with the `@` prefix.**
-
-**Reason:** Moving files around becomes easier since you don't also have to update those import statements. This makes modifications to the codebase much nicer.
-
-```typescript
-// ✅ Good
-import { Button } from "@/components/ui/button";
-import { useAuth } from "@/hooks/useAuth";
-import { Text } from "@/refresh-components/texts/Text";
-
-// ❌ Bad
-import { Button } from "../../../components/ui/button";
-import { useAuth } from "./hooks/useAuth";
-```
-
-### 2. React Component Functions
-
-**Prefer regular functions over arrow functions for React components.**
-
-**Reason:** Functions just become easier to read.
-
-```typescript
-// ✅ Good
-function UserProfile({ userId }: UserProfileProps) {
-  return <div>User Profile</div>
-}
-
-// ❌ Bad
-const UserProfile = ({ userId }: UserProfileProps) => {
-  return <div>User Profile</div>
-}
-```
-
-### 3. Props Interface Extraction
-
-**Extract prop types into their own interface definitions.**
-
-**Reason:** Functions just become easier to read.
-
-```typescript
-// ✅ Good
-interface UserCardProps {
-  user: User
-  showActions?: boolean
-  onEdit?: (userId: string) => void
-}
-
-function UserCard({ user, showActions = false, onEdit }: UserCardProps) {
-  return <div>User Card</div>
-}
-
-// ❌ Bad
-function UserCard({
-  user,
-  showActions = false,
-  onEdit
-}: {
-  user: User
-  showActions?: boolean
-  onEdit?: (userId: string) => void
-}) {
-  return <div>User Card</div>
-}
-```
-
-### 4. Spacing Guidelines
-
-**Prefer padding over margins for spacing.**
-
-**Reason:** We want to consolidate usage to paddings instead of margins.
-
-```typescript
-// ✅ Good
-<div className="p-4 space-y-2">
-  <div className="p-2">Content</div>
-</div>
-
-// ❌ Bad
-<div className="m-4 space-y-2">
-  <div className="m-2">Content</div>
-</div>
-```
-
-### 5. Tailwind Dark Mode
-
-**Strictly forbid using the `dark:` modifier in Tailwind classes, except for logo icon handling.**
-
-**Reason:** The `colors.css` file already, VERY CAREFULLY, defines what the exact opposite colour of each light-mode colour is. Overriding this behaviour is VERY bad and will lead to horrible UI breakages.
-
-**Exception:** The `createLogoIcon` helper in `web/src/components/icons/icons.tsx` uses `dark:` modifiers (`dark:invert`, `dark:hidden`, `dark:block`) to handle third-party logo icons that cannot automatically adapt through `colors.css`. This is the ONLY acceptable use of dark mode modifiers.
-
-```typescript
-// ✅ Good - Standard components use `web/tailwind-themes/tailwind.config.js` / `web/src/app/css/colors.css`
-<div className="bg-background-neutral-03 text-text-02">
-  Content
-</div>
-
-// ✅ Good - Logo icons with dark mode handling via createLogoIcon
-export const GithubIcon = createLogoIcon(githubLightIcon, {
-  monochromatic: true,  // Will apply dark:invert internally
-});
-
-export const GitbookIcon = createLogoIcon(gitbookLightIcon, {
-  darkSrc: gitbookDarkIcon,  // Will use dark:hidden/dark:block internally
-});
-
-// ❌ Bad - Manual dark mode overrides
-<div className="bg-white dark:bg-black text-black dark:text-white">
-  Content
-</div>
-```
-
-### 6. Class Name Utilities
-
-**Use the `cn` utility instead of raw string formatting for classNames.**
-
-**Reason:** `cn`s are easier to read. They also allow for more complex types (i.e., string-arrays) to get formatted properly (it flattens each element in that string array down). As a result, it can allow things such as conditionals (i.e., `myCondition && "some-tailwind-class"`, which evaluates to `false` when `myCondition` is `false`) to get filtered out.
-
-```typescript
-import { cn } from '@/lib/utils'
-
-// ✅ Good
-<div className={cn(
-  'base-class',
-  isActive && 'active-class',
-  className
-)}>
-  Content
-</div>
-
-// ❌ Bad
-<div className={`base-class ${isActive ? 'active-class' : ''} ${className}`}>
-  Content
-</div>
-```
-
-### 7. Custom Hooks Organization
-
-**Follow a "hook-per-file" layout. Each hook should live in its own file within `web/src/hooks`.**
-
-**Reason:** This is just a layout preference. Keeps code clean.
-
-```typescript
-// web/src/hooks/useUserData.ts
-export function useUserData(userId: string) {
-  // hook implementation
-}
-
-// web/src/hooks/useLocalStorage.ts
-export function useLocalStorage<T>(key: string, initialValue: T) {
-  // hook implementation
-}
-```
-
-### 8. Icon Usage
-
-**ONLY use icons from the `web/src/icons` directory. Do NOT use icons from `react-icons`, `lucide`, or other external libraries.**
-
-**Reason:** We have a very carefully curated selection of icons that match our Onyx guidelines. We do NOT want to muddy those up with different aesthetic stylings.
-
-```typescript
-// ✅ Good
-import SvgX from "@/icons/x";
-import SvgMoreHorizontal from "@/icons/more-horizontal";
-
-// ❌ Bad
-import { User } from "lucide-react";
-import { FiSearch } from "react-icons/fi";
-```
-
-**Missing Icons**: If an icon is needed but doesn't exist in the `web/src/icons` directory, import it from Figma using the Figma MCP tool and add it to the icons directory.
-If you need help with this step, reach out to `raunak@onyx.app`.
-
-### 9. Text Rendering
-
-**Prefer using the `refresh-components/texts/Text` component for all text rendering. Avoid "naked" text nodes.**
-
-**Reason:** The `Text` component is fully compliant with the stylings provided in Figma. It provides easy utilities to specify the text-colour and font-size in the form of flags. Super duper easy.
-
-```typescript
-// ✅ Good
-import { Text } from '@/refresh-components/texts/Text'
-
-function UserCard({ name }: { name: string }) {
-  return (
-    <Text
-      {/* The `text03` flag makes the text it renders to be coloured the 3rd-scale grey */}
-      text03
-      {/* The `mainAction` flag makes the text it renders to be "main-action" font + line-height + weightage, as described in the Figma */}
-      mainAction
-    >
-      {name}
-    </Text>
-  )
-}
-
-// ❌ Bad
-function UserCard({ name }: { name: string }) {
-  return (
-    <div>
-      <h2>{name}</h2>
-      <p>User details</p>
-    </div>
-  )
-}
-```
-
-### 10. Component Usage
-
-**Heavily avoid raw HTML input components. Always use components from the `web/src/refresh-components` or `web/lib/opal/src` directory.**
-
-**Reason:** We've put in a lot of effort to unify the components that are rendered in the Onyx app. Using raw components breaks the entire UI of the application, and leaves it in a muddier state than before.
-
-```typescript
-// ✅ Good
-import Button from '@/refresh-components/buttons/Button'
-import InputTypeIn from '@/refresh-components/inputs/InputTypeIn'
-import SvgPlusCircle from '@/icons/plus-circle'
-
-function ContactForm() {
-  return (
-    <form>
-      <InputTypeIn placeholder="Search..." />
-      <Button type="submit" leftIcon={SvgPlusCircle}>Submit</Button>
-    </form>
-  )
-}
-
-// ❌ Bad
-function ContactForm() {
-  return (
-    <form>
-      <input placeholder="Name" />
-      <textarea placeholder="Message" />
-      <button type="submit">Submit</button>
-    </form>
-  )
-}
-```
-
-### 11. Colors
-
-**Always use custom overrides for colors and borders rather than built in Tailwind CSS colors. These overrides live in `web/tailwind-themes/tailwind.config.js`.**
-
-**Reason:** Our custom color system uses CSS variables that automatically handle dark mode and maintain design consistency across the app. Standard Tailwind colors bypass this system.
-
-**Available color categories:**
- **Text:** `text-01` through `text-05`, `text-inverted-XX`
- **Backgrounds:** `background-neutral-XX`, `background-tint-XX` (and inverted variants)
- **Borders:** `border-01` through `border-05`, `border-inverted-XX`
- **Actions:** `action-link-XX`, `action-danger-XX`
- **Status:** `status-info-XX`, `status-success-XX`, `status-warning-XX`, `status-error-XX`
- **Theme:** `theme-primary-XX`, `theme-red-XX`, `theme-blue-XX`, etc.
-
-```typescript
-// ✅ Good - Use custom Onyx color classes
-<div className="bg-background-neutral-01 border border-border-02" />
-<div className="bg-background-tint-02 border border-border-01" />
-<div className="bg-status-success-01" />
-<div className="bg-action-link-01" />
-<div className="bg-theme-primary-05" />
-
-// ❌ Bad - Do NOT use standard Tailwind colors
-<div className="bg-gray-100 border border-gray-300 text-gray-600" />
-<div className="bg-white border border-slate-200" />
-<div className="bg-green-100 text-green-700" />
-<div className="bg-blue-100 text-blue-600" />
-<div className="bg-indigo-500" />
-```
-
-### 12. Data Fetching
-
-**Prefer using `useSWR` for data fetching. Data should generally be fetched on the client side. Components that need data should display a loader / placeholder while waiting for that data. Prefer loading data within the component that needs it rather than at the top level and passing it down.**
-
-**Reason:** Client side fetching allows us to load the skeleton of the page without waiting for data to load, leading to a snappier UX. Loading data where needed reduces dependencies between a component and its parent component(s).
-
-## Database & Migrations
-
-### Running Migrations
-```bash
-# Standard migrations
-alembic upgrade head
-
-# Multi-tenant (Enterprise)
-alembic -n schema_private upgrade head
-```
-
-### Creating Migrations
-```bash
-# Create migration
-alembic revision -m "description"
-
-# Multi-tenant migration
-alembic -n schema_private revision -m "description"
-```
-
-Write the migration manually and place it in the file that alembic creates when running the above command.
-
-## Testing Strategy
-
-There are 4 main types of tests within Onyx:
-
-### Unit Tests
-These should not assume any Onyx/external services are available to be called.
-Interactions with the outside world should be mocked using `unittest.mock`. Generally, only 
-write these for complex, isolated modules e.g. `citation_processing.py`.
-
-To run them:
-
-```bash
-python -m dotenv -f .vscode/.env run -- pytest -xv backend/tests/unit
-```
-
-### External Dependency Unit Tests
-These tests assume that all external dependencies of Onyx are available and callable (e.g. Postgres, Redis, 
-MinIO/S3, Vespa are running + OpenAI can be called + any request to the internet is fine + etc.).
-
-However, the actual Onyx containers are not running and with these tests we call the function to test directly.
-We can also mock components/calls at will. 
-
-The goal with these tests are to minimize mocking while giving some flexibility to mock things that are flakey, 
-need strictly controlled behavior, or need to have their internal behavior validated (e.g. verify a function is called
-with certain args, something that would be impossible with proper integration tests).
-
-A great example of this type of test is `backend/tests/external_dependency_unit/connectors/confluence/test_confluence_group_sync.py`.
-
-To run them:
-
-```bash
-python -m dotenv -f .vscode/.env run -- pytest backend/tests/external_dependency_unit
-```
-
-### Integration Tests
-Standard integration tests. Every test in `backend/tests/integration` runs against a real Onyx deployment. We cannot 
-mock anything in these tests. Prefer writing integration tests (or External Dependency Unit Tests if mocking/internal 
-verification is necessary) over any other type of test.
-
-Tests are parallelized at a directory level.
-
-When writing integration tests, make sure to check the root `conftest.py` for useful fixtures + the `backend/tests/integration/common_utils` directory for utilities. Prefer (if one exists), calling the appropriate Manager 
-class in the utils over directly calling the APIs with a library like `requests`. Prefer using fixtures rather than
-calling the utilities directly (e.g. do NOT create admin users with 
-`admin_user = UserManager.create(name="admin_user")`, instead use the `admin_user` fixture).
-
-A great example of this type of test is `backend/tests/integration/dev_apis/test_simple_chat_api.py`.
-
-To run them:
-
-```bash
-python -m dotenv -f .vscode/.env run -- pytest backend/tests/integration
-```
-
-### Playwright (E2E) Tests
-These tests are an even more complete version of the Integration Tests mentioned above. Has all services of Onyx 
-running, *including* the Web Server.
-
-Use these tests for anything that requires significant frontend <-> backend coordination.
-
-Tests are located at `web/tests/e2e`. Tests are written in TypeScript.
-
-To run them:
-
-```bash
-npx playwright test <TEST_NAME>
-```
-
-
-## Logs
-
-When (1) writing integration tests or (2) doing live tests (e.g. curl / playwright) you can get access
-to logs via the `backend/log/<service_name>_debug.log` file. All Onyx services (api_server, web_server, celery_X)
-will be tailing their logs to this file. 
-
-
-## Security Considerations
-
- Never commit API keys or secrets to repository
- Use encrypted credential storage for connector credentials
- Follow RBAC patterns for new features
- Implement proper input validation with Pydantic models
- Use parameterized queries to prevent SQL injection
-
-## AI/LLM Integration
-
- Multiple LLM providers supported via LiteLLM
- Configurable models per feature (chat, search, embeddings)
- Streaming support for real-time responses
- Token management and rate limiting
- Custom prompts and agent actions
-
-## Creating a Plan
-When creating a plan in the `plans` directory, make sure to include at least these elements:
-
-**Issues to Address**
-What the change is meant to do.
-
-**Important Notes**
-Things you come across in your research that are important to the implementation.
-
-**Implementation strategy**
-How you are going to make the changes happen. High level approach.
-
-**Tests**
-What unit (use rarely), external dependency unit, integration, and playwright tests you plan to write to 
-verify the correct behavior. Don't overtest. Usually, a given change only needs one type of test.
-
-Do NOT include these: *Timeline*, *Rollback plan*
-
-This is a minimal list - feel free to include more. Do NOT write code as part of your plan.
-Keep it high level. You can reference certain files or functions though.
-
-Before writing your plan, make sure to do research. Explore the relevant sections in the codebase.
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1 @@
+AGENTS.md
--- a/5
+++ b/5
@@ -2,7 +2,10 @@ Copyright (c) 2023-present DanswerAI, Inc.

 Portions of this software are licensed as follows:

- All content that resides under "ee" directories of this repository, if that directory exists, is licensed under the license defined in "backend/ee/LICENSE". Specifically all content under "backend/ee" and "web/src/app/ee" is licensed under the license defined in "backend/ee/LICENSE".
+- All content that resides under "ee" directories of this repository is licensed under the Onyx Enterprise License. Each ee directory contains an identical copy of this license at its root:
+  - backend/ee/LICENSE
+  - web/src/app/ee/LICENSE
+  - web/src/ee/LICENSE
 - All third party components incorporated into the Onyx Software are licensed under the original license provided by the owner of the applicable component.
 - Content outside of the above mentioned directories or restrictions above is available under the "MIT Expat" license as defined below.

--- a/backend/.dockerignore
+++ b/backend/.dockerignore
@@ -16,3 +16,8 @@ dist/
 .coverage
 htmlcov/
 model_server/legacy/
+
+# Craft: demo_data directory should be unzipped at container startup, not copied
+**/demo_data/
+# Craft: templates/outputs/venv is created at container startup
+**/templates/outputs/venv
--- a/backend/Dockerfile
+++ b/backend/Dockerfile
@@ -7,6 +7,10 @@ have a contract or agreement with DanswerAI, you are not permitted to use the En
 Edition features outside of personal development or testing purposes. Please reach out to \
 founders@onyx.app for more information. Please visit https://github.com/onyx-dot-app/onyx"

+# Build argument for Craft support (disabled by default)
+# Use --build-arg ENABLE_CRAFT=true to include Node.js and opencode CLI
+ARG ENABLE_CRAFT=false
+
 # DO_NOT_TRACK is used to disable telemetry for Unstructured
 ENV DANSWER_RUNNING_IN_DOCKER="true" \
    DO_NOT_TRACK="true" \
@@ -46,7 +50,23 @@ RUN apt-get update && \
    rm -rf /var/lib/apt/lists/* && \
    apt-get clean

+# Conditionally install Node.js 20 for Craft (required for Next.js)
+# Only installed when ENABLE_CRAFT=true
+RUN if [ "$ENABLE_CRAFT" = "true" ]; then \
+        echo "Installing Node.js 20 for Craft support..." && \
+        curl -fsSL https://deb.nodesource.com/setup_20.x | bash - && \
+        apt-get install -y nodejs && \
+        rm -rf /var/lib/apt/lists/*; \
+    fi

+# Conditionally install opencode CLI for Craft agent functionality
+# Only installed when ENABLE_CRAFT=true
+# TODO: download a specific, versioned release of the opencode CLI
+RUN if [ "$ENABLE_CRAFT" = "true" ]; then \
+        echo "Installing opencode CLI for Craft support..." && \
+        curl -fsSL https://opencode.ai/install | bash; \
+    fi
+ENV PATH="/root/.opencode/bin:${PATH}"

 # Install Python dependencies
 # Remove py which is pulled in by retry, py is not needed and is a CVE
@@ -89,6 +109,12 @@ RUN uv pip install --system --no-cache-dir --upgrade \
 RUN python -c "from tokenizers import Tokenizer; \
 Tokenizer.from_pretrained('nomic-ai/nomic-embed-text-v1')"

+# Pre-downloading NLTK for setups with limited egress
+RUN python -c "import nltk; \
+    nltk.download('stopwords', quiet=True); \
+    nltk.download('punkt_tab', quiet=True);"
+# nltk.download('wordnet', quiet=True); introduce this back if lemmatization is needed
+
 # Pre-downloading tiktoken for setups with limited egress
 RUN python -c "import tiktoken; \
 tiktoken.get_encoding('cl100k_base')"
@@ -108,12 +134,26 @@ COPY --chown=onyx:onyx ./alembic_tenants /app/alembic_tenants
 COPY --chown=onyx:onyx ./alembic.ini /app/alembic.ini
 COPY supervisord.conf /usr/etc/supervisord.conf
 COPY --chown=onyx:onyx ./static /app/static
+COPY --chown=onyx:onyx ./keys /app/keys

 # Escape hatch scripts
 COPY --chown=onyx:onyx ./scripts/debugging /app/scripts/debugging
 COPY --chown=onyx:onyx ./scripts/force_delete_connector_by_id.py /app/scripts/force_delete_connector_by_id.py
 COPY --chown=onyx:onyx ./scripts/supervisord_entrypoint.sh /app/scripts/supervisord_entrypoint.sh
-RUN chmod +x /app/scripts/supervisord_entrypoint.sh
+COPY --chown=onyx:onyx ./scripts/setup_craft_templates.sh /app/scripts/setup_craft_templates.sh
+RUN chmod +x /app/scripts/supervisord_entrypoint.sh /app/scripts/setup_craft_templates.sh
+
+# Run Craft template setup at build time when ENABLE_CRAFT=true
+# This pre-bakes demo data, Python venv, and npm dependencies into the image
+RUN if [ "$ENABLE_CRAFT" = "true" ]; then \
+        echo "Running Craft template setup at build time..." && \
+        ENABLE_CRAFT=true /app/scripts/setup_craft_templates.sh; \
+    fi
+
+# Set Craft template paths to the in-image locations
+# These match the paths where setup_craft_templates.sh creates the templates
+ENV OUTPUTS_TEMPLATE_PATH=/app/onyx/server/features/build/sandbox/kubernetes/docker/templates/outputs
+ENV VENV_TEMPLATE_PATH=/app/onyx/server/features/build/sandbox/kubernetes/docker/templates/venv

 # Put logo in assets
 COPY --chown=onyx:onyx ./assets /app/assets
--- a/backend/Dockerfile.model_server
+++ b/backend/Dockerfile.model_server
@@ -48,6 +48,7 @@ WORKDIR /app
 # Utils used by model server
 COPY ./onyx/utils/logger.py /app/onyx/utils/logger.py
 COPY ./onyx/utils/middleware.py /app/onyx/utils/middleware.py
+COPY ./onyx/utils/tenant.py /app/onyx/utils/tenant.py

 # Place to fetch version information
 COPY ./onyx/__init__.py /app/onyx/__init__.py
--- a/backend/alembic/env.py
+++ b/backend/alembic/env.py
@@ -57,7 +57,7 @@ if USE_IAM_AUTH:


 def include_object(
-    object: SchemaItem,
+    object: SchemaItem,  # noqa: ARG001
    name: str | None,
    type_: Literal[
        "schema",
@@ -67,8 +67,8 @@ def include_object(
        "unique_constraint",
        "foreign_key_constraint",
    ],
-    reflected: bool,
-    compare_to: SchemaItem | None,
+    reflected: bool,  # noqa: ARG001
+    compare_to: SchemaItem | None,  # noqa: ARG001
 ) -> bool:
    if type_ == "table" and name in EXCLUDE_TABLES:
        return False
@@ -244,7 +244,7 @@ def do_run_migrations(


 def provide_iam_token_for_alembic(
-    dialect: Any, conn_rec: Any, cargs: Any, cparams: Any
+    dialect: Any, conn_rec: Any, cargs: Any, cparams: Any  # noqa: ARG001
 ) -> None:
    if USE_IAM_AUTH:
        # Database connection settings
@@ -474,7 +474,7 @@ def run_migrations_online() -> None:

    if connectable is not None:
        # pytest-alembic is providing an engine - use it directly
-        logger.info("run_migrations_online starting (pytest-alembic mode).")
+        logger.debug("run_migrations_online starting (pytest-alembic mode).")

        # For pytest-alembic, we use the default schema (public)
        schema_name = context.config.attributes.get(
--- a/backend/alembic/run_multitenant_migrations.py
+++ b/backend/alembic/run_multitenant_migrations.py
@@ -0,0 +1,343 @@
+#!/usr/bin/env python3
+"""Parallel Alembic Migration Runner
+
+Upgrades tenant schemas to head in batched, parallel alembic subprocesses.
+Each subprocess handles a batch of schemas (via ``-x schemas=a,b,c``),
+reducing per-process overhead compared to one-schema-per-process.
+
+Usage examples::
+
+    # defaults: 6 workers, 50 schemas/batch
+    python alembic/run_multitenant_migrations.py
+
+    # custom settings
+    python alembic/run_multitenant_migrations.py -j 8 -b 100
+"""
+from __future__ import annotations
+
+import argparse
+import subprocess
+import sys
+import threading
+import time
+from concurrent.futures import ThreadPoolExecutor, as_completed
+from typing import List, NamedTuple
+
+from alembic.config import Config
+from alembic.script import ScriptDirectory
+from sqlalchemy import text
+
+from onyx.db.engine.sql_engine import is_valid_schema_name
+from onyx.db.engine.sql_engine import SqlEngine
+from onyx.db.engine.tenant_utils import get_all_tenant_ids
+from shared_configs.configs import TENANT_ID_PREFIX
+
+
+# ---------------------------------------------------------------------------
+# Data types
+# ---------------------------------------------------------------------------
+
+
+class Args(NamedTuple):
+    jobs: int
+    batch_size: int
+
+
+class BatchResult(NamedTuple):
+    schemas: list[str]
+    success: bool
+    output: str
+    elapsed_sec: float
+
+
+# ---------------------------------------------------------------------------
+# Core functions
+# ---------------------------------------------------------------------------
+
+
+def run_alembic_for_batch(schemas: list[str]) -> BatchResult:
+    """Run ``alembic upgrade head`` for a batch of schemas in one subprocess.
+
+    If the batch fails, it is automatically retried with ``-x continue=true``
+    so that the remaining schemas in the batch still get migrated.  The retry
+    output (which contains alembic's per-schema error messages) is returned
+    for diagnosis.
+    """
+    csv = ",".join(schemas)
+    base_cmd = ["alembic", "-x", f"schemas={csv}"]
+
+    start = time.monotonic()
+    result = subprocess.run(
+        [*base_cmd, "upgrade", "head"],
+        stdout=subprocess.PIPE,
+        stderr=subprocess.STDOUT,
+        text=True,
+    )
+
+    if result.returncode == 0:
+        elapsed = time.monotonic() - start
+        return BatchResult(schemas, True, result.stdout or "", elapsed)
+
+    # At least one schema failed.  Print the initial error output, then
+    # re-run with continue=true so the remaining schemas still get migrated.
+    if result.stdout:
+        print(f"Initial error output:\n{result.stdout}", file=sys.stderr, flush=True)
+    print(
+        f"Batch failed (exit {result.returncode}), retrying with 'continue=true'...",
+        file=sys.stderr,
+        flush=True,
+    )
+
+    retry = subprocess.run(
+        [*base_cmd, "-x", "continue=true", "upgrade", "head"],
+        stdout=subprocess.PIPE,
+        stderr=subprocess.STDOUT,
+        text=True,
+    )
+    elapsed = time.monotonic() - start
+    return BatchResult(schemas, False, retry.stdout or "", elapsed)
+
+
+def get_head_revision() -> str | None:
+    """Get the head revision from the alembic script directory."""
+    alembic_cfg = Config("alembic.ini")
+    script = ScriptDirectory.from_config(alembic_cfg)
+    return script.get_current_head()
+
+
+def get_schemas_needing_migration(
+    tenant_schemas: List[str], head_rev: str
+) -> List[str]:
+    """Return only schemas whose current alembic version is not at head."""
+    if not tenant_schemas:
+        return []
+
+    engine = SqlEngine.get_engine()
+
+    with engine.connect() as conn:
+        # Find which schemas actually have an alembic_version table
+        rows = conn.execute(
+            text(
+                "SELECT table_schema FROM information_schema.tables "
+                "WHERE table_name = 'alembic_version' "
+                "AND table_schema = ANY(:schemas)"
+            ),
+            {"schemas": tenant_schemas},
+        )
+        schemas_with_table = set(row[0] for row in rows)
+
+        # Schemas without the table definitely need migration
+        needs_migration = [s for s in tenant_schemas if s not in schemas_with_table]
+
+        if not schemas_with_table:
+            return needs_migration
+
+        # Validate schema names before interpolating into SQL
+        for schema in schemas_with_table:
+            if not is_valid_schema_name(schema):
+                raise ValueError(f"Invalid schema name: {schema}")
+
+        # Single query to get every schema's current revision at once.
+        # Use integer tags instead of interpolating schema names into
+        # string literals to avoid quoting issues.
+        schema_list = list(schemas_with_table)
+        union_parts = [
+            f'SELECT {i} AS idx, version_num FROM "{schema}".alembic_version'
+            for i, schema in enumerate(schema_list)
+        ]
+        rows = conn.execute(text(" UNION ALL ".join(union_parts)))
+        version_by_schema = {schema_list[row[0]]: row[1] for row in rows}
+
+        needs_migration.extend(
+            s for s in schemas_with_table if version_by_schema.get(s) != head_rev
+        )
+
+    return needs_migration
+
+
+def run_migrations_parallel(
+    schemas: list[str],
+    max_workers: int,
+    batch_size: int,
+) -> bool:
+    """Chunk *schemas* into batches and run them in parallel.
+
+    A background monitor thread prints a status line every 60 s listing
+    which batches are still in-flight, making it easy to spot hung tenants.
+    """
+    batches = [schemas[i : i + batch_size] for i in range(0, len(schemas), batch_size)]
+    total_batches = len(batches)
+    print(
+        f"{len(schemas)} schemas in {total_batches} batch(es) "
+        f"with {max_workers} workers (batch size: {batch_size})...",
+        flush=True,
+    )
+    all_success = True
+
+    # Thread-safe tracking of in-flight batches for the monitor thread.
+    in_flight: dict[int, list[str]] = {}
+    prev_in_flight: set[int] = set()
+    lock = threading.Lock()
+    stop_event = threading.Event()
+
+    def _monitor() -> None:
+        """Print a status line every 60 s listing batches still in-flight.
+
+        Only prints batches that were also present in the previous tick,
+        making it easy to spot batches that are stuck.
+        """
+        nonlocal prev_in_flight
+        while not stop_event.wait(60):
+            with lock:
+                if not in_flight:
+                    prev_in_flight = set()
+                    continue
+                current = set(in_flight)
+                stuck = current & prev_in_flight
+                prev_in_flight = current
+
+                if not stuck:
+                    continue
+
+                schemas = [s for idx in sorted(stuck) for s in in_flight[idx]]
+                print(
+                    f"⏳ batch(es) still running since last check "
+                    f"({', '.join(str(i + 1) for i in sorted(stuck))}): "
+                    + ", ".join(schemas),
+                    flush=True,
+                )
+
+    monitor_thread = threading.Thread(target=_monitor, daemon=True)
+    monitor_thread.start()
+
+    try:
+        with ThreadPoolExecutor(max_workers=max_workers) as executor:
+
+            def _run(batch_idx: int, batch: list[str]) -> BatchResult:
+                with lock:
+                    in_flight[batch_idx] = batch
+                print(
+                    f"Batch {batch_idx + 1}/{total_batches} started "
+                    f"({len(batch)} schemas): {', '.join(batch)}",
+                    flush=True,
+                )
+                result = run_alembic_for_batch(batch)
+                with lock:
+                    in_flight.pop(batch_idx, None)
+                return result
+
+            future_to_idx = {
+                executor.submit(_run, i, b): i for i, b in enumerate(batches)
+            }
+
+            for future in as_completed(future_to_idx):
+                batch_idx = future_to_idx[future]
+                try:
+                    result = future.result()
+                    status = "✓" if result.success else "✗"
+
+                    print(
+                        f"Batch {batch_idx + 1}/{total_batches} "
+                        f"{status} {len(result.schemas)} schemas "
+                        f"in {result.elapsed_sec:.1f}s",
+                        flush=True,
+                    )
+
+                    if not result.success:
+                        # Print last 20 lines of retry output for diagnosis
+                        tail = result.output.strip().splitlines()[-20:]
+                        for line in tail:
+                            print(f"    {line}", flush=True)
+                        all_success = False
+
+                except Exception as e:
+                    print(
+                        f"Batch {batch_idx + 1}/{total_batches} " f"✗ exception: {e}",
+                        flush=True,
+                    )
+                    all_success = False
+    finally:
+        stop_event.set()
+        monitor_thread.join(timeout=2)
+
+    return all_success
+
+
+# ---------------------------------------------------------------------------
+# CLI
+# ---------------------------------------------------------------------------
+
+
+def parse_args() -> Args:
+    parser = argparse.ArgumentParser(
+        description="Run alembic migrations for all tenant schemas in parallel"
+    )
+    parser.add_argument(
+        "-j",
+        "--jobs",
+        type=int,
+        default=6,
+        metavar="N",
+        help="Number of parallel alembic processes (default: 6)",
+    )
+    parser.add_argument(
+        "-b",
+        "--batch-size",
+        type=int,
+        default=50,
+        metavar="N",
+        help="Schemas per alembic process (default: 50)",
+    )
+    args = parser.parse_args()
+    if args.jobs < 1:
+        parser.error("--jobs must be >= 1")
+    if args.batch_size < 1:
+        parser.error("--batch-size must be >= 1")
+    return Args(jobs=args.jobs, batch_size=args.batch_size)
+
+
+def main() -> int:
+    args = parse_args()
+
+    head_rev = get_head_revision()
+    if head_rev is None:
+        print("Could not determine head revision.", file=sys.stderr)
+        return 1
+
+    with SqlEngine.scoped_engine(pool_size=5, max_overflow=2):
+        tenant_ids = get_all_tenant_ids()
+        tenant_schemas = [tid for tid in tenant_ids if tid.startswith(TENANT_ID_PREFIX)]
+
+        if not tenant_schemas:
+            print(
+                "No tenant schemas found. Is MULTI_TENANT=true set?",
+                file=sys.stderr,
+            )
+            return 1
+
+        schemas_to_migrate = get_schemas_needing_migration(tenant_schemas, head_rev)
+
+    if not schemas_to_migrate:
+        print(
+            f"All {len(tenant_schemas)} tenants are already at head "
+            f"revision ({head_rev})."
+        )
+        return 0
+
+    print(
+        f"{len(schemas_to_migrate)}/{len(tenant_schemas)} tenants need "
+        f"migration (head: {head_rev})."
+    )
+
+    success = run_migrations_parallel(
+        schemas_to_migrate,
+        max_workers=args.jobs,
+        batch_size=args.batch_size,
+    )
+
+    print(f"\n{'All migrations successful' if success else 'Some migrations failed'}")
+    return 0 if success else 1
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
--- a/backend/alembic/versions/01f8e6d95a33_populate_flow_mapping_data.py
+++ b/backend/alembic/versions/01f8e6d95a33_populate_flow_mapping_data.py
@@ -0,0 +1,112 @@
+"""Populate flow mapping data
+
+Revision ID: 01f8e6d95a33
+Revises: d5c86e2c6dc6
+Create Date: 2026-01-31 17:37:10.485558
+
+"""
+
+from alembic import op
+
+
+# revision identifiers, used by Alembic.
+revision = "01f8e6d95a33"
+down_revision = "d5c86e2c6dc6"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # Add each model config to the conversation flow, setting the global default if it exists
+    # Exclude models that are part of ImageGenerationConfig
+    op.execute(
+        """
+        INSERT INTO llm_model_flow (llm_model_flow_type, is_default, model_configuration_id)
+        SELECT
+            'CHAT' AS llm_model_flow_type,
+            COALESCE(
+                (lp.is_default_provider IS TRUE AND lp.default_model_name = mc.name),
+                FALSE
+            ) AS is_default,
+            mc.id AS model_configuration_id
+        FROM model_configuration mc
+        LEFT JOIN llm_provider lp
+            ON lp.id = mc.llm_provider_id
+        WHERE NOT EXISTS (
+            SELECT 1 FROM image_generation_config igc
+            WHERE igc.model_configuration_id = mc.id
+        );
+        """
+    )
+
+    # Add models with supports_image_input to the vision flow
+    op.execute(
+        """
+        INSERT INTO llm_model_flow (llm_model_flow_type, is_default, model_configuration_id)
+        SELECT
+            'VISION' AS llm_model_flow_type,
+            COALESCE(
+                (lp.is_default_vision_provider IS TRUE AND lp.default_vision_model = mc.name),
+                FALSE
+            ) AS is_default,
+            mc.id AS model_configuration_id
+        FROM model_configuration mc
+        LEFT JOIN llm_provider lp
+            ON lp.id = mc.llm_provider_id
+        WHERE mc.supports_image_input IS TRUE;
+        """
+    )
+
+
+def downgrade() -> None:
+    # Populate vision defaults from model_flow
+    op.execute(
+        """
+        UPDATE llm_provider AS lp
+        SET
+            is_default_vision_provider = TRUE,
+            default_vision_model = mc.name
+        FROM llm_model_flow mf
+        JOIN model_configuration mc ON mc.id = mf.model_configuration_id
+        WHERE mf.llm_model_flow_type = 'VISION'
+          AND mf.is_default = TRUE
+          AND mc.llm_provider_id = lp.id;
+        """
+    )
+
+    # Populate conversation defaults from model_flow
+    op.execute(
+        """
+        UPDATE llm_provider AS lp
+        SET
+            is_default_provider = TRUE,
+            default_model_name = mc.name
+        FROM llm_model_flow mf
+        JOIN model_configuration mc ON mc.id = mf.model_configuration_id
+        WHERE mf.llm_model_flow_type = 'CHAT'
+          AND mf.is_default = TRUE
+          AND mc.llm_provider_id = lp.id;
+        """
+    )
+
+    # For providers that have conversation flow mappings but aren't the default,
+    # we still need a default_model_name (it was NOT NULL originally)
+    # Pick the first visible model or any model for that provider
+    op.execute(
+        """
+        UPDATE llm_provider AS lp
+        SET default_model_name = (
+            SELECT mc.name
+            FROM model_configuration mc
+            JOIN llm_model_flow mf ON mf.model_configuration_id = mc.id
+            WHERE mc.llm_provider_id = lp.id
+              AND mf.llm_model_flow_type = 'CHAT'
+            ORDER BY mc.is_visible DESC, mc.id ASC
+            LIMIT 1
+        )
+        WHERE lp.default_model_name IS NULL;
+        """
+    )
+
+    # Delete all model_flow entries (reverse the inserts from upgrade)
+    op.execute("DELETE FROM llm_model_flow;")
--- a/backend/alembic/versions/114a638452db_add_default_app_mode_to_user.py
+++ b/backend/alembic/versions/114a638452db_add_default_app_mode_to_user.py
@@ -0,0 +1,33 @@
+"""add default_app_mode to user
+
+Revision ID: 114a638452db
+Revises: feead2911109
+Create Date: 2026-02-09 18:57:08.274640
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "114a638452db"
+down_revision = "feead2911109"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "user",
+        sa.Column(
+            "default_app_mode",
+            sa.String(),
+            nullable=False,
+            server_default="CHAT",
+        ),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("user", "default_app_mode")
--- a/backend/alembic/versions/12635f6655b7_drive_canonical_ids.py
+++ b/backend/alembic/versions/12635f6655b7_drive_canonical_ids.py
@@ -11,7 +11,6 @@ import sqlalchemy as sa
 from urllib.parse import urlparse, urlunparse
 from httpx import HTTPStatusError
 import httpx
-from onyx.document_index.factory import get_default_document_index
 from onyx.db.search_settings import SearchSettings
 from onyx.document_index.vespa.shared_utils.utils import get_vespa_http_client
 from onyx.document_index.vespa.shared_utils.utils import (
@@ -519,15 +518,11 @@ def delete_document_from_db(current_doc_id: str, index_name: str) -> None:
 def upgrade() -> None:
    if SKIP_CANON_DRIVE_IDS:
        return
-    current_search_settings, future_search_settings = active_search_settings()
-    document_index = get_default_document_index(
-        current_search_settings,
-        future_search_settings,
-    )
+    current_search_settings, _ = active_search_settings()

    # Get the index name
-    if hasattr(document_index, "index_name"):
-        index_name = document_index.index_name
+    if hasattr(current_search_settings, "index_name"):
+        index_name = current_search_settings.index_name
    else:
        # Default index name if we can't get it from the document_index
        index_name = "danswer_index"
--- a/backend/alembic/versions/175ea04c7087_add_user_preferences.py
+++ b/backend/alembic/versions/175ea04c7087_add_user_preferences.py
@@ -0,0 +1,27 @@
+"""add_user_preferences
+
+Revision ID: 175ea04c7087
+Revises: d56ffa94ca32
+Create Date: 2026-02-04 18:16:24.830873
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision = "175ea04c7087"
+down_revision = "d56ffa94ca32"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "user",
+        sa.Column("user_preferences", sa.Text(), nullable=True),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("user", "user_preferences")
--- a/backend/alembic/versions/19c0ccb01687_migrate_to_contextual_rag_model.py
+++ b/backend/alembic/versions/19c0ccb01687_migrate_to_contextual_rag_model.py
@@ -0,0 +1,71 @@
+"""Migrate to contextual rag model
+
+Revision ID: 19c0ccb01687
+Revises: 9c54986124c6
+Create Date: 2026-02-12 11:21:41.798037
+
+"""
+
+import sqlalchemy as sa
+from alembic import op
+
+
+# revision identifiers, used by Alembic.
+revision = "19c0ccb01687"
+down_revision = "9c54986124c6"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # Widen the column to fit 'CONTEXTUAL_RAG' (15 chars); was varchar(10)
+    # when the table was created with only CHAT/VISION values.
+    op.alter_column(
+        "llm_model_flow",
+        "llm_model_flow_type",
+        type_=sa.String(length=20),
+        existing_type=sa.String(length=10),
+        existing_nullable=False,
+    )
+
+    # For every search_settings row that has contextual rag configured,
+    # create an llm_model_flow entry. is_default is TRUE if the row
+    # belongs to the PRESENT search settings, FALSE otherwise.
+    op.execute(
+        """
+        INSERT INTO llm_model_flow (llm_model_flow_type, model_configuration_id, is_default)
+        SELECT DISTINCT
+            'CONTEXTUAL_RAG',
+            mc.id,
+            (ss.status = 'PRESENT')
+        FROM search_settings ss
+        JOIN llm_provider lp
+            ON lp.name = ss.contextual_rag_llm_provider
+        JOIN model_configuration mc
+            ON mc.llm_provider_id = lp.id
+            AND mc.name = ss.contextual_rag_llm_name
+        WHERE ss.enable_contextual_rag = TRUE
+            AND ss.contextual_rag_llm_name IS NOT NULL
+            AND ss.contextual_rag_llm_provider IS NOT NULL
+        ON CONFLICT (llm_model_flow_type, model_configuration_id)
+            DO UPDATE SET is_default = EXCLUDED.is_default
+            WHERE EXCLUDED.is_default = TRUE
+        """
+    )
+
+
+def downgrade() -> None:
+    op.execute(
+        """
+        DELETE FROM llm_model_flow
+        WHERE llm_model_flow_type = 'CONTEXTUAL_RAG'
+        """
+    )
+
+    op.alter_column(
+        "llm_model_flow",
+        "llm_model_flow_type",
+        type_=sa.String(length=10),
+        existing_type=sa.String(length=20),
+        existing_nullable=False,
+    )
--- a/backend/alembic/versions/1f60f60c3401_embedding_model_search_settings.py
+++ b/backend/alembic/versions/1f60f60c3401_embedding_model_search_settings.py
@@ -10,8 +10,6 @@ from alembic import op
 import sqlalchemy as sa
 from sqlalchemy.dialects import postgresql

-from onyx.configs.chat_configs import NUM_POSTPROCESSED_RESULTS
-
 # revision identifiers, used by Alembic.
 revision = "1f60f60c3401"
 down_revision = "f17bf3b0d9f1"
@@ -66,7 +64,7 @@ def upgrade() -> None:
            "num_rerank",
            sa.Integer(),
            nullable=False,
-            server_default=str(NUM_POSTPROCESSED_RESULTS),
+            server_default=str(20),
        ),
    )

--- a/backend/alembic/versions/2020d417ec84_single_onyx_craft_migration.py
+++ b/backend/alembic/versions/2020d417ec84_single_onyx_craft_migration.py
@@ -0,0 +1,351 @@
+"""single onyx craft migration
+
+Consolidates all buildmode/onyx craft tables into a single migration.
+
+Tables created:
+- build_session: User build sessions with status tracking
+- sandbox: User-owned containerized environments (one per user)
+- artifact: Build output files (web apps, documents, images)
+- snapshot: Sandbox filesystem snapshots
+- build_message: Conversation messages for build sessions
+
+Existing table modified:
+- connector_credential_pair: Added processing_mode column
+
+Revision ID: 2020d417ec84
+Revises: 41fa44bef321
+Create Date: 2026-01-26 14:43:54.641405
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+from sqlalchemy.dialects import postgresql
+
+
+# revision identifiers, used by Alembic.
+revision = "2020d417ec84"
+down_revision = "41fa44bef321"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # ==========================================================================
+    # ENUMS
+    # ==========================================================================
+
+    # Build session status enum
+    build_session_status_enum = sa.Enum(
+        "active",
+        "idle",
+        name="buildsessionstatus",
+        native_enum=False,
+    )
+
+    # Sandbox status enum
+    sandbox_status_enum = sa.Enum(
+        "provisioning",
+        "running",
+        "idle",
+        "sleeping",
+        "terminated",
+        "failed",
+        name="sandboxstatus",
+        native_enum=False,
+    )
+
+    # Artifact type enum
+    artifact_type_enum = sa.Enum(
+        "web_app",
+        "pptx",
+        "docx",
+        "markdown",
+        "excel",
+        "image",
+        name="artifacttype",
+        native_enum=False,
+    )
+
+    # ==========================================================================
+    # BUILD_SESSION TABLE
+    # ==========================================================================
+
+    op.create_table(
+        "build_session",
+        sa.Column("id", postgresql.UUID(as_uuid=True), primary_key=True),
+        sa.Column(
+            "user_id",
+            postgresql.UUID(as_uuid=True),
+            sa.ForeignKey("user.id", ondelete="CASCADE"),
+            nullable=True,
+        ),
+        sa.Column("name", sa.String(), nullable=True),
+        sa.Column(
+            "status",
+            build_session_status_enum,
+            nullable=False,
+            server_default="active",
+        ),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.Column(
+            "last_activity_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.Column("nextjs_port", sa.Integer(), nullable=True),
+        sa.PrimaryKeyConstraint("id"),
+    )
+
+    op.create_index(
+        "ix_build_session_user_created",
+        "build_session",
+        ["user_id", sa.text("created_at DESC")],
+        unique=False,
+    )
+    op.create_index(
+        "ix_build_session_status",
+        "build_session",
+        ["status"],
+        unique=False,
+    )
+
+    # ==========================================================================
+    # SANDBOX TABLE (user-owned, one per user)
+    # ==========================================================================
+
+    op.create_table(
+        "sandbox",
+        sa.Column("id", postgresql.UUID(as_uuid=True), primary_key=True),
+        sa.Column(
+            "user_id",
+            postgresql.UUID(as_uuid=True),
+            sa.ForeignKey("user.id", ondelete="CASCADE"),
+            nullable=False,
+        ),
+        sa.Column("container_id", sa.String(), nullable=True),
+        sa.Column(
+            "status",
+            sandbox_status_enum,
+            nullable=False,
+            server_default="provisioning",
+        ),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.Column("last_heartbeat", sa.DateTime(timezone=True), nullable=True),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("user_id", name="sandbox_user_id_key"),
+    )
+
+    op.create_index(
+        "ix_sandbox_status",
+        "sandbox",
+        ["status"],
+        unique=False,
+    )
+    op.create_index(
+        "ix_sandbox_container_id",
+        "sandbox",
+        ["container_id"],
+        unique=False,
+    )
+
+    # ==========================================================================
+    # ARTIFACT TABLE
+    # ==========================================================================
+
+    op.create_table(
+        "artifact",
+        sa.Column("id", postgresql.UUID(as_uuid=True), primary_key=True),
+        sa.Column(
+            "session_id",
+            postgresql.UUID(as_uuid=True),
+            sa.ForeignKey("build_session.id", ondelete="CASCADE"),
+            nullable=False,
+        ),
+        sa.Column("type", artifact_type_enum, nullable=False),
+        sa.Column("path", sa.String(), nullable=False),
+        sa.Column("name", sa.String(), nullable=False),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.Column(
+            "updated_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.PrimaryKeyConstraint("id"),
+    )
+
+    op.create_index(
+        "ix_artifact_session_created",
+        "artifact",
+        ["session_id", sa.text("created_at DESC")],
+        unique=False,
+    )
+    op.create_index(
+        "ix_artifact_type",
+        "artifact",
+        ["type"],
+        unique=False,
+    )
+
+    # ==========================================================================
+    # SNAPSHOT TABLE
+    # ==========================================================================
+
+    op.create_table(
+        "snapshot",
+        sa.Column("id", postgresql.UUID(as_uuid=True), primary_key=True),
+        sa.Column(
+            "session_id",
+            postgresql.UUID(as_uuid=True),
+            sa.ForeignKey("build_session.id", ondelete="CASCADE"),
+            nullable=False,
+        ),
+        sa.Column("storage_path", sa.String(), nullable=False),
+        sa.Column("size_bytes", sa.BigInteger(), nullable=False, server_default="0"),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.PrimaryKeyConstraint("id"),
+    )
+
+    op.create_index(
+        "ix_snapshot_session_created",
+        "snapshot",
+        ["session_id", sa.text("created_at DESC")],
+        unique=False,
+    )
+
+    # ==========================================================================
+    # BUILD_MESSAGE TABLE
+    # ==========================================================================
+
+    op.create_table(
+        "build_message",
+        sa.Column("id", postgresql.UUID(as_uuid=True), primary_key=True),
+        sa.Column(
+            "session_id",
+            postgresql.UUID(as_uuid=True),
+            sa.ForeignKey("build_session.id", ondelete="CASCADE"),
+            nullable=False,
+        ),
+        sa.Column(
+            "turn_index",
+            sa.Integer(),
+            nullable=False,
+        ),
+        sa.Column(
+            "type",
+            sa.Enum(
+                "SYSTEM",
+                "USER",
+                "ASSISTANT",
+                "DANSWER",
+                name="messagetype",
+                create_type=False,
+                native_enum=False,
+            ),
+            nullable=False,
+        ),
+        sa.Column(
+            "message_metadata",
+            postgresql.JSONB(),
+            nullable=False,
+        ),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.PrimaryKeyConstraint("id"),
+    )
+
+    op.create_index(
+        "ix_build_message_session_turn",
+        "build_message",
+        ["session_id", "turn_index", sa.text("created_at ASC")],
+        unique=False,
+    )
+
+    # ==========================================================================
+    # CONNECTOR_CREDENTIAL_PAIR MODIFICATION
+    # ==========================================================================
+
+    op.add_column(
+        "connector_credential_pair",
+        sa.Column(
+            "processing_mode",
+            sa.String(),
+            nullable=False,
+            server_default="regular",
+        ),
+    )
+
+
+def downgrade() -> None:
+    # ==========================================================================
+    # CONNECTOR_CREDENTIAL_PAIR MODIFICATION
+    # ==========================================================================
+
+    op.drop_column("connector_credential_pair", "processing_mode")
+
+    # ==========================================================================
+    # BUILD_MESSAGE TABLE
+    # ==========================================================================
+
+    op.drop_index("ix_build_message_session_turn", table_name="build_message")
+    op.drop_table("build_message")
+
+    # ==========================================================================
+    # SNAPSHOT TABLE
+    # ==========================================================================
+
+    op.drop_index("ix_snapshot_session_created", table_name="snapshot")
+    op.drop_table("snapshot")
+
+    # ==========================================================================
+    # ARTIFACT TABLE
+    # ==========================================================================
+
+    op.drop_index("ix_artifact_type", table_name="artifact")
+    op.drop_index("ix_artifact_session_created", table_name="artifact")
+    op.drop_table("artifact")
+    sa.Enum(name="artifacttype").drop(op.get_bind(), checkfirst=True)
+
+    # ==========================================================================
+    # SANDBOX TABLE
+    # ==========================================================================
+
+    op.drop_index("ix_sandbox_container_id", table_name="sandbox")
+    op.drop_index("ix_sandbox_status", table_name="sandbox")
+    op.drop_table("sandbox")
+    sa.Enum(name="sandboxstatus").drop(op.get_bind(), checkfirst=True)
+
+    # ==========================================================================
+    # BUILD_SESSION TABLE
+    # ==========================================================================
+
+    op.drop_index("ix_build_session_status", table_name="build_session")
+    op.drop_index("ix_build_session_user_created", table_name="build_session")
+    op.drop_table("build_session")
+    sa.Enum(name="buildsessionstatus").drop(op.get_bind(), checkfirst=True)
--- a/backend/alembic/versions/631fd2504136_add_approx_chunk_count_in_vespa_to_.py
+++ b/backend/alembic/versions/631fd2504136_add_approx_chunk_count_in_vespa_to_.py
@@ -0,0 +1,32 @@
+"""add approx_chunk_count_in_vespa to opensearch tenant migration
+
+Revision ID: 631fd2504136
+Revises: c7f2e1b4a9d3
+Create Date: 2026-02-18 21:07:52.831215
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "631fd2504136"
+down_revision = "c7f2e1b4a9d3"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "opensearch_tenant_migration_record",
+        sa.Column(
+            "approx_chunk_count_in_vespa",
+            sa.Integer(),
+            nullable=True,
+        ),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("opensearch_tenant_migration_record", "approx_chunk_count_in_vespa")
--- a/backend/alembic/versions/72aa7de2e5cf_make_processing_mode_default_all_caps.py
+++ b/backend/alembic/versions/72aa7de2e5cf_make_processing_mode_default_all_caps.py
@@ -0,0 +1,45 @@
+"""make processing mode default all caps
+
+Revision ID: 72aa7de2e5cf
+Revises: 2020d417ec84
+Create Date: 2026-01-26 18:58:47.705253
+
+This migration fixes the ProcessingMode enum value mismatch:
+- SQLAlchemy's Enum with native_enum=False uses enum member NAMES as valid values
+- The original migration stored lowercase VALUES ('regular', 'file_system')
+- This converts existing data to uppercase NAMES ('REGULAR', 'FILE_SYSTEM')
+- Also drops any spurious native PostgreSQL enum type that may have been auto-created
+"""
+
+from alembic import op
+
+
+# revision identifiers, used by Alembic.
+revision = "72aa7de2e5cf"
+down_revision = "2020d417ec84"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # Convert existing lowercase values to uppercase to match enum member names
+    op.execute(
+        "UPDATE connector_credential_pair SET processing_mode = 'REGULAR' "
+        "WHERE processing_mode = 'regular'"
+    )
+    op.execute(
+        "UPDATE connector_credential_pair SET processing_mode = 'FILE_SYSTEM' "
+        "WHERE processing_mode = 'file_system'"
+    )
+
+    # Update the server default to use uppercase
+    op.alter_column(
+        "connector_credential_pair",
+        "processing_mode",
+        server_default="REGULAR",
+    )
+
+
+def downgrade() -> None:
+    # State prior to this was broken, so we don't want to revert back to it
+    pass
--- a/backend/alembic/versions/78ebc66946a0_remove_reranking_from_search_settings.py
+++ b/backend/alembic/versions/78ebc66946a0_remove_reranking_from_search_settings.py
@@ -0,0 +1,58 @@
+"""remove reranking from search_settings
+
+Revision ID: 78ebc66946a0
+Revises: 849b21c732f8
+Create Date: 2026-01-28
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision = "78ebc66946a0"
+down_revision = "849b21c732f8"
+branch_labels: None = None
+depends_on: None = None
+
+
+def upgrade() -> None:
+    op.drop_column("search_settings", "disable_rerank_for_streaming")
+    op.drop_column("search_settings", "rerank_model_name")
+    op.drop_column("search_settings", "rerank_provider_type")
+    op.drop_column("search_settings", "rerank_api_key")
+    op.drop_column("search_settings", "rerank_api_url")
+    op.drop_column("search_settings", "num_rerank")
+
+
+def downgrade() -> None:
+    op.add_column(
+        "search_settings",
+        sa.Column(
+            "disable_rerank_for_streaming",
+            sa.Boolean(),
+            nullable=False,
+            server_default="false",
+        ),
+    )
+    op.add_column(
+        "search_settings", sa.Column("rerank_model_name", sa.String(), nullable=True)
+    )
+    op.add_column(
+        "search_settings", sa.Column("rerank_provider_type", sa.String(), nullable=True)
+    )
+    op.add_column(
+        "search_settings", sa.Column("rerank_api_key", sa.String(), nullable=True)
+    )
+    op.add_column(
+        "search_settings", sa.Column("rerank_api_url", sa.String(), nullable=True)
+    )
+    op.add_column(
+        "search_settings",
+        sa.Column(
+            "num_rerank",
+            sa.Integer(),
+            nullable=False,
+            server_default=str(20),
+        ),
+    )
--- a/backend/alembic/versions/81c22b1e2e78_hierarchy_nodes_v1.py
+++ b/backend/alembic/versions/81c22b1e2e78_hierarchy_nodes_v1.py
@@ -0,0 +1,349 @@
+"""hierarchy_nodes_v1
+
+Revision ID: 81c22b1e2e78
+Revises: 72aa7de2e5cf
+Create Date: 2026-01-13 18:10:01.021451
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+from sqlalchemy.dialects import postgresql
+
+from onyx.configs.constants import DocumentSource
+
+
+# revision identifiers, used by Alembic.
+revision = "81c22b1e2e78"
+down_revision = "72aa7de2e5cf"
+branch_labels = None
+depends_on = None
+
+
+# Human-readable display names for each source
+SOURCE_DISPLAY_NAMES: dict[str, str] = {
+    "ingestion_api": "Ingestion API",
+    "slack": "Slack",
+    "web": "Web",
+    "google_drive": "Google Drive",
+    "gmail": "Gmail",
+    "requesttracker": "Request Tracker",
+    "github": "GitHub",
+    "gitbook": "GitBook",
+    "gitlab": "GitLab",
+    "guru": "Guru",
+    "bookstack": "BookStack",
+    "outline": "Outline",
+    "confluence": "Confluence",
+    "jira": "Jira",
+    "slab": "Slab",
+    "productboard": "Productboard",
+    "file": "File",
+    "coda": "Coda",
+    "notion": "Notion",
+    "zulip": "Zulip",
+    "linear": "Linear",
+    "hubspot": "HubSpot",
+    "document360": "Document360",
+    "gong": "Gong",
+    "google_sites": "Google Sites",
+    "zendesk": "Zendesk",
+    "loopio": "Loopio",
+    "dropbox": "Dropbox",
+    "sharepoint": "SharePoint",
+    "teams": "Teams",
+    "salesforce": "Salesforce",
+    "discourse": "Discourse",
+    "axero": "Axero",
+    "clickup": "ClickUp",
+    "mediawiki": "MediaWiki",
+    "wikipedia": "Wikipedia",
+    "asana": "Asana",
+    "s3": "S3",
+    "r2": "R2",
+    "google_cloud_storage": "Google Cloud Storage",
+    "oci_storage": "OCI Storage",
+    "xenforo": "XenForo",
+    "not_applicable": "Not Applicable",
+    "discord": "Discord",
+    "freshdesk": "Freshdesk",
+    "fireflies": "Fireflies",
+    "egnyte": "Egnyte",
+    "airtable": "Airtable",
+    "highspot": "Highspot",
+    "drupal_wiki": "Drupal Wiki",
+    "imap": "IMAP",
+    "bitbucket": "Bitbucket",
+    "testrail": "TestRail",
+    "mock_connector": "Mock Connector",
+    "user_file": "User File",
+}
+
+
+def upgrade() -> None:
+    # 1. Create hierarchy_node table
+    op.create_table(
+        "hierarchy_node",
+        sa.Column("id", sa.Integer(), nullable=False),
+        sa.Column("raw_node_id", sa.String(), nullable=False),
+        sa.Column("display_name", sa.String(), nullable=False),
+        sa.Column("link", sa.String(), nullable=True),
+        sa.Column("source", sa.String(), nullable=False),
+        sa.Column("node_type", sa.String(), nullable=False),
+        sa.Column("document_id", sa.String(), nullable=True),
+        sa.Column("parent_id", sa.Integer(), nullable=True),
+        # Permission fields - same pattern as Document table
+        sa.Column(
+            "external_user_emails",
+            postgresql.ARRAY(sa.String()),
+            nullable=True,
+        ),
+        sa.Column(
+            "external_user_group_ids",
+            postgresql.ARRAY(sa.String()),
+            nullable=True,
+        ),
+        sa.Column("is_public", sa.Boolean(), nullable=False, server_default="false"),
+        sa.PrimaryKeyConstraint("id"),
+        # When document is deleted, just unlink (node can exist without document)
+        sa.ForeignKeyConstraint(["document_id"], ["document.id"], ondelete="SET NULL"),
+        # When parent node is deleted, orphan children (cleanup via pruning)
+        sa.ForeignKeyConstraint(
+            ["parent_id"], ["hierarchy_node.id"], ondelete="SET NULL"
+        ),
+        sa.UniqueConstraint(
+            "raw_node_id", "source", name="uq_hierarchy_node_raw_id_source"
+        ),
+    )
+    op.create_index("ix_hierarchy_node_parent_id", "hierarchy_node", ["parent_id"])
+    op.create_index(
+        "ix_hierarchy_node_source_type", "hierarchy_node", ["source", "node_type"]
+    )
+
+    # Add partial unique index to ensure only one SOURCE-type node per source
+    # This prevents duplicate source root nodes from being created
+    # NOTE: node_type stores enum NAME ('SOURCE'), not value ('source')
+    op.execute(
+        sa.text(
+            """
+            CREATE UNIQUE INDEX uq_hierarchy_node_one_source_per_type
+            ON hierarchy_node (source)
+            WHERE node_type = 'SOURCE'
+            """
+        )
+    )
+
+    # 2. Create hierarchy_fetch_attempt table
+    op.create_table(
+        "hierarchy_fetch_attempt",
+        sa.Column("id", postgresql.UUID(as_uuid=True), nullable=False),
+        sa.Column("connector_credential_pair_id", sa.Integer(), nullable=False),
+        sa.Column("status", sa.String(), nullable=False),
+        sa.Column("nodes_fetched", sa.Integer(), nullable=True, server_default="0"),
+        sa.Column("nodes_updated", sa.Integer(), nullable=True, server_default="0"),
+        sa.Column("error_msg", sa.Text(), nullable=True),
+        sa.Column("full_exception_trace", sa.Text(), nullable=True),
+        sa.Column(
+            "time_created",
+            sa.DateTime(timezone=True),
+            server_default=sa.func.now(),
+            nullable=False,
+        ),
+        sa.Column("time_started", sa.DateTime(timezone=True), nullable=True),
+        sa.Column(
+            "time_updated",
+            sa.DateTime(timezone=True),
+            server_default=sa.func.now(),
+            nullable=False,
+        ),
+        sa.PrimaryKeyConstraint("id"),
+        sa.ForeignKeyConstraint(
+            ["connector_credential_pair_id"],
+            ["connector_credential_pair.id"],
+            ondelete="CASCADE",
+        ),
+    )
+    op.create_index(
+        "ix_hierarchy_fetch_attempt_status", "hierarchy_fetch_attempt", ["status"]
+    )
+    op.create_index(
+        "ix_hierarchy_fetch_attempt_time_created",
+        "hierarchy_fetch_attempt",
+        ["time_created"],
+    )
+    op.create_index(
+        "ix_hierarchy_fetch_attempt_cc_pair",
+        "hierarchy_fetch_attempt",
+        ["connector_credential_pair_id"],
+    )
+
+    # 3. Insert SOURCE-type hierarchy nodes for each DocumentSource
+    # We insert these so every existing document can have a parent hierarchy node
+    # NOTE: SQLAlchemy's Enum with native_enum=False stores the enum NAME (e.g., 'GOOGLE_DRIVE'),
+    # not the VALUE (e.g., 'google_drive'). We must use .name for source and node_type columns.
+    # SOURCE nodes are always public since they're just categorical roots.
+    for source in DocumentSource:
+        source_name = (
+            source.name
+        )  # e.g., 'GOOGLE_DRIVE' - what SQLAlchemy stores/expects
+        source_value = source.value  # e.g., 'google_drive' - the raw_node_id
+        display_name = SOURCE_DISPLAY_NAMES.get(
+            source_value, source_value.replace("_", " ").title()
+        )
+        op.execute(
+            sa.text(
+                """
+                INSERT INTO hierarchy_node (raw_node_id, display_name, source, node_type, parent_id, is_public)
+                VALUES (:raw_node_id, :display_name, :source, 'SOURCE', NULL, true)
+                ON CONFLICT (raw_node_id, source) DO NOTHING
+                """
+            ).bindparams(
+                raw_node_id=source_value,  # Use .value for raw_node_id (human-readable identifier)
+                display_name=display_name,
+                source=source_name,  # Use .name for source column (SQLAlchemy enum storage)
+            )
+        )
+
+    # 4. Add parent_hierarchy_node_id column to document table
+    op.add_column(
+        "document",
+        sa.Column("parent_hierarchy_node_id", sa.Integer(), nullable=True),
+    )
+    # When hierarchy node is deleted, just unlink the document (SET NULL)
+    op.create_foreign_key(
+        "fk_document_parent_hierarchy_node",
+        "document",
+        "hierarchy_node",
+        ["parent_hierarchy_node_id"],
+        ["id"],
+        ondelete="SET NULL",
+    )
+    op.create_index(
+        "ix_document_parent_hierarchy_node_id",
+        "document",
+        ["parent_hierarchy_node_id"],
+    )
+
+    # 5. Set all existing documents' parent_hierarchy_node_id to their source's SOURCE node
+    # For documents with multiple connectors, we pick one source deterministically (MIN connector_id)
+    # NOTE: Both connector.source and hierarchy_node.source store enum NAMEs (e.g., 'GOOGLE_DRIVE')
+    # because SQLAlchemy Enum(native_enum=False) uses the enum name for storage.
+    op.execute(
+        sa.text(
+            """
+            UPDATE document d
+            SET parent_hierarchy_node_id = hn.id
+            FROM (
+                -- Get the source for each document (pick MIN connector_id for determinism)
+                SELECT DISTINCT ON (dbcc.id)
+                    dbcc.id as doc_id,
+                    c.source as source
+                FROM document_by_connector_credential_pair dbcc
+                JOIN connector c ON dbcc.connector_id = c.id
+                ORDER BY dbcc.id, dbcc.connector_id
+            ) doc_source
+            JOIN hierarchy_node hn ON hn.source = doc_source.source AND hn.node_type = 'SOURCE'
+            WHERE d.id = doc_source.doc_id
+            """
+        )
+    )
+
+    # Create the persona__hierarchy_node association table
+    op.create_table(
+        "persona__hierarchy_node",
+        sa.Column("persona_id", sa.Integer(), nullable=False),
+        sa.Column("hierarchy_node_id", sa.Integer(), nullable=False),
+        sa.ForeignKeyConstraint(
+            ["persona_id"],
+            ["persona.id"],
+            ondelete="CASCADE",
+        ),
+        sa.ForeignKeyConstraint(
+            ["hierarchy_node_id"],
+            ["hierarchy_node.id"],
+            ondelete="CASCADE",
+        ),
+        sa.PrimaryKeyConstraint("persona_id", "hierarchy_node_id"),
+    )
+
+    # Add index for efficient lookups
+    op.create_index(
+        "ix_persona__hierarchy_node_hierarchy_node_id",
+        "persona__hierarchy_node",
+        ["hierarchy_node_id"],
+    )
+
+    # Create the persona__document association table for attaching individual
+    # documents directly to assistants
+    op.create_table(
+        "persona__document",
+        sa.Column("persona_id", sa.Integer(), nullable=False),
+        sa.Column("document_id", sa.String(), nullable=False),
+        sa.ForeignKeyConstraint(
+            ["persona_id"],
+            ["persona.id"],
+            ondelete="CASCADE",
+        ),
+        sa.ForeignKeyConstraint(
+            ["document_id"],
+            ["document.id"],
+            ondelete="CASCADE",
+        ),
+        sa.PrimaryKeyConstraint("persona_id", "document_id"),
+    )
+
+    # Add index for efficient lookups by document_id
+    op.create_index(
+        "ix_persona__document_document_id",
+        "persona__document",
+        ["document_id"],
+    )
+
+    # 6. Add last_time_hierarchy_fetch column to connector_credential_pair table
+    op.add_column(
+        "connector_credential_pair",
+        sa.Column(
+            "last_time_hierarchy_fetch", sa.DateTime(timezone=True), nullable=True
+        ),
+    )
+
+
+def downgrade() -> None:
+    # Remove last_time_hierarchy_fetch from connector_credential_pair
+    op.drop_column("connector_credential_pair", "last_time_hierarchy_fetch")
+
+    # Drop persona__document table
+    op.drop_index("ix_persona__document_document_id", table_name="persona__document")
+    op.drop_table("persona__document")
+
+    # Drop persona__hierarchy_node table
+    op.drop_index(
+        "ix_persona__hierarchy_node_hierarchy_node_id",
+        table_name="persona__hierarchy_node",
+    )
+    op.drop_table("persona__hierarchy_node")
+
+    # Remove parent_hierarchy_node_id from document
+    op.drop_index("ix_document_parent_hierarchy_node_id", table_name="document")
+    op.drop_constraint(
+        "fk_document_parent_hierarchy_node", "document", type_="foreignkey"
+    )
+    op.drop_column("document", "parent_hierarchy_node_id")
+
+    # Drop hierarchy_fetch_attempt table
+    op.drop_index(
+        "ix_hierarchy_fetch_attempt_cc_pair", table_name="hierarchy_fetch_attempt"
+    )
+    op.drop_index(
+        "ix_hierarchy_fetch_attempt_time_created", table_name="hierarchy_fetch_attempt"
+    )
+    op.drop_index(
+        "ix_hierarchy_fetch_attempt_status", table_name="hierarchy_fetch_attempt"
+    )
+    op.drop_table("hierarchy_fetch_attempt")
+
+    # Drop hierarchy_node table
+    op.drop_index("uq_hierarchy_node_one_source_per_type", table_name="hierarchy_node")
+    op.drop_index("ix_hierarchy_node_source_type", table_name="hierarchy_node")
+    op.drop_index("ix_hierarchy_node_parent_id", table_name="hierarchy_node")
+    op.drop_table("hierarchy_node")
--- a/backend/alembic/versions/849b21c732f8_add_demo_data_enabled_to_build_session.py
+++ b/backend/alembic/versions/849b21c732f8_add_demo_data_enabled_to_build_session.py
@@ -0,0 +1,32 @@
+"""add demo_data_enabled to build_session
+
+Revision ID: 849b21c732f8
+Revises: 81c22b1e2e78
+Create Date: 2026-01-28 10:00:00.000000
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision = "849b21c732f8"
+down_revision = "81c22b1e2e78"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "build_session",
+        sa.Column(
+            "demo_data_enabled",
+            sa.Boolean(),
+            nullable=False,
+            server_default=sa.text("true"),
+        ),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("build_session", "demo_data_enabled")
--- a/backend/alembic/versions/90b409d06e50_add_chat_compression_fields.py
+++ b/backend/alembic/versions/90b409d06e50_add_chat_compression_fields.py
@@ -0,0 +1,36 @@
+"""add_chat_compression_fields
+
+Revision ID: 90b409d06e50
+Revises: f220515df7b4
+Create Date: 2026-01-26 09:13:09.635427
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "90b409d06e50"
+down_revision = "f220515df7b4"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # Add last_summarized_message_id to chat_message
+    # This field marks a message as a summary and indicates the last message it covers.
+    # Summaries are branch-aware via their parent_message_id pointing to the branch.
+    op.add_column(
+        "chat_message",
+        sa.Column(
+            "last_summarized_message_id",
+            sa.Integer(),
+            sa.ForeignKey("chat_message.id", ondelete="SET NULL"),
+            nullable=True,
+        ),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("chat_message", "last_summarized_message_id")
--- a/backend/alembic/versions/90e3b9af7da4_tag_fix.py
+++ b/backend/alembic/versions/90e3b9af7da4_tag_fix.py
@@ -16,7 +16,6 @@ from typing import Generator
 from alembic import op
 import sqlalchemy as sa

-from onyx.document_index.factory import get_default_document_index
 from onyx.document_index.vespa_constants import DOCUMENT_ID_ENDPOINT
 from onyx.db.search_settings import SearchSettings
 from onyx.configs.app_configs import AUTH_TYPE
@@ -126,14 +125,11 @@ def remove_old_tags() -> None:
    the document got reindexed, the old tag would not be removed.
    This function removes those old tags by comparing it against the tags in vespa.
    """
-    current_search_settings, future_search_settings = active_search_settings()
-    document_index = get_default_document_index(
-        current_search_settings, future_search_settings
-    )
+    current_search_settings, _ = active_search_settings()

    # Get the index name
-    if hasattr(document_index, "index_name"):
-        index_name = document_index.index_name
+    if hasattr(current_search_settings, "index_name"):
+        index_name = current_search_settings.index_name
    else:
        # Default index name if we can't get it from the document_index
        index_name = "danswer_index"
--- a/backend/alembic/versions/93c15d6a6fbb_add_chunk_error_and_vespa_count_columns_.py
+++ b/backend/alembic/versions/93c15d6a6fbb_add_chunk_error_and_vespa_count_columns_.py
@@ -0,0 +1,43 @@
+"""add chunk error and vespa count columns to opensearch tenant migration
+
+Revision ID: 93c15d6a6fbb
+Revises: d3fd499c829c
+Create Date: 2026-02-11 23:07:34.576725
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "93c15d6a6fbb"
+down_revision = "d3fd499c829c"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "opensearch_tenant_migration_record",
+        sa.Column(
+            "total_chunks_errored",
+            sa.Integer(),
+            nullable=False,
+            server_default="0",
+        ),
+    )
+    op.add_column(
+        "opensearch_tenant_migration_record",
+        sa.Column(
+            "total_chunks_in_vespa",
+            sa.Integer(),
+            nullable=False,
+            server_default="0",
+        ),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("opensearch_tenant_migration_record", "total_chunks_in_vespa")
+    op.drop_column("opensearch_tenant_migration_record", "total_chunks_errored")
--- a/backend/alembic/versions/9c54986124c6_add_scim_tables.py
+++ b/backend/alembic/versions/9c54986124c6_add_scim_tables.py
@@ -0,0 +1,124 @@
+"""add_scim_tables
+
+Revision ID: 9c54986124c6
+Revises: b51c6844d1df
+Create Date: 2026-02-12 20:29:47.448614
+
+"""
+
+from alembic import op
+import fastapi_users_db_sqlalchemy
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision = "9c54986124c6"
+down_revision = "b51c6844d1df"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.create_table(
+        "scim_token",
+        sa.Column("id", sa.Integer(), nullable=False),
+        sa.Column("name", sa.String(), nullable=False),
+        sa.Column("hashed_token", sa.String(length=64), nullable=False),
+        sa.Column("token_display", sa.String(), nullable=False),
+        sa.Column(
+            "created_by_id",
+            fastapi_users_db_sqlalchemy.generics.GUID(),
+            nullable=False,
+        ),
+        sa.Column(
+            "is_active",
+            sa.Boolean(),
+            server_default=sa.text("true"),
+            nullable=False,
+        ),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.Column("last_used_at", sa.DateTime(timezone=True), nullable=True),
+        sa.ForeignKeyConstraint(["created_by_id"], ["user.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("hashed_token"),
+    )
+    op.create_table(
+        "scim_group_mapping",
+        sa.Column("id", sa.Integer(), nullable=False),
+        sa.Column("external_id", sa.String(), nullable=False),
+        sa.Column("user_group_id", sa.Integer(), nullable=False),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.Column(
+            "updated_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            onupdate=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.ForeignKeyConstraint(
+            ["user_group_id"], ["user_group.id"], ondelete="CASCADE"
+        ),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("user_group_id"),
+    )
+    op.create_index(
+        op.f("ix_scim_group_mapping_external_id"),
+        "scim_group_mapping",
+        ["external_id"],
+        unique=True,
+    )
+    op.create_table(
+        "scim_user_mapping",
+        sa.Column("id", sa.Integer(), nullable=False),
+        sa.Column("external_id", sa.String(), nullable=False),
+        sa.Column(
+            "user_id",
+            fastapi_users_db_sqlalchemy.generics.GUID(),
+            nullable=False,
+        ),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.Column(
+            "updated_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.text("now()"),
+            onupdate=sa.text("now()"),
+            nullable=False,
+        ),
+        sa.ForeignKeyConstraint(["user_id"], ["user.id"], ondelete="CASCADE"),
+        sa.PrimaryKeyConstraint("id"),
+        sa.UniqueConstraint("user_id"),
+    )
+    op.create_index(
+        op.f("ix_scim_user_mapping_external_id"),
+        "scim_user_mapping",
+        ["external_id"],
+        unique=True,
+    )
+
+
+def downgrade() -> None:
+    op.drop_index(
+        op.f("ix_scim_user_mapping_external_id"),
+        table_name="scim_user_mapping",
+    )
+    op.drop_table("scim_user_mapping")
+    op.drop_index(
+        op.f("ix_scim_group_mapping_external_id"),
+        table_name="scim_group_mapping",
+    )
+    op.drop_table("scim_group_mapping")
+    op.drop_table("scim_token")
--- a/backend/alembic/versions/9d1543a37106_add_processing_duration_seconds_to_chat_.py
+++ b/backend/alembic/versions/9d1543a37106_add_processing_duration_seconds_to_chat_.py
@@ -0,0 +1,27 @@
+"""add processing_duration_seconds to chat_message
+
+Revision ID: 9d1543a37106
+Revises: cbc03e08d0f3
+Create Date: 2026-01-21 11:42:18.546188
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision = "9d1543a37106"
+down_revision = "cbc03e08d0f3"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "chat_message",
+        sa.Column("processing_duration_seconds", sa.Float(), nullable=True),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("chat_message", "processing_duration_seconds")
--- a/backend/alembic/versions/b51c6844d1df_seed_memory_tool.py
+++ b/backend/alembic/versions/b51c6844d1df_seed_memory_tool.py
@@ -0,0 +1,81 @@
+"""seed_memory_tool and add enable_memory_tool to user
+
+Revision ID: b51c6844d1df
+Revises: 93c15d6a6fbb
+Create Date: 2026-02-11 00:00:00.000000
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "b51c6844d1df"
+down_revision = "93c15d6a6fbb"
+branch_labels = None
+depends_on = None
+
+
+MEMORY_TOOL = {
+    "name": "MemoryTool",
+    "display_name": "Add Memory",
+    "description": "Save memories about the user for future conversations.",
+    "in_code_tool_id": "MemoryTool",
+    "enabled": True,
+}
+
+
+def upgrade() -> None:
+    conn = op.get_bind()
+
+    existing = conn.execute(
+        sa.text(
+            "SELECT in_code_tool_id FROM tool WHERE in_code_tool_id = :in_code_tool_id"
+        ),
+        {"in_code_tool_id": MEMORY_TOOL["in_code_tool_id"]},
+    ).fetchone()
+
+    if existing:
+        conn.execute(
+            sa.text(
+                """
+                UPDATE tool
+                SET name = :name,
+                    display_name = :display_name,
+                    description = :description
+                WHERE in_code_tool_id = :in_code_tool_id
+                """
+            ),
+            MEMORY_TOOL,
+        )
+    else:
+        conn.execute(
+            sa.text(
+                """
+                INSERT INTO tool (name, display_name, description, in_code_tool_id, enabled)
+                VALUES (:name, :display_name, :description, :in_code_tool_id, :enabled)
+                """
+            ),
+            MEMORY_TOOL,
+        )
+
+    op.add_column(
+        "user",
+        sa.Column(
+            "enable_memory_tool",
+            sa.Boolean(),
+            nullable=False,
+            server_default=sa.true(),
+        ),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("user", "enable_memory_tool")
+
+    conn = op.get_bind()
+    conn.execute(
+        sa.text("DELETE FROM tool WHERE in_code_tool_id = :in_code_tool_id"),
+        {"in_code_tool_id": MEMORY_TOOL["in_code_tool_id"]},
+    )
--- a/backend/alembic/versions/be87a654d5af_persona_new_default_model_configuration_.py
+++ b/backend/alembic/versions/be87a654d5af_persona_new_default_model_configuration_.py
@@ -0,0 +1,40 @@
+"""Persona new default model configuration id column
+
+Revision ID: be87a654d5af
+Revises: e7f8a9b0c1d2
+Create Date: 2026-01-30 11:14:17.306275
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "be87a654d5af"
+down_revision = "e7f8a9b0c1d2"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "persona",
+        sa.Column("default_model_configuration_id", sa.Integer(), nullable=True),
+    )
+    op.create_foreign_key(
+        "fk_persona_default_model_configuration_id",
+        "persona",
+        "model_configuration",
+        ["default_model_configuration_id"],
+        ["id"],
+        ondelete="SET NULL",
+    )
+
+
+def downgrade() -> None:
+    op.drop_constraint(
+        "fk_persona_default_model_configuration_id", "persona", type_="foreignkey"
+    )
+
+    op.drop_column("persona", "default_model_configuration_id")
--- a/backend/alembic/versions/c7f2e1b4a9d3_add_sharing_scope_to_build_session.py
+++ b/backend/alembic/versions/c7f2e1b4a9d3_add_sharing_scope_to_build_session.py
@@ -0,0 +1,31 @@
+"""add sharing_scope to build_session
+
+Revision ID: c7f2e1b4a9d3
+Revises: 19c0ccb01687
+Create Date: 2026-02-17 12:00:00.000000
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+revision = "c7f2e1b4a9d3"
+down_revision = "19c0ccb01687"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "build_session",
+        sa.Column(
+            "sharing_scope",
+            sa.String(),
+            nullable=False,
+            server_default="private",
+        ),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("build_session", "sharing_scope")
--- a/backend/alembic/versions/cbc03e08d0f3_add_opensearch_migration_tables.py
+++ b/backend/alembic/versions/cbc03e08d0f3_add_opensearch_migration_tables.py
@@ -0,0 +1,128 @@
+"""add_opensearch_migration_tables
+
+Revision ID: cbc03e08d0f3
+Revises: be87a654d5af
+Create Date: 2026-01-31 17:00:45.176604
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "cbc03e08d0f3"
+down_revision = "be87a654d5af"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    # 1. Create opensearch_document_migration_record table.
+    op.create_table(
+        "opensearch_document_migration_record",
+        sa.Column("document_id", sa.String(), nullable=False),
+        sa.Column("status", sa.String(), nullable=False, server_default="pending"),
+        sa.Column("error_message", sa.Text(), nullable=True),
+        sa.Column("attempts_count", sa.Integer(), nullable=False, server_default="0"),
+        sa.Column("last_attempt_at", sa.DateTime(timezone=True), nullable=True),
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.func.now(),
+            nullable=False,
+        ),
+        sa.PrimaryKeyConstraint("document_id"),
+        sa.ForeignKeyConstraint(
+            ["document_id"],
+            ["document.id"],
+            ondelete="CASCADE",
+        ),
+    )
+    # 2. Create indices.
+    op.create_index(
+        "ix_opensearch_document_migration_record_status",
+        "opensearch_document_migration_record",
+        ["status"],
+    )
+    op.create_index(
+        "ix_opensearch_document_migration_record_attempts_count",
+        "opensearch_document_migration_record",
+        ["attempts_count"],
+    )
+    op.create_index(
+        "ix_opensearch_document_migration_record_created_at",
+        "opensearch_document_migration_record",
+        ["created_at"],
+    )
+
+    # 3. Create opensearch_tenant_migration_record table (singleton).
+    op.create_table(
+        "opensearch_tenant_migration_record",
+        sa.Column("id", sa.Integer(), nullable=False),
+        sa.Column(
+            "document_migration_record_table_population_status",
+            sa.String(),
+            nullable=False,
+            server_default="pending",
+        ),
+        sa.Column(
+            "num_times_observed_no_additional_docs_to_populate_migration_table",
+            sa.Integer(),
+            nullable=False,
+            server_default="0",
+        ),
+        sa.Column(
+            "overall_document_migration_status",
+            sa.String(),
+            nullable=False,
+            server_default="pending",
+        ),
+        sa.Column(
+            "num_times_observed_no_additional_docs_to_migrate",
+            sa.Integer(),
+            nullable=False,
+            server_default="0",
+        ),
+        sa.Column(
+            "last_updated_at",
+            sa.DateTime(timezone=True),
+            server_default=sa.func.now(),
+            nullable=False,
+        ),
+        sa.PrimaryKeyConstraint("id"),
+    )
+
+    # 4. Create unique index on constant to enforce singleton pattern.
+    op.execute(
+        sa.text(
+            """
+            CREATE UNIQUE INDEX idx_opensearch_tenant_migration_singleton
+            ON opensearch_tenant_migration_record ((true))
+            """
+        )
+    )
+
+
+def downgrade() -> None:
+    # Drop opensearch_tenant_migration_record.
+    op.drop_index(
+        "idx_opensearch_tenant_migration_singleton",
+        table_name="opensearch_tenant_migration_record",
+    )
+    op.drop_table("opensearch_tenant_migration_record")
+
+    # Drop opensearch_document_migration_record.
+    op.drop_index(
+        "ix_opensearch_document_migration_record_created_at",
+        table_name="opensearch_document_migration_record",
+    )
+    op.drop_index(
+        "ix_opensearch_document_migration_record_attempts_count",
+        table_name="opensearch_document_migration_record",
+    )
+    op.drop_index(
+        "ix_opensearch_document_migration_record_status",
+        table_name="opensearch_document_migration_record",
+    )
+    op.drop_table("opensearch_document_migration_record")
--- a/backend/alembic/versions/d3fd499c829c_add_file_reader_tool.py
+++ b/backend/alembic/versions/d3fd499c829c_add_file_reader_tool.py
@@ -0,0 +1,102 @@
+"""add_file_reader_tool
+
+Revision ID: d3fd499c829c
+Revises: 114a638452db
+Create Date: 2026-02-07 19:28:22.452337
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "d3fd499c829c"
+down_revision = "114a638452db"
+branch_labels = None
+depends_on = None
+
+FILE_READER_TOOL = {
+    "name": "read_file",
+    "display_name": "File Reader",
+    "description": (
+        "Read sections of user-uploaded files by character offset. "
+        "Useful for inspecting large files that cannot fit entirely in context."
+    ),
+    "in_code_tool_id": "FileReaderTool",
+    "enabled": True,
+}
+
+
+def upgrade() -> None:
+    conn = op.get_bind()
+
+    # Check if tool already exists
+    existing = conn.execute(
+        sa.text("SELECT id FROM tool WHERE in_code_tool_id = :in_code_tool_id"),
+        {"in_code_tool_id": FILE_READER_TOOL["in_code_tool_id"]},
+    ).fetchone()
+
+    if existing:
+        # Update existing tool
+        conn.execute(
+            sa.text(
+                """
+                UPDATE tool
+                SET name = :name,
+                    display_name = :display_name,
+                    description = :description
+                WHERE in_code_tool_id = :in_code_tool_id
+                """
+            ),
+            FILE_READER_TOOL,
+        )
+        tool_id = existing[0]
+    else:
+        # Insert new tool
+        result = conn.execute(
+            sa.text(
+                """
+                INSERT INTO tool (name, display_name, description, in_code_tool_id, enabled)
+                VALUES (:name, :display_name, :description, :in_code_tool_id, :enabled)
+                RETURNING id
+                """
+            ),
+            FILE_READER_TOOL,
+        )
+        tool_id = result.scalar_one()
+
+    # Attach to the default persona (id=0) if not already attached
+    conn.execute(
+        sa.text(
+            """
+            INSERT INTO persona__tool (persona_id, tool_id)
+            VALUES (0, :tool_id)
+            ON CONFLICT DO NOTHING
+            """
+        ),
+        {"tool_id": tool_id},
+    )
+
+
+def downgrade() -> None:
+    conn = op.get_bind()
+    in_code_tool_id = FILE_READER_TOOL["in_code_tool_id"]
+
+    # Remove persona associations first (FK constraint)
+    conn.execute(
+        sa.text(
+            """
+            DELETE FROM persona__tool
+            WHERE tool_id IN (
+                SELECT id FROM tool WHERE in_code_tool_id = :in_code_tool_id
+            )
+            """
+        ),
+        {"in_code_tool_id": in_code_tool_id},
+    )
+
+    conn.execute(
+        sa.text("DELETE FROM tool WHERE in_code_tool_id = :in_code_tool_id"),
+        {"in_code_tool_id": in_code_tool_id},
+    )
--- a/backend/alembic/versions/d56ffa94ca32_add_file_content.py
+++ b/backend/alembic/versions/d56ffa94ca32_add_file_content.py
@@ -0,0 +1,35 @@
+"""add_file_content
+
+Revision ID: d56ffa94ca32
+Revises: 01f8e6d95a33
+Create Date: 2026-02-06 15:29:34.192960
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "d56ffa94ca32"
+down_revision = "01f8e6d95a33"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.create_table(
+        "file_content",
+        sa.Column(
+            "file_id",
+            sa.String(),
+            sa.ForeignKey("file_record.file_id", ondelete="CASCADE"),
+            primary_key=True,
+        ),
+        sa.Column("lobj_oid", sa.BigInteger(), nullable=False),
+        sa.Column("file_size", sa.BigInteger(), nullable=False, server_default="0"),
+    )
+
+
+def downgrade() -> None:
+    op.drop_table("file_content")
--- a/backend/alembic/versions/d5c86e2c6dc6_add_cascade_delete_to_search_query_user_.py
+++ b/backend/alembic/versions/d5c86e2c6dc6_add_cascade_delete_to_search_query_user_.py
@@ -0,0 +1,35 @@
+"""add_cascade_delete_to_search_query_user_id
+
+Revision ID: d5c86e2c6dc6
+Revises: 90b409d06e50
+Create Date: 2026-02-04 16:05:04.749804
+
+"""
+
+from alembic import op
+
+
+# revision identifiers, used by Alembic.
+revision = "d5c86e2c6dc6"
+down_revision = "90b409d06e50"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.drop_constraint("search_query_user_id_fkey", "search_query", type_="foreignkey")
+    op.create_foreign_key(
+        "search_query_user_id_fkey",
+        "search_query",
+        "user",
+        ["user_id"],
+        ["id"],
+        ondelete="CASCADE",
+    )
+
+
+def downgrade() -> None:
+    op.drop_constraint("search_query_user_id_fkey", "search_query", type_="foreignkey")
+    op.create_foreign_key(
+        "search_query_user_id_fkey", "search_query", "user", ["user_id"], ["id"]
+    )
--- a/backend/alembic/versions/e7f8a9b0c1d2_create_anonymous_user.py
+++ b/backend/alembic/versions/e7f8a9b0c1d2_create_anonymous_user.py
@@ -0,0 +1,125 @@
+"""create_anonymous_user
+
+This migration creates a permanent anonymous user in the database.
+When anonymous access is enabled, unauthenticated requests will use this user
+instead of returning user_id=NULL.
+
+Revision ID: e7f8a9b0c1d2
+Revises: f7ca3e2f45d9
+Create Date: 2026-01-15 14:00:00.000000
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "e7f8a9b0c1d2"
+down_revision = "f7ca3e2f45d9"
+branch_labels = None
+depends_on = None
+
+# Must match constants in onyx/configs/constants.py file
+ANONYMOUS_USER_UUID = "00000000-0000-0000-0000-000000000002"
+ANONYMOUS_USER_EMAIL = "anonymous@onyx.app"
+
+# Tables with user_id foreign key that may need migration
+TABLES_WITH_USER_ID = [
+    "chat_session",
+    "credential",
+    "document_set",
+    "persona",
+    "tool",
+    "notification",
+    "inputprompt",
+]
+
+
+def upgrade() -> None:
+    """
+    Create the anonymous user for anonymous access feature.
+    Also migrates any remaining user_id=NULL records to the anonymous user.
+    """
+    connection = op.get_bind()
+
+    # Create the anonymous user (using ON CONFLICT to be idempotent)
+    connection.execute(
+        sa.text(
+            """
+            INSERT INTO "user" (id, email, hashed_password, is_active, is_superuser, is_verified, role)
+            VALUES (:id, :email, :hashed_password, :is_active, :is_superuser, :is_verified, :role)
+            ON CONFLICT (id) DO NOTHING
+            """
+        ),
+        {
+            "id": ANONYMOUS_USER_UUID,
+            "email": ANONYMOUS_USER_EMAIL,
+            "hashed_password": "",  # Empty password - user cannot log in directly
+            "is_active": True,  # Active so it can be used for anonymous access
+            "is_superuser": False,
+            "is_verified": True,  # Verified since no email verification needed
+            "role": "LIMITED",  # Anonymous users have limited role to restrict access
+        },
+    )
+
+    # Migrate any remaining user_id=NULL records to anonymous user
+    for table in TABLES_WITH_USER_ID:
+        try:
+            # Exclude public credential (id=0) which must remain user_id=NULL
+            # Exclude builtin tools (in_code_tool_id IS NOT NULL) which must remain user_id=NULL
+            # Exclude builtin personas (builtin_persona=True) which must remain user_id=NULL
+            # Exclude system input prompts (is_public=True with user_id=NULL) which must remain user_id=NULL
+            if table == "credential":
+                condition = "user_id IS NULL AND id != 0"
+            elif table == "tool":
+                condition = "user_id IS NULL AND in_code_tool_id IS NULL"
+            elif table == "persona":
+                condition = "user_id IS NULL AND builtin_persona = false"
+            elif table == "inputprompt":
+                condition = "user_id IS NULL AND is_public = false"
+            else:
+                condition = "user_id IS NULL"
+            result = connection.execute(
+                sa.text(
+                    f"""
+                    UPDATE "{table}"
+                    SET user_id = :user_id
+                    WHERE {condition}
+                    """
+                ),
+                {"user_id": ANONYMOUS_USER_UUID},
+            )
+            if result.rowcount > 0:
+                print(f"Updated {result.rowcount} rows in {table} to anonymous user")
+        except Exception as e:
+            print(f"Skipping {table}: {e}")
+
+
+def downgrade() -> None:
+    """
+    Set anonymous user's records back to NULL and delete the anonymous user.
+    """
+    connection = op.get_bind()
+
+    # Set records back to NULL
+    for table in TABLES_WITH_USER_ID:
+        try:
+            connection.execute(
+                sa.text(
+                    f"""
+                    UPDATE "{table}"
+                    SET user_id = NULL
+                    WHERE user_id = :user_id
+                    """
+                ),
+                {"user_id": ANONYMOUS_USER_UUID},
+            )
+        except Exception:
+            pass
+
+    # Delete the anonymous user
+    connection.execute(
+        sa.text('DELETE FROM "user" WHERE id = :user_id'),
+        {"user_id": ANONYMOUS_USER_UUID},
+    )
--- a/backend/alembic/versions/f220515df7b4_add_flow_mapping_table.py
+++ b/backend/alembic/versions/f220515df7b4_add_flow_mapping_table.py
@@ -0,0 +1,57 @@
+"""Add flow mapping table
+
+Revision ID: f220515df7b4
+Revises: cbc03e08d0f3
+Create Date: 2026-01-30 12:21:24.955922
+
+"""
+
+from onyx.db.enums import LLMModelFlowType
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "f220515df7b4"
+down_revision = "9d1543a37106"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.create_table(
+        "llm_model_flow",
+        sa.Column("id", sa.Integer(), nullable=False),
+        sa.Column(
+            "llm_model_flow_type",
+            sa.Enum(LLMModelFlowType, name="llmmodelflowtype", native_enum=False),
+            nullable=False,
+        ),
+        sa.Column(
+            "is_default", sa.Boolean(), nullable=False, server_default=sa.text("false")
+        ),
+        sa.Column("model_configuration_id", sa.Integer(), nullable=False),
+        sa.PrimaryKeyConstraint("id"),
+        sa.ForeignKeyConstraint(
+            ["model_configuration_id"], ["model_configuration.id"], ondelete="CASCADE"
+        ),
+        sa.UniqueConstraint(
+            "llm_model_flow_type",
+            "model_configuration_id",
+            name="uq_model_config_per_llm_model_flow_type",
+        ),
+    )
+
+    # Partial unique index so that there is at most one default for each flow type
+    op.create_index(
+        "ix_one_default_per_llm_model_flow",
+        "llm_model_flow",
+        ["llm_model_flow_type"],
+        unique=True,
+        postgresql_where=sa.text("is_default IS TRUE"),
+    )
+
+
+def downgrade() -> None:
+    # Drop the llm_model_flow table (index is dropped automatically with table)
+    op.drop_table("llm_model_flow")
--- a/backend/alembic/versions/f7ca3e2f45d9_migrate_no_auth_data_to_placeholder.py
+++ b/backend/alembic/versions/f7ca3e2f45d9_migrate_no_auth_data_to_placeholder.py
@@ -0,0 +1,281 @@
+"""migrate_no_auth_data_to_placeholder
+
+This migration handles the transition from AUTH_TYPE=disabled to requiring
+authentication. It creates a placeholder user and assigns all data that was
+created without a user (user_id=NULL) to this placeholder.
+
+A database trigger is installed that automatically transfers all data from
+the placeholder user to the first real user who registers, then drops itself.
+
+Revision ID: f7ca3e2f45d9
+Revises: 78ebc66946a0
+Create Date: 2026-01-15 12:49:53.802741
+
+"""
+
+import os
+
+from alembic import op
+import sqlalchemy as sa
+
+from shared_configs.configs import MULTI_TENANT
+
+
+# revision identifiers, used by Alembic.
+revision = "f7ca3e2f45d9"
+down_revision = "78ebc66946a0"
+branch_labels = None
+depends_on = None
+
+# Must match constants in onyx/configs/constants.py file
+NO_AUTH_PLACEHOLDER_USER_UUID = "00000000-0000-0000-0000-000000000001"
+NO_AUTH_PLACEHOLDER_USER_EMAIL = "no-auth-placeholder@onyx.app"
+
+# Trigger and function names
+TRIGGER_NAME = "trg_migrate_no_auth_data"
+FUNCTION_NAME = "migrate_no_auth_data_to_user"
+
+# Trigger function that migrates data from placeholder to first real user
+MIGRATE_NO_AUTH_TRIGGER_FUNCTION = f"""
+CREATE OR REPLACE FUNCTION {FUNCTION_NAME}()
+RETURNS TRIGGER AS $$
+DECLARE
+    placeholder_uuid UUID := '00000000-0000-0000-0000-000000000001'::uuid;
+    anonymous_uuid UUID := '00000000-0000-0000-0000-000000000002'::uuid;
+    placeholder_row RECORD;
+    schema_name TEXT;
+BEGIN
+    -- Skip if this is the placeholder user being inserted
+    IF NEW.id = placeholder_uuid THEN
+        RETURN NULL;
+    END IF;
+
+    -- Skip if this is the anonymous user being inserted (not a real user)
+    IF NEW.id = anonymous_uuid THEN
+        RETURN NULL;
+    END IF;
+
+    -- Skip if the new user is not active
+    IF NEW.is_active = FALSE THEN
+        RETURN NULL;
+    END IF;
+
+    -- Get current schema for self-cleanup
+    schema_name := current_schema();
+
+    -- Try to lock the placeholder user row with FOR UPDATE SKIP LOCKED
+    -- This ensures only one concurrent transaction can proceed with migration
+    -- SKIP LOCKED means if another transaction has the lock, we skip (don't wait)
+    SELECT id INTO placeholder_row
+    FROM "user"
+    WHERE id = placeholder_uuid
+    FOR UPDATE SKIP LOCKED;
+
+    IF NOT FOUND THEN
+        -- Either placeholder doesn't exist or another transaction has it locked
+        -- Either way, drop the trigger and return without making admin
+        EXECUTE format('DROP TRIGGER IF EXISTS {TRIGGER_NAME} ON %I."user"', schema_name);
+        EXECUTE format('DROP FUNCTION IF EXISTS %I.{FUNCTION_NAME}()', schema_name);
+        RETURN NULL;
+    END IF;
+
+    -- We have exclusive lock on placeholder - proceed with migration
+    -- The INSERT has already completed (AFTER INSERT), so NEW.id exists in the table
+
+    -- Migrate chat_session
+    UPDATE "chat_session" SET user_id = NEW.id WHERE user_id = placeholder_uuid;
+
+    -- Migrate credential (exclude public credential id=0)
+    UPDATE "credential" SET user_id = NEW.id WHERE user_id = placeholder_uuid AND id != 0;
+
+    -- Migrate document_set
+    UPDATE "document_set" SET user_id = NEW.id WHERE user_id = placeholder_uuid;
+
+    -- Migrate persona (exclude builtin personas)
+    UPDATE "persona" SET user_id = NEW.id WHERE user_id = placeholder_uuid AND builtin_persona = FALSE;
+
+    -- Migrate tool (exclude builtin tools)
+    UPDATE "tool" SET user_id = NEW.id WHERE user_id = placeholder_uuid AND in_code_tool_id IS NULL;
+
+    -- Migrate notification
+    UPDATE "notification" SET user_id = NEW.id WHERE user_id = placeholder_uuid;
+
+    -- Migrate inputprompt (exclude system/public prompts)
+    UPDATE "inputprompt" SET user_id = NEW.id WHERE user_id = placeholder_uuid AND is_public = FALSE;
+
+    -- Make the new user an admin (they had admin access in no-auth mode)
+    -- In AFTER INSERT trigger, we must UPDATE the row since it already exists
+    UPDATE "user" SET role = 'ADMIN' WHERE id = NEW.id;
+
+    -- Delete the placeholder user (we hold the lock so this is safe)
+    DELETE FROM "user" WHERE id = placeholder_uuid;
+
+    -- Drop the trigger and function (self-cleanup)
+    EXECUTE format('DROP TRIGGER IF EXISTS {TRIGGER_NAME} ON %I."user"', schema_name);
+    EXECUTE format('DROP FUNCTION IF EXISTS %I.{FUNCTION_NAME}()', schema_name);
+
+    RETURN NULL;
+END;
+$$ LANGUAGE plpgsql;
+"""
+
+MIGRATE_NO_AUTH_TRIGGER = f"""
+CREATE TRIGGER {TRIGGER_NAME}
+AFTER INSERT ON "user"
+FOR EACH ROW
+EXECUTE FUNCTION {FUNCTION_NAME}();
+"""
+
+
+def upgrade() -> None:
+    """
+    Create a placeholder user and assign all NULL user_id records to it.
+    Install a trigger that migrates data to the first real user and self-destructs.
+    Only runs if AUTH_TYPE is currently disabled/none.
+
+    Skipped in multi-tenant mode - each tenant starts fresh with no legacy data.
+    """
+    # Skip in multi-tenant mode - this migration handles single-tenant
+    # AUTH_TYPE=disabled -> auth transitions only
+    if MULTI_TENANT:
+        return
+
+    # Only run if AUTH_TYPE is currently disabled/none
+    # If they've already switched to auth-enabled, NULL data is stale anyway
+    auth_type = (os.environ.get("AUTH_TYPE") or "").lower()
+    if auth_type not in ("disabled", "none", ""):
+        print(f"AUTH_TYPE is '{auth_type}', not disabled. Skipping migration.")
+        return
+
+    connection = op.get_bind()
+
+    # Check if there are any NULL user_id records that need migration
+    tables_to_check = [
+        "chat_session",
+        "credential",
+        "document_set",
+        "persona",
+        "tool",
+        "notification",
+        "inputprompt",
+    ]
+
+    has_null_records = False
+    for table in tables_to_check:
+        try:
+            result = connection.execute(
+                sa.text(f'SELECT 1 FROM "{table}" WHERE user_id IS NULL LIMIT 1')
+            )
+            if result.fetchone():
+                has_null_records = True
+                break
+        except Exception:
+            # Table might not exist
+            pass
+
+    if not has_null_records:
+        return
+
+    # Create the placeholder user
+    connection.execute(
+        sa.text(
+            """
+            INSERT INTO "user" (id, email, hashed_password, is_active, is_superuser, is_verified, role)
+            VALUES (:id, :email, :hashed_password, :is_active, :is_superuser, :is_verified, :role)
+            """
+        ),
+        {
+            "id": NO_AUTH_PLACEHOLDER_USER_UUID,
+            "email": NO_AUTH_PLACEHOLDER_USER_EMAIL,
+            "hashed_password": "",  # Empty password - user cannot log in
+            "is_active": False,  # Inactive - user cannot log in
+            "is_superuser": False,
+            "is_verified": False,
+            "role": "BASIC",
+        },
+    )
+
+    # Assign NULL user_id records to the placeholder user
+    for table in tables_to_check:
+        try:
+            # Base condition for all tables
+            condition = "user_id IS NULL"
+            # Exclude public credential (id=0) which must remain user_id=NULL
+            if table == "credential":
+                condition += " AND id != 0"
+            # Exclude builtin tools (in_code_tool_id IS NOT NULL) which must remain user_id=NULL
+            elif table == "tool":
+                condition += " AND in_code_tool_id IS NULL"
+            # Exclude builtin personas which must remain user_id=NULL
+            elif table == "persona":
+                condition += " AND builtin_persona = FALSE"
+            # Exclude system/public input prompts which must remain user_id=NULL
+            elif table == "inputprompt":
+                condition += " AND is_public = FALSE"
+            result = connection.execute(
+                sa.text(
+                    f"""
+                    UPDATE "{table}"
+                    SET user_id = :user_id
+                    WHERE {condition}
+                    """
+                ),
+                {"user_id": NO_AUTH_PLACEHOLDER_USER_UUID},
+            )
+            if result.rowcount > 0:
+                print(f"Updated {result.rowcount} rows in {table}")
+        except Exception as e:
+            print(f"Skipping {table}: {e}")
+
+    # Install the trigger function and trigger for automatic migration on first user registration
+    connection.execute(sa.text(MIGRATE_NO_AUTH_TRIGGER_FUNCTION))
+    connection.execute(sa.text(MIGRATE_NO_AUTH_TRIGGER))
+    print("Installed trigger for automatic data migration on first user registration")
+
+
+def downgrade() -> None:
+    """
+    Drop trigger and function, set placeholder user's records back to NULL,
+    and delete the placeholder user.
+    """
+    # Skip in multi-tenant mode for consistency with upgrade
+    if MULTI_TENANT:
+        return
+
+    connection = op.get_bind()
+
+    # Drop trigger and function if they exist (they may have already self-destructed)
+    connection.execute(sa.text(f'DROP TRIGGER IF EXISTS {TRIGGER_NAME} ON "user"'))
+    connection.execute(sa.text(f"DROP FUNCTION IF EXISTS {FUNCTION_NAME}()"))
+
+    tables_to_update = [
+        "chat_session",
+        "credential",
+        "document_set",
+        "persona",
+        "tool",
+        "notification",
+        "inputprompt",
+    ]
+
+    # Set records back to NULL
+    for table in tables_to_update:
+        try:
+            connection.execute(
+                sa.text(
+                    f"""
+                    UPDATE "{table}"
+                    SET user_id = NULL
+                    WHERE user_id = :user_id
+                    """
+                ),
+                {"user_id": NO_AUTH_PLACEHOLDER_USER_UUID},
+            )
+        except Exception:
+            pass
+
+    # Delete the placeholder user
+    connection.execute(
+        sa.text('DELETE FROM "user" WHERE id = :user_id'),
+        {"user_id": NO_AUTH_PLACEHOLDER_USER_UUID},
+    )
--- a/backend/alembic/versions/feead2911109_add_opensearch_tenant_migration_columns.py
+++ b/backend/alembic/versions/feead2911109_add_opensearch_tenant_migration_columns.py
@@ -0,0 +1,69 @@
+"""add_opensearch_tenant_migration_columns
+
+Revision ID: feead2911109
+Revises: d56ffa94ca32
+Create Date: 2026-02-10 17:46:34.029937
+
+"""
+
+from alembic import op
+import sqlalchemy as sa
+
+
+# revision identifiers, used by Alembic.
+revision = "feead2911109"
+down_revision = "175ea04c7087"
+branch_labels = None
+depends_on = None
+
+
+def upgrade() -> None:
+    op.add_column(
+        "opensearch_tenant_migration_record",
+        sa.Column("vespa_visit_continuation_token", sa.Text(), nullable=True),
+    )
+    op.add_column(
+        "opensearch_tenant_migration_record",
+        sa.Column(
+            "total_chunks_migrated",
+            sa.Integer(),
+            nullable=False,
+            server_default="0",
+        ),
+    )
+    op.add_column(
+        "opensearch_tenant_migration_record",
+        sa.Column(
+            "created_at",
+            sa.DateTime(timezone=True),
+            nullable=False,
+            server_default=sa.func.now(),
+        ),
+    )
+    op.add_column(
+        "opensearch_tenant_migration_record",
+        sa.Column(
+            "migration_completed_at",
+            sa.DateTime(timezone=True),
+            nullable=True,
+        ),
+    )
+    op.add_column(
+        "opensearch_tenant_migration_record",
+        sa.Column(
+            "enable_opensearch_retrieval",
+            sa.Boolean(),
+            nullable=False,
+            server_default="false",
+        ),
+    )
+
+
+def downgrade() -> None:
+    op.drop_column("opensearch_tenant_migration_record", "enable_opensearch_retrieval")
+    op.drop_column("opensearch_tenant_migration_record", "migration_completed_at")
+    op.drop_column("opensearch_tenant_migration_record", "created_at")
+    op.drop_column("opensearch_tenant_migration_record", "total_chunks_migrated")
+    op.drop_column(
+        "opensearch_tenant_migration_record", "vespa_visit_continuation_token"
+    )
--- a/backend/alembic_tenants/env.py
+++ b/backend/alembic_tenants/env.py
@@ -39,7 +39,7 @@ EXCLUDE_TABLES = {"kombu_queue", "kombu_message"}


 def include_object(
-    object: SchemaItem,
+    object: SchemaItem,  # noqa: ARG001
    name: str | None,
    type_: Literal[
        "schema",
@@ -49,8 +49,8 @@ def include_object(
        "unique_constraint",
        "foreign_key_constraint",
    ],
-    reflected: bool,
-    compare_to: SchemaItem | None,
+    reflected: bool,  # noqa: ARG001
+    compare_to: SchemaItem | None,  # noqa: ARG001
 ) -> bool:
    if type_ == "table" and name in EXCLUDE_TABLES:
        return False
--- a/backend/ee/LICENSE
+++ b/backend/ee/LICENSE
@@ -1,20 +1,20 @@
-The DanswerAI Enterprise license (the “Enterprise License”)
+The Onyx Enterprise License (the "Enterprise License")
 Copyright (c) 2023-present DanswerAI, Inc.

 With regard to the Onyx Software:

 This software and associated documentation files (the "Software") may only be
 used in production, if you (and any entity that you represent) have agreed to,
-and are in compliance with, the DanswerAI Subscription Terms of Service, available
-at https://onyx.app/terms (the “Enterprise Terms”), or other
+and are in compliance with, the Onyx Subscription Terms of Service, available
+at https://www.onyx.app/legal/self-host (the "Enterprise Terms"), or other
 agreement governing the use of the Software, as agreed by you and DanswerAI,
-and otherwise have a valid Onyx Enterprise license for the
+and otherwise have a valid Onyx Enterprise License for the
 correct number of user seats. Subject to the foregoing sentence, you are free to
 modify this Software and publish patches to the Software. You agree that DanswerAI
 and/or its licensors (as applicable) retain all right, title and interest in and
 to all such modifications and/or patches, and all such modifications and/or
 patches may only be used, copied, modified, displayed, distributed, or otherwise
-exploited with a valid Onyx Enterprise license for the correct
+exploited with a valid Onyx Enterprise License for the correct
 number of user seats. Notwithstanding the foregoing, you may copy and modify
 the Software for development and testing purposes, without requiring a
 subscription. You agree that DanswerAI and/or its licensors (as applicable) retain
--- a/backend/ee/onyx/access/access.py
+++ b/backend/ee/onyx/access/access.py
@@ -116,7 +116,7 @@ def _get_access_for_documents(
    return access_map


-def _get_acl_for_user(user: User | None, db_session: Session) -> set[str]:
+def _get_acl_for_user(user: User, db_session: Session) -> set[str]:
    """Returns a list of ACL entries that the user has access to. This is meant to be
    used downstream to filter out documents that the user does not have access to. The
    user should have access to a document if at least one entry in the document's ACL
@@ -124,13 +124,16 @@ def _get_acl_for_user(user: User | None, db_session: Session) -> set[str]:

    NOTE: is imported in onyx.access.access by `fetch_versioned_implementation`
    DO NOT REMOVE."""
-    db_user_groups = fetch_user_groups_for_user(db_session, user.id) if user else []
+    is_anonymous = user.is_anonymous
+    db_user_groups = (
+        [] if is_anonymous else fetch_user_groups_for_user(db_session, user.id)
+    )
    prefixed_user_groups = [
        prefix_user_group(db_user_group.name) for db_user_group in db_user_groups
    ]

    db_external_groups = (
-        fetch_external_groups_for_user(db_session, user.id) if user else []
+        [] if is_anonymous else fetch_external_groups_for_user(db_session, user.id)
    )
    prefixed_external_groups = [
        prefix_external_group(db_external_group.external_user_group_id)
--- a/backend/ee/onyx/access/hierarchy_access.py
+++ b/backend/ee/onyx/access/hierarchy_access.py
@@ -0,0 +1,11 @@
+from sqlalchemy.orm import Session
+
+from ee.onyx.db.external_perm import fetch_external_groups_for_user
+from onyx.db.models import User
+
+
+def _get_user_external_group_ids(db_session: Session, user: User) -> list[str]:
+    if not user:
+        return []
+    external_groups = fetch_external_groups_for_user(db_session, user.id)
+    return [external_group.external_user_group_id for external_group in external_groups]
--- a/backend/ee/onyx/auth/users.py
+++ b/backend/ee/onyx/auth/users.py
@@ -33,8 +33,8 @@ def get_default_admin_user_emails_() -> list[str]:

 async def current_cloud_superuser(
    request: Request,
-    user: User | None = Depends(current_admin_user),
-) -> User | None:
+    user: User = Depends(current_admin_user),
+) -> User:
    api_key = request.headers.get("Authorization", "").replace("Bearer ", "")
    if api_key != SUPER_CLOUD_API_KEY:
        raise HTTPException(status_code=401, detail="Invalid API key")
--- a/backend/ee/onyx/background/celery/apps/background.py
+++ b/backend/ee/onyx/background/celery/apps/background.py
@@ -1,12 +1,15 @@
+from onyx.background.celery.apps import app_base
 from onyx.background.celery.apps.background import celery_app


 celery_app.autodiscover_tasks(
-    [
-        "ee.onyx.background.celery.tasks.doc_permission_syncing",
-        "ee.onyx.background.celery.tasks.external_group_syncing",
-        "ee.onyx.background.celery.tasks.cleanup",
-        "ee.onyx.background.celery.tasks.tenant_provisioning",
-        "ee.onyx.background.celery.tasks.query_history",
-    ]
+    app_base.filter_task_modules(
+        [
+            "ee.onyx.background.celery.tasks.doc_permission_syncing",
+            "ee.onyx.background.celery.tasks.external_group_syncing",
+            "ee.onyx.background.celery.tasks.cleanup",
+            "ee.onyx.background.celery.tasks.tenant_provisioning",
+            "ee.onyx.background.celery.tasks.query_history",
+        ]
+    )
 )
--- a/backend/ee/onyx/background/celery/apps/heavy.py
+++ b/backend/ee/onyx/background/celery/apps/heavy.py
@@ -1,11 +1,14 @@
+from onyx.background.celery.apps import app_base
 from onyx.background.celery.apps.heavy import celery_app


 celery_app.autodiscover_tasks(
-    [
-        "ee.onyx.background.celery.tasks.doc_permission_syncing",
-        "ee.onyx.background.celery.tasks.external_group_syncing",
-        "ee.onyx.background.celery.tasks.cleanup",
-        "ee.onyx.background.celery.tasks.query_history",
-    ]
+    app_base.filter_task_modules(
+        [
+            "ee.onyx.background.celery.tasks.doc_permission_syncing",
+            "ee.onyx.background.celery.tasks.external_group_syncing",
+            "ee.onyx.background.celery.tasks.cleanup",
+            "ee.onyx.background.celery.tasks.query_history",
+        ]
+    )
 )
--- a/backend/ee/onyx/background/celery/apps/light.py
+++ b/backend/ee/onyx/background/celery/apps/light.py
@@ -1,8 +1,11 @@
+from onyx.background.celery.apps import app_base
 from onyx.background.celery.apps.light import celery_app

 celery_app.autodiscover_tasks(
-    [
-        "ee.onyx.background.celery.tasks.doc_permission_syncing",
-        "ee.onyx.background.celery.tasks.external_group_syncing",
-    ]
+    app_base.filter_task_modules(
+        [
+            "ee.onyx.background.celery.tasks.doc_permission_syncing",
+            "ee.onyx.background.celery.tasks.external_group_syncing",
+        ]
+    )
 )
--- a/backend/ee/onyx/background/celery/apps/monitoring.py
+++ b/backend/ee/onyx/background/celery/apps/monitoring.py
@@ -1,7 +1,10 @@
+from onyx.background.celery.apps import app_base
 from onyx.background.celery.apps.monitoring import celery_app

 celery_app.autodiscover_tasks(
-    [
-        "ee.onyx.background.celery.tasks.tenant_provisioning",
-    ]
+    app_base.filter_task_modules(
+        [
+            "ee.onyx.background.celery.tasks.tenant_provisioning",
+        ]
+    )
 )
--- a/backend/ee/onyx/background/celery/apps/primary.py
+++ b/backend/ee/onyx/background/celery/apps/primary.py
@@ -1,12 +1,15 @@
+from onyx.background.celery.apps import app_base
 from onyx.background.celery.apps.primary import celery_app


 celery_app.autodiscover_tasks(
-    [
-        "ee.onyx.background.celery.tasks.doc_permission_syncing",
-        "ee.onyx.background.celery.tasks.external_group_syncing",
-        "ee.onyx.background.celery.tasks.cloud",
-        "ee.onyx.background.celery.tasks.ttl_management",
-        "ee.onyx.background.celery.tasks.usage_reporting",
-    ]
+    app_base.filter_task_modules(
+        [
+            "ee.onyx.background.celery.tasks.doc_permission_syncing",
+            "ee.onyx.background.celery.tasks.external_group_syncing",
+            "ee.onyx.background.celery.tasks.cloud",
+            "ee.onyx.background.celery.tasks.ttl_management",
+            "ee.onyx.background.celery.tasks.usage_reporting",
+        ]
+    )
 )
--- a/backend/ee/onyx/background/celery/tasks/cleanup/init.py
+++ b/backend/ee/onyx/background/celery/tasks/cleanup/init.py
--- a/backend/ee/onyx/background/celery/tasks/cloud/init.py
+++ b/backend/ee/onyx/background/celery/tasks/cloud/init.py
--- a/backend/ee/onyx/background/celery/tasks/doc_permission_syncing/init.py
+++ b/backend/ee/onyx/background/celery/tasks/doc_permission_syncing/init.py
--- a/backend/ee/onyx/background/celery/tasks/doc_permission_syncing/tasks.py
+++ b/backend/ee/onyx/background/celery/tasks/doc_permission_syncing/tasks.py
@@ -25,6 +25,7 @@ from ee.onyx.db.connector_credential_pair import get_all_auto_sync_cc_pairs
 from ee.onyx.db.document import upsert_document_external_perms
 from ee.onyx.external_permissions.sync_params import get_source_perm_sync_config
 from onyx.access.models import DocExternalAccess
+from onyx.access.models import ElementExternalAccess
 from onyx.background.celery.apps.app_base import task_logger
 from onyx.background.celery.celery_redis import celery_find_task
 from onyx.background.celery.celery_redis import celery_get_queue_length
@@ -55,6 +56,9 @@ from onyx.db.enums import AccessType
 from onyx.db.enums import ConnectorCredentialPairStatus
 from onyx.db.enums import SyncStatus
 from onyx.db.enums import SyncType
+from onyx.db.hierarchy import (
+    update_hierarchy_node_permissions as db_update_hierarchy_node_permissions,
+)
 from onyx.db.models import ConnectorCredentialPair
 from onyx.db.permission_sync_attempt import complete_doc_permission_sync_attempt
 from onyx.db.permission_sync_attempt import create_doc_permission_sync_attempt
@@ -532,7 +536,9 @@ def connector_permission_sync_generator_task(
            )
            redis_connector.permissions.set_fence(new_payload)

-            callback = PermissionSyncCallback(redis_connector, lock, r)
+            callback = PermissionSyncCallback(
+                redis_connector, lock, r, timeout_seconds=JOB_TIMEOUT
+            )

            # pass in the capability to fetch all existing docs for the cc_pair
            # this is can be used to determine documents that are "missing" and thus
@@ -572,6 +578,13 @@ def connector_permission_sync_generator_task(
            tasks_generated = 0
            docs_with_errors = 0
            for doc_external_access in document_external_accesses:
+                if callback.should_stop():
+                    raise RuntimeError(
+                        f"Permission sync task timed out or stop signal detected: "
+                        f"cc_pair={cc_pair_id} "
+                        f"tasks_generated={tasks_generated}"
+                    )
+
                result = redis_connector.permissions.update_db(
                    lock=lock,
                    new_permissions=[doc_external_access],
@@ -637,18 +650,25 @@ def connector_permission_sync_generator_task(
    ),
    stop=stop_after_delay(DOCUMENT_PERMISSIONS_UPDATE_STOP_AFTER),
 )
-def document_update_permissions(
+def element_update_permissions(
    tenant_id: str,
-    permissions: DocExternalAccess,
+    permissions: ElementExternalAccess,
    source_type_str: str,
    connector_id: int,
    credential_id: int,
 ) -> bool:
+    """Update permissions for a document or hierarchy node."""
    start = time.monotonic()
-
-    doc_id = permissions.doc_id
    external_access = permissions.external_access

+    # Determine element type and identifier for logging
+    if isinstance(permissions, DocExternalAccess):
+        element_id = permissions.doc_id
+        element_type = "doc"
+    else:
+        element_id = permissions.raw_node_id
+        element_type = "node"
+
    try:
        with get_session_with_tenant(tenant_id=tenant_id) as db_session:
            # Add the users to the DB if they don't exist
@@ -657,39 +677,57 @@ def document_update_permissions(
                emails=list(external_access.external_user_emails),
                continue_on_error=True,
            )
-            # Then upsert the document's external permissions
-            created_new_doc = upsert_document_external_perms(
-                db_session=db_session,
-                doc_id=doc_id,
-                external_access=external_access,
-                source_type=DocumentSource(source_type_str),
-            )

-            if created_new_doc:
-                # If a new document was created, we associate it with the cc_pair
-                upsert_document_by_connector_credential_pair(
+            if isinstance(permissions, DocExternalAccess):
+                # Document permission update
+                created_new_doc = upsert_document_external_perms(
                    db_session=db_session,
-                    connector_id=connector_id,
-                    credential_id=credential_id,
-                    document_ids=[doc_id],
+                    doc_id=permissions.doc_id,
+                    external_access=external_access,
+                    source_type=DocumentSource(source_type_str),
+                )
+
+                if created_new_doc:
+                    # If a new document was created, we associate it with the cc_pair
+                    upsert_document_by_connector_credential_pair(
+                        db_session=db_session,
+                        connector_id=connector_id,
+                        credential_id=credential_id,
+                        document_ids=[permissions.doc_id],
+                    )
+            else:
+                # Hierarchy node permission update
+                db_update_hierarchy_node_permissions(
+                    db_session=db_session,
+                    raw_node_id=permissions.raw_node_id,
+                    source=DocumentSource(permissions.source),
+                    is_public=external_access.is_public,
+                    external_user_emails=(
+                        list(external_access.external_user_emails)
+                        if external_access.external_user_emails
+                        else None
+                    ),
+                    external_user_group_ids=(
+                        list(external_access.external_user_group_ids)
+                        if external_access.external_user_group_ids
+                        else None
+                    ),
                )

            elapsed = time.monotonic() - start
            task_logger.info(
-                f"connector_id={connector_id} "
-                f"doc={doc_id} "
+                f"{element_type}={element_id} "
                f"action=update_permissions "
                f"elapsed={elapsed:.2f}"
            )
    except Exception as e:
        task_logger.exception(
-            f"document_update_permissions exceptioned: "
-            f"connector_id={connector_id} doc_id={doc_id}"
+            f"element_update_permissions exceptioned: {element_type}={element_id}, {connector_id=} {credential_id=}"
        )
        raise e
    finally:
        task_logger.info(
-            f"document_update_permissions completed: connector_id={connector_id} doc={doc_id}"
+            f"element_update_permissions completed: {element_type}={element_id}, {connector_id=} {credential_id=}"
        )

    return True
@@ -903,6 +941,7 @@ class PermissionSyncCallback(IndexingHeartbeatInterface):
        redis_connector: RedisConnector,
        redis_lock: RedisLock,
        redis_client: Redis,
+        timeout_seconds: int | None = None,
    ):
        super().__init__()
        self.redis_connector: RedisConnector = redis_connector
@@ -915,14 +954,29 @@ class PermissionSyncCallback(IndexingHeartbeatInterface):
        self.last_tag: str = "PermissionSyncCallback.__init__"
        self.last_lock_reacquire: datetime = datetime.now(timezone.utc)
        self.last_lock_monotonic = time.monotonic()
+        self.start_monotonic = time.monotonic()
+        self.timeout_seconds = timeout_seconds

    def should_stop(self) -> bool:
        if self.redis_connector.stop.fenced:
            return True

+        # Check if the task has exceeded its timeout
+        # NOTE: Celery's soft_time_limit does not work with thread pools,
+        # so we must enforce timeouts internally.
+        if self.timeout_seconds is not None:
+            elapsed = time.monotonic() - self.start_monotonic
+            if elapsed > self.timeout_seconds:
+                logger.warning(
+                    f"PermissionSyncCallback - task timeout exceeded: "
+                    f"elapsed={elapsed:.0f}s timeout={self.timeout_seconds}s "
+                    f"cc_pair={self.redis_connector.cc_pair_id}"
+                )
+                return True
+
        return False

-    def progress(self, tag: str, amount: int) -> None:
+    def progress(self, tag: str, amount: int) -> None:  # noqa: ARG002
        try:
            self.redis_connector.permissions.set_active()

@@ -953,7 +1007,7 @@ class PermissionSyncCallback(IndexingHeartbeatInterface):


 def monitor_ccpair_permissions_taskset(
-    tenant_id: str, key_bytes: bytes, r: Redis, db_session: Session
+    tenant_id: str, key_bytes: bytes, r: Redis, db_session: Session  # noqa: ARG001
 ) -> None:
    fence_key = key_bytes.decode("utf-8")
    cc_pair_id_str = RedisConnector.get_id_from_fence_key(fence_key)
--- a/backend/ee/onyx/background/celery/tasks/external_group_syncing/init.py
+++ b/backend/ee/onyx/background/celery/tasks/external_group_syncing/init.py
--- a/backend/ee/onyx/background/celery/tasks/external_group_syncing/tasks.py
+++ b/backend/ee/onyx/background/celery/tasks/external_group_syncing/tasks.py
@@ -259,7 +259,7 @@ def check_for_external_group_sync(self: Task, *, tenant_id: str) -> bool | None:
 def try_creating_external_group_sync_task(
    app: Celery,
    cc_pair_id: int,
-    r: Redis,
+    r: Redis,  # noqa: ARG001
    tenant_id: str,
 ) -> str | None:
    """Returns an int if syncing is needed. The int represents the number of sync tasks generated.
@@ -344,7 +344,7 @@ def try_creating_external_group_sync_task(
    bind=True,
 )
 def connector_external_group_sync_generator_task(
-    self: Task,
+    self: Task,  # noqa: ARG001
    cc_pair_id: int,
    tenant_id: str,
 ) -> None:
@@ -466,6 +466,7 @@ def connector_external_group_sync_generator_task(
 def _perform_external_group_sync(
    cc_pair_id: int,
    tenant_id: str,
+    timeout_seconds: int = JOB_TIMEOUT,
 ) -> None:
    # Create attempt record at the start
    with get_session_with_current_tenant() as db_session:
@@ -518,9 +519,23 @@ def _perform_external_group_sync(
        seen_users: set[str] = set()  # Track unique users across all groups
        total_groups_processed = 0
        total_group_memberships_synced = 0
+        start_time = time.monotonic()
        try:
            external_user_group_generator = ext_group_sync_func(tenant_id, cc_pair)
            for external_user_group in external_user_group_generator:
+                # Check if the task has exceeded its timeout
+                # NOTE: Celery's soft_time_limit does not work with thread pools,
+                # so we must enforce timeouts internally.
+                elapsed = time.monotonic() - start_time
+                if elapsed > timeout_seconds:
+                    raise RuntimeError(
+                        f"External group sync task timed out: "
+                        f"cc_pair={cc_pair_id} "
+                        f"elapsed={elapsed:.0f}s "
+                        f"timeout={timeout_seconds}s "
+                        f"groups_processed={total_groups_processed}"
+                    )
+
                external_user_group_batch.append(external_user_group)

                # Track progress
@@ -590,8 +605,8 @@ def _perform_external_group_sync(

 def validate_external_group_sync_fences(
    tenant_id: str,
-    celery_app: Celery,
-    r: Redis,
+    celery_app: Celery,  # noqa: ARG001
+    r: Redis,  # noqa: ARG001
    r_replica: Redis,
    r_celery: Redis,
    lock_beat: RedisLock,
--- a/backend/ee/onyx/background/celery/tasks/query_history/tasks.py
+++ b/backend/ee/onyx/background/celery/tasks/query_history/tasks.py
@@ -40,7 +40,7 @@ def export_query_history_task(
    end: datetime,
    start_time: datetime,
    # Need to include the tenant_id since the TenantAwareTask needs this
-    tenant_id: str,
+    tenant_id: str,  # noqa: ARG001
 ) -> None:
    if not self.request.id:
        raise RuntimeError("No task id defined for this task; cannot identify it")
--- a/backend/ee/onyx/background/celery/tasks/tenant_provisioning/init.py
+++ b/backend/ee/onyx/background/celery/tasks/tenant_provisioning/init.py
--- a/backend/ee/onyx/background/celery/tasks/tenant_provisioning/tasks.py
+++ b/backend/ee/onyx/background/celery/tasks/tenant_provisioning/tasks.py
@@ -43,7 +43,7 @@ _TENANT_PROVISIONING_TIME_LIMIT = 60 * 10  # 10 minutes
    trail=False,
    bind=True,
 )
-def check_available_tenants(self: Task) -> None:
+def check_available_tenants(self: Task) -> None:  # noqa: ARG001
    """
    Check if we have enough pre-provisioned tenants available.
    If not, trigger the pre-provisioning of new tenants.
--- a/backend/ee/onyx/background/celery/tasks/ttl_management/init.py
+++ b/backend/ee/onyx/background/celery/tasks/ttl_management/init.py
--- a/backend/ee/onyx/background/celery/tasks/usage_reporting/init.py
+++ b/backend/ee/onyx/background/celery/tasks/usage_reporting/init.py
--- a/backend/ee/onyx/background/celery/tasks/usage_reporting/tasks.py
+++ b/backend/ee/onyx/background/celery/tasks/usage_reporting/tasks.py
@@ -21,9 +21,9 @@ logger = setup_logger()
    trail=False,
 )
 def generate_usage_report_task(
-    self: Task,
+    self: Task,  # noqa: ARG001
    *,
-    tenant_id: str,
+    tenant_id: str,  # noqa: ARG001
    user_id: str | None = None,
    period_from: str | None = None,
    period_to: str | None = None,
--- a/backend/ee/onyx/background/celery/tasks/vespa/init.py
+++ b/backend/ee/onyx/background/celery/tasks/vespa/init.py
--- a/backend/ee/onyx/background/task_name_builders.py
+++ b/backend/ee/onyx/background/task_name_builders.py
@@ -7,7 +7,7 @@ QUERY_HISTORY_TASK_NAME_PREFIX = OnyxCeleryTask.EXPORT_QUERY_HISTORY_TASK


 def name_chat_ttl_task(
-    retention_limit_days: float, tenant_id: str | None = None
+    retention_limit_days: float, tenant_id: str | None = None  # noqa: ARG001
 ) -> str:
    return f"chat_ttl_{retention_limit_days}_days"

--- a/backend/ee/onyx/configs/app_configs.py
+++ b/backend/ee/onyx/configs/app_configs.py
@@ -122,6 +122,9 @@ SUPER_CLOUD_API_KEY = os.environ.get("SUPER_CLOUD_API_KEY", "api_key")
 # when the capture is called. These defaults prevent Posthog issues from breaking the Onyx app
 POSTHOG_API_KEY = os.environ.get("POSTHOG_API_KEY") or "FooBar"
 POSTHOG_HOST = os.environ.get("POSTHOG_HOST") or "https://us.i.posthog.com"
+POSTHOG_DEBUG_LOGS_ENABLED = (
+    os.environ.get("POSTHOG_DEBUG_LOGS_ENABLED", "").lower() == "true"
+)

 MARKETING_POSTHOG_API_KEY = os.environ.get("MARKETING_POSTHOG_API_KEY")

@@ -131,5 +134,11 @@ GATED_TENANTS_KEY = "gated_tenants"

 # License enforcement - when True, blocks API access for gated/expired licenses
 LICENSE_ENFORCEMENT_ENABLED = (
-    os.environ.get("LICENSE_ENFORCEMENT_ENABLED", "").lower() == "true"
+    os.environ.get("LICENSE_ENFORCEMENT_ENABLED", "true").lower() == "true"
+)
+
+# Cloud data plane URL - self-hosted instances call this to reach cloud proxy endpoints
+# Used when MULTI_TENANT=false (self-hosted mode)
+CLOUD_DATA_PLANE_URL = os.environ.get(
+    "CLOUD_DATA_PLANE_URL", "https://cloud.onyx.app/api"
 )
--- a/backend/ee/onyx/configs/license_enforcement_config.py
+++ b/backend/ee/onyx/configs/license_enforcement_config.py
@@ -0,0 +1,73 @@
+"""Constants for license enforcement.
+
+This file is the single source of truth for:
+1. Paths that bypass license enforcement (always accessible)
+2. Paths that require an EE license (EE-only features)
+
+Import these constants in both production code and tests to ensure consistency.
+"""
+
+# Paths that are ALWAYS accessible, even when license is expired/gated.
+# These enable users to:
+#   /auth - Log in/out (users can't fix billing if locked out of auth)
+#   /license - Fetch, upload, or check license status
+#   /health - Health checks for load balancers/orchestrators
+#   /me - Basic user info needed for UI rendering
+#   /settings, /enterprise-settings - View app status and branding
+#   /billing - Unified billing API
+#   /proxy - Self-hosted proxy endpoints (have own license-based auth)
+#   /tenants/billing-* - Legacy billing endpoints (backwards compatibility)
+#   /manage/users, /users - User management (needed for seat limit resolution)
+#   /notifications - Needed for UI to load properly
+LICENSE_ENFORCEMENT_ALLOWED_PREFIXES: frozenset[str] = frozenset(
+    {
+        "/auth",
+        "/license",
+        "/health",
+        "/me",
+        "/settings",
+        "/enterprise-settings",
+        # Billing endpoints (unified API for both MT and self-hosted)
+        "/billing",
+        "/admin/billing",
+        # Proxy endpoints for self-hosted billing (no tenant context)
+        "/proxy",
+        # Legacy tenant billing endpoints (kept for backwards compatibility)
+        "/tenants/billing-information",
+        "/tenants/create-customer-portal-session",
+        "/tenants/create-subscription-session",
+        # User management - needed to remove users when seat limit exceeded
+        "/manage/users",
+        "/manage/admin/users",
+        "/manage/admin/valid-domains",
+        "/manage/admin/deactivate-user",
+        "/manage/admin/delete-user",
+        "/users",
+        # Notifications - needed for UI to load properly
+        "/notifications",
+    }
+)
+
+# EE-only paths that require a valid license.
+# Users without a license (community edition) cannot access these.
+# These are blocked even when user has never subscribed (no license).
+EE_ONLY_PATH_PREFIXES: frozenset[str] = frozenset(
+    {
+        # User groups and access control
+        "/manage/admin/user-group",
+        # Analytics and reporting
+        "/analytics",
+        # Query history (admin chat session endpoints)
+        "/admin/chat-sessions",
+        "/admin/chat-session-history",
+        "/admin/query-history",
+        # Usage reporting/export
+        "/admin/usage-report",
+        # Standard answers (canned responses)
+        "/manage/admin/standard-answer",
+        # Token rate limits
+        "/admin/token-rate-limits",
+        # Evals
+        "/evals",
+    }
+)
--- a/backend/ee/onyx/db/analytics.py
+++ b/backend/ee/onyx/db/analytics.py
@@ -334,11 +334,9 @@ def fetch_assistant_unique_users_total(
 # Users can view assistant stats if they created the persona,
 # or if they are an admin
 def user_can_view_assistant_stats(
-    db_session: Session, user: User | None, assistant_id: int
+    db_session: Session, user: User, assistant_id: int
 ) -> bool:
-    # If user is None and auth is disabled, assume the user is an admin
-
-    if user is None or user.role == UserRole.ADMIN:
+    if user.role == UserRole.ADMIN:
        return True

    # Check if the user created the persona
--- a/backend/ee/onyx/db/document_set.py
+++ b/backend/ee/onyx/db/document_set.py
@@ -54,7 +54,7 @@ def delete_document_set_privacy__no_commit(
 def fetch_document_sets(
    user_id: UUID | None,
    db_session: Session,
-    include_outdated: bool = True,  # Parameter only for versioned implementation, unused
+    include_outdated: bool = True,  # Parameter only for versioned implementation, unused  # noqa: ARG001
 ) -> list[tuple[DocumentSet, list[ConnectorCredentialPair]]]:
    assert user_id is not None

--- a/Show More
+++ b/Show More