Compare commits

...

281 Commits

Author SHA1 Message Date
Justin Tahara
f1c30974f5 fix(celery): Guardrail for User File Processing (#8633) 2026-03-01 09:22:43 -08:00
Jamison Lahman
81bf07fb15 chore(devtools): upgrade ods: v0.6.1->v0.6.2 (#8773) 2026-02-26 16:20:13 -08:00
Jamison Lahman
b565bf8291 chore(mypy): fix mypy cache issues switching between HEAD and release (#7732) 2026-01-27 15:52:40 -08:00
Jamison Lahman
b4da99cbdd fix(citations): enable citation sidebar w/ web_search-only assistants (#7888) 2026-01-27 13:36:44 -08:00
Justin Tahara
f910feea0f fix(llm): Hide private models from Agent Creation (#7873) 2026-01-27 12:20:56 -08:00
Justin Tahara
e3af8c6c8a feat(desktop): Domain Configuration (#7655) 2026-01-26 16:42:58 -08:00
Justin Tahara
d6e46ed792 feat(desktop): Properly Sign Mac App (#7608) 2026-01-26 16:42:47 -08:00
Jamison Lahman
4ce1f4ecdd chore(desktop): make artifact filename version-agnostic (#7679) 2026-01-26 16:24:06 -08:00
Jamison Lahman
a4678884d7 chore(deployments): fix region (#7640) 2026-01-26 16:24:06 -08:00
Jamison Lahman
c861ba68f1 chore(deployments): fetch secrets from AWS (#7584) 2026-01-26 16:24:06 -08:00
Raunak Bhagat
b1d0e0bb0b Fix actions-steps collapsing/opening issue 2026-01-25 12:49:32 -08:00
Raunak Bhagat
0d78bf52e3 Stop header from collapsing over and over again 2026-01-25 12:49:32 -08:00
Yuhong Sun
bd743282e6 fix: LiteLLM Azure models don't stream (#7761) 2026-01-25 12:47:48 -08:00
Raunak Bhagat
d44d1d92b3 2.9 fixes (#7756) 2026-01-24 17:36:20 -08:00
Raunak Bhagat
4cedcfee59 Fix notifications popover some more 2026-01-24 17:30:45 -08:00
Raunak Bhagat
90a721a76e Fix line-items 2026-01-24 17:30:45 -08:00
Raunak Bhagat
3ccd99e931 Fix notifications 2026-01-24 17:30:45 -08:00
Raunak Bhagat
9076bf603f Fix actions popover 2026-01-24 17:30:45 -08:00
Nikolas Garza
8c6e0a70c3 fix(chat): prevent streaming text from appearing in bursts after citations (#7745) 2026-01-24 16:58:12 -08:00
Yuhong Sun
bebe9555d4 fix: Azure OpenAI Tool Calls (#7727) 2026-01-24 16:55:27 -08:00
Nikolas Garza
c530722c9f fix(tests): use crawler-friendly search query in Exa integration test (#7746) 2026-01-24 16:53:40 -08:00
Jamison Lahman
68380b4ddb chore(fe): align assistant icon with chat bar (#7537) 2026-01-24 16:34:57 -08:00
Jamison Lahman
b3380746ab fix(fe): chat header is sticky and transparent (#7487) 2026-01-24 16:34:57 -08:00
Nikolas Garza
56be114c87 fix(fe): show scroll-down button when user scrolls up during streaming (#7562) 2026-01-24 16:34:57 -08:00
Nikolas Garza
54f467da5c fix: improve scroll behavior (#7364) 2026-01-24 16:34:57 -08:00
Nikolas Garza
8726b112fe fix(slack): Extract person names and filter garbage in query expansion (#7632) 2026-01-23 22:59:23 -08:00
Raunak Bhagat
92181d07b2 fix: Fix scrollability issues for modals (#7718) 2026-01-23 22:05:53 -08:00
Raunak Bhagat
3a73f7fab2 fix: Fix layout issues with AgentEditorPage (#7730) 2026-01-23 20:29:21 -08:00
Raunak Bhagat
7dabaca7cd fix: Add back agent sharing (#7731) 2026-01-23 19:13:36 -08:00
Raunak Bhagat
dec4748825 Close modal on success only 2026-01-23 17:39:52 -08:00
Raunak Bhagat
072836cd86 Cherry-pick agent-deletion 2026-01-23 17:39:52 -08:00
Evan Lohn
2705b5fb0e Revert "fix: modal header in index attempt errors (#7601)"
This reverts commit f945ab6b05.
2026-01-23 15:02:41 -08:00
Evan Lohn
37dcde4226 fix: prevent updates from overwriting perm syncing (#7384) 2026-01-23 14:52:44 -08:00
Evan Lohn
a765b5f622 fix(mcp): per-user auth (#7400) 2026-01-23 14:51:56 -08:00
Evan Lohn
5e093368d1 fix: bedrock non-anthropic prompt caching (#7435) 2026-01-23 14:50:13 -08:00
Evan Lohn
f945ab6b05 fix: modal header in index attempt errors (#7601) 2026-01-23 14:48:29 -08:00
Justin Tahara
11b7a22404 fix(ui): Coda Logo (#7656) 2026-01-23 14:45:29 -08:00
Justin Tahara
8e34f944cc fix(ui): First Connector Result (#7657) 2026-01-23 14:45:18 -08:00
Jamison Lahman
32606dc752 revert: "feat: Enable triple click on content in the chat" (#7393) to release v2.9 (#7710) 2026-01-23 14:21:22 -08:00
Jamison Lahman
1f6c4b40bf fix(fe): inline code text wraps (#7574) to release v2.9 (#7707) 2026-01-23 13:40:28 -08:00
Nikolas Garza
1943f1c745 feat(billing): add annual pricing support to subscription checkout (#7506) 2026-01-23 10:40:16 -08:00
Jamison Lahman
82460729a6 fix(db): ensure migrations are atomic (#7474) to release v2.9 (#7648) 2026-01-21 14:58:04 -08:00
Wenxi
c445e6a8c0 fix: delete old notifications first in migration (#7454) 2026-01-20 08:31:00 -08:00
SubashMohan
8d30a03d7f fix(chat): prevent adding chat sessions to recents that belong to a project (#7377) 2026-01-13 17:57:29 +00:00
Raunak Bhagat
277428f579 refactor: consolidate tabs components into single Tabs.tsx (#7370) 2026-01-13 03:51:48 +00:00
acaprau
9f8c0d4237 feat(opensearch): Even more feature parity, more strict tenant ID checks, OpenSearch client test improvements (#7372)
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-01-13 03:39:02 +00:00
Jessica Singh
9ccbb6a04b feat(web search): exa crawler (#7326) 2026-01-13 01:42:16 +00:00
Danelegend
58a943f782 fix(tools): Tool name should align with what llm knows (#7352) 2026-01-13 01:04:20 +00:00
roshan
9021c607f2 chore(dr): finer grained tracing for clarification step, research plan step, and orchestration step (#7374)
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-01-12 23:58:27 +00:00
Jamison Lahman
c03b0d80fd chore(deps): remove requires-python < 3.13 (#7367) 2026-01-12 23:21:02 +00:00
acaprau
fcf0b316a4 feat(opensearch): More feature parity (#7286) 2026-01-12 23:01:55 +00:00
Jamison Lahman
157f672b4b chore(deps): upgrade numpy, unstructured, unstructured-client (#7369) 2026-01-12 22:58:11 +00:00
dependabot[bot]
51b9484b96 chore(deps): bump actions/upload-artifact from 5.0.0 to 6.0.0 (#6964)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-12 21:53:48 +00:00
Danelegend
0c8f55c049 fix(tools): persist enabled tools in ui (#7347) 2026-01-12 21:47:29 +00:00
dependabot[bot]
c7be2571d1 chore(deps): bump tauri-apps/tauri-action from 0.6.0 to 0.6.1 (#7371)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-12 13:48:46 -08:00
dependabot[bot]
4948b6cca9 chore(deps): bump actions/stale from 10.1.0 to 10.1.1 (#6965)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-12 13:12:24 -08:00
Jamison Lahman
638ea5f316 chore(deps): fix uv-lock hook (#7368) 2026-01-12 12:52:17 -08:00
dependabot[bot]
6e3268ca75 chore(deps): bump pypdf from 6.1.3 to 6.6.0 (#7319)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-12 20:36:47 +00:00
Wenxi
d8921df60c fix: onboarding modal styling (#7363) 2026-01-12 20:29:23 +00:00
Yuhong Sun
693d9f5f69 fix: Editing First Message (#7366) 2026-01-12 19:45:01 +00:00
Jamison Lahman
02e17871cc chore(devtools): recommend starting dev dockers with --wait (#7365) 2026-01-12 19:13:00 +00:00
Wenxi
209cfd00b0 fix: only show latest release notification for nightly versions (#7362) 2026-01-12 11:10:28 -08:00
Jessica Singh
cd36baa484 fix(web search): removing site: operator from exa query (#7248) 2026-01-12 18:22:18 +00:00
Raunak Bhagat
c78fe275af refactor: Popover cleanup (#7356) 2026-01-12 12:08:30 +00:00
Raunak Bhagat
c935c4808f fix: More actions cards fixes (#7358) 2026-01-12 03:27:42 -08:00
Raunak Bhagat
4ebcfef541 fix: Fix actions cards (#7357) 2026-01-12 10:57:22 +00:00
SubashMohan
e320ef9d9c Fix/agent creation files (#7346) 2026-01-12 07:00:47 +00:00
Nikolas Garza
9e02438af5 chore: standardize password/secret inputs and update per design docs (#7316) 2026-01-12 06:26:09 +00:00
Danelegend
177e097ddb fix(chat): newly created chats being marked as failed (#7310)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-12 02:02:49 +00:00
Wenxi
9ecd47ec31 feat: in app notifications for changelog (#7253) 2026-01-12 01:09:04 +00:00
Nikolas Garza
83f3d29b10 fix: stop federated OAuth modal from appearing permanently after skips (#7351) 2026-01-11 22:20:13 +00:00
Yuhong Sun
12e668cc0f feat: Deep Research Replay (#7340) 2026-01-11 22:17:09 +00:00
SubashMohan
afe8376d5e feat: Exclude image generation providers from LLM fetch in API calls (#7348) 2026-01-11 21:13:25 +00:00
Wenxi
082ef3e096 fix: always start onboarding at first step and track by user (#7315) 2026-01-11 21:03:17 +00:00
Nikolas Garza
cb2951a1c0 perf: switch BeautifulSoup parser from html.parser to lxml for web crawler (#7350) 2026-01-11 20:46:35 +00:00
Corey Auger
eda5598af5 fix: update docs link (#7349)
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-01-11 12:44:48 -08:00
Justin Tahara
0bbb4b6988 fix(ui): Action Strikethrough when not configured (#7273) 2026-01-11 11:21:17 +00:00
Jamison Lahman
4768aadb20 refactor(fe): WelcomeMessage nits (#7344) 2026-01-10 22:01:48 -08:00
Jamison Lahman
e05e85e782 fix(fe): "Pick a date range" button wrapping (#7343) 2026-01-10 21:22:20 -08:00
Jamison Lahman
6408f61307 fix(fe): avoid internal table scroll on query history page (#7342) 2026-01-10 20:39:17 -08:00
Jamison Lahman
5a5cd51e4f fix(fe): SidebarTabs are Links (#7341) 2026-01-10 20:01:31 -08:00
Danelegend
7c047c47a0 fix(chat): Chat in-progress messages (#7318)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-11 00:29:39 +00:00
Evan Lohn
22138bbb33 fix: vertex prompt caching (#7339)
Co-authored-by: Weves <chrisweaver101@gmail.com>
2026-01-11 00:23:39 +00:00
Chris Weaver
7cff1064a8 chore: reenable auto update test (#7146) 2026-01-10 16:00:48 -08:00
Wenxi
deeb6fdcd2 fix: anonymous users cookie and admin panel config (#7321) 2026-01-10 15:12:27 -08:00
Chris Weaver
3e7f4e0aa5 fix: auto-sync (#7337) 2026-01-10 13:43:40 -08:00
Raunak Bhagat
ac73671e35 refactor: Components updates (#7308) 2026-01-10 06:30:39 +00:00
Raunak Bhagat
3c20d132e0 feat: Modal updates (#7306) 2026-01-10 05:13:09 +00:00
Yuhong Sun
0e3e7eb4a2 feat: Create new chat session button after msg send (#7332)
Co-authored-by: Raunak Bhagat <r@rabh.io>
2026-01-10 04:56:54 +00:00
Yuhong Sun
c85aebe8ab Tables (#7333) 2026-01-09 20:40:15 -08:00
Yuhong Sun
a47e6a3146 feat: Enable triple click on content in the chat (#7331)
Co-authored-by: Raunak Bhagat <r@rabh.io>
2026-01-09 20:37:36 -08:00
Jamison Lahman
1e61737e03 fix(fe): Tags have consistent height on hover (#7328) 2026-01-09 20:20:36 -08:00
Wenxi
c7fc1cd5ae chore: allow tenant cleanup script to skip control plane if tenant not found (#7290) 2026-01-10 00:17:26 +00:00
roshan
e2b60bf67c feat(posthog): track message origin analytics in posthog (#7313) 2026-01-10 00:11:17 +00:00
Danelegend
f4d4d14286 fix(chat): post llm loop callback (#7309)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-09 23:53:22 +00:00
Yuhong Sun
1c24bc6ea2 Opensearch README (#7327) 2026-01-09 15:53:22 -08:00
Yuhong Sun
cacbd18dcd feat: Opensearch README (#7325) 2026-01-09 15:28:08 -08:00
Nikolas Garza
8527b83b15 fix(sidebar): Allow unpinning all agents and fix icon flicker (#7241) 2026-01-09 14:20:46 -08:00
Nikolas Garza
33e37a1846 fix: make autocomplete opt in (#7317) 2026-01-09 20:04:22 +00:00
Jamison Lahman
d454d8a878 fix(chat): wide tables can be scrolled (#7311) 2026-01-09 19:07:40 +00:00
roshan
00ad65a6a8 feat: chrome extension (#6704) 2026-01-09 18:45:23 +00:00
Nikolas Garza
dac60d403c fix(chat): show "User has stopped generation" indicator when user cancels (#7312) 2026-01-09 18:14:35 +00:00
Evan Lohn
6256b2854d chore: bump indexing usage (#7307) 2026-01-09 17:46:27 +00:00
Danelegend
8acb8e191d fix(chat): use url when name unknown (#7278)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-09 17:16:20 +00:00
Evan Lohn
8c4cbddc43 fix: minor perm sync improvements (#7296) 2026-01-09 05:46:23 +00:00
Yuhong Sun
f6cd006bd6 chore: Refactor tool exceptions (#7280) 2026-01-09 04:01:12 +00:00
Jamison Lahman
0033934319 chore(perf): remove isEqual memoization check (#7304)
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-01-09 03:20:37 +00:00
Raunak Bhagat
ff87b79d14 fix: Section layout component fix (#7305) 2026-01-08 19:25:33 -08:00
Raunak Bhagat
ebf18af7c9 refactor: UI components cleanup (#7301)
Co-authored-by: Nikolas Garza <90273783+nmgarza5@users.noreply.github.com>
2026-01-09 03:09:20 +00:00
Raunak Bhagat
cf67ae962c feat: Add a new GeneralLayouts file and update layout components (#7297)
Co-authored-by: Nikolas Garza <90273783+nmgarza5@users.noreply.github.com>
2026-01-09 02:50:21 +00:00
dependabot[bot]
7a9a132739 chore(deps): bump werkzeug from 3.1.4 to 3.1.5 in /backend/requirements (#7300)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-09 00:08:17 +00:00
dependabot[bot]
33bad8c37b chore(deps): bump authlib from 1.6.5 to 1.6.6 in /backend/requirements (#7299)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-08 23:28:19 +00:00
Raunak Bhagat
9241ff7a75 refactor: migrate hooks to /hooks directory and update imports (#7295) 2026-01-08 14:57:06 -08:00
Chris Weaver
0a25bc30ec fix: auto-pause (#7289) 2026-01-08 14:45:30 -08:00
Raunak Bhagat
e359732f4c feat: add SvgEmpty icon and alphabetize icon exports (#7294) 2026-01-08 21:40:55 +00:00
Evan Lohn
be47866a4d chore: logging confluence perm sync errors better (#7291) 2026-01-08 20:24:03 +00:00
Wenxi
8a20540559 fix: use tag constraint name instead of index elements (#7288) 2026-01-08 18:52:12 +00:00
Jamison Lahman
e6e1f2860a chore(fe): remove items-center from onboarding cards (#7285) 2026-01-08 18:28:36 +00:00
Evan Lohn
fc3f433df7 fix: usage limits for indexing (#7287) 2026-01-08 18:26:52 +00:00
Evan Lohn
016caf453b fix: indexing and usage bugs (#7279) 2026-01-08 17:08:20 +00:00
Jamison Lahman
a9de25053f refactor(fe): remove "container" divs (#7271) 2026-01-08 07:23:51 +00:00
SubashMohan
8ef8dfdeb7 Cleanup/userfile indexing (#7221) 2026-01-08 05:07:19 +00:00
Danelegend
0643b626d9 fix(files): Display protected file errors (#7265)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-08 00:31:26 +00:00
Yuhong Sun
64a0eb52e0 chore: limit Deep Research to sequential calls only (#7275) 2026-01-08 00:03:09 +00:00
Evan Lohn
b82ffc82cf chore: upgrade client libs (#7249) 2026-01-07 23:59:57 +00:00
Danelegend
b3014b9911 fix(ui): deep research flag in chat edit (#7276)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-07 23:52:52 +00:00
Yuhong Sun
439707c395 chore: exa prompt fix (#7274) 2026-01-07 23:36:27 +00:00
dependabot[bot]
65351aa8bd chore(deps): bump marshmallow from 3.26.1 to 3.26.2 in /backend/requirements (#6970)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-07 23:33:59 +00:00
Wenxi
b44ee07eaf feat: improved backend driven notifications and new notification display (#7246) 2026-01-07 22:57:49 +00:00
Justin Tahara
065d391c08 fix(web crawler): Fixing decoding bytes issue (#7270) 2026-01-07 22:32:33 +00:00
dependabot[bot]
14fe3b375f chore(deps): bump urllib3 from 2.6.2 to 2.6.3 in /backend/requirements (#7272)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-07 21:47:53 +00:00
dependabot[bot]
bb1b96dded chore(deps): bump preact from 10.27.2 to 10.28.2 in /web (#7267)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-07 21:17:10 +00:00
Evan Lohn
9f949ae2d9 fix: custom llm provider prompt caching type safety (#7269) 2026-01-07 20:41:53 +00:00
acaprau
975c0e8009 feat(opensearch): Some low hanging fruit for Vespa <-> OpenSearch data parity (#7252) 2026-01-07 20:36:12 +00:00
Jamison Lahman
3dfb38c460 fix(fe): Failed indexing colors support dark theme (#7264) 2026-01-07 11:52:46 -08:00
Jamison Lahman
a1512a0485 fix(fe): fix InputComboBox shrinking when disabled (#7266) 2026-01-07 19:43:39 +00:00
roshan
8ea3bacd38 feat(evals): weekly eval runs (#7236) 2026-01-07 19:39:13 +00:00
Jamison Lahman
6b560b8162 fix(fe): admin containers apply bottom padding (#7263) 2026-01-07 18:34:53 +00:00
Jamison Lahman
3b750939ed fix(fe): move Text horizontal padding to pseudo-element (#7226)
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-01-07 18:14:33 +00:00
Yuhong Sun
bd4cb17a48 chore: agent pin behavior (#7261) 2026-01-07 18:11:33 +00:00
SubashMohan
485cd9a311 feat(projects): enhance FileCard component with className prop to fix width issue (#7259) 2026-01-07 18:04:59 +00:00
SubashMohan
2108c72353 feat(chat): add custom copy behavior for HumanMessage component (#7257) 2026-01-07 18:04:55 +00:00
Danelegend
98f43fb6ab fix(files): propagate file error from backend (#7245)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-07 17:43:15 +00:00
Danelegend
e112ebb371 chore: add msoffcrypto-tool (#7247)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-07 17:38:09 +00:00
Jamison Lahman
f88cbcfe27 revert: "chore(deployments): prefer release environment (#6997)" (#7260) 2026-01-07 07:06:56 -08:00
Wenxi
0df0b10d3a feat: add public tag for api reference docs (#7227) 2026-01-07 06:09:36 +00:00
Jamison Lahman
ed0d12452a chore(deployments): dont treat ad-hoc releases as dry-runs (#7256) 2026-01-06 21:57:51 -08:00
Wenxi
dc7cb80594 fix: don't pass tool_choice for mistral provider (#7255) 2026-01-07 05:42:59 +00:00
Yuhong Sun
4312b24945 feat: Fix last cycle LLM did not return an answer (#7254) 2026-01-07 05:41:44 +00:00
Justin Tahara
afd920bb33 fix(users): Multi-tenant signup (#7237) 2026-01-06 18:38:05 -08:00
Jamison Lahman
d009b12aa7 chore(gha): paths-filter depends on actions/checkout (#7244) 2026-01-06 17:11:45 -08:00
Jamison Lahman
596b3d9f3e chore(gha): skip all of zizmor when applicable (#7243) 2026-01-06 17:08:50 -08:00
Jamison Lahman
1981c912b7 chore(gha): conditionally run zizmor (#7240) 2026-01-06 16:18:33 -08:00
Jamison Lahman
68b1bb8448 chore(gha): pin uv version w/ chart-testing-action (#7239) 2026-01-06 16:03:37 -08:00
Jamison Lahman
4676b5017f chore(whitespace): format pr-helm-chart-testing.yml (#7238) 2026-01-06 16:01:02 -08:00
Danelegend
eb7b6a5ce1 fix(chat): enable exclusion of failed chat sessions from api (#7233)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-06 23:04:35 +00:00
Justin Tahara
87d6df2621 fix(user): Block Malicious Accounts (#7235) 2026-01-06 14:52:44 -08:00
Danelegend
13b4108b53 fix: serper api key errors when adding (#7217)
Co-authored-by: Dane Urban <durban@Danes-MacBook-Pro.local>
2026-01-06 22:42:07 +00:00
acaprau
13e806b625 feat(opensearch): Add OpenSearch document index interface (#7143) 2026-01-06 22:35:47 +00:00
Nikolas Garza
f4f7839d84 fix: sidebar button shifting on hover (#7234) 2026-01-06 21:54:39 +00:00
Jamison Lahman
2dbf1c3b1f chore(devtools): ods with no args outputs help (#7230) 2026-01-06 21:14:26 +00:00
dependabot[bot]
288d4147c3 chore(deps): bump pynacl from 1.6.1 to 1.6.2 in /backend/requirements (#7228)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-06 20:56:17 +00:00
dependabot[bot]
fee27b2274 chore(deps): bump aiohttp from 3.13.2 to 3.13.3 in /backend/requirements (#7216)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-06 20:53:40 +00:00
Justin Tahara
340e938627 fix(chat): Math Formatting (#7229) 2026-01-06 20:37:34 +00:00
roshan
6faa47e0f7 fix: json serialize tool calls and other types in prompt cache (#7225) 2026-01-06 19:16:19 +00:00
Evan Lohn
ba6801f5af chore: add logs to tenant overrides (#7215) 2026-01-06 18:46:38 +00:00
SubashMohan
d7447eb8af fix(projects): projct folder button not expandable (#7223) 2026-01-06 18:22:39 +00:00
SubashMohan
196f890a68 feat(image-generation): Add Azure OpenAI GPT image models (#7224) 2026-01-06 17:37:09 +00:00
Justin Tahara
3ac96572c3 fix(open_url): Parse PDF files with Open URL Tool (#7219) 2026-01-06 17:32:35 +00:00
SubashMohan
3d8ae22b3a seeds(config): image gen from llm providers (#7198) 2026-01-06 16:17:40 +00:00
SubashMohan
233d06ec0e feat(api): Enhance API key handling and masking in image generation (#7220) 2026-01-06 14:17:03 +05:30
Justin Tahara
9ff82ac740 fix(chat): Thinking in Regen Chat (#7213) 2026-01-06 04:13:11 +00:00
Justin Tahara
b15f01fd78 fix(ui): Image Gen Tooltip for Agent Workflow (#7211) 2026-01-06 02:57:08 +00:00
Nikolas Garza
6480cf6738 fix(fe): chat input box spacing and sizing fixes (#7204) 2026-01-06 01:38:08 +00:00
Justin Tahara
c521a4397a chore(llm): Remove Claude Opus 3 (#7214) 2026-01-06 01:34:58 +00:00
Evan Lohn
41a8d86df3 feat: prompt cache 3 (#6605) 2026-01-06 00:39:39 +00:00
roshan
735cf926e4 feat(evals): multi-turn evals (#7210) 2026-01-05 23:33:19 +00:00
Justin Tahara
035e73655f fix(ui): Update coloring for Doc Set Tooltip (#7208) 2026-01-05 22:12:42 +00:00
roshan
f317420f58 feat(evals): set log level for eval runs to warning (#7209) 2026-01-05 22:11:45 +00:00
Justin Tahara
d50a84f2e4 fix(ui): Remove Open URL Filter for Agents (#7205) 2026-01-05 21:16:30 +00:00
Justin Tahara
9b441e3686 feat(braintrust): Cost Tracking (#7201) 2026-01-05 20:44:53 +00:00
Justin Tahara
c4c1e16f19 fix(braintrust): Implement actual TTFA Metric (#7169) 2026-01-05 20:31:47 +00:00
Evan Lohn
9044e0f5fa feat: per-tenant usage limits (#7197) 2026-01-05 19:01:00 +00:00
Jamison Lahman
a180e1337b chore(fe): replace js isHovered with css hover effects (#7200) 2026-01-05 09:01:55 -08:00
Evan Lohn
6ca72291bc fix: llm usage tracking for dr (#7196) 2026-01-05 01:24:20 +00:00
Evan Lohn
c23046f7c0 chore: bump limits on cloud LLM usage (#7195) 2026-01-04 21:38:13 +00:00
Evan Lohn
d5f66ac146 feat: cloud usage limits (#7192) 2026-01-04 06:51:12 +00:00
Yuhong Sun
241fc8f877 feat: Deep Research Internal Search Tuning (#7193) 2026-01-03 22:54:23 -08:00
Jamison Lahman
f1ea41b519 chore(whitespace): ignore refactor rev (#7191) 2026-01-02 23:52:48 -08:00
Jamison Lahman
ed3f72bc75 refactor(whitespace): rm react fragment (#7190) 2026-01-02 23:49:39 -08:00
Jamison Lahman
2247e3cf8e chore(fe): rm unnecessary spacer from chat ui (#7189) 2026-01-02 23:42:54 -08:00
Jamison Lahman
47c49d86e8 chore(fe): improve human chat responsiveness (#7187) 2026-01-02 23:26:52 -08:00
Yuhong Sun
8c11330d46 feat: Easy send message nonstreaming (#7186) 2026-01-02 19:46:54 -08:00
Chris Weaver
22ac22c17d feat: improve display for models that are no longer present (#7184) 2026-01-03 02:39:06 +00:00
Yuhong Sun
c0a6a0fb4a feat: nonstreaming send chat message api (#7181) 2026-01-03 02:33:17 +00:00
Chris Weaver
7f31a39dc2 fix: regenerate models stuck in perma loading state (#7182)
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-03 02:18:34 +00:00
Yuhong Sun
f1f61690e3 chore: spacing (#7183) 2026-01-02 17:57:55 -08:00
Jamison Lahman
8c3e17bbe5 revert: "chore(pre-commit): run uv-sync in active venv" (#7178) 2026-01-03 01:16:01 +00:00
Yuhong Sun
a1ab3678a0 chore: Plugin issue (#7179) 2026-01-02 16:43:51 -08:00
Yuhong Sun
2d79ed7bb4 New send message api (#7167) 2026-01-02 23:57:54 +00:00
Justin Tahara
f472fd763e fix(braintrust): Span Attributes Association (#7174) 2026-01-02 15:20:10 -08:00
Jamison Lahman
e47b2fccb4 chore(playwright): fix Exa configure tests (#7176) 2026-01-02 15:10:54 -08:00
acaprau
17a6fc4ebf chore(opensearch): Add external dep tests for OpenSearchClient (#7155) 2026-01-02 22:28:46 +00:00
acaprau
391c8c5cf7 feat(opensearch): Add OpenSearch client (#7137)
flakey connector tests are failing for reasons unrelated to this pr. all other tests pass.
2026-01-02 14:11:14 -08:00
Jamison Lahman
d0e3ee1055 chore(deployments): prefer release environment (#6997) 2026-01-02 22:00:33 +00:00
Jamison Lahman
dc760cf580 chore(playwright): prefer baseURL (#7171) 2026-01-02 13:30:10 -08:00
Justin Tahara
d49931fce1 fix(braintrust): Fix Tenant ID to Token Association (#7173) 2026-01-02 13:10:34 -08:00
Jamison Lahman
41d1d265a0 chore(docker): .dockerignore /tests/ (#7172) 2026-01-02 20:19:52 +00:00
Chris Weaver
45a2207662 chore: cleanup old LLM provider update mechanism (#7170) 2026-01-02 20:14:27 +00:00
Justin Tahara
725ed6a523 fix(braintrust): Updating naming for metric (#7168) 2026-01-02 20:06:43 +00:00
acaprau
2452671420 feat(opensearch): Add OpenSearch queries (#7133) 2026-01-02 19:05:43 +00:00
Jamison Lahman
a4a767f146 fix(ollama): rm unsupported tool_choice option (#7156) 2026-01-02 18:55:57 +00:00
Wenxi
8304fbd14c fix: don't pass selected tab to connector specific config (#7165) 2026-01-02 18:19:33 +00:00
Jamison Lahman
7db7d4c965 chore(docker): publish inference_model_server port 9000 in dev (#7166) 2026-01-02 10:04:45 -08:00
SubashMohan
2cc2b5aee9 feat(image-generation): e2e tests (#7164) 2026-01-02 19:13:59 +05:30
SubashMohan
0c35ffe468 feat(config): Image generation frontend (#7019) 2026-01-02 11:36:57 +00:00
SubashMohan
adece3f812 Tests/theme (#7163)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 16:14:13 +05:30
Jamison Lahman
b44349e67d chore(blame): introduce .git-blame-ignore-revs to ignore refactors (#7162) 2026-01-01 22:23:34 -08:00
Jamison Lahman
3134e5f840 refactor(whitespace): rm temporary react fragments (#7161) 2026-01-01 22:10:31 -08:00
dependabot[bot]
5b8223b6af chore(deps): bump qs from 6.14.0 to 6.14.1 in /web (#7147)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
2026-01-02 05:05:00 +00:00
Jamison Lahman
30ab85f5a0 chore(fe): follow up styling fixes to #7129 (#7160) 2026-01-01 19:58:43 -08:00
Jamison Lahman
daa343c30b perf(chat): memoize chat messages (#7157)
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2026-01-01 19:10:18 -08:00
devin-ai-integration[bot]
c67936a4c1 fix: non-thinking responses not displaying until page refresh (#7123)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: roshan@onyx.app <rohod04@gmail.com>
Co-authored-by: Wenxi <wenxi@onyx.app>
Co-authored-by: Chris <chris@onyx.app>
Co-authored-by: Jamison Lahman <jamison@lahman.dev>
Co-authored-by: Nikolas Garza <90273783+nmgarza5@users.noreply.github.com>
Co-authored-by: Yuhong Sun <yuhongsun96@gmail.com>
Co-authored-by: Raunak Bhagat <r@rabh.io>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: SubashMohan <subashmohan75@gmail.com>
Co-authored-by: Justin Tahara <105671973+justin-tahara@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: roshan <38771624+rohoswagger@users.noreply.github.com>
2026-01-01 21:15:55 +00:00
Jamison Lahman
4578c268ed perf(chat): consildate chat UI layout style (#7129) 2026-01-01 13:10:47 -08:00
roshan
7658917fe8 feat: running evals locally (#7145) 2026-01-01 18:39:08 +00:00
roshan
fd4695d5bd feat: add tool call validation to eval cli (#7144)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2026-01-01 15:46:05 +00:00
devin-ai-integration[bot]
a25362a709 fix: check stop signal during active streaming (#7151)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: roshan@onyx.app <rohod04@gmail.com>
2026-01-01 15:33:03 +00:00
SubashMohan
1eb4962861 refactor: White-labelling (#6938) 2026-01-01 09:55:58 +00:00
Nikolas Garza
aa1c956608 fix: Duplicate model provider sections for unenriched LLM models (#7148) 2026-01-01 03:03:40 +00:00
Chris Weaver
19e5c47f85 fix: when onboarding flow shows up (#7154) 2025-12-31 18:29:36 -08:00
Chris Weaver
872a2ed58a feat: add new models to cloud (#7149) 2026-01-01 01:50:26 +00:00
Jessica Singh
42047a4dce feat(tools): extend open_url to handle indexed content urls (#6822) 2026-01-01 01:31:28 +00:00
Chris Weaver
a3a9847d76 fix: onboarding display (#7153) 2025-12-31 17:19:00 -08:00
Yuhong Sun
3ade17c380 chore: fix linter issues (#7122) 2025-12-31 16:48:33 -08:00
Chris Weaver
9150ba1905 fix: skip failing tests (#7152) 2026-01-01 00:08:46 +00:00
Justin Tahara
cb14e84750 feat(connectors): Add Deletion Popup (#7054) 2025-12-31 22:12:57 +00:00
Chris Weaver
c916517342 feat: add auto LLM model updates from GitHub config (#6830)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2025-12-31 14:02:08 -08:00
Justin Tahara
45b902c950 fix(desktop): Disable reload on Mac (#7141) 2025-12-31 21:06:02 +00:00
Nikolas Garza
981b43e47b fix: prevent Slack federated search query multiplication (#7125) 2025-12-31 20:41:50 +00:00
Yuhong Sun
b5c45cbce0 Chat Flow Readme (#7142) 2025-12-31 11:15:48 -08:00
Yuhong Sun
451f10343e Update README.md (#7140) 2025-12-31 10:11:31 -08:00
SubashMohan
ceeed2a562 Feat/image config backend (#6961) 2025-12-31 11:39:32 +00:00
SubashMohan
bcc7a7f264 refactor(modals): All modals use new Modal component (#6729) 2025-12-31 07:54:08 +00:00
SubashMohan
972ef34b92 Fix/input combobox dropdown (#7015) 2025-12-31 13:01:03 +05:30
Raunak Bhagat
9d11d1f218 feat: Refreshed agent creation page (#6241)
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2025-12-31 05:09:07 +00:00
Chris Weaver
4db68853cd fix: openai provider identification on the admin panel (#7135) 2025-12-31 02:14:46 +00:00
Wenxi
b08fafc66b fix: make litellm testing script prettier (#7136) 2025-12-30 18:08:25 -08:00
Wenxi
1e61bf401e fix: lazy load tracing providers to avoid spamming logs when not configured (#7134) 2025-12-31 02:03:33 +00:00
Chris Weaver
0541c2989d fix: downgrade (#7132) 2025-12-31 01:45:41 +00:00
Yuhong Sun
743b996698 fix: Remove Default Reminder (#7131) 2025-12-31 00:55:16 +00:00
Chris Weaver
16e77aebfc refactor: onboarding forms (#7105) 2025-12-30 16:56:13 -08:00
Yuhong Sun
944f4a2464 fix: reenable force search parameter (#7130) 2025-12-31 00:27:17 +00:00
Nikolas Garza
67db7c0346 fix: suppress Jest act() warning spam in test output (#7127) 2025-12-30 22:32:15 +00:00
Jamison Lahman
8e47cd4e4f chore(fe): baseline align inline code spans (#7128) 2025-12-30 22:21:59 +00:00
devin-ai-integration[bot]
e8a4fca0a3 fix: persist onboarding flow until user explicitly finishes (#7111)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Chris <chris@onyx.app>
2025-12-30 21:42:04 +00:00
Wenxi
6d783ca691 fix: gemini default location global (#7124) 2025-12-30 21:15:57 +00:00
Yuhong Sun
283317bd65 chore: prompts (#7108) 2025-12-30 12:22:21 -08:00
acaprau
2afbc74224 feat: Add OpenSearch schema (#7118) 2025-12-30 19:55:34 +00:00
acaprau
5b273de8be chore: Add script to restart OpenSearch container (#7110) 2025-12-30 19:48:30 +00:00
roshan
a0a24147b5 fix: stop-generation for deep research (#7050)
Co-authored-by: Raunak Bhagat <r@rabh.io>
Co-authored-by: acaprau <48705707+acaprau@users.noreply.github.com>
Co-authored-by: Justin Tahara <105671973+justin-tahara@users.noreply.github.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2025-12-30 19:17:28 +00:00
roshan
fd31da3159 chore: clean up stop signal redis fence (#7119) 2025-12-30 18:55:21 +00:00
Yuhong Sun
cd76ac876b fix: MIT integration tests (#7121) 2025-12-30 10:51:36 -08:00
Jamison Lahman
8f205172eb chore(gha): ensure uv cache is pruned before upload (#7120) 2025-12-30 10:50:08 -08:00
roshan
be70fa21e3 fix: stop-generation for non-deep research (#7045)
Co-authored-by: Raunak Bhagat <r@rabh.io>
Co-authored-by: acaprau <48705707+acaprau@users.noreply.github.com>
Co-authored-by: Justin Tahara <105671973+justin-tahara@users.noreply.github.com>
2025-12-30 18:41:20 +00:00
roshan
0687bddb6f fix: popover max height setting (#7093)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
2025-12-30 18:40:54 +00:00
roshan
73091118e3 fix: rendering parallel research agents cleanly (#7078) 2025-12-30 18:40:45 +00:00
Wenxi
bf8590a637 feat: add z indices for confirmation modal (#7114) 2025-12-30 18:40:16 +00:00
Chris Weaver
8a6d597496 perf: update web/STANDARDS.md + add standards to CLAUDE.md / AGENTS.md (#7039)
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2025-12-30 09:36:58 -08:00
Jamison Lahman
f0bc538f60 chore(fe): fix some Text that should be spans (#7112) 2025-12-30 08:06:15 -08:00
Jamison Lahman
0b6d9347bb fix(ux): Share Chat modal uses CopyIconButton (#7116) 2025-12-30 08:05:02 -08:00
Raunak Bhagat
415538f9f8 refactor: Improve form field components (#7104) 2025-12-29 23:26:56 -08:00
Jamison Lahman
969261f314 chore(desktop): disable nightly builds (#7115) 2025-12-29 22:42:39 -08:00
Jamison Lahman
eaa4d5d434 chore(desktop): remove duplicate startup log, onyx-desktop (#7113) 2025-12-29 19:58:25 -08:00
acaprau
19e6900d96 chore: Add opensearch-py 3.0.0 (#7103) 2025-12-30 03:50:22 +00:00
Jamison Lahman
f3535b94a0 chore(docker): add healthchecks (#7089)
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2025-12-29 19:29:16 -08:00
Jamison Lahman
383aa222ba chore(fe): refresh chat Stack Trace button (#7092) 2025-12-29 18:29:58 -08:00
Yuhong Sun
f32b21400f chore: Fix Tests (#7107) 2025-12-29 17:24:40 -08:00
Jamison Lahman
5d5e71900e chore(fe): Text default span follow up (#7106) 2025-12-29 17:22:09 -08:00
Yuhong Sun
06ce7484b3 chore: docker compose no MCP server (#7100) 2025-12-29 16:40:15 -08:00
Jamison Lahman
700db01b33 chore(fe): make Text component default to span (#7096) 2025-12-29 16:30:09 -08:00
acaprau
521e9f108f fix: The update method for the new Vespa interface should correctly handle None chunk_count (#7098) 2025-12-30 00:23:37 +00:00
813 changed files with 49290 additions and 19143 deletions

8
.git-blame-ignore-revs Normal file
View File

@@ -0,0 +1,8 @@
# Exclude these commits from git blame (e.g. mass reformatting).
# These are ignored by GitHub automatically.
# To enable this locally, run:
#
# git config blame.ignoreRevsFile .git-blame-ignore-revs
3134e5f840c12c8f32613ce520101a047c89dcc2 # refactor(whitespace): rm temporary react fragments (#7161)
ed3f72bc75f3e3a9ae9e4d8cd38278f9c97e78b4 # refactor(whitespace): rm react fragment #7190

7
.github/CODEOWNERS vendored
View File

@@ -1,3 +1,10 @@
* @onyx-dot-app/onyx-core-team
# Helm charts Owners
/helm/ @justin-tahara
# Web standards updates
/web/STANDARDS.md @raunakab @Weves
# Agent context files
/CLAUDE.md.template @Weves
/AGENTS.md.template @Weves

View File

@@ -7,14 +7,6 @@ inputs:
runs:
using: "composite"
steps:
- name: Setup uv
uses: astral-sh/setup-uv@ed21f2f24f8dd64503750218de024bcf64c7250a # ratchet:astral-sh/setup-uv@v7
with:
version: "0.9.9"
# TODO: Enable caching once there is a uv.lock file checked in.
# with:
# enable-cache: true
- name: Compute requirements hash
id: req-hash
shell: bash
@@ -30,6 +22,8 @@ runs:
done <<< "$REQUIREMENTS"
echo "hash=$(echo "$hash" | sha256sum | cut -d' ' -f1)" >> "$GITHUB_OUTPUT"
# NOTE: This comes before Setup uv since clean-ups run in reverse chronological order
# such that Setup uv's prune-cache is able to prune the cache before we upload.
- name: Cache uv cache directory
uses: runs-on/cache@50350ad4242587b6c8c2baa2e740b1bc11285ff4 # ratchet:runs-on/cache@v4
with:
@@ -38,6 +32,14 @@ runs:
restore-keys: |
${{ runner.os }}-uv-
- name: Setup uv
uses: astral-sh/setup-uv@ed21f2f24f8dd64503750218de024bcf64c7250a # ratchet:astral-sh/setup-uv@v7
with:
version: "0.9.9"
# TODO: Enable caching once there is a uv.lock file checked in.
# with:
# enable-cache: true
- name: Setup Python
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # ratchet:actions/setup-python@v5
with:

File diff suppressed because it is too large Load Diff

View File

@@ -13,7 +13,7 @@ jobs:
runs-on: ubuntu-latest
timeout-minutes: 45
steps:
- uses: actions/stale@5f858e3efba33a5ca4407a664cc011ad407f2008 # ratchet:actions/stale@v10
- uses: actions/stale@997185467fa4f803885201cee163a9f38240193d # ratchet:actions/stale@v10
with:
stale-issue-message: 'This issue is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days.'
stale-pr-message: 'This PR is stale because it has been open 75 days with no activity. Remove stale label or comment or this will be closed in 15 days.'

View File

@@ -38,6 +38,8 @@ env:
# LLMs
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
VERTEX_CREDENTIALS: ${{ secrets.VERTEX_CREDENTIALS }}
VERTEX_LOCATION: ${{ vars.VERTEX_LOCATION }}
# Code Interpreter
# TODO: debug why this is failing and enable
@@ -170,7 +172,7 @@ jobs:
- name: Upload Docker logs
if: failure()
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # ratchet:actions/upload-artifact@v5
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
with:
name: docker-logs-${{ matrix.test-dir }}
path: docker-logs/

View File

@@ -6,11 +6,11 @@ concurrency:
on:
merge_group:
pull_request:
branches: [ main ]
branches: [main]
push:
tags:
- "v*.*.*"
workflow_dispatch: # Allows manual triggering
workflow_dispatch: # Allows manual triggering
permissions:
contents: read
@@ -18,225 +18,233 @@ permissions:
jobs:
helm-chart-check:
# See https://runs-on.com/runners/linux/
runs-on: [runs-on,runner=8cpu-linux-x64,hdd=256,"run-id=${{ github.run_id }}-helm-chart-check"]
runs-on:
[
runs-on,
runner=8cpu-linux-x64,
hdd=256,
"run-id=${{ github.run_id }}-helm-chart-check",
]
timeout-minutes: 45
# fetch-depth 0 is required for helm/chart-testing-action
steps:
- name: Checkout code
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
with:
fetch-depth: 0
persist-credentials: false
- name: Checkout code
uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # ratchet:actions/checkout@v6
with:
fetch-depth: 0
persist-credentials: false
- name: Set up Helm
uses: azure/setup-helm@1a275c3b69536ee54be43f2070a358922e12c8d4 # ratchet:azure/setup-helm@v4.3.1
with:
version: v3.19.0
- name: Set up Helm
uses: azure/setup-helm@1a275c3b69536ee54be43f2070a358922e12c8d4 # ratchet:azure/setup-helm@v4.3.1
with:
version: v3.19.0
- name: Set up chart-testing
uses: helm/chart-testing-action@6ec842c01de15ebb84c8627d2744a0c2f2755c9f # ratchet:helm/chart-testing-action@v2.8.0
- name: Set up chart-testing
# NOTE: This is Jamison's patch from https://github.com/helm/chart-testing-action/pull/194
uses: helm/chart-testing-action@8958a6ac472cbd8ee9a8fbb6f1acbc1b0e966e44 # zizmor: ignore[impostor-commit]
with:
uv_version: "0.9.9"
# even though we specify chart-dirs in ct.yaml, it isn't used by ct for the list-changed command...
- name: Run chart-testing (list-changed)
id: list-changed
env:
DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
run: |
echo "default_branch: ${DEFAULT_BRANCH}"
changed=$(ct list-changed --remote origin --target-branch ${DEFAULT_BRANCH} --chart-dirs deployment/helm/charts)
echo "list-changed output: $changed"
if [[ -n "$changed" ]]; then
echo "changed=true" >> "$GITHUB_OUTPUT"
fi
# uncomment to force run chart-testing
# - name: Force run chart-testing (list-changed)
# id: list-changed
# run: echo "changed=true" >> $GITHUB_OUTPUT
# lint all charts if any changes were detected
- name: Run chart-testing (lint)
if: steps.list-changed.outputs.changed == 'true'
run: ct lint --config ct.yaml --all
# the following would lint only changed charts, but linting isn't expensive
# run: ct lint --config ct.yaml --target-branch ${{ github.event.repository.default_branch }}
- name: Create kind cluster
if: steps.list-changed.outputs.changed == 'true'
uses: helm/kind-action@92086f6be054225fa813e0a4b13787fc9088faab # ratchet:helm/kind-action@v1.13.0
- name: Pre-install cluster status check
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Pre-install Cluster Status ==="
kubectl get nodes -o wide
kubectl get pods --all-namespaces
kubectl get storageclass
- name: Add Helm repositories and update
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Adding Helm repositories ==="
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add vespa https://onyx-dot-app.github.io/vespa-helm-charts
helm repo add cloudnative-pg https://cloudnative-pg.github.io/charts
helm repo add ot-container-kit https://ot-container-kit.github.io/helm-charts
helm repo add minio https://charts.min.io/
helm repo add code-interpreter https://onyx-dot-app.github.io/code-interpreter/
helm repo update
- name: Install Redis operator
if: steps.list-changed.outputs.changed == 'true'
shell: bash
run: |
echo "=== Installing redis-operator CRDs ==="
helm upgrade --install redis-operator ot-container-kit/redis-operator \
--namespace redis-operator --create-namespace --wait --timeout 300s
- name: Pre-pull required images
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Pre-pulling required images to avoid timeout ==="
KIND_CLUSTER=$(kubectl config current-context | sed 's/kind-//')
echo "Kind cluster: $KIND_CLUSTER"
IMAGES=(
"ghcr.io/cloudnative-pg/cloudnative-pg:1.27.0"
"quay.io/opstree/redis:v7.0.15"
"docker.io/onyxdotapp/onyx-web-server:latest"
)
for image in "${IMAGES[@]}"; do
echo "Pre-pulling $image"
if docker pull "$image"; then
kind load docker-image "$image" --name "$KIND_CLUSTER" || echo "Failed to load $image into kind"
else
echo "Failed to pull $image"
# even though we specify chart-dirs in ct.yaml, it isn't used by ct for the list-changed command...
- name: Run chart-testing (list-changed)
id: list-changed
env:
DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
run: |
echo "default_branch: ${DEFAULT_BRANCH}"
changed=$(ct list-changed --remote origin --target-branch ${DEFAULT_BRANCH} --chart-dirs deployment/helm/charts)
echo "list-changed output: $changed"
if [[ -n "$changed" ]]; then
echo "changed=true" >> "$GITHUB_OUTPUT"
fi
done
echo "=== Images loaded into Kind cluster ==="
docker exec "$KIND_CLUSTER"-control-plane crictl images | grep -E "(cloudnative-pg|redis|onyx)" || echo "Some images may still be loading..."
# uncomment to force run chart-testing
# - name: Force run chart-testing (list-changed)
# id: list-changed
# run: echo "changed=true" >> $GITHUB_OUTPUT
# lint all charts if any changes were detected
- name: Run chart-testing (lint)
if: steps.list-changed.outputs.changed == 'true'
run: ct lint --config ct.yaml --all
# the following would lint only changed charts, but linting isn't expensive
# run: ct lint --config ct.yaml --target-branch ${{ github.event.repository.default_branch }}
- name: Validate chart dependencies
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Validating chart dependencies ==="
cd deployment/helm/charts/onyx
helm dependency update
helm lint .
- name: Create kind cluster
if: steps.list-changed.outputs.changed == 'true'
uses: helm/kind-action@92086f6be054225fa813e0a4b13787fc9088faab # ratchet:helm/kind-action@v1.13.0
- name: Run chart-testing (install) with enhanced monitoring
timeout-minutes: 25
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Starting chart installation with monitoring ==="
- name: Pre-install cluster status check
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Pre-install Cluster Status ==="
kubectl get nodes -o wide
kubectl get pods --all-namespaces
kubectl get storageclass
# Function to monitor cluster state
monitor_cluster() {
while true; do
echo "=== Cluster Status Check at $(date) ==="
# Only show non-running pods to reduce noise
NON_RUNNING_PODS=$(kubectl get pods --all-namespaces --field-selector=status.phase!=Running,status.phase!=Succeeded --no-headers 2>/dev/null | wc -l)
if [ "$NON_RUNNING_PODS" -gt 0 ]; then
echo "Non-running pods:"
kubectl get pods --all-namespaces --field-selector=status.phase!=Running,status.phase!=Succeeded
- name: Add Helm repositories and update
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Adding Helm repositories ==="
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add vespa https://onyx-dot-app.github.io/vespa-helm-charts
helm repo add cloudnative-pg https://cloudnative-pg.github.io/charts
helm repo add ot-container-kit https://ot-container-kit.github.io/helm-charts
helm repo add minio https://charts.min.io/
helm repo add code-interpreter https://onyx-dot-app.github.io/code-interpreter/
helm repo update
- name: Install Redis operator
if: steps.list-changed.outputs.changed == 'true'
shell: bash
run: |
echo "=== Installing redis-operator CRDs ==="
helm upgrade --install redis-operator ot-container-kit/redis-operator \
--namespace redis-operator --create-namespace --wait --timeout 300s
- name: Pre-pull required images
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Pre-pulling required images to avoid timeout ==="
KIND_CLUSTER=$(kubectl config current-context | sed 's/kind-//')
echo "Kind cluster: $KIND_CLUSTER"
IMAGES=(
"ghcr.io/cloudnative-pg/cloudnative-pg:1.27.0"
"quay.io/opstree/redis:v7.0.15"
"docker.io/onyxdotapp/onyx-web-server:latest"
)
for image in "${IMAGES[@]}"; do
echo "Pre-pulling $image"
if docker pull "$image"; then
kind load docker-image "$image" --name "$KIND_CLUSTER" || echo "Failed to load $image into kind"
else
echo "All pods running successfully"
echo "Failed to pull $image"
fi
# Only show recent events if there are issues
RECENT_EVENTS=$(kubectl get events --sort-by=.lastTimestamp --all-namespaces --field-selector=type!=Normal 2>/dev/null | tail -5)
if [ -n "$RECENT_EVENTS" ]; then
echo "Recent warnings/errors:"
echo "$RECENT_EVENTS"
fi
sleep 60
done
}
# Start monitoring in background
monitor_cluster &
MONITOR_PID=$!
echo "=== Images loaded into Kind cluster ==="
docker exec "$KIND_CLUSTER"-control-plane crictl images | grep -E "(cloudnative-pg|redis|onyx)" || echo "Some images may still be loading..."
# Set up cleanup
cleanup() {
echo "=== Cleaning up monitoring process ==="
kill $MONITOR_PID 2>/dev/null || true
- name: Validate chart dependencies
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Validating chart dependencies ==="
cd deployment/helm/charts/onyx
helm dependency update
helm lint .
- name: Run chart-testing (install) with enhanced monitoring
timeout-minutes: 25
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Starting chart installation with monitoring ==="
# Function to monitor cluster state
monitor_cluster() {
while true; do
echo "=== Cluster Status Check at $(date) ==="
# Only show non-running pods to reduce noise
NON_RUNNING_PODS=$(kubectl get pods --all-namespaces --field-selector=status.phase!=Running,status.phase!=Succeeded --no-headers 2>/dev/null | wc -l)
if [ "$NON_RUNNING_PODS" -gt 0 ]; then
echo "Non-running pods:"
kubectl get pods --all-namespaces --field-selector=status.phase!=Running,status.phase!=Succeeded
else
echo "All pods running successfully"
fi
# Only show recent events if there are issues
RECENT_EVENTS=$(kubectl get events --sort-by=.lastTimestamp --all-namespaces --field-selector=type!=Normal 2>/dev/null | tail -5)
if [ -n "$RECENT_EVENTS" ]; then
echo "Recent warnings/errors:"
echo "$RECENT_EVENTS"
fi
sleep 60
done
}
# Start monitoring in background
monitor_cluster &
MONITOR_PID=$!
# Set up cleanup
cleanup() {
echo "=== Cleaning up monitoring process ==="
kill $MONITOR_PID 2>/dev/null || true
echo "=== Final cluster state ==="
kubectl get pods --all-namespaces
kubectl get events --all-namespaces --sort-by=.lastTimestamp | tail -20
}
# Trap cleanup on exit
trap cleanup EXIT
# Run the actual installation with detailed logging
echo "=== Starting ct install ==="
set +e
ct install --all \
--helm-extra-set-args="\
--set=nginx.enabled=false \
--set=minio.enabled=false \
--set=vespa.enabled=false \
--set=slackbot.enabled=false \
--set=postgresql.enabled=true \
--set=postgresql.nameOverride=cloudnative-pg \
--set=postgresql.cluster.storage.storageClass=standard \
--set=redis.enabled=true \
--set=redis.storageSpec.volumeClaimTemplate.spec.storageClassName=standard \
--set=webserver.replicaCount=1 \
--set=api.replicaCount=0 \
--set=inferenceCapability.replicaCount=0 \
--set=indexCapability.replicaCount=0 \
--set=celery_beat.replicaCount=0 \
--set=celery_worker_heavy.replicaCount=0 \
--set=celery_worker_docfetching.replicaCount=0 \
--set=celery_worker_docprocessing.replicaCount=0 \
--set=celery_worker_light.replicaCount=0 \
--set=celery_worker_monitoring.replicaCount=0 \
--set=celery_worker_primary.replicaCount=0 \
--set=celery_worker_user_file_processing.replicaCount=0 \
--set=celery_worker_user_files_indexing.replicaCount=0" \
--helm-extra-args="--timeout 900s --debug" \
--debug --config ct.yaml
CT_EXIT=$?
set -e
if [[ $CT_EXIT -ne 0 ]]; then
echo "ct install failed with exit code $CT_EXIT"
exit $CT_EXIT
else
echo "=== Installation completed successfully ==="
fi
kubectl get pods --all-namespaces
- name: Post-install verification
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Post-install verification ==="
kubectl get pods --all-namespaces
kubectl get services --all-namespaces
# Only show issues if they exist
kubectl describe pods --all-namespaces | grep -A 5 -B 2 "Failed\|Error\|Warning" || echo "No pod issues found"
- name: Cleanup on failure
if: failure() && steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Cleanup on failure ==="
echo "=== Final cluster state ==="
kubectl get pods --all-namespaces
kubectl get events --all-namespaces --sort-by=.lastTimestamp | tail -20
}
kubectl get events --all-namespaces --sort-by=.lastTimestamp | tail -10
# Trap cleanup on exit
trap cleanup EXIT
echo "=== Pod descriptions for debugging ==="
kubectl describe pods --all-namespaces | grep -A 10 -B 3 "Failed\|Error\|Warning\|Pending" || echo "No problematic pods found"
# Run the actual installation with detailed logging
echo "=== Starting ct install ==="
set +e
ct install --all \
--helm-extra-set-args="\
--set=nginx.enabled=false \
--set=minio.enabled=false \
--set=vespa.enabled=false \
--set=slackbot.enabled=false \
--set=postgresql.enabled=true \
--set=postgresql.nameOverride=cloudnative-pg \
--set=postgresql.cluster.storage.storageClass=standard \
--set=redis.enabled=true \
--set=redis.storageSpec.volumeClaimTemplate.spec.storageClassName=standard \
--set=webserver.replicaCount=1 \
--set=api.replicaCount=0 \
--set=inferenceCapability.replicaCount=0 \
--set=indexCapability.replicaCount=0 \
--set=celery_beat.replicaCount=0 \
--set=celery_worker_heavy.replicaCount=0 \
--set=celery_worker_docfetching.replicaCount=0 \
--set=celery_worker_docprocessing.replicaCount=0 \
--set=celery_worker_light.replicaCount=0 \
--set=celery_worker_monitoring.replicaCount=0 \
--set=celery_worker_primary.replicaCount=0 \
--set=celery_worker_user_file_processing.replicaCount=0 \
--set=celery_worker_user_files_indexing.replicaCount=0" \
--helm-extra-args="--timeout 900s --debug" \
--debug --config ct.yaml
CT_EXIT=$?
set -e
echo "=== Recent logs for debugging ==="
kubectl logs --all-namespaces --tail=50 | grep -i "error\|timeout\|failed\|pull" || echo "No error logs found"
if [[ $CT_EXIT -ne 0 ]]; then
echo "ct install failed with exit code $CT_EXIT"
exit $CT_EXIT
else
echo "=== Installation completed successfully ==="
fi
kubectl get pods --all-namespaces
- name: Post-install verification
if: steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Post-install verification ==="
kubectl get pods --all-namespaces
kubectl get services --all-namespaces
# Only show issues if they exist
kubectl describe pods --all-namespaces | grep -A 5 -B 2 "Failed\|Error\|Warning" || echo "No pod issues found"
- name: Cleanup on failure
if: failure() && steps.list-changed.outputs.changed == 'true'
run: |
echo "=== Cleanup on failure ==="
echo "=== Final cluster state ==="
kubectl get pods --all-namespaces
kubectl get events --all-namespaces --sort-by=.lastTimestamp | tail -10
echo "=== Pod descriptions for debugging ==="
kubectl describe pods --all-namespaces | grep -A 10 -B 3 "Failed\|Error\|Warning\|Pending" || echo "No problematic pods found"
echo "=== Recent logs for debugging ==="
kubectl logs --all-namespaces --tail=50 | grep -i "error\|timeout\|failed\|pull" || echo "No error logs found"
echo "=== Helm releases ==="
helm list --all-namespaces
# the following would install only changed charts, but we only have one chart so
# don't worry about that for now
# run: ct install --target-branch ${{ github.event.repository.default_branch }}
echo "=== Helm releases ==="
helm list --all-namespaces
# the following would install only changed charts, but we only have one chart so
# don't worry about that for now
# run: ct install --target-branch ${{ github.event.repository.default_branch }}

View File

@@ -56,7 +56,7 @@ jobs:
id: set-matrix
run: |
# Find all leaf-level directories in both test directories
tests_dirs=$(find backend/tests/integration/tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" -exec basename {} \; | sort)
tests_dirs=$(find backend/tests/integration/tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" ! -name "mcp" -exec basename {} \; | sort)
connector_dirs=$(find backend/tests/integration/connector_job_tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" -exec basename {} \; | sort)
# Create JSON array with directory info
@@ -310,7 +310,9 @@ jobs:
ONYX_MODEL_SERVER_IMAGE=${ECR_CACHE}:integration-test-model-server-test-${RUN_ID}
INTEGRATION_TESTS_MODE=true
CHECK_TTL_MANAGEMENT_TASK_FREQUENCY_IN_HOURS=0.001
AUTO_LLM_UPDATE_INTERVAL_SECONDS=10
MCP_SERVER_ENABLED=true
USE_LIGHTWEIGHT_BACKGROUND_WORKER=false
EOF
- name: Start Docker containers
@@ -324,7 +326,6 @@ jobs:
api_server \
inference_model_server \
indexing_model_server \
mcp_server \
background \
-d
id: start_docker
@@ -367,12 +368,6 @@ jobs:
}
wait_for_service "http://localhost:8080/health" "API server"
test_dir="${{ matrix.test-dir.path }}"
if [ "$test_dir" = "tests/mcp" ]; then
wait_for_service "http://localhost:8090/health" "MCP server"
else
echo "Skipping MCP server wait for non-MCP suite: $test_dir"
fi
echo "Finished waiting for services."
- name: Start Mock Services
@@ -402,8 +397,6 @@ jobs:
-e VESPA_HOST=index \
-e REDIS_HOST=cache \
-e API_SERVER_HOST=api_server \
-e MCP_SERVER_HOST=mcp_server \
-e MCP_SERVER_PORT=8090 \
-e OPENAI_API_KEY=${OPENAI_API_KEY} \
-e EXA_API_KEY=${EXA_API_KEY} \
-e SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN} \
@@ -446,7 +439,7 @@ jobs:
- name: Upload logs
if: always()
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # ratchet:actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
with:
name: docker-all-logs-${{ matrix.test-dir.name }}
path: ${{ github.workspace }}/docker-compose.log
@@ -488,10 +481,10 @@ jobs:
AUTH_TYPE=cloud \
REQUIRE_EMAIL_VERIFICATION=false \
DISABLE_TELEMETRY=true \
OPENAI_DEFAULT_API_KEY=${OPENAI_API_KEY} \
ONYX_BACKEND_IMAGE=${ECR_CACHE}:integration-test-backend-test-${RUN_ID} \
ONYX_MODEL_SERVER_IMAGE=${ECR_CACHE}:integration-test-model-server-test-${RUN_ID} \
DEV_MODE=true \
MCP_SERVER_ENABLED=true \
docker compose -f docker-compose.multitenant-dev.yml up \
relational_db \
index \
@@ -500,7 +493,6 @@ jobs:
api_server \
inference_model_server \
indexing_model_server \
mcp_server \
background \
-d
id: start_docker_multi_tenant
@@ -549,8 +541,6 @@ jobs:
-e VESPA_HOST=index \
-e REDIS_HOST=cache \
-e API_SERVER_HOST=api_server \
-e MCP_SERVER_HOST=mcp_server \
-e MCP_SERVER_PORT=8090 \
-e OPENAI_API_KEY=${OPENAI_API_KEY} \
-e EXA_API_KEY=${EXA_API_KEY} \
-e SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN} \
@@ -578,7 +568,7 @@ jobs:
- name: Upload logs (multi-tenant)
if: always()
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # ratchet:actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
with:
name: docker-all-logs-multitenant
path: ${{ github.workspace }}/docker-compose-multitenant.log

View File

@@ -44,7 +44,7 @@ jobs:
- name: Upload coverage reports
if: always()
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # ratchet:actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
with:
name: jest-coverage-${{ github.run_id }}
path: ./web/coverage

View File

@@ -48,7 +48,7 @@ jobs:
id: set-matrix
run: |
# Find all leaf-level directories in both test directories
tests_dirs=$(find backend/tests/integration/tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" -exec basename {} \; | sort)
tests_dirs=$(find backend/tests/integration/tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" ! -name "mcp" -exec basename {} \; | sort)
connector_dirs=$(find backend/tests/integration/connector_job_tests -mindepth 1 -maxdepth 1 -type d ! -name "__pycache__" -exec basename {} \; | sort)
# Create JSON array with directory info
@@ -301,6 +301,7 @@ jobs:
ONYX_MODEL_SERVER_IMAGE=${ECR_CACHE}:integration-test-model-server-test-${RUN_ID}
INTEGRATION_TESTS_MODE=true
MCP_SERVER_ENABLED=true
AUTO_LLM_UPDATE_INTERVAL_SECONDS=10
EOF
- name: Start Docker containers
@@ -314,7 +315,6 @@ jobs:
api_server \
inference_model_server \
indexing_model_server \
mcp_server \
background \
-d
id: start_docker
@@ -357,12 +357,6 @@ jobs:
}
wait_for_service "http://localhost:8080/health" "API server"
test_dir="${{ matrix.test-dir.path }}"
if [ "$test_dir" = "tests/mcp" ]; then
wait_for_service "http://localhost:8090/health" "MCP server"
else
echo "Skipping MCP server wait for non-MCP suite: $test_dir"
fi
echo "Finished waiting for services."
- name: Start Mock Services
@@ -393,8 +387,6 @@ jobs:
-e VESPA_HOST=index \
-e REDIS_HOST=cache \
-e API_SERVER_HOST=api_server \
-e MCP_SERVER_HOST=mcp_server \
-e MCP_SERVER_PORT=8090 \
-e OPENAI_API_KEY=${OPENAI_API_KEY} \
-e EXA_API_KEY=${EXA_API_KEY} \
-e SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN} \
@@ -432,7 +424,7 @@ jobs:
- name: Upload logs
if: always()
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # ratchet:actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
with:
name: docker-all-logs-${{ matrix.test-dir.name }}
path: ${{ github.workspace }}/docker-compose.log

View File

@@ -435,7 +435,7 @@ jobs:
fi
npx playwright test --project ${PROJECT}
- uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # ratchet:actions/upload-artifact@v4
- uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
if: always()
with:
# Includes test results and trace.zip files
@@ -455,7 +455,7 @@ jobs:
- name: Upload logs
if: success() || failure()
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # ratchet:actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
with:
name: docker-logs-${{ matrix.project }}-${{ github.run_id }}
path: ${{ github.workspace }}/docker-compose.log

View File

@@ -50,8 +50,9 @@ jobs:
uses: runs-on/cache@50350ad4242587b6c8c2baa2e740b1bc11285ff4 # ratchet:runs-on/cache@v4
with:
path: backend/.mypy_cache
key: mypy-${{ runner.os }}-${{ hashFiles('**/*.py', '**/*.pyi', 'backend/pyproject.toml') }}
key: mypy-${{ runner.os }}-${{ github.base_ref || github.event.merge_group.base_ref || 'main' }}-${{ hashFiles('**/*.py', '**/*.pyi', 'backend/pyproject.toml') }}
restore-keys: |
mypy-${{ runner.os }}-${{ github.base_ref || github.event.merge_group.base_ref || 'main' }}-
mypy-${{ runner.os }}-
- name: Run MyPy

View File

@@ -144,7 +144,7 @@ jobs:
- name: Upload logs
if: always()
uses: actions/upload-artifact@330a01c490aca151604b8cf639adc76d48f6c5d4 # ratchet:actions/upload-artifact@v4
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
with:
name: docker-all-logs
path: ${{ github.workspace }}/docker-compose.log

View File

@@ -21,18 +21,29 @@ jobs:
with:
persist-credentials: false
- name: Detect changes
id: filter
uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # ratchet:dorny/paths-filter@v3
with:
filters: |
zizmor:
- '.github/**'
- name: Install the latest version of uv
if: steps.filter.outputs.zizmor == 'true' || github.ref_name == 'main'
uses: astral-sh/setup-uv@ed21f2f24f8dd64503750218de024bcf64c7250a # ratchet:astral-sh/setup-uv@v7
with:
enable-cache: false
version: "0.9.9"
- name: Run zizmor
if: steps.filter.outputs.zizmor == 'true' || github.ref_name == 'main'
run: uv run --no-sync --with zizmor zizmor --format=sarif . > results.sarif
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Upload SARIF file
if: steps.filter.outputs.zizmor == 'true' || github.ref_name == 'main'
uses: github/codeql-action/upload-sarif@ba454b8ab46733eb6145342877cd148270bb77ab # ratchet:github/codeql-action/upload-sarif@codeql-bundle-v2.23.5
with:
sarif_file: results.sarif

1
.gitignore vendored
View File

@@ -21,6 +21,7 @@ backend/tests/regression/search_quality/*.json
backend/onyx/evals/data/
backend/onyx/evals/one_off/*.json
*.log
*.csv
# secret files
.env

View File

@@ -9,9 +9,8 @@ repos:
rev: d30b4298e4fb63ce8609e29acdbcf4c9018a483c
hooks:
- id: uv-sync
args: ["--active", "--locked", "--all-extras"]
args: ["--locked", "--all-extras"]
- id: uv-lock
files: ^pyproject\.toml$
- id: uv-export
name: uv-export default.txt
args:

View File

@@ -1,36 +1,45 @@
# Copy this file to .env in the .vscode folder
# Fill in the <REPLACE THIS> values as needed, it is recommended to set the GEN_AI_API_KEY value to avoid having to set up an LLM in the UI
# Also check out onyx/backend/scripts/restart_containers.sh for a script to restart the containers which Onyx relies on outside of VSCode/Cursor processes
# Copy this file to .env in the .vscode folder.
# Fill in the <REPLACE THIS> values as needed; it is recommended to set the
# GEN_AI_API_KEY value to avoid having to set up an LLM in the UI.
# Also check out onyx/backend/scripts/restart_containers.sh for a script to
# restart the containers which Onyx relies on outside of VSCode/Cursor
# processes.
# For local dev, often user Authentication is not needed
# For local dev, often user Authentication is not needed.
AUTH_TYPE=disabled
# Always keep these on for Dev
# Logs model prompts, reasoning, and answer to stdout
# Always keep these on for Dev.
# Logs model prompts, reasoning, and answer to stdout.
LOG_ONYX_MODEL_INTERACTIONS=True
# More verbose logging
LOG_LEVEL=debug
# This passes top N results to LLM an additional time for reranking prior to answer generation
# This step is quite heavy on token usage so we disable it for dev generally
# This passes top N results to LLM an additional time for reranking prior to
# answer generation.
# This step is quite heavy on token usage so we disable it for dev generally.
DISABLE_LLM_DOC_RELEVANCE=False
# Useful if you want to toggle auth on/off (google_oauth/OIDC specifically)
# Useful if you want to toggle auth on/off (google_oauth/OIDC specifically).
OAUTH_CLIENT_ID=<REPLACE THIS>
OAUTH_CLIENT_SECRET=<REPLACE THIS>
OPENID_CONFIG_URL=<REPLACE THIS>
SAML_CONF_DIR=/<ABSOLUTE PATH TO ONYX>/onyx/backend/ee/onyx/configs/saml_config
# Generally not useful for dev, we don't generally want to set up an SMTP server for dev
# Generally not useful for dev, we don't generally want to set up an SMTP server
# for dev.
REQUIRE_EMAIL_VERIFICATION=False
# Set these so if you wipe the DB, you don't end up having to go through the UI every time
# Set these so if you wipe the DB, you don't end up having to go through the UI
# every time.
GEN_AI_API_KEY=<REPLACE THIS>
OPENAI_API_KEY=<REPLACE THIS>
# If answer quality isn't important for dev, use gpt-4o-mini since it's cheaper
# If answer quality isn't important for dev, use gpt-4o-mini since it's cheaper.
GEN_AI_MODEL_VERSION=gpt-4o
FAST_GEN_AI_MODEL_VERSION=gpt-4o
@@ -40,26 +49,36 @@ PYTHONPATH=../backend
PYTHONUNBUFFERED=1
# Enable the full set of Danswer Enterprise Edition features
# NOTE: DO NOT ENABLE THIS UNLESS YOU HAVE A PAID ENTERPRISE LICENSE (or if you are using this for local testing/development)
# Enable the full set of Danswer Enterprise Edition features.
# NOTE: DO NOT ENABLE THIS UNLESS YOU HAVE A PAID ENTERPRISE LICENSE (or if you
# are using this for local testing/development).
ENABLE_PAID_ENTERPRISE_EDITION_FEATURES=False
# S3 File Store Configuration (MinIO for local development)
S3_ENDPOINT_URL=http://localhost:9004
S3_FILE_STORE_BUCKET_NAME=onyx-file-store-bucket
S3_AWS_ACCESS_KEY_ID=minioadmin
S3_AWS_SECRET_ACCESS_KEY=minioadmin
# Show extra/uncommon connectors
# Show extra/uncommon connectors.
SHOW_EXTRA_CONNECTORS=True
# Local langsmith tracing
LANGSMITH_TRACING="true"
LANGSMITH_ENDPOINT="https://api.smith.langchain.com"
LANGSMITH_API_KEY=<REPLACE_THIS>
LANGSMITH_PROJECT=<REPLACE_THIS>
# Local Confluence OAuth testing
# OAUTH_CONFLUENCE_CLOUD_CLIENT_ID=<REPLACE_THIS>
# OAUTH_CONFLUENCE_CLOUD_CLIENT_SECRET=<REPLACE_THIS>
# NEXT_PUBLIC_TEST_ENV=True
# NEXT_PUBLIC_TEST_ENV=True
# OpenSearch
# Arbitrary password is fine for local development.
OPENSEARCH_INITIAL_ADMIN_PASSWORD=<REPLACE THIS>

View File

@@ -512,6 +512,21 @@
"group": "3"
}
},
{
"name": "Clear and Restart OpenSearch Container",
// Generic debugger type, required arg but has no bearing on bash.
"type": "node",
"request": "launch",
"runtimeExecutable": "bash",
"runtimeArgs": [
"${workspaceFolder}/backend/scripts/restart_opensearch_container.sh"
],
"cwd": "${workspaceFolder}",
"console": "integratedTerminal",
"presentation": {
"group": "3"
}
},
{
"name": "Eval CLI",
"type": "debugpy",

View File

@@ -1,13 +1,13 @@
# AGENTS.md
This file provides guidance to Codex when working with code in this repository.
This file provides guidance to AI agents when working with code in this repository.
## KEY NOTES
- If you run into any missing python dependency errors, try running your command with `source backend/.venv/bin/activate` \
- If you run into any missing python dependency errors, try running your command with `source .venv/bin/activate` \
to assume the python venv.
- To make tests work, check the `.env` file at the root of the project to find an OpenAI key.
- If using `playwright` to explore the frontend, you can usually log in with username `a@test.com` and password
- If using `playwright` to explore the frontend, you can usually log in with username `a@example.com` and password
`a`. The app can be accessed at `http://localhost:3000`.
- You should assume that all Onyx services are running. To verify, you can check the `backend/log` directory to
make sure we see logs coming out from the relevant service.
@@ -181,6 +181,286 @@ web/
└── src/lib/ # Utilities & business logic
```
## Frontend Standards
### 1. Import Standards
**Always use absolute imports with the `@` prefix.**
**Reason:** Moving files around becomes easier since you don't also have to update those import statements. This makes modifications to the codebase much nicer.
```typescript
// ✅ Good
import { Button } from "@/components/ui/button";
import { useAuth } from "@/hooks/useAuth";
import { Text } from "@/refresh-components/texts/Text";
// ❌ Bad
import { Button } from "../../../components/ui/button";
import { useAuth } from "./hooks/useAuth";
```
### 2. React Component Functions
**Prefer regular functions over arrow functions for React components.**
**Reason:** Functions just become easier to read.
```typescript
// ✅ Good
function UserProfile({ userId }: UserProfileProps) {
return <div>User Profile</div>
}
// ❌ Bad
const UserProfile = ({ userId }: UserProfileProps) => {
return <div>User Profile</div>
}
```
### 3. Props Interface Extraction
**Extract prop types into their own interface definitions.**
**Reason:** Functions just become easier to read.
```typescript
// ✅ Good
interface UserCardProps {
user: User
showActions?: boolean
onEdit?: (userId: string) => void
}
function UserCard({ user, showActions = false, onEdit }: UserCardProps) {
return <div>User Card</div>
}
// ❌ Bad
function UserCard({
user,
showActions = false,
onEdit
}: {
user: User
showActions?: boolean
onEdit?: (userId: string) => void
}) {
return <div>User Card</div>
}
```
### 4. Spacing Guidelines
**Prefer padding over margins for spacing.**
**Reason:** We want to consolidate usage to paddings instead of margins.
```typescript
// ✅ Good
<div className="p-4 space-y-2">
<div className="p-2">Content</div>
</div>
// ❌ Bad
<div className="m-4 space-y-2">
<div className="m-2">Content</div>
</div>
```
### 5. Tailwind Dark Mode
**Strictly forbid using the `dark:` modifier in Tailwind classes, except for logo icon handling.**
**Reason:** The `colors.css` file already, VERY CAREFULLY, defines what the exact opposite colour of each light-mode colour is. Overriding this behaviour is VERY bad and will lead to horrible UI breakages.
**Exception:** The `createLogoIcon` helper in `web/src/components/icons/icons.tsx` uses `dark:` modifiers (`dark:invert`, `dark:hidden`, `dark:block`) to handle third-party logo icons that cannot automatically adapt through `colors.css`. This is the ONLY acceptable use of dark mode modifiers.
```typescript
// ✅ Good - Standard components use `web/tailwind-themes/tailwind.config.js` / `web/src/app/css/colors.css`
<div className="bg-background-neutral-03 text-text-02">
Content
</div>
// ✅ Good - Logo icons with dark mode handling via createLogoIcon
export const GithubIcon = createLogoIcon(githubLightIcon, {
monochromatic: true, // Will apply dark:invert internally
});
export const GitbookIcon = createLogoIcon(gitbookLightIcon, {
darkSrc: gitbookDarkIcon, // Will use dark:hidden/dark:block internally
});
// ❌ Bad - Manual dark mode overrides
<div className="bg-white dark:bg-black text-black dark:text-white">
Content
</div>
```
### 6. Class Name Utilities
**Use the `cn` utility instead of raw string formatting for classNames.**
**Reason:** `cn`s are easier to read. They also allow for more complex types (i.e., string-arrays) to get formatted properly (it flattens each element in that string array down). As a result, it can allow things such as conditionals (i.e., `myCondition && "some-tailwind-class"`, which evaluates to `false` when `myCondition` is `false`) to get filtered out.
```typescript
import { cn } from '@/lib/utils'
// ✅ Good
<div className={cn(
'base-class',
isActive && 'active-class',
className
)}>
Content
</div>
// ❌ Bad
<div className={`base-class ${isActive ? 'active-class' : ''} ${className}`}>
Content
</div>
```
### 7. Custom Hooks Organization
**Follow a "hook-per-file" layout. Each hook should live in its own file within `web/src/hooks`.**
**Reason:** This is just a layout preference. Keeps code clean.
```typescript
// web/src/hooks/useUserData.ts
export function useUserData(userId: string) {
// hook implementation
}
// web/src/hooks/useLocalStorage.ts
export function useLocalStorage<T>(key: string, initialValue: T) {
// hook implementation
}
```
### 8. Icon Usage
**ONLY use icons from the `web/src/icons` directory. Do NOT use icons from `react-icons`, `lucide`, or other external libraries.**
**Reason:** We have a very carefully curated selection of icons that match our Onyx guidelines. We do NOT want to muddy those up with different aesthetic stylings.
```typescript
// ✅ Good
import SvgX from "@/icons/x";
import SvgMoreHorizontal from "@/icons/more-horizontal";
// ❌ Bad
import { User } from "lucide-react";
import { FiSearch } from "react-icons/fi";
```
**Missing Icons**: If an icon is needed but doesn't exist in the `web/src/icons` directory, import it from Figma using the Figma MCP tool and add it to the icons directory.
If you need help with this step, reach out to `raunak@onyx.app`.
### 9. Text Rendering
**Prefer using the `refresh-components/texts/Text` component for all text rendering. Avoid "naked" text nodes.**
**Reason:** The `Text` component is fully compliant with the stylings provided in Figma. It provides easy utilities to specify the text-colour and font-size in the form of flags. Super duper easy.
```typescript
// ✅ Good
import { Text } from '@/refresh-components/texts/Text'
function UserCard({ name }: { name: string }) {
return (
<Text
{/* The `text03` flag makes the text it renders to be coloured the 3rd-scale grey */}
text03
{/* The `mainAction` flag makes the text it renders to be "main-action" font + line-height + weightage, as described in the Figma */}
mainAction
>
{name}
</Text>
)
}
// ❌ Bad
function UserCard({ name }: { name: string }) {
return (
<div>
<h2>{name}</h2>
<p>User details</p>
</div>
)
}
```
### 10. Component Usage
**Heavily avoid raw HTML input components. Always use components from the `web/src/refresh-components` or `web/lib/opal/src` directory.**
**Reason:** We've put in a lot of effort to unify the components that are rendered in the Onyx app. Using raw components breaks the entire UI of the application, and leaves it in a muddier state than before.
```typescript
// ✅ Good
import Button from '@/refresh-components/buttons/Button'
import InputTypeIn from '@/refresh-components/inputs/InputTypeIn'
import SvgPlusCircle from '@/icons/plus-circle'
function ContactForm() {
return (
<form>
<InputTypeIn placeholder="Search..." />
<Button type="submit" leftIcon={SvgPlusCircle}>Submit</Button>
</form>
)
}
// ❌ Bad
function ContactForm() {
return (
<form>
<input placeholder="Name" />
<textarea placeholder="Message" />
<button type="submit">Submit</button>
</form>
)
}
```
### 11. Colors
**Always use custom overrides for colors and borders rather than built in Tailwind CSS colors. These overrides live in `web/tailwind-themes/tailwind.config.js`.**
**Reason:** Our custom color system uses CSS variables that automatically handle dark mode and maintain design consistency across the app. Standard Tailwind colors bypass this system.
**Available color categories:**
- **Text:** `text-01` through `text-05`, `text-inverted-XX`
- **Backgrounds:** `background-neutral-XX`, `background-tint-XX` (and inverted variants)
- **Borders:** `border-01` through `border-05`, `border-inverted-XX`
- **Actions:** `action-link-XX`, `action-danger-XX`
- **Status:** `status-info-XX`, `status-success-XX`, `status-warning-XX`, `status-error-XX`
- **Theme:** `theme-primary-XX`, `theme-red-XX`, `theme-blue-XX`, etc.
```typescript
// ✅ Good - Use custom Onyx color classes
<div className="bg-background-neutral-01 border border-border-02" />
<div className="bg-background-tint-02 border border-border-01" />
<div className="bg-status-success-01" />
<div className="bg-action-link-01" />
<div className="bg-theme-primary-05" />
// ❌ Bad - Do NOT use standard Tailwind colors
<div className="bg-gray-100 border border-gray-300 text-gray-600" />
<div className="bg-white border border-slate-200" />
<div className="bg-green-100 text-green-700" />
<div className="bg-blue-100 text-blue-600" />
<div className="bg-indigo-500" />
```
### 12. Data Fetching
**Prefer using `useSWR` for data fetching. Data should generally be fetched on the client side. Components that need data should display a loader / placeholder while waiting for that data. Prefer loading data within the component that needs it rather than at the top level and passing it down.**
**Reason:** Client side fetching allows us to load the skeleton of the page without waiting for data to load, leading to a snappier UX. Loading data where needed reduces dependencies between a component and its parent component(s).
## Database & Migrations
### Running Migrations
@@ -295,14 +575,6 @@ will be tailing their logs to this file.
- Token management and rate limiting
- Custom prompts and agent actions
## UI/UX Patterns
- Tailwind CSS with design system in `web/src/components/ui/`
- Radix UI and Headless UI for accessible components
- SWR for data fetching and caching
- Form validation with react-hook-form
- Error handling with popup notifications
## Creating a Plan
When creating a plan in the `plans` directory, make sure to include at least these elements:

View File

@@ -7,7 +7,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
- If you run into any missing python dependency errors, try running your command with `source .venv/bin/activate` \
to assume the python venv.
- To make tests work, check the `.env` file at the root of the project to find an OpenAI key.
- If using `playwright` to explore the frontend, you can usually log in with username `a@test.com` and password
- If using `playwright` to explore the frontend, you can usually log in with username `a@example.com` and password
`a`. The app can be accessed at `http://localhost:3000`.
- You should assume that all Onyx services are running. To verify, you can check the `backend/log` directory to
make sure we see logs coming out from the relevant service.
@@ -184,6 +184,286 @@ web/
└── src/lib/ # Utilities & business logic
```
## Frontend Standards
### 1. Import Standards
**Always use absolute imports with the `@` prefix.**
**Reason:** Moving files around becomes easier since you don't also have to update those import statements. This makes modifications to the codebase much nicer.
```typescript
// ✅ Good
import { Button } from "@/components/ui/button";
import { useAuth } from "@/hooks/useAuth";
import { Text } from "@/refresh-components/texts/Text";
// ❌ Bad
import { Button } from "../../../components/ui/button";
import { useAuth } from "./hooks/useAuth";
```
### 2. React Component Functions
**Prefer regular functions over arrow functions for React components.**
**Reason:** Functions just become easier to read.
```typescript
// ✅ Good
function UserProfile({ userId }: UserProfileProps) {
return <div>User Profile</div>
}
// ❌ Bad
const UserProfile = ({ userId }: UserProfileProps) => {
return <div>User Profile</div>
}
```
### 3. Props Interface Extraction
**Extract prop types into their own interface definitions.**
**Reason:** Functions just become easier to read.
```typescript
// ✅ Good
interface UserCardProps {
user: User
showActions?: boolean
onEdit?: (userId: string) => void
}
function UserCard({ user, showActions = false, onEdit }: UserCardProps) {
return <div>User Card</div>
}
// ❌ Bad
function UserCard({
user,
showActions = false,
onEdit
}: {
user: User
showActions?: boolean
onEdit?: (userId: string) => void
}) {
return <div>User Card</div>
}
```
### 4. Spacing Guidelines
**Prefer padding over margins for spacing.**
**Reason:** We want to consolidate usage to paddings instead of margins.
```typescript
// ✅ Good
<div className="p-4 space-y-2">
<div className="p-2">Content</div>
</div>
// ❌ Bad
<div className="m-4 space-y-2">
<div className="m-2">Content</div>
</div>
```
### 5. Tailwind Dark Mode
**Strictly forbid using the `dark:` modifier in Tailwind classes, except for logo icon handling.**
**Reason:** The `colors.css` file already, VERY CAREFULLY, defines what the exact opposite colour of each light-mode colour is. Overriding this behaviour is VERY bad and will lead to horrible UI breakages.
**Exception:** The `createLogoIcon` helper in `web/src/components/icons/icons.tsx` uses `dark:` modifiers (`dark:invert`, `dark:hidden`, `dark:block`) to handle third-party logo icons that cannot automatically adapt through `colors.css`. This is the ONLY acceptable use of dark mode modifiers.
```typescript
// ✅ Good - Standard components use `tailwind-themes/tailwind.config.js` / `src/app/css/colors.css`
<div className="bg-background-neutral-03 text-text-02">
Content
</div>
// ✅ Good - Logo icons with dark mode handling via createLogoIcon
export const GithubIcon = createLogoIcon(githubLightIcon, {
monochromatic: true, // Will apply dark:invert internally
});
export const GitbookIcon = createLogoIcon(gitbookLightIcon, {
darkSrc: gitbookDarkIcon, // Will use dark:hidden/dark:block internally
});
// ❌ Bad - Manual dark mode overrides
<div className="bg-white dark:bg-black text-black dark:text-white">
Content
</div>
```
### 6. Class Name Utilities
**Use the `cn` utility instead of raw string formatting for classNames.**
**Reason:** `cn`s are easier to read. They also allow for more complex types (i.e., string-arrays) to get formatted properly (it flattens each element in that string array down). As a result, it can allow things such as conditionals (i.e., `myCondition && "some-tailwind-class"`, which evaluates to `false` when `myCondition` is `false`) to get filtered out.
```typescript
import { cn } from '@/lib/utils'
// ✅ Good
<div className={cn(
'base-class',
isActive && 'active-class',
className
)}>
Content
</div>
// ❌ Bad
<div className={`base-class ${isActive ? 'active-class' : ''} ${className}`}>
Content
</div>
```
### 7. Custom Hooks Organization
**Follow a "hook-per-file" layout. Each hook should live in its own file within `web/src/hooks`.**
**Reason:** This is just a layout preference. Keeps code clean.
```typescript
// web/src/hooks/useUserData.ts
export function useUserData(userId: string) {
// hook implementation
}
// web/src/hooks/useLocalStorage.ts
export function useLocalStorage<T>(key: string, initialValue: T) {
// hook implementation
}
```
### 8. Icon Usage
**ONLY use icons from the `web/src/icons` directory. Do NOT use icons from `react-icons`, `lucide`, or other external libraries.**
**Reason:** We have a very carefully curated selection of icons that match our Onyx guidelines. We do NOT want to muddy those up with different aesthetic stylings.
```typescript
// ✅ Good
import SvgX from "@/icons/x";
import SvgMoreHorizontal from "@/icons/more-horizontal";
// ❌ Bad
import { User } from "lucide-react";
import { FiSearch } from "react-icons/fi";
```
**Missing Icons**: If an icon is needed but doesn't exist in the `web/src/icons` directory, import it from Figma using the Figma MCP tool and add it to the icons directory.
If you need help with this step, reach out to `raunak@onyx.app`.
### 9. Text Rendering
**Prefer using the `refresh-components/texts/Text` component for all text rendering. Avoid "naked" text nodes.**
**Reason:** The `Text` component is fully compliant with the stylings provided in Figma. It provides easy utilities to specify the text-colour and font-size in the form of flags. Super duper easy.
```typescript
// ✅ Good
import { Text } from '@/refresh-components/texts/Text'
function UserCard({ name }: { name: string }) {
return (
<Text
{/* The `text03` flag makes the text it renders to be coloured the 3rd-scale grey */}
text03
{/* The `mainAction` flag makes the text it renders to be "main-action" font + line-height + weightage, as described in the Figma */}
mainAction
>
{name}
</Text>
)
}
// ❌ Bad
function UserCard({ name }: { name: string }) {
return (
<div>
<h2>{name}</h2>
<p>User details</p>
</div>
)
}
```
### 10. Component Usage
**Heavily avoid raw HTML input components. Always use components from the `web/src/refresh-components` or `web/lib/opal/src` directory.**
**Reason:** We've put in a lot of effort to unify the components that are rendered in the Onyx app. Using raw components breaks the entire UI of the application, and leaves it in a muddier state than before.
```typescript
// ✅ Good
import Button from '@/refresh-components/buttons/Button'
import InputTypeIn from '@/refresh-components/inputs/InputTypeIn'
import SvgPlusCircle from '@/icons/plus-circle'
function ContactForm() {
return (
<form>
<InputTypeIn placeholder="Search..." />
<Button type="submit" leftIcon={SvgPlusCircle}>Submit</Button>
</form>
)
}
// ❌ Bad
function ContactForm() {
return (
<form>
<input placeholder="Name" />
<textarea placeholder="Message" />
<button type="submit">Submit</button>
</form>
)
}
```
### 11. Colors
**Always use custom overrides for colors and borders rather than built in Tailwind CSS colors. These overrides live in `web/tailwind-themes/tailwind.config.js`.**
**Reason:** Our custom color system uses CSS variables that automatically handle dark mode and maintain design consistency across the app. Standard Tailwind colors bypass this system.
**Available color categories:**
- **Text:** `text-01` through `text-05`, `text-inverted-XX`
- **Backgrounds:** `background-neutral-XX`, `background-tint-XX` (and inverted variants)
- **Borders:** `border-01` through `border-05`, `border-inverted-XX`
- **Actions:** `action-link-XX`, `action-danger-XX`
- **Status:** `status-info-XX`, `status-success-XX`, `status-warning-XX`, `status-error-XX`
- **Theme:** `theme-primary-XX`, `theme-red-XX`, `theme-blue-XX`, etc.
```typescript
// ✅ Good - Use custom Onyx color classes
<div className="bg-background-neutral-01 border border-border-02" />
<div className="bg-background-tint-02 border border-border-01" />
<div className="bg-status-success-01" />
<div className="bg-action-link-01" />
<div className="bg-theme-primary-05" />
// ❌ Bad - Do NOT use standard Tailwind colors
<div className="bg-gray-100 border border-gray-300 text-gray-600" />
<div className="bg-white border border-slate-200" />
<div className="bg-green-100 text-green-700" />
<div className="bg-blue-100 text-blue-600" />
<div className="bg-indigo-500" />
```
### 12. Data Fetching
**Prefer using `useSWR` for data fetching. Data should generally be fetched on the client side. Components that need data should display a loader / placeholder while waiting for that data. Prefer loading data within the component that needs it rather than at the top level and passing it down.**
**Reason:** Client side fetching allows us to load the skeleton of the page without waiting for data to load, leading to a snappier UX. Loading data where needed reduces dependencies between a component and its parent component(s).
## Database & Migrations
### Running Migrations
@@ -300,14 +580,6 @@ will be tailing their logs to this file.
- Token management and rate limiting
- Custom prompts and agent actions
## UI/UX Patterns
- Tailwind CSS with design system in `web/src/components/ui/`
- Radix UI and Headless UI for accessible components
- SWR for data fetching and caching
- Form validation with react-hook-form
- Error handling with popup notifications
## Creating a Plan
When creating a plan in the `plans` directory, make sure to include at least these elements:

View File

@@ -225,7 +225,6 @@ def do_run_migrations(
) -> None:
if create_schema:
connection.execute(text(f'CREATE SCHEMA IF NOT EXISTS "{schema_name}"'))
connection.execute(text("COMMIT"))
connection.execute(text(f'SET search_path TO "{schema_name}"'))
@@ -309,6 +308,7 @@ async def run_async_migrations() -> None:
schema_name=schema,
create_schema=create_schema,
)
await connection.commit()
except Exception as e:
logger.error(f"Error migrating schema {schema}: {e}")
if not continue_on_error:
@@ -346,6 +346,7 @@ async def run_async_migrations() -> None:
schema_name=schema,
create_schema=create_schema,
)
await connection.commit()
except Exception as e:
logger.error(f"Error migrating schema {schema}: {e}")
if not continue_on_error:

View File

@@ -0,0 +1,46 @@
"""usage_limits
Revision ID: 2b90f3af54b8
Revises: 9a0296d7421e
Create Date: 2026-01-03 16:55:30.449692
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "2b90f3af54b8"
down_revision = "9a0296d7421e"
branch_labels = None
depends_on = None
def upgrade() -> None:
op.create_table(
"tenant_usage",
sa.Column("id", sa.Integer(), nullable=False),
sa.Column(
"window_start", sa.DateTime(timezone=True), nullable=False, index=True
),
sa.Column("llm_cost_cents", sa.Float(), nullable=False, server_default="0.0"),
sa.Column("chunks_indexed", sa.Integer(), nullable=False, server_default="0"),
sa.Column("api_calls", sa.Integer(), nullable=False, server_default="0"),
sa.Column(
"non_streaming_api_calls", sa.Integer(), nullable=False, server_default="0"
),
sa.Column(
"updated_at",
sa.DateTime(timezone=True),
server_default=sa.func.now(),
nullable=True,
),
sa.PrimaryKeyConstraint("id"),
sa.UniqueConstraint("window_start", name="uq_tenant_usage_window"),
)
def downgrade() -> None:
op.drop_index("ix_tenant_usage_window_start", table_name="tenant_usage")
op.drop_table("tenant_usage")

View File

@@ -11,7 +11,7 @@ from pydantic import BaseModel, ConfigDict
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql
from onyx.llm.llm_provider_options import (
from onyx.llm.well_known_providers.llm_provider_options import (
fetch_model_names_for_provider_as_set,
fetch_visible_model_names_for_provider_as_set,
)

View File

@@ -85,103 +85,122 @@ class UserRow(NamedTuple):
def upgrade() -> None:
conn = op.get_bind()
# Start transaction
conn.execute(sa.text("BEGIN"))
# Step 1: Create or update the unified assistant (ID 0)
search_assistant = conn.execute(
sa.text("SELECT * FROM persona WHERE id = 0")
).fetchone()
try:
# Step 1: Create or update the unified assistant (ID 0)
search_assistant = conn.execute(
sa.text("SELECT * FROM persona WHERE id = 0")
).fetchone()
if search_assistant:
# Update existing Search assistant to be the unified assistant
conn.execute(
sa.text(
"""
UPDATE persona
SET name = :name,
description = :description,
system_prompt = :system_prompt,
num_chunks = :num_chunks,
is_default_persona = true,
is_visible = true,
deleted = false,
display_priority = :display_priority,
llm_filter_extraction = :llm_filter_extraction,
llm_relevance_filter = :llm_relevance_filter,
recency_bias = :recency_bias,
chunks_above = :chunks_above,
chunks_below = :chunks_below,
datetime_aware = :datetime_aware,
starter_messages = null
WHERE id = 0
"""
),
INSERT_DICT,
)
else:
# Create new unified assistant with ID 0
conn.execute(
sa.text(
"""
INSERT INTO persona (
id, name, description, system_prompt, num_chunks,
is_default_persona, is_visible, deleted, display_priority,
llm_filter_extraction, llm_relevance_filter, recency_bias,
chunks_above, chunks_below, datetime_aware, starter_messages,
builtin_persona
) VALUES (
0, :name, :description, :system_prompt, :num_chunks,
true, true, false, :display_priority, :llm_filter_extraction,
:llm_relevance_filter, :recency_bias, :chunks_above, :chunks_below,
:datetime_aware, null, true
)
"""
),
INSERT_DICT,
)
# Step 2: Mark ALL builtin assistants as deleted (except the unified assistant ID 0)
if search_assistant:
# Update existing Search assistant to be the unified assistant
conn.execute(
sa.text(
"""
UPDATE persona
SET deleted = true, is_visible = false, is_default_persona = false
WHERE builtin_persona = true AND id != 0
SET name = :name,
description = :description,
system_prompt = :system_prompt,
num_chunks = :num_chunks,
is_default_persona = true,
is_visible = true,
deleted = false,
display_priority = :display_priority,
llm_filter_extraction = :llm_filter_extraction,
llm_relevance_filter = :llm_relevance_filter,
recency_bias = :recency_bias,
chunks_above = :chunks_above,
chunks_below = :chunks_below,
datetime_aware = :datetime_aware,
starter_messages = null
WHERE id = 0
"""
)
),
INSERT_DICT,
)
else:
# Create new unified assistant with ID 0
conn.execute(
sa.text(
"""
INSERT INTO persona (
id, name, description, system_prompt, num_chunks,
is_default_persona, is_visible, deleted, display_priority,
llm_filter_extraction, llm_relevance_filter, recency_bias,
chunks_above, chunks_below, datetime_aware, starter_messages,
builtin_persona
) VALUES (
0, :name, :description, :system_prompt, :num_chunks,
true, true, false, :display_priority, :llm_filter_extraction,
:llm_relevance_filter, :recency_bias, :chunks_above, :chunks_below,
:datetime_aware, null, true
)
"""
),
INSERT_DICT,
)
# Step 3: Add all built-in tools to the unified assistant
# First, get the tool IDs for SearchTool, ImageGenerationTool, and WebSearchTool
search_tool = conn.execute(
sa.text("SELECT id FROM tool WHERE in_code_tool_id = 'SearchTool'")
).fetchone()
# Step 2: Mark ALL builtin assistants as deleted (except the unified assistant ID 0)
conn.execute(
sa.text(
"""
UPDATE persona
SET deleted = true, is_visible = false, is_default_persona = false
WHERE builtin_persona = true AND id != 0
"""
)
)
if not search_tool:
raise ValueError(
"SearchTool not found in database. Ensure tools migration has run first."
)
# Step 3: Add all built-in tools to the unified assistant
# First, get the tool IDs for SearchTool, ImageGenerationTool, and WebSearchTool
search_tool = conn.execute(
sa.text("SELECT id FROM tool WHERE in_code_tool_id = 'SearchTool'")
).fetchone()
image_gen_tool = conn.execute(
sa.text("SELECT id FROM tool WHERE in_code_tool_id = 'ImageGenerationTool'")
).fetchone()
if not search_tool:
raise ValueError(
"SearchTool not found in database. Ensure tools migration has run first."
)
if not image_gen_tool:
raise ValueError(
"ImageGenerationTool not found in database. Ensure tools migration has run first."
)
image_gen_tool = conn.execute(
sa.text("SELECT id FROM tool WHERE in_code_tool_id = 'ImageGenerationTool'")
).fetchone()
# WebSearchTool is optional - may not be configured
web_search_tool = conn.execute(
sa.text("SELECT id FROM tool WHERE in_code_tool_id = 'WebSearchTool'")
).fetchone()
if not image_gen_tool:
raise ValueError(
"ImageGenerationTool not found in database. Ensure tools migration has run first."
)
# Clear existing tool associations for persona 0
conn.execute(sa.text("DELETE FROM persona__tool WHERE persona_id = 0"))
# WebSearchTool is optional - may not be configured
web_search_tool = conn.execute(
sa.text("SELECT id FROM tool WHERE in_code_tool_id = 'WebSearchTool'")
).fetchone()
# Add tools to the unified assistant
# Clear existing tool associations for persona 0
conn.execute(sa.text("DELETE FROM persona__tool WHERE persona_id = 0"))
# Add tools to the unified assistant
conn.execute(
sa.text(
"""
INSERT INTO persona__tool (persona_id, tool_id)
VALUES (0, :tool_id)
ON CONFLICT DO NOTHING
"""
),
{"tool_id": search_tool[0]},
)
conn.execute(
sa.text(
"""
INSERT INTO persona__tool (persona_id, tool_id)
VALUES (0, :tool_id)
ON CONFLICT DO NOTHING
"""
),
{"tool_id": image_gen_tool[0]},
)
if web_search_tool:
conn.execute(
sa.text(
"""
@@ -190,191 +209,148 @@ def upgrade() -> None:
ON CONFLICT DO NOTHING
"""
),
{"tool_id": search_tool[0]},
{"tool_id": web_search_tool[0]},
)
conn.execute(
sa.text(
"""
INSERT INTO persona__tool (persona_id, tool_id)
VALUES (0, :tool_id)
ON CONFLICT DO NOTHING
# Step 4: Migrate existing chat sessions from all builtin assistants to unified assistant
conn.execute(
sa.text(
"""
),
{"tool_id": image_gen_tool[0]},
UPDATE chat_session
SET persona_id = 0
WHERE persona_id IN (
SELECT id FROM persona WHERE builtin_persona = true AND id != 0
)
"""
)
)
if web_search_tool:
# Step 5: Migrate user preferences - remove references to all builtin assistants
# First, get all builtin assistant IDs (except 0)
builtin_assistants_result = conn.execute(
sa.text(
"""
SELECT id FROM persona
WHERE builtin_persona = true AND id != 0
"""
)
).fetchall()
builtin_assistant_ids = [row[0] for row in builtin_assistants_result]
# Get all users with preferences
users_result = conn.execute(
sa.text(
"""
SELECT id, chosen_assistants, visible_assistants,
hidden_assistants, pinned_assistants
FROM "user"
"""
)
).fetchall()
for user_row in users_result:
user = UserRow(*user_row)
user_id: UUID = user.id
updates: dict[str, Any] = {}
# Remove all builtin assistants from chosen_assistants
if user.chosen_assistants:
new_chosen: list[int] = [
assistant_id
for assistant_id in user.chosen_assistants
if assistant_id not in builtin_assistant_ids
]
if new_chosen != user.chosen_assistants:
updates["chosen_assistants"] = json.dumps(new_chosen)
# Remove all builtin assistants from visible_assistants
if user.visible_assistants:
new_visible: list[int] = [
assistant_id
for assistant_id in user.visible_assistants
if assistant_id not in builtin_assistant_ids
]
if new_visible != user.visible_assistants:
updates["visible_assistants"] = json.dumps(new_visible)
# Add all builtin assistants to hidden_assistants
if user.hidden_assistants:
new_hidden: list[int] = list(user.hidden_assistants)
for old_id in builtin_assistant_ids:
if old_id not in new_hidden:
new_hidden.append(old_id)
if new_hidden != user.hidden_assistants:
updates["hidden_assistants"] = json.dumps(new_hidden)
else:
updates["hidden_assistants"] = json.dumps(builtin_assistant_ids)
# Remove all builtin assistants from pinned_assistants
if user.pinned_assistants:
new_pinned: list[int] = [
assistant_id
for assistant_id in user.pinned_assistants
if assistant_id not in builtin_assistant_ids
]
if new_pinned != user.pinned_assistants:
updates["pinned_assistants"] = json.dumps(new_pinned)
# Apply updates if any
if updates:
set_clause = ", ".join([f"{k} = :{k}" for k in updates.keys()])
updates["user_id"] = str(user_id) # Convert UUID to string for SQL
conn.execute(
sa.text(
"""
INSERT INTO persona__tool (persona_id, tool_id)
VALUES (0, :tool_id)
ON CONFLICT DO NOTHING
"""
),
{"tool_id": web_search_tool[0]},
sa.text(f'UPDATE "user" SET {set_clause} WHERE id = :user_id'),
updates,
)
# Step 4: Migrate existing chat sessions from all builtin assistants to unified assistant
conn.execute(
sa.text(
"""
UPDATE chat_session
SET persona_id = 0
WHERE persona_id IN (
SELECT id FROM persona WHERE builtin_persona = true AND id != 0
)
"""
)
)
# Step 5: Migrate user preferences - remove references to all builtin assistants
# First, get all builtin assistant IDs (except 0)
builtin_assistants_result = conn.execute(
sa.text(
"""
SELECT id FROM persona
WHERE builtin_persona = true AND id != 0
"""
)
).fetchall()
builtin_assistant_ids = [row[0] for row in builtin_assistants_result]
# Get all users with preferences
users_result = conn.execute(
sa.text(
"""
SELECT id, chosen_assistants, visible_assistants,
hidden_assistants, pinned_assistants
FROM "user"
"""
)
).fetchall()
for user_row in users_result:
user = UserRow(*user_row)
user_id: UUID = user.id
updates: dict[str, Any] = {}
# Remove all builtin assistants from chosen_assistants
if user.chosen_assistants:
new_chosen: list[int] = [
assistant_id
for assistant_id in user.chosen_assistants
if assistant_id not in builtin_assistant_ids
]
if new_chosen != user.chosen_assistants:
updates["chosen_assistants"] = json.dumps(new_chosen)
# Remove all builtin assistants from visible_assistants
if user.visible_assistants:
new_visible: list[int] = [
assistant_id
for assistant_id in user.visible_assistants
if assistant_id not in builtin_assistant_ids
]
if new_visible != user.visible_assistants:
updates["visible_assistants"] = json.dumps(new_visible)
# Add all builtin assistants to hidden_assistants
if user.hidden_assistants:
new_hidden: list[int] = list(user.hidden_assistants)
for old_id in builtin_assistant_ids:
if old_id not in new_hidden:
new_hidden.append(old_id)
if new_hidden != user.hidden_assistants:
updates["hidden_assistants"] = json.dumps(new_hidden)
else:
updates["hidden_assistants"] = json.dumps(builtin_assistant_ids)
# Remove all builtin assistants from pinned_assistants
if user.pinned_assistants:
new_pinned: list[int] = [
assistant_id
for assistant_id in user.pinned_assistants
if assistant_id not in builtin_assistant_ids
]
if new_pinned != user.pinned_assistants:
updates["pinned_assistants"] = json.dumps(new_pinned)
# Apply updates if any
if updates:
set_clause = ", ".join([f"{k} = :{k}" for k in updates.keys()])
updates["user_id"] = str(user_id) # Convert UUID to string for SQL
conn.execute(
sa.text(f'UPDATE "user" SET {set_clause} WHERE id = :user_id'),
updates,
)
# Commit transaction
conn.execute(sa.text("COMMIT"))
except Exception as e:
# Rollback on error
conn.execute(sa.text("ROLLBACK"))
raise e
def downgrade() -> None:
conn = op.get_bind()
# Start transaction
conn.execute(sa.text("BEGIN"))
try:
# Only restore General (ID -1) and Art (ID -3) assistants
# Step 1: Keep Search assistant (ID 0) as default but restore original state
conn.execute(
sa.text(
"""
UPDATE persona
SET is_default_persona = true,
is_visible = true,
deleted = false
WHERE id = 0
# Only restore General (ID -1) and Art (ID -3) assistants
# Step 1: Keep Search assistant (ID 0) as default but restore original state
conn.execute(
sa.text(
"""
)
UPDATE persona
SET is_default_persona = true,
is_visible = true,
deleted = false
WHERE id = 0
"""
)
)
# Step 2: Restore General assistant (ID -1)
conn.execute(
sa.text(
"""
UPDATE persona
SET deleted = false,
is_visible = true,
is_default_persona = true
WHERE id = :general_assistant_id
# Step 2: Restore General assistant (ID -1)
conn.execute(
sa.text(
"""
),
{"general_assistant_id": GENERAL_ASSISTANT_ID},
)
UPDATE persona
SET deleted = false,
is_visible = true,
is_default_persona = true
WHERE id = :general_assistant_id
"""
),
{"general_assistant_id": GENERAL_ASSISTANT_ID},
)
# Step 3: Restore Art assistant (ID -3)
conn.execute(
sa.text(
"""
UPDATE persona
SET deleted = false,
is_visible = true,
is_default_persona = true
WHERE id = :art_assistant_id
# Step 3: Restore Art assistant (ID -3)
conn.execute(
sa.text(
"""
),
{"art_assistant_id": ART_ASSISTANT_ID},
)
UPDATE persona
SET deleted = false,
is_visible = true,
is_default_persona = true
WHERE id = :art_assistant_id
"""
),
{"art_assistant_id": ART_ASSISTANT_ID},
)
# Note: We don't restore the original tool associations, names, or descriptions
# as those would require more complex logic to determine original state.
# We also cannot restore original chat session persona_ids as we don't
# have the original mappings.
# Other builtin assistants remain deleted as per the requirement.
# Commit transaction
conn.execute(sa.text("COMMIT"))
except Exception as e:
# Rollback on error
conn.execute(sa.text("ROLLBACK"))
raise e
# Note: We don't restore the original tool associations, names, or descriptions
# as those would require more complex logic to determine original state.
# We also cannot restore original chat session persona_ids as we don't
# have the original mappings.
# Other builtin assistants remain deleted as per the requirement.

View File

@@ -0,0 +1,35 @@
"""backend driven notification details
Revision ID: 5c3dca366b35
Revises: 9087b548dd69
Create Date: 2026-01-06 16:03:11.413724
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "5c3dca366b35"
down_revision = "9087b548dd69"
branch_labels = None
depends_on = None
def upgrade() -> None:
op.add_column(
"notification",
sa.Column(
"title", sa.String(), nullable=False, server_default="New Notification"
),
)
op.add_column(
"notification",
sa.Column("description", sa.String(), nullable=True, server_default=""),
)
def downgrade() -> None:
op.drop_column("notification", "title")
op.drop_column("notification", "description")

View File

@@ -0,0 +1,75 @@
"""nullify_default_task_prompt
Revision ID: 699221885109
Revises: 7e490836d179
Create Date: 2025-12-30 10:00:00.000000
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "699221885109"
down_revision = "7e490836d179"
branch_labels = None
depends_on = None
DEFAULT_PERSONA_ID = 0
def upgrade() -> None:
# Make task_prompt column nullable
# Note: The model had nullable=True but the DB column was NOT NULL until this point
op.alter_column(
"persona",
"task_prompt",
nullable=True,
)
# Set task_prompt to NULL for the default persona
conn = op.get_bind()
conn.execute(
sa.text(
"""
UPDATE persona
SET task_prompt = NULL
WHERE id = :persona_id
"""
),
{"persona_id": DEFAULT_PERSONA_ID},
)
def downgrade() -> None:
# Restore task_prompt to empty string for the default persona
conn = op.get_bind()
conn.execute(
sa.text(
"""
UPDATE persona
SET task_prompt = ''
WHERE id = :persona_id AND task_prompt IS NULL
"""
),
{"persona_id": DEFAULT_PERSONA_ID},
)
# Set any remaining NULL task_prompts to empty string before making non-nullable
conn.execute(
sa.text(
"""
UPDATE persona
SET task_prompt = ''
WHERE task_prompt IS NULL
"""
)
)
# Revert task_prompt column to not nullable
op.alter_column(
"persona",
"task_prompt",
nullable=False,
)

View File

@@ -0,0 +1,54 @@
"""add image generation config table
Revision ID: 7206234e012a
Revises: 699221885109
Create Date: 2025-12-21 00:00:00.000000
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "7206234e012a"
down_revision = "699221885109"
branch_labels = None
depends_on = None
def upgrade() -> None:
op.create_table(
"image_generation_config",
sa.Column("image_provider_id", sa.String(), primary_key=True),
sa.Column("model_configuration_id", sa.Integer(), nullable=False),
sa.Column("is_default", sa.Boolean(), nullable=False),
sa.ForeignKeyConstraint(
["model_configuration_id"],
["model_configuration.id"],
ondelete="CASCADE",
),
)
op.create_index(
"ix_image_generation_config_is_default",
"image_generation_config",
["is_default"],
unique=False,
)
op.create_index(
"ix_image_generation_config_model_configuration_id",
"image_generation_config",
["model_configuration_id"],
unique=False,
)
def downgrade() -> None:
op.drop_index(
"ix_image_generation_config_model_configuration_id",
table_name="image_generation_config",
)
op.drop_index(
"ix_image_generation_config_is_default", table_name="image_generation_config"
)
op.drop_table("image_generation_config")

View File

@@ -10,7 +10,7 @@ from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql
from onyx.llm.llm_provider_options import (
from onyx.llm.well_known_providers.llm_provider_options import (
fetch_model_names_for_provider_as_set,
fetch_visible_model_names_for_provider_as_set,
)

View File

@@ -0,0 +1,80 @@
"""nullify_default_system_prompt
Revision ID: 7e490836d179
Revises: c1d2e3f4a5b6
Create Date: 2025-12-29 16:54:36.635574
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "7e490836d179"
down_revision = "c1d2e3f4a5b6"
branch_labels = None
depends_on = None
# This is the default system prompt from the previous migration (87c52ec39f84)
# ruff: noqa: E501, W605 start
PREVIOUS_DEFAULT_SYSTEM_PROMPT = """
You are a highly capable, thoughtful, and precise assistant. Your goal is to deeply understand the user's intent, ask clarifying questions when needed, think step-by-step through complex problems, provide clear and accurate answers, and proactively anticipate helpful follow-up information. Always prioritize being truthful, nuanced, insightful, and efficient.
The current date is [[CURRENT_DATETIME]].[[CITATION_GUIDANCE]]
# Response Style
You use different text styles, bolding, emojis (sparingly), block quotes, and other formatting to make your responses more readable and engaging.
You use proper Markdown and LaTeX to format your responses for math, scientific, and chemical formulas, symbols, etc.: '$$\\n[expression]\\n$$' for standalone cases and '\\( [expression] \\)' when inline.
For code you prefer to use Markdown and specify the language.
You can use horizontal rules (---) to separate sections of your responses.
You can use Markdown tables to format your responses for data, lists, and other structured information.
""".lstrip()
# ruff: noqa: E501, W605 end
def upgrade() -> None:
# Make system_prompt column nullable (model already has nullable=True but DB doesn't)
op.alter_column(
"persona",
"system_prompt",
nullable=True,
)
# Set system_prompt to NULL where it matches the previous default
conn = op.get_bind()
conn.execute(
sa.text(
"""
UPDATE persona
SET system_prompt = NULL
WHERE system_prompt = :previous_default
"""
),
{"previous_default": PREVIOUS_DEFAULT_SYSTEM_PROMPT},
)
def downgrade() -> None:
# Restore the default system prompt for personas that have NULL
# Note: This may restore the prompt to personas that originally had NULL
# before this migration, but there's no way to distinguish them
conn = op.get_bind()
conn.execute(
sa.text(
"""
UPDATE persona
SET system_prompt = :previous_default
WHERE system_prompt IS NULL
"""
),
{"previous_default": PREVIOUS_DEFAULT_SYSTEM_PROMPT},
)
# Revert system_prompt column to not nullable
op.alter_column(
"persona",
"system_prompt",
nullable=False,
)

View File

@@ -0,0 +1,49 @@
"""notifications constraint, sort index, and cleanup old notifications
Revision ID: 8405ca81cc83
Revises: a3c1a7904cd0
Create Date: 2026-01-07 16:43:44.855156
"""
from alembic import op
# revision identifiers, used by Alembic.
revision = "8405ca81cc83"
down_revision = "a3c1a7904cd0"
branch_labels = None
depends_on = None
def upgrade() -> None:
# Create unique index for notification deduplication.
# This enables atomic ON CONFLICT DO NOTHING inserts in batch_create_notifications.
#
# Uses COALESCE to handle NULL additional_data (NULLs are normally distinct
# in unique constraints, but we want NULL == NULL for deduplication).
# The '{}' represents an empty JSONB object as the NULL replacement.
# Clean up legacy notifications first
op.execute("DELETE FROM notification WHERE title = 'New Notification'")
op.execute(
"""
CREATE UNIQUE INDEX IF NOT EXISTS ix_notification_user_type_data
ON notification (user_id, notif_type, COALESCE(additional_data, '{}'::jsonb))
"""
)
# Create index for efficient notification sorting by user
# Covers: WHERE user_id = ? ORDER BY dismissed, first_shown DESC
op.execute(
"""
CREATE INDEX IF NOT EXISTS ix_notification_user_sort
ON notification (user_id, dismissed, first_shown DESC)
"""
)
def downgrade() -> None:
op.execute("DROP INDEX IF EXISTS ix_notification_user_type_data")
op.execute("DROP INDEX IF EXISTS ix_notification_user_sort")

View File

@@ -0,0 +1,136 @@
"""seed_default_image_gen_config
Revision ID: 9087b548dd69
Revises: 2b90f3af54b8
Create Date: 2026-01-05 00:00:00.000000
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "9087b548dd69"
down_revision = "2b90f3af54b8"
branch_labels = None
depends_on = None
# Constants for default image generation config
# Source: web/src/app/admin/configuration/image-generation/constants.ts
IMAGE_PROVIDER_ID = "openai_gpt_image_1"
MODEL_NAME = "gpt-image-1"
PROVIDER_NAME = "openai"
def upgrade() -> None:
conn = op.get_bind()
# Check if image_generation_config table already has records
existing_configs = (
conn.execute(sa.text("SELECT COUNT(*) FROM image_generation_config")).scalar()
or 0
)
if existing_configs > 0:
# Skip if configs already exist - user may have configured manually
return
# Find the first OpenAI LLM provider
openai_provider = conn.execute(
sa.text(
"""
SELECT id, api_key
FROM llm_provider
WHERE provider = :provider
ORDER BY id
LIMIT 1
"""
),
{"provider": PROVIDER_NAME},
).fetchone()
if not openai_provider:
# No OpenAI provider found - nothing to do
return
source_provider_id, api_key = openai_provider
# Create new LLM provider for image generation (clone only api_key)
result = conn.execute(
sa.text(
"""
INSERT INTO llm_provider (
name, provider, api_key, api_base, api_version,
deployment_name, default_model_name, is_public,
is_default_provider, is_default_vision_provider, is_auto_mode
)
VALUES (
:name, :provider, :api_key, NULL, NULL,
NULL, :default_model_name, :is_public,
NULL, NULL, :is_auto_mode
)
RETURNING id
"""
),
{
"name": f"Image Gen - {IMAGE_PROVIDER_ID}",
"provider": PROVIDER_NAME,
"api_key": api_key,
"default_model_name": MODEL_NAME,
"is_public": True,
"is_auto_mode": False,
},
)
new_provider_id = result.scalar()
# Create model configuration
result = conn.execute(
sa.text(
"""
INSERT INTO model_configuration (
llm_provider_id, name, is_visible, max_input_tokens,
supports_image_input, display_name
)
VALUES (
:llm_provider_id, :name, :is_visible, :max_input_tokens,
:supports_image_input, :display_name
)
RETURNING id
"""
),
{
"llm_provider_id": new_provider_id,
"name": MODEL_NAME,
"is_visible": True,
"max_input_tokens": None,
"supports_image_input": False,
"display_name": None,
},
)
model_config_id = result.scalar()
# Create image generation config
conn.execute(
sa.text(
"""
INSERT INTO image_generation_config (
image_provider_id, model_configuration_id, is_default
)
VALUES (
:image_provider_id, :model_configuration_id, :is_default
)
"""
),
{
"image_provider_id": IMAGE_PROVIDER_ID,
"model_configuration_id": model_config_id,
"is_default": True,
},
)
def downgrade() -> None:
# We don't remove the config on downgrade since it's safe to keep around
# If we upgrade again, it will be a no-op due to the existing records check
pass

View File

@@ -0,0 +1,33 @@
"""add_is_auto_mode_to_llm_provider
Revision ID: 9a0296d7421e
Revises: 7206234e012a
Create Date: 2025-12-17 18:14:29.620981
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "9a0296d7421e"
down_revision = "7206234e012a"
branch_labels = None
depends_on = None
def upgrade() -> None:
op.add_column(
"llm_provider",
sa.Column(
"is_auto_mode",
sa.Boolean(),
nullable=False,
server_default="false",
),
)
def downgrade() -> None:
op.drop_column("llm_provider", "is_auto_mode")

View File

@@ -234,6 +234,8 @@ def downgrade() -> None:
if "instructions" in columns:
op.drop_column("user_project", "instructions")
op.execute("ALTER TABLE user_project RENAME TO user_folder")
# Update NULL descriptions to empty string before setting NOT NULL constraint
op.execute("UPDATE user_folder SET description = '' WHERE description IS NULL")
op.alter_column("user_folder", "description", nullable=False)
logger.info("Renamed user_project back to user_folder")

View File

@@ -42,20 +42,13 @@ TOOL_DESCRIPTIONS = {
def upgrade() -> None:
conn = op.get_bind()
conn.execute(sa.text("BEGIN"))
try:
for tool_id, description in TOOL_DESCRIPTIONS.items():
conn.execute(
sa.text(
"UPDATE tool SET description = :description WHERE in_code_tool_id = :tool_id"
),
{"description": description, "tool_id": tool_id},
)
conn.execute(sa.text("COMMIT"))
except Exception as e:
conn.execute(sa.text("ROLLBACK"))
raise e
for tool_id, description in TOOL_DESCRIPTIONS.items():
conn.execute(
sa.text(
"UPDATE tool SET description = :description WHERE in_code_tool_id = :tool_id"
),
{"description": description, "tool_id": tool_id},
)
def downgrade() -> None:

View File

@@ -0,0 +1,39 @@
"""remove userfile related deprecated fields
Revision ID: a3c1a7904cd0
Revises: 5c3dca366b35
Create Date: 2026-01-06 13:00:30.634396
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "a3c1a7904cd0"
down_revision = "5c3dca366b35"
branch_labels = None
depends_on = None
def upgrade() -> None:
op.drop_column("user_file", "document_id")
op.drop_column("user_file", "document_id_migrated")
op.drop_column("connector_credential_pair", "is_user_file")
def downgrade() -> None:
op.add_column(
"connector_credential_pair",
sa.Column("is_user_file", sa.Boolean(), nullable=False, server_default="false"),
)
op.add_column(
"user_file",
sa.Column("document_id", sa.String(), nullable=True),
)
op.add_column(
"user_file",
sa.Column(
"document_id_migrated", sa.Boolean(), nullable=False, server_default="true"
),
)

View File

@@ -7,7 +7,6 @@ Create Date: 2025-12-18 16:00:00.000000
"""
from alembic import op
from onyx.deep_research.dr_mock_tools import RESEARCH_AGENT_DB_NAME
import sqlalchemy as sa
@@ -19,7 +18,7 @@ depends_on = None
DEEP_RESEARCH_TOOL = {
"name": RESEARCH_AGENT_DB_NAME,
"name": "ResearchAgent",
"display_name": "Research Agent",
"description": "The Research Agent is a sub-agent that conducts research on a specific topic.",
"in_code_tool_id": "ResearchAgent",

View File

@@ -70,80 +70,66 @@ BUILT_IN_TOOLS = [
def upgrade() -> None:
conn = op.get_bind()
# Start transaction
conn.execute(sa.text("BEGIN"))
# Get existing tools to check what already exists
existing_tools = conn.execute(
sa.text("SELECT in_code_tool_id FROM tool WHERE in_code_tool_id IS NOT NULL")
).fetchall()
existing_tool_ids = {row[0] for row in existing_tools}
try:
# Get existing tools to check what already exists
existing_tools = conn.execute(
sa.text(
"SELECT in_code_tool_id FROM tool WHERE in_code_tool_id IS NOT NULL"
# Insert or update built-in tools
for tool in BUILT_IN_TOOLS:
in_code_id = tool["in_code_tool_id"]
# Handle historical rename: InternetSearchTool -> WebSearchTool
if (
in_code_id == "WebSearchTool"
and "WebSearchTool" not in existing_tool_ids
and "InternetSearchTool" in existing_tool_ids
):
# Rename the existing InternetSearchTool row in place and update fields
conn.execute(
sa.text(
"""
UPDATE tool
SET name = :name,
display_name = :display_name,
description = :description,
in_code_tool_id = :in_code_tool_id
WHERE in_code_tool_id = 'InternetSearchTool'
"""
),
tool,
)
).fetchall()
existing_tool_ids = {row[0] for row in existing_tools}
# Keep the local view of existing ids in sync to avoid duplicate insert
existing_tool_ids.discard("InternetSearchTool")
existing_tool_ids.add("WebSearchTool")
continue
# Insert or update built-in tools
for tool in BUILT_IN_TOOLS:
in_code_id = tool["in_code_tool_id"]
# Handle historical rename: InternetSearchTool -> WebSearchTool
if (
in_code_id == "WebSearchTool"
and "WebSearchTool" not in existing_tool_ids
and "InternetSearchTool" in existing_tool_ids
):
# Rename the existing InternetSearchTool row in place and update fields
conn.execute(
sa.text(
"""
UPDATE tool
SET name = :name,
display_name = :display_name,
description = :description,
in_code_tool_id = :in_code_tool_id
WHERE in_code_tool_id = 'InternetSearchTool'
"""
),
tool,
)
# Keep the local view of existing ids in sync to avoid duplicate insert
existing_tool_ids.discard("InternetSearchTool")
existing_tool_ids.add("WebSearchTool")
continue
if in_code_id in existing_tool_ids:
# Update existing tool
conn.execute(
sa.text(
"""
UPDATE tool
SET name = :name,
display_name = :display_name,
description = :description
WHERE in_code_tool_id = :in_code_tool_id
"""
),
tool,
)
else:
# Insert new tool
conn.execute(
sa.text(
"""
INSERT INTO tool (name, display_name, description, in_code_tool_id)
VALUES (:name, :display_name, :description, :in_code_tool_id)
"""
),
tool,
)
# Commit transaction
conn.execute(sa.text("COMMIT"))
except Exception as e:
# Rollback on error
conn.execute(sa.text("ROLLBACK"))
raise e
if in_code_id in existing_tool_ids:
# Update existing tool
conn.execute(
sa.text(
"""
UPDATE tool
SET name = :name,
display_name = :display_name,
description = :description
WHERE in_code_tool_id = :in_code_tool_id
"""
),
tool,
)
else:
# Insert new tool
conn.execute(
sa.text(
"""
INSERT INTO tool (name, display_name, description, in_code_tool_id)
VALUES (:name, :display_name, :description, :in_code_tool_id)
"""
),
tool,
)
def downgrade() -> None:

View File

@@ -0,0 +1,64 @@
"""sync_exa_api_key_to_content_provider
Revision ID: d1b637d7050a
Revises: d25168c2beee
Create Date: 2026-01-09 15:54:15.646249
"""
from alembic import op
from sqlalchemy import text
# revision identifiers, used by Alembic.
revision = "d1b637d7050a"
down_revision = "d25168c2beee"
branch_labels = None
depends_on = None
def upgrade() -> None:
# Exa uses a shared API key between search and content providers.
# For existing Exa search providers with API keys, create the corresponding
# content provider if it doesn't exist yet.
connection = op.get_bind()
# Check if Exa search provider exists with an API key
result = connection.execute(
text(
"""
SELECT api_key FROM internet_search_provider
WHERE provider_type = 'exa' AND api_key IS NOT NULL
LIMIT 1
"""
)
)
row = result.fetchone()
if row:
api_key = row[0]
# Create Exa content provider with the shared key
connection.execute(
text(
"""
INSERT INTO internet_content_provider
(name, provider_type, api_key, is_active)
VALUES ('Exa', 'exa', :api_key, false)
ON CONFLICT (name) DO NOTHING
"""
),
{"api_key": api_key},
)
def downgrade() -> None:
# Remove the Exa content provider that was created by this migration
connection = op.get_bind()
connection.execute(
text(
"""
DELETE FROM internet_content_provider
WHERE provider_type = 'exa'
"""
)
)

View File

@@ -0,0 +1,86 @@
"""tool_name_consistency
Revision ID: d25168c2beee
Revises: 8405ca81cc83
Create Date: 2026-01-11 17:54:40.135777
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = "d25168c2beee"
down_revision = "8405ca81cc83"
branch_labels = None
depends_on = None
# Currently the seeded tools have the in_code_tool_id == name
CURRENT_TOOL_NAME_MAPPING = [
"SearchTool",
"WebSearchTool",
"ImageGenerationTool",
"PythonTool",
"OpenURLTool",
"KnowledgeGraphTool",
"ResearchAgent",
]
# Mapping of in_code_tool_id -> name
# These are the expected names that we want in the database
EXPECTED_TOOL_NAME_MAPPING = {
"SearchTool": "internal_search",
"WebSearchTool": "web_search",
"ImageGenerationTool": "generate_image",
"PythonTool": "python",
"OpenURLTool": "open_url",
"KnowledgeGraphTool": "run_kg_search",
"ResearchAgent": "research_agent",
}
def upgrade() -> None:
conn = op.get_bind()
# Mapping of in_code_tool_id to the NAME constant from each tool class
# These match the .name property of each tool implementation
tool_name_mapping = EXPECTED_TOOL_NAME_MAPPING
# Update the name column for each tool based on its in_code_tool_id
for in_code_tool_id, expected_name in tool_name_mapping.items():
conn.execute(
sa.text(
"""
UPDATE tool
SET name = :expected_name
WHERE in_code_tool_id = :in_code_tool_id
"""
),
{
"expected_name": expected_name,
"in_code_tool_id": in_code_tool_id,
},
)
def downgrade() -> None:
conn = op.get_bind()
# Reverse the migration by setting name back to in_code_tool_id
# This matches the original pattern where name was the class name
for in_code_tool_id in CURRENT_TOOL_NAME_MAPPING:
conn.execute(
sa.text(
"""
UPDATE tool
SET name = :current_name
WHERE in_code_tool_id = :in_code_tool_id
"""
),
{
"current_name": in_code_tool_id,
"in_code_tool_id": in_code_tool_id,
},
)

View File

@@ -109,11 +109,6 @@ CHECK_TTL_MANAGEMENT_TASK_FREQUENCY_IN_HOURS = float(
STRIPE_SECRET_KEY = os.environ.get("STRIPE_SECRET_KEY")
STRIPE_PRICE_ID = os.environ.get("STRIPE_PRICE")
OPENAI_DEFAULT_API_KEY = os.environ.get("OPENAI_DEFAULT_API_KEY")
ANTHROPIC_DEFAULT_API_KEY = os.environ.get("ANTHROPIC_DEFAULT_API_KEY")
COHERE_DEFAULT_API_KEY = os.environ.get("COHERE_DEFAULT_API_KEY")
# JWT Public Key URL
JWT_PUBLIC_KEY_URL: str | None = os.getenv("JWT_PUBLIC_KEY_URL", None)

View File

@@ -3,30 +3,42 @@ from uuid import UUID
from sqlalchemy.orm import Session
from onyx.configs.constants import NotificationType
from onyx.db.models import Persona
from onyx.db.models import Persona__User
from onyx.db.models import Persona__UserGroup
from onyx.db.notification import create_notification
from onyx.server.features.persona.models import PersonaSharedNotificationData
def make_persona_private(
def update_persona_access(
persona_id: int,
creator_user_id: UUID | None,
user_ids: list[UUID] | None,
group_ids: list[int] | None,
db_session: Session,
is_public: bool | None = None,
user_ids: list[UUID] | None = None,
group_ids: list[int] | None = None,
) -> None:
"""NOTE(rkuo): This function batches all updates into a single commit. If we don't
dedupe the inputs, the commit will exception."""
"""Updates the access settings for a persona including public status, user shares,
and group shares.
db_session.query(Persona__User).filter(
Persona__User.persona_id == persona_id
).delete(synchronize_session="fetch")
db_session.query(Persona__UserGroup).filter(
Persona__UserGroup.persona_id == persona_id
).delete(synchronize_session="fetch")
NOTE: This function batches all updates. If we don't dedupe the inputs,
the commit will exception.
NOTE: Callers are responsible for committing."""
if is_public is not None:
persona = db_session.query(Persona).filter(Persona.id == persona_id).first()
if persona:
persona.is_public = is_public
# NOTE: For user-ids and group-ids, `None` means "leave unchanged", `[]` means "clear all shares",
# and a non-empty list means "replace with these shares".
if user_ids is not None:
db_session.query(Persona__User).filter(
Persona__User.persona_id == persona_id
).delete(synchronize_session="fetch")
if user_ids:
user_ids_set = set(user_ids)
for user_id in user_ids_set:
db_session.add(Persona__User(persona_id=persona_id, user_id=user_id))
@@ -34,17 +46,20 @@ def make_persona_private(
create_notification(
user_id=user_id,
notif_type=NotificationType.PERSONA_SHARED,
title="A new agent was shared with you!",
db_session=db_session,
additional_data=PersonaSharedNotificationData(
persona_id=persona_id,
).model_dump(),
)
if group_ids:
if group_ids is not None:
db_session.query(Persona__UserGroup).filter(
Persona__UserGroup.persona_id == persona_id
).delete(synchronize_session="fetch")
group_ids_set = set(group_ids)
for group_id in group_ids_set:
db_session.add(
Persona__UserGroup(persona_id=persona_id, user_group_id=group_id)
)
db_session.commit()

View File

@@ -21,8 +21,9 @@ from onyx.auth.users import current_admin_user
from onyx.auth.users import current_user
from onyx.db.engine.sql_engine import get_session
from onyx.db.models import User
from onyx.server.utils import PUBLIC_API_TAGS
router = APIRouter(prefix="/analytics")
router = APIRouter(prefix="/analytics", tags=PUBLIC_API_TAGS)
_DEFAULT_LOOKBACK_DAYS = 30

View File

@@ -1,3 +1,4 @@
from enum import Enum
from typing import Any
from typing import List
@@ -23,6 +24,12 @@ class NavigationItem(BaseModel):
return instance
class LogoDisplayStyle(str, Enum):
LOGO_AND_NAME = "logo_and_name"
LOGO_ONLY = "logo_only"
NAME_ONLY = "name_only"
class EnterpriseSettings(BaseModel):
"""General settings that only apply to the Enterprise Edition of Onyx
@@ -31,6 +38,7 @@ class EnterpriseSettings(BaseModel):
application_name: str | None = None
use_custom_logo: bool = False
use_custom_logotype: bool = False
logo_display_style: LogoDisplayStyle | None = None
# custom navigation
custom_nav_items: List[NavigationItem] = Field(default_factory=list)
@@ -42,6 +50,9 @@ class EnterpriseSettings(BaseModel):
custom_popup_header: str | None = None
custom_popup_content: str | None = None
enable_consent_screen: bool | None = None
consent_screen_prompt: str | None = None
show_first_visit_notice: bool | None = None
custom_greeting_message: str | None = None
def check_validity(self) -> None:
return

View File

@@ -23,6 +23,7 @@ from onyx.db.models import User
from onyx.llm.factory import get_llm_for_persona
from onyx.natural_language_processing.utils import get_tokenizer
from onyx.server.query_and_chat.models import CreateChatMessageRequest
from onyx.server.query_and_chat.models import MessageOrigin
from onyx.utils.logger import setup_logger
logger = setup_logger()
@@ -100,13 +101,13 @@ def handle_simplified_chat_message(
chunks_below=0,
full_doc=chat_message_req.full_doc,
structured_response_format=chat_message_req.structured_response_format,
origin=MessageOrigin.API,
)
packets = stream_chat_message_objects(
new_msg_req=full_chat_msg_info,
user=user,
db_session=db_session,
enforce_chat_session_id_for_search_docs=False,
)
return gather_stream(packets)
@@ -204,13 +205,13 @@ def handle_send_message_simple_with_history(
chunks_below=0,
full_doc=req.full_doc,
structured_response_format=req.structured_response_format,
origin=MessageOrigin.API,
)
packets = stream_chat_message_objects(
new_msg_req=full_chat_msg_info,
user=user,
db_session=db_session,
enforce_chat_session_id_for_search_docs=False,
)
return gather_stream(packets)

View File

@@ -48,6 +48,7 @@ from onyx.file_store.file_store import get_default_file_store
from onyx.server.documents.models import PaginatedReturn
from onyx.server.query_and_chat.models import ChatSessionDetails
from onyx.server.query_and_chat.models import ChatSessionsResponse
from onyx.server.utils import PUBLIC_API_TAGS
from onyx.utils.threadpool_concurrency import parallel_yield
from shared_configs.contextvars import get_current_tenant_id
@@ -294,7 +295,7 @@ def list_all_query_history_exports(
)
@router.post("/admin/query-history/start-export")
@router.post("/admin/query-history/start-export", tags=PUBLIC_API_TAGS)
def start_query_history_export(
_: User | None = Depends(current_admin_user),
db_session: Session = Depends(get_session),
@@ -340,7 +341,7 @@ def start_query_history_export(
return {"request_id": task_id}
@router.get("/admin/query-history/export-status")
@router.get("/admin/query-history/export-status", tags=PUBLIC_API_TAGS)
def get_query_history_export_status(
request_id: str,
_: User | None = Depends(current_admin_user),
@@ -374,7 +375,7 @@ def get_query_history_export_status(
return {"status": TaskStatus.SUCCESS}
@router.get("/admin/query-history/download")
@router.get("/admin/query-history/download", tags=PUBLIC_API_TAGS)
def download_query_history_csv(
request_id: str,
_: User | None = Depends(current_admin_user),

View File

@@ -0,0 +1,92 @@
"""Tenant-specific usage limit overrides from the control plane (EE version)."""
import requests
from ee.onyx.server.tenants.access import generate_data_plane_token
from onyx.configs.app_configs import CONTROL_PLANE_API_BASE_URL
from onyx.server.tenant_usage_limits import TenantUsageLimitOverrides
from onyx.utils.logger import setup_logger
logger = setup_logger()
# In-memory storage for tenant overrides (populated at startup)
_tenant_usage_limit_overrides: dict[str, TenantUsageLimitOverrides] | None = None
def fetch_usage_limit_overrides() -> dict[str, TenantUsageLimitOverrides]:
"""
Fetch tenant-specific usage limit overrides from the control plane.
Returns:
Dictionary mapping tenant_id to their specific limit overrides.
Returns empty dict on any error (falls back to defaults).
"""
try:
token = generate_data_plane_token()
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
}
url = f"{CONTROL_PLANE_API_BASE_URL}/usage-limit-overrides"
response = requests.get(url, headers=headers, timeout=30)
response.raise_for_status()
tenant_overrides = response.json()
# Parse each tenant's overrides
result: dict[str, TenantUsageLimitOverrides] = {}
for override_data in tenant_overrides:
tenant_id = override_data["tenant_id"]
try:
result[tenant_id] = TenantUsageLimitOverrides(**override_data)
except Exception as e:
logger.warning(
f"Failed to parse usage limit overrides for tenant {tenant_id}: {e}"
)
return result
except requests.exceptions.RequestException as e:
logger.warning(f"Failed to fetch usage limit overrides from control plane: {e}")
return {}
except Exception as e:
logger.error(f"Error parsing usage limit overrides: {e}")
return {}
def load_usage_limit_overrides() -> dict[str, TenantUsageLimitOverrides]:
"""
Load tenant usage limit overrides from the control plane.
Called at server startup to populate the in-memory cache.
"""
global _tenant_usage_limit_overrides
logger.info("Loading tenant usage limit overrides from control plane...")
overrides = fetch_usage_limit_overrides()
_tenant_usage_limit_overrides = overrides
if overrides:
logger.info(f"Loaded usage limit overrides for {len(overrides)} tenants")
else:
logger.info("No tenant-specific usage limit overrides found")
return overrides
def get_tenant_usage_limit_overrides(
tenant_id: str,
) -> TenantUsageLimitOverrides | None:
"""
Get the usage limit overrides for a specific tenant.
Args:
tenant_id: The tenant ID to look up
Returns:
TenantUsageLimitOverrides if the tenant has overrides, None otherwise.
"""
global _tenant_usage_limit_overrides
if _tenant_usage_limit_overrides is None:
_tenant_usage_limit_overrides = load_usage_limit_overrides()
return _tenant_usage_limit_overrides.get(tenant_id)

View File

@@ -1,9 +1,9 @@
from typing import cast
from typing import Literal
import requests
import stripe
from ee.onyx.configs.app_configs import STRIPE_PRICE_ID
from ee.onyx.configs.app_configs import STRIPE_SECRET_KEY
from ee.onyx.server.tenants.access import generate_data_plane_token
from ee.onyx.server.tenants.models import BillingInformation
@@ -16,15 +16,21 @@ stripe.api_key = STRIPE_SECRET_KEY
logger = setup_logger()
def fetch_stripe_checkout_session(tenant_id: str) -> str:
def fetch_stripe_checkout_session(
tenant_id: str,
billing_period: Literal["monthly", "annual"] = "monthly",
) -> str:
token = generate_data_plane_token()
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
}
url = f"{CONTROL_PLANE_API_BASE_URL}/create-checkout-session"
params = {"tenant_id": tenant_id}
response = requests.post(url, headers=headers, params=params)
payload = {
"tenant_id": tenant_id,
"billing_period": billing_period,
}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
return response.json()["sessionId"]
@@ -72,22 +78,24 @@ def fetch_billing_information(
def register_tenant_users(tenant_id: str, number_of_users: int) -> stripe.Subscription:
"""
Send a request to the control service to register the number of users for a tenant.
Update the number of seats for a tenant's subscription.
Preserves the existing price (monthly, annual, or grandfathered).
"""
if not STRIPE_PRICE_ID:
raise Exception("STRIPE_PRICE_ID is not set")
response = fetch_tenant_stripe_information(tenant_id)
stripe_subscription_id = cast(str, response.get("stripe_subscription_id"))
subscription = stripe.Subscription.retrieve(stripe_subscription_id)
subscription_item = subscription["items"]["data"][0]
# Use existing price to preserve the customer's current plan
current_price_id = subscription_item.price.id
updated_subscription = stripe.Subscription.modify(
stripe_subscription_id,
items=[
{
"id": subscription["items"]["data"][0].id,
"price": STRIPE_PRICE_ID,
"id": subscription_item.id,
"price": current_price_id,
"quantity": number_of_users,
}
],

View File

@@ -10,6 +10,7 @@ from ee.onyx.server.tenants.billing import fetch_billing_information
from ee.onyx.server.tenants.billing import fetch_stripe_checkout_session
from ee.onyx.server.tenants.billing import fetch_tenant_stripe_information
from ee.onyx.server.tenants.models import BillingInformation
from ee.onyx.server.tenants.models import CreateSubscriptionSessionRequest
from ee.onyx.server.tenants.models import ProductGatingFullSyncRequest
from ee.onyx.server.tenants.models import ProductGatingRequest
from ee.onyx.server.tenants.models import ProductGatingResponse
@@ -104,15 +105,18 @@ async def create_customer_portal_session(
@router.post("/create-subscription-session")
async def create_subscription_session(
request: CreateSubscriptionSessionRequest | None = None,
_: User = Depends(current_admin_user),
) -> SubscriptionSessionResponse:
try:
tenant_id = CURRENT_TENANT_ID_CONTEXTVAR.get()
if not tenant_id:
raise HTTPException(status_code=400, detail="Tenant ID not found")
session_id = fetch_stripe_checkout_session(tenant_id)
billing_period = request.billing_period if request else "monthly"
session_id = fetch_stripe_checkout_session(tenant_id, billing_period)
return SubscriptionSessionResponse(sessionId=session_id)
except Exception as e:
logger.exception("Failed to create resubscription session")
logger.exception("Failed to create subscription session")
raise HTTPException(status_code=500, detail=str(e))

View File

@@ -1,4 +1,5 @@
from datetime import datetime
from typing import Literal
from pydantic import BaseModel
@@ -73,6 +74,12 @@ class SubscriptionSessionResponse(BaseModel):
sessionId: str
class CreateSubscriptionSessionRequest(BaseModel):
"""Request to create a subscription checkout session."""
billing_period: Literal["monthly", "annual"] = "monthly"
class TenantByDomainResponse(BaseModel):
tenant_id: str
number_of_users: int

View File

@@ -1,5 +1,4 @@
import asyncio
import logging
import uuid
import aiohttp # Async HTTP client
@@ -10,10 +9,7 @@ from fastapi import Request
from sqlalchemy import select
from sqlalchemy.orm import Session
from ee.onyx.configs.app_configs import ANTHROPIC_DEFAULT_API_KEY
from ee.onyx.configs.app_configs import COHERE_DEFAULT_API_KEY
from ee.onyx.configs.app_configs import HUBSPOT_TRACKING_URL
from ee.onyx.configs.app_configs import OPENAI_DEFAULT_API_KEY
from ee.onyx.server.tenants.access import generate_data_plane_token
from ee.onyx.server.tenants.models import TenantByDomainResponse
from ee.onyx.server.tenants.models import TenantCreationPayload
@@ -25,11 +21,18 @@ from ee.onyx.server.tenants.user_mapping import add_users_to_tenant
from ee.onyx.server.tenants.user_mapping import get_tenant_id_for_email
from ee.onyx.server.tenants.user_mapping import user_owns_a_tenant
from onyx.auth.users import exceptions
from onyx.configs.app_configs import ANTHROPIC_DEFAULT_API_KEY
from onyx.configs.app_configs import COHERE_DEFAULT_API_KEY
from onyx.configs.app_configs import CONTROL_PLANE_API_BASE_URL
from onyx.configs.app_configs import DEV_MODE
from onyx.configs.app_configs import OPENAI_DEFAULT_API_KEY
from onyx.configs.app_configs import OPENROUTER_DEFAULT_API_KEY
from onyx.configs.app_configs import VERTEXAI_DEFAULT_CREDENTIALS
from onyx.configs.app_configs import VERTEXAI_DEFAULT_LOCATION
from onyx.configs.constants import MilestoneRecordType
from onyx.db.engine.sql_engine import get_session_with_shared_schema
from onyx.db.engine.sql_engine import get_session_with_tenant
from onyx.db.image_generation import create_default_image_gen_config_from_api_key
from onyx.db.llm import update_default_provider
from onyx.db.llm import upsert_cloud_embedding_provider
from onyx.db.llm import upsert_llm_provider
@@ -37,13 +40,24 @@ from onyx.db.models import AvailableTenant
from onyx.db.models import IndexModelStatus
from onyx.db.models import SearchSettings
from onyx.db.models import UserTenantMapping
from onyx.llm.constants import LlmProviderNames
from onyx.llm.llm_provider_options import get_anthropic_model_names
from onyx.llm.llm_provider_options import get_openai_model_names
from onyx.llm.well_known_providers.auto_update_models import LLMRecommendations
from onyx.llm.well_known_providers.constants import ANTHROPIC_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import OPENAI_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import OPENROUTER_PROVIDER_NAME
from onyx.llm.well_known_providers.constants import VERTEX_CREDENTIALS_FILE_KWARG
from onyx.llm.well_known_providers.constants import VERTEX_LOCATION_KWARG
from onyx.llm.well_known_providers.constants import VERTEXAI_PROVIDER_NAME
from onyx.llm.well_known_providers.llm_provider_options import (
get_recommendations,
)
from onyx.llm.well_known_providers.llm_provider_options import (
model_configurations_for_provider,
)
from onyx.server.manage.embedding.models import CloudEmbeddingProviderCreationRequest
from onyx.server.manage.llm.models import LLMProviderUpsertRequest
from onyx.server.manage.llm.models import ModelConfigurationUpsertRequest
from onyx.setup import setup_onyx
from onyx.utils.logger import setup_logger
from onyx.utils.telemetry import mt_cloud_telemetry
from shared_configs.configs import MULTI_TENANT
from shared_configs.configs import POSTGRES_DEFAULT_SCHEMA
@@ -52,7 +66,7 @@ from shared_configs.contextvars import CURRENT_TENANT_ID_CONTEXTVAR
from shared_configs.enums import EmbeddingProvider
logger = logging.getLogger(__name__)
logger = setup_logger()
async def get_or_provision_tenant(
@@ -261,59 +275,173 @@ async def rollback_tenant_provisioning(tenant_id: str) -> None:
logger.info(f"Tenant rollback completed successfully for tenant {tenant_id}")
def configure_default_api_keys(db_session: Session) -> None:
if ANTHROPIC_DEFAULT_API_KEY:
anthropic_provider = LLMProviderUpsertRequest(
name="Anthropic",
provider=LlmProviderNames.ANTHROPIC,
api_key=ANTHROPIC_DEFAULT_API_KEY,
default_model_name="claude-3-7-sonnet-20250219",
model_configurations=[
ModelConfigurationUpsertRequest(
name=name,
is_visible=False,
max_input_tokens=None,
)
for name in get_anthropic_model_names()
],
api_key_changed=True,
)
try:
full_provider = upsert_llm_provider(anthropic_provider, db_session)
update_default_provider(full_provider.id, db_session)
except Exception as e:
logger.error(f"Failed to configure Anthropic provider: {e}")
else:
logger.error(
"ANTHROPIC_DEFAULT_API_KEY not set, skipping Anthropic provider configuration"
def _build_model_configuration_upsert_requests(
provider_name: str,
recommendations: LLMRecommendations,
) -> list[ModelConfigurationUpsertRequest]:
model_configurations = model_configurations_for_provider(
provider_name, recommendations
)
return [
ModelConfigurationUpsertRequest(
name=model_configuration.name,
is_visible=model_configuration.is_visible,
max_input_tokens=model_configuration.max_input_tokens,
supports_image_input=model_configuration.supports_image_input,
)
for model_configuration in model_configurations
]
def configure_default_api_keys(db_session: Session) -> None:
"""Configure default LLM providers using recommended-models.json for model selection."""
# Load recommendations from JSON config
recommendations = get_recommendations()
has_set_default_provider = False
def _upsert(request: LLMProviderUpsertRequest) -> None:
nonlocal has_set_default_provider
try:
provider = upsert_llm_provider(request, db_session)
if not has_set_default_provider:
update_default_provider(provider.id, db_session)
has_set_default_provider = True
except Exception as e:
logger.error(f"Failed to configure {request.provider} provider: {e}")
# Configure OpenAI provider
if OPENAI_DEFAULT_API_KEY:
default_model = recommendations.get_default_model(OPENAI_PROVIDER_NAME)
if default_model is None:
logger.error(
f"No default model found for {OPENAI_PROVIDER_NAME} in recommendations"
)
default_model_name = default_model.name if default_model else "gpt-5.2"
openai_provider = LLMProviderUpsertRequest(
name="OpenAI",
provider=LlmProviderNames.OPENAI,
provider=OPENAI_PROVIDER_NAME,
api_key=OPENAI_DEFAULT_API_KEY,
default_model_name="gpt-4o",
model_configurations=[
ModelConfigurationUpsertRequest(
name=model_name,
is_visible=False,
max_input_tokens=None,
)
for model_name in get_openai_model_names()
],
default_model_name=default_model_name,
model_configurations=_build_model_configuration_upsert_requests(
OPENAI_PROVIDER_NAME, recommendations
),
api_key_changed=True,
is_auto_mode=True,
)
_upsert(openai_provider)
# Create default image generation config using the OpenAI API key
try:
full_provider = upsert_llm_provider(openai_provider, db_session)
update_default_provider(full_provider.id, db_session)
create_default_image_gen_config_from_api_key(
db_session, OPENAI_DEFAULT_API_KEY
)
except Exception as e:
logger.error(f"Failed to configure OpenAI provider: {e}")
logger.error(f"Failed to create default image gen config: {e}")
else:
logger.error(
logger.info(
"OPENAI_DEFAULT_API_KEY not set, skipping OpenAI provider configuration"
)
# Configure Anthropic provider
if ANTHROPIC_DEFAULT_API_KEY:
default_model = recommendations.get_default_model(ANTHROPIC_PROVIDER_NAME)
if default_model is None:
logger.error(
f"No default model found for {ANTHROPIC_PROVIDER_NAME} in recommendations"
)
default_model_name = (
default_model.name if default_model else "claude-sonnet-4-5"
)
anthropic_provider = LLMProviderUpsertRequest(
name="Anthropic",
provider=ANTHROPIC_PROVIDER_NAME,
api_key=ANTHROPIC_DEFAULT_API_KEY,
default_model_name=default_model_name,
model_configurations=_build_model_configuration_upsert_requests(
ANTHROPIC_PROVIDER_NAME, recommendations
),
api_key_changed=True,
is_auto_mode=True,
)
_upsert(anthropic_provider)
else:
logger.info(
"ANTHROPIC_DEFAULT_API_KEY not set, skipping Anthropic provider configuration"
)
# Configure Vertex AI provider
if VERTEXAI_DEFAULT_CREDENTIALS:
default_model = recommendations.get_default_model(VERTEXAI_PROVIDER_NAME)
if default_model is None:
logger.error(
f"No default model found for {VERTEXAI_PROVIDER_NAME} in recommendations"
)
default_model_name = default_model.name if default_model else "gemini-2.5-pro"
# Vertex AI uses custom_config for credentials and location
custom_config = {
VERTEX_CREDENTIALS_FILE_KWARG: VERTEXAI_DEFAULT_CREDENTIALS,
VERTEX_LOCATION_KWARG: VERTEXAI_DEFAULT_LOCATION,
}
vertexai_provider = LLMProviderUpsertRequest(
name="Google Vertex AI",
provider=VERTEXAI_PROVIDER_NAME,
custom_config=custom_config,
default_model_name=default_model_name,
model_configurations=_build_model_configuration_upsert_requests(
VERTEXAI_PROVIDER_NAME, recommendations
),
api_key_changed=True,
is_auto_mode=True,
)
_upsert(vertexai_provider)
else:
logger.info(
"VERTEXAI_DEFAULT_CREDENTIALS not set, skipping Vertex AI provider configuration"
)
# Configure OpenRouter provider
if OPENROUTER_DEFAULT_API_KEY:
default_model = recommendations.get_default_model(OPENROUTER_PROVIDER_NAME)
if default_model is None:
logger.error(
f"No default model found for {OPENROUTER_PROVIDER_NAME} in recommendations"
)
default_model_name = default_model.name if default_model else "z-ai/glm-4.7"
# For OpenRouter, we use the visible models from recommendations as model_configurations
# since OpenRouter models are dynamic (fetched from their API)
visible_models = recommendations.get_visible_models(OPENROUTER_PROVIDER_NAME)
model_configurations = [
ModelConfigurationUpsertRequest(
name=model.name,
is_visible=True,
max_input_tokens=None,
display_name=model.display_name,
)
for model in visible_models
]
openrouter_provider = LLMProviderUpsertRequest(
name="OpenRouter",
provider=OPENROUTER_PROVIDER_NAME,
api_key=OPENROUTER_DEFAULT_API_KEY,
default_model_name=default_model_name,
model_configurations=model_configurations,
api_key_changed=True,
is_auto_mode=True,
)
_upsert(openrouter_provider)
else:
logger.info(
"OPENROUTER_DEFAULT_API_KEY not set, skipping OpenRouter provider configuration"
)
# Configure Cohere embedding provider
if COHERE_DEFAULT_API_KEY:
cloud_embedding_provider = CloudEmbeddingProviderCreationRequest(
provider_type=EmbeddingProvider.COHERE,

View File

@@ -16,8 +16,9 @@ from onyx.db.token_limit import insert_user_token_rate_limit
from onyx.server.query_and_chat.token_limit import any_rate_limit_exists
from onyx.server.token_rate_limits.models import TokenRateLimitArgs
from onyx.server.token_rate_limits.models import TokenRateLimitDisplay
from onyx.server.utils import PUBLIC_API_TAGS
router = APIRouter(prefix="/admin/token-rate-limits")
router = APIRouter(prefix="/admin/token-rate-limits", tags=PUBLIC_API_TAGS)
"""

View File

@@ -0,0 +1,38 @@
"""EE Usage limits - trial detection via billing information."""
from ee.onyx.server.tenants.billing import fetch_billing_information
from ee.onyx.server.tenants.models import BillingInformation
from ee.onyx.server.tenants.models import SubscriptionStatusResponse
from onyx.utils.logger import setup_logger
from shared_configs.configs import MULTI_TENANT
logger = setup_logger()
def is_tenant_on_trial(tenant_id: str) -> bool:
"""
Determine if a tenant is currently on a trial subscription.
In multi-tenant mode, we fetch billing information from the control plane
to determine if the tenant has an active trial.
"""
if not MULTI_TENANT:
return False
try:
billing_info = fetch_billing_information(tenant_id)
# If not subscribed at all, check if we have trial information
if isinstance(billing_info, SubscriptionStatusResponse):
# No subscription means they're likely on trial (new tenant)
return True
if isinstance(billing_info, BillingInformation):
return billing_info.status == "trialing"
return False
except Exception as e:
logger.warning(f"Failed to fetch billing info for trial check: {e}")
# Default to trial limits on error (more restrictive = safer)
return True

View File

@@ -21,11 +21,12 @@ from onyx.auth.users import current_curator_or_admin_user
from onyx.db.engine.sql_engine import get_session
from onyx.db.models import User
from onyx.db.models import UserRole
from onyx.server.utils import PUBLIC_API_TAGS
from onyx.utils.logger import setup_logger
logger = setup_logger()
router = APIRouter(prefix="/manage")
router = APIRouter(prefix="/manage", tags=PUBLIC_API_TAGS)
@router.get("/admin/user-group")

View File

@@ -105,6 +105,8 @@ class DocExternalAccess:
)
# TODO(andrei): First refactor this into a pydantic model, then get rid of
# duplicate fields.
@dataclass(frozen=True, init=False)
class DocumentAccess(ExternalAccess):
# User emails for Onyx users, None indicates admin

View File

@@ -0,0 +1,107 @@
"""Captcha verification for user registration."""
import httpx
from pydantic import BaseModel
from pydantic import Field
from onyx.configs.app_configs import CAPTCHA_ENABLED
from onyx.configs.app_configs import RECAPTCHA_SCORE_THRESHOLD
from onyx.configs.app_configs import RECAPTCHA_SECRET_KEY
from onyx.utils.logger import setup_logger
logger = setup_logger()
RECAPTCHA_VERIFY_URL = "https://www.google.com/recaptcha/api/siteverify"
class CaptchaVerificationError(Exception):
"""Raised when captcha verification fails."""
class RecaptchaResponse(BaseModel):
"""Response from Google reCAPTCHA verification API."""
success: bool
score: float | None = None # Only present for reCAPTCHA v3
action: str | None = None
challenge_ts: str | None = None
hostname: str | None = None
error_codes: list[str] | None = Field(default=None, alias="error-codes")
def is_captcha_enabled() -> bool:
"""Check if captcha verification is enabled."""
return CAPTCHA_ENABLED and bool(RECAPTCHA_SECRET_KEY)
async def verify_captcha_token(
token: str,
expected_action: str = "signup",
) -> None:
"""
Verify a reCAPTCHA token with Google's API.
Args:
token: The reCAPTCHA response token from the client
expected_action: Expected action name for v3 verification
Raises:
CaptchaVerificationError: If verification fails
"""
if not is_captcha_enabled():
return
if not token:
raise CaptchaVerificationError("Captcha token is required")
try:
async with httpx.AsyncClient() as client:
response = await client.post(
RECAPTCHA_VERIFY_URL,
data={
"secret": RECAPTCHA_SECRET_KEY,
"response": token,
},
timeout=10.0,
)
response.raise_for_status()
data = response.json()
result = RecaptchaResponse(**data)
if not result.success:
error_codes = result.error_codes or ["unknown-error"]
logger.warning(f"Captcha verification failed: {error_codes}")
raise CaptchaVerificationError(
f"Captcha verification failed: {', '.join(error_codes)}"
)
# For reCAPTCHA v3, also check the score
if result.score is not None:
if result.score < RECAPTCHA_SCORE_THRESHOLD:
logger.warning(
f"Captcha score too low: {result.score} < {RECAPTCHA_SCORE_THRESHOLD}"
)
raise CaptchaVerificationError(
"Captcha verification failed: suspicious activity detected"
)
# Optionally verify the action matches
if result.action and result.action != expected_action:
logger.warning(
f"Captcha action mismatch: {result.action} != {expected_action}"
)
raise CaptchaVerificationError(
"Captcha verification failed: action mismatch"
)
logger.debug(
f"Captcha verification passed: score={result.score}, "
f"action={result.action}"
)
except httpx.HTTPError as e:
logger.error(f"Captcha API request failed: {e}")
# In case of API errors, we might want to allow registration
# to prevent blocking legitimate users. This is a policy decision.
raise CaptchaVerificationError("Captcha verification service unavailable")

View File

@@ -0,0 +1,192 @@
"""
Utility to validate and block disposable/temporary email addresses.
This module fetches a list of known disposable email domains from a remote source
and caches them for performance. It's used during user registration to prevent
abuse from temporary email services.
"""
import threading
import time
from typing import Set
import httpx
from onyx.configs.app_configs import DISPOSABLE_EMAIL_DOMAINS_URL
from onyx.utils.logger import setup_logger
logger = setup_logger()
class DisposableEmailValidator:
"""
Thread-safe singleton validator for disposable email domains.
Fetches and caches the list of disposable domains, with periodic refresh.
"""
_instance: "DisposableEmailValidator | None" = None
_lock = threading.Lock()
def __new__(cls) -> "DisposableEmailValidator":
if cls._instance is None:
with cls._lock:
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
def __init__(self) -> None:
# Check if already initialized using a try/except to avoid type issues
try:
if self._initialized:
return
except AttributeError:
pass
self._domains: Set[str] = set()
self._last_fetch_time: float = 0
self._fetch_lock = threading.Lock()
# Cache for 1 hour
self._cache_duration = 3600
# Hardcoded fallback list of common disposable domains
# This ensures we block at least these even if the remote fetch fails
self._fallback_domains = {
"trashlify.com",
"10minutemail.com",
"guerrillamail.com",
"mailinator.com",
"tempmail.com",
"throwaway.email",
"yopmail.com",
"temp-mail.org",
"getnada.com",
"maildrop.cc",
}
# Set initialized flag last to prevent race conditions
self._initialized: bool = True
def _should_refresh(self) -> bool:
"""Check if the cached domains should be refreshed."""
return (time.time() - self._last_fetch_time) > self._cache_duration
def _fetch_domains(self) -> Set[str]:
"""
Fetch disposable email domains from the configured URL.
Returns:
Set of domain strings (lowercased)
"""
if not DISPOSABLE_EMAIL_DOMAINS_URL:
logger.debug("DISPOSABLE_EMAIL_DOMAINS_URL not configured")
return self._fallback_domains.copy()
try:
logger.info(
f"Fetching disposable email domains from {DISPOSABLE_EMAIL_DOMAINS_URL}"
)
with httpx.Client(timeout=10.0) as client:
response = client.get(DISPOSABLE_EMAIL_DOMAINS_URL)
response.raise_for_status()
domains_list = response.json()
if not isinstance(domains_list, list):
logger.error(
f"Expected list from disposable domains URL, got {type(domains_list)}"
)
return self._fallback_domains.copy()
# Convert all to lowercase and create set
domains = {domain.lower().strip() for domain in domains_list if domain}
# Always include fallback domains
domains.update(self._fallback_domains)
logger.info(
f"Successfully fetched {len(domains)} disposable email domains"
)
return domains
except httpx.HTTPError as e:
logger.warning(f"Failed to fetch disposable domains (HTTP error): {e}")
except Exception as e:
logger.warning(f"Failed to fetch disposable domains: {e}")
# On error, return fallback domains
return self._fallback_domains.copy()
def get_domains(self) -> Set[str]:
"""
Get the cached set of disposable email domains.
Refreshes the cache if needed.
Returns:
Set of disposable domain strings (lowercased)
"""
# Fast path: return cached domains if still fresh
if self._domains and not self._should_refresh():
return self._domains.copy()
# Slow path: need to refresh
with self._fetch_lock:
# Double-check after acquiring lock
if self._domains and not self._should_refresh():
return self._domains.copy()
self._domains = self._fetch_domains()
self._last_fetch_time = time.time()
return self._domains.copy()
def is_disposable(self, email: str) -> bool:
"""
Check if an email address uses a disposable domain.
Args:
email: The email address to check
Returns:
True if the email domain is disposable, False otherwise
"""
if not email or "@" not in email:
return False
parts = email.split("@")
if len(parts) != 2 or not parts[0]: # Must have user@domain with non-empty user
return False
domain = parts[1].lower().strip()
if not domain: # Domain part must not be empty
return False
disposable_domains = self.get_domains()
return domain in disposable_domains
# Global singleton instance
_validator = DisposableEmailValidator()
def is_disposable_email(email: str) -> bool:
"""
Check if an email address uses a disposable/temporary domain.
This is a convenience function that uses the global validator instance.
Args:
email: The email address to check
Returns:
True if the email uses a disposable domain, False otherwise
"""
return _validator.is_disposable(email)
def refresh_disposable_domains() -> None:
"""
Force a refresh of the disposable domains list.
This can be called manually if you want to update the list
without waiting for the cache to expire.
"""
_validator._last_fetch_time = 0
_validator.get_domains()

View File

@@ -40,6 +40,8 @@ class UserRead(schemas.BaseUser[uuid.UUID]):
class UserCreate(schemas.BaseUserCreate):
role: UserRole = UserRole.BASIC
tenant_id: str | None = None
# Captcha token for cloud signup protection (optional, only used when captcha is enabled)
captcha_token: str | None = None
class UserUpdateWithRole(schemas.BaseUserUpdate):

View File

@@ -60,6 +60,7 @@ from sqlalchemy.exc import IntegrityError
from sqlalchemy.ext.asyncio import AsyncSession
from onyx.auth.api_key import get_hashed_api_key_from_request
from onyx.auth.disposable_email_validator import is_disposable_email
from onyx.auth.email_utils import send_forgot_password_email
from onyx.auth.email_utils import send_user_verification_email
from onyx.auth.invited_users import get_invited_users
@@ -248,13 +249,23 @@ def verify_email_in_whitelist(email: str, tenant_id: str) -> None:
def verify_email_domain(email: str) -> None:
if email.count("@") != 1:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Email is not valid",
)
domain = email.split("@")[-1].lower()
# Check if email uses a disposable/temporary domain
if is_disposable_email(email):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Disposable email addresses are not allowed. Please use a permanent email address.",
)
# Check domain whitelist if configured
if VALID_EMAIL_DOMAINS:
if email.count("@") != 1:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Email is not valid",
)
domain = email.split("@")[-1].lower()
if domain not in VALID_EMAIL_DOMAINS:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
@@ -292,11 +303,57 @@ class UserManager(UUIDIDMixin, BaseUserManager[User, uuid.UUID]):
safe: bool = False,
request: Optional[Request] = None,
) -> User:
# Verify captcha if enabled (for cloud signup protection)
from onyx.auth.captcha import CaptchaVerificationError
from onyx.auth.captcha import is_captcha_enabled
from onyx.auth.captcha import verify_captcha_token
if is_captcha_enabled() and request is not None:
# Get captcha token from request body or headers
captcha_token = None
if hasattr(user_create, "captcha_token"):
captcha_token = getattr(user_create, "captcha_token", None)
# Also check headers as a fallback
if not captcha_token:
captcha_token = request.headers.get("X-Captcha-Token")
try:
await verify_captcha_token(
captcha_token or "", expected_action="signup"
)
except CaptchaVerificationError as e:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail={"reason": str(e)},
)
# We verify the password here to make sure it's valid before we proceed
await self.validate_password(
user_create.password, cast(schemas.UC, user_create)
)
# Check for disposable emails BEFORE provisioning tenant
# This prevents creating tenants for throwaway email addresses
try:
verify_email_domain(user_create.email)
except HTTPException as e:
# Log blocked disposable email attempts
if (
e.status_code == status.HTTP_400_BAD_REQUEST
and "Disposable email" in str(e.detail)
):
domain = (
user_create.email.split("@")[-1]
if "@" in user_create.email
else "unknown"
)
logger.warning(
f"Blocked disposable email registration attempt: {domain}",
extra={"email_domain": domain},
)
raise
user_count: int | None = None
referral_source = (
request.cookies.get("referral_source", None)
@@ -318,8 +375,17 @@ class UserManager(UUIDIDMixin, BaseUserManager[User, uuid.UUID]):
token = CURRENT_TENANT_ID_CONTEXTVAR.set(tenant_id)
try:
async with get_async_session_context_manager(tenant_id) as db_session:
verify_email_is_invited(user_create.email)
verify_email_domain(user_create.email)
# Check invite list based on deployment mode
if MULTI_TENANT:
# Multi-tenant: Only require invite for existing tenants
# New tenant creation (first user) doesn't require an invite
user_count = await get_user_count()
if user_count > 0:
# Tenant already has users - require invite for new users
verify_email_is_invited(user_create.email)
else:
# Single-tenant: Check invite list (skips if SAML/OIDC or no list configured)
verify_email_is_invited(user_create.email)
if MULTI_TENANT:
tenant_user_db = SQLAlchemyUserAdminDB[User, uuid.UUID](
db_session, User, OAuthAccount

View File

@@ -26,6 +26,7 @@ from onyx.background.celery.celery_utils import celery_is_worker_primary
from onyx.background.celery.celery_utils import make_probe_path
from onyx.background.celery.tasks.vespa.document_sync import DOCUMENT_SYNC_PREFIX
from onyx.background.celery.tasks.vespa.document_sync import DOCUMENT_SYNC_TASKSET_KEY
from onyx.configs.app_configs import ENABLE_OPENSEARCH_FOR_ONYX
from onyx.configs.constants import ONYX_CLOUD_CELERY_TASK_PREFIX
from onyx.configs.constants import OnyxRedisLocks
from onyx.db.engine.sql_engine import get_sqlalchemy_engine
@@ -515,6 +516,9 @@ def wait_for_vespa_or_shutdown(sender: Any, **kwargs: Any) -> None:
"""Waits for Vespa to become ready subject to a timeout.
Raises WorkerShutdown if the timeout is reached."""
if ENABLE_OPENSEARCH_FOR_ONYX:
return
if not wait_for_vespa_with_timeout():
msg = "Vespa: Readiness probe did not succeed within the timeout. Exiting..."
logger.error(msg)

View File

@@ -124,6 +124,7 @@ celery_app.autodiscover_tasks(
"onyx.background.celery.tasks.kg_processing",
"onyx.background.celery.tasks.monitoring",
"onyx.background.celery.tasks.user_file_processing",
"onyx.background.celery.tasks.llm_model_update",
# Light worker tasks
"onyx.background.celery.tasks.shared",
"onyx.background.celery.tasks.vespa",

View File

@@ -98,8 +98,5 @@ for bootstep in base_bootsteps:
celery_app.autodiscover_tasks(
[
"onyx.background.celery.tasks.docfetching",
# Ensure the user files indexing worker registers the doc_id migration task
# TODO(subash): remove this once the doc_id migration is complete
"onyx.background.celery.tasks.user_file_processing",
]
)

View File

@@ -2,8 +2,12 @@ import copy
from datetime import timedelta
from typing import Any
from celery.schedules import crontab
from onyx.configs.app_configs import AUTO_LLM_CONFIG_URL
from onyx.configs.app_configs import AUTO_LLM_UPDATE_INTERVAL_SECONDS
from onyx.configs.app_configs import ENTERPRISE_EDITION_ENABLED
from onyx.configs.app_configs import LLM_MODEL_UPDATE_API_URL
from onyx.configs.app_configs import SCHEDULED_EVAL_DATASET_NAMES
from onyx.configs.constants import ONYX_CLOUD_CELERY_TASK_PREFIX
from onyx.configs.constants import OnyxCeleryPriority
from onyx.configs.constants import OnyxCeleryQueues
@@ -53,16 +57,6 @@ beat_task_templates: list[dict] = [
"expires": BEAT_EXPIRES_DEFAULT,
},
},
{
"name": "user-file-docid-migration",
"task": OnyxCeleryTask.USER_FILE_DOCID_MIGRATION,
"schedule": timedelta(minutes=10),
"options": {
"priority": OnyxCeleryPriority.HIGH,
"expires": BEAT_EXPIRES_DEFAULT,
"queue": OnyxCeleryQueues.USER_FILES_INDEXING,
},
},
{
"name": "check-for-kg-processing",
"task": OnyxCeleryTask.CHECK_KG_PROCESSING,
@@ -171,13 +165,32 @@ if ENTERPRISE_EDITION_ENABLED:
]
)
# Only add the LLM model update task if the API URL is configured
if LLM_MODEL_UPDATE_API_URL:
# Add the Auto LLM update task if the config URL is set (has a default)
if AUTO_LLM_CONFIG_URL:
beat_task_templates.append(
{
"name": "check-for-llm-model-update",
"task": OnyxCeleryTask.CHECK_FOR_LLM_MODEL_UPDATE,
"schedule": timedelta(hours=1), # Check every hour
"name": "check-for-auto-llm-update",
"task": OnyxCeleryTask.CHECK_FOR_AUTO_LLM_UPDATE,
"schedule": timedelta(seconds=AUTO_LLM_UPDATE_INTERVAL_SECONDS),
"options": {
"priority": OnyxCeleryPriority.LOW,
"expires": BEAT_EXPIRES_DEFAULT,
},
}
)
# Add scheduled eval task if datasets are configured
if SCHEDULED_EVAL_DATASET_NAMES:
beat_task_templates.append(
{
"name": "scheduled-eval-pipeline",
"task": OnyxCeleryTask.SCHEDULED_EVAL_TASK,
# run every Sunday at midnight UTC
"schedule": crontab(
hour=0,
minute=0,
day_of_week=0,
),
"options": {
"priority": OnyxCeleryPriority.LOW,
"expires": BEAT_EXPIRES_DEFAULT,

View File

@@ -72,15 +72,6 @@ def try_creating_docfetching_task(
# Another indexing attempt is already running
return None
# Determine which queue to use based on whether this is a user file
# TODO: at the moment the indexing pipeline is
# shared between user files and connectors
queue = (
OnyxCeleryQueues.USER_FILES_INDEXING
if cc_pair.is_user_file
else OnyxCeleryQueues.CONNECTOR_DOC_FETCHING
)
# Use higher priority for first-time indexing to ensure new connectors
# get processed before re-indexing of existing connectors
has_successful_attempt = cc_pair.last_successful_index_time is not None
@@ -99,7 +90,7 @@ def try_creating_docfetching_task(
search_settings_id=search_settings.id,
tenant_id=tenant_id,
),
queue=queue,
queue=OnyxCeleryQueues.CONNECTOR_DOC_FETCHING,
task_id=custom_task_id,
priority=priority,
)

View File

@@ -12,6 +12,7 @@ from celery import Celery
from celery import shared_task
from celery import Task
from celery.exceptions import SoftTimeLimitExceeded
from fastapi import HTTPException
from pydantic import BaseModel
from redis import Redis
from redis.lock import Lock as RedisLock
@@ -40,9 +41,11 @@ from onyx.background.indexing.checkpointing_utils import (
)
from onyx.background.indexing.index_attempt_utils import cleanup_index_attempts
from onyx.background.indexing.index_attempt_utils import get_old_index_attempts
from onyx.configs.app_configs import AUTH_TYPE
from onyx.configs.app_configs import MANAGED_VESPA
from onyx.configs.app_configs import VESPA_CLOUD_CERT_PATH
from onyx.configs.app_configs import VESPA_CLOUD_KEY_PATH
from onyx.configs.constants import AuthType
from onyx.configs.constants import CELERY_GENERIC_BEAT_LOCK_TIMEOUT
from onyx.configs.constants import CELERY_INDEXING_LOCK_TIMEOUT
from onyx.configs.constants import MilestoneRecordType
@@ -59,11 +62,9 @@ from onyx.db.connector import mark_ccpair_with_indexing_trigger
from onyx.db.connector_credential_pair import (
fetch_indexable_standard_connector_credential_pair_ids,
)
from onyx.db.connector_credential_pair import (
fetch_indexable_user_file_connector_credential_pair_ids,
)
from onyx.db.connector_credential_pair import get_connector_credential_pair_from_id
from onyx.db.connector_credential_pair import set_cc_pair_repeated_error_state
from onyx.db.connector_credential_pair import update_connector_credential_pair_from_id
from onyx.db.engine.sql_engine import get_session_with_current_tenant
from onyx.db.engine.time_utils import get_db_current_time
from onyx.db.enums import ConnectorCredentialPairStatus
@@ -112,6 +113,7 @@ from onyx.utils.telemetry import RecordType
from shared_configs.configs import INDEXING_MODEL_SERVER_HOST
from shared_configs.configs import INDEXING_MODEL_SERVER_PORT
from shared_configs.configs import MULTI_TENANT
from shared_configs.configs import USAGE_LIMITS_ENABLED
from shared_configs.contextvars import CURRENT_TENANT_ID_CONTEXTVAR
from shared_configs.contextvars import INDEX_ATTEMPT_INFO_CONTEXTVAR
@@ -538,12 +540,7 @@ def check_indexing_completion(
]:
# User file connectors must be paused on success
# NOTE: _run_indexing doesn't update connectors if the index attempt is the future embedding model
# TODO: figure out why this doesn't pause connectors during swap
cc_pair.status = (
ConnectorCredentialPairStatus.PAUSED
if cc_pair.is_user_file
else ConnectorCredentialPairStatus.ACTIVE
)
cc_pair.status = ConnectorCredentialPairStatus.ACTIVE
db_session.commit()
mt_cloud_telemetry(
@@ -809,13 +806,8 @@ def check_for_indexing(self: Task, *, tenant_id: str) -> int | None:
db_session, active_cc_pairs_only=True
)
)
user_file_cc_pair_ids = (
fetch_indexable_user_file_connector_credential_pair_ids(
db_session, search_settings_id=current_search_settings.id
)
)
primary_cc_pair_ids = standard_cc_pair_ids + user_file_cc_pair_ids
primary_cc_pair_ids = standard_cc_pair_ids
# Get CC pairs for secondary search settings
secondary_cc_pair_ids: list[int] = []
@@ -831,30 +823,47 @@ def check_for_indexing(self: Task, *, tenant_id: str) -> int | None:
db_session, active_cc_pairs_only=not include_paused
)
)
user_file_cc_pair_ids = (
fetch_indexable_user_file_connector_credential_pair_ids(
db_session, search_settings_id=secondary_search_settings.id
)
or []
)
secondary_cc_pair_ids = standard_cc_pair_ids + user_file_cc_pair_ids
secondary_cc_pair_ids = standard_cc_pair_ids
# Flag CC pairs in repeated error state for primary/current search settings
with get_session_with_current_tenant() as db_session:
for cc_pair_id in primary_cc_pair_ids:
lock_beat.reacquire()
if is_in_repeated_error_state(
cc_pair_id=cc_pair_id,
search_settings_id=current_search_settings.id,
cc_pair = get_connector_credential_pair_from_id(
db_session=db_session,
cc_pair_id=cc_pair_id,
)
# if already in repeated error state, don't do anything
# this is important so that we don't keep pausing the connector
# immediately upon a user un-pausing it to manually re-trigger and
# recover.
if (
cc_pair
and not cc_pair.in_repeated_error_state
and is_in_repeated_error_state(
cc_pair=cc_pair,
search_settings_id=current_search_settings.id,
db_session=db_session,
)
):
set_cc_pair_repeated_error_state(
db_session=db_session,
cc_pair_id=cc_pair_id,
in_repeated_error_state=True,
)
# When entering repeated error state, also pause the connector
# to prevent continued indexing retry attempts burning through embedding credits.
# NOTE: only for Cloud, since most self-hosted users use self-hosted embedding
# models. Also, they are more prone to repeated failures -> eventual success.
if AUTH_TYPE == AuthType.CLOUD:
update_connector_credential_pair_from_id(
db_session=db_session,
cc_pair_id=cc_pair.id,
status=ConnectorCredentialPairStatus.PAUSED,
)
# NOTE: At this point, we haven't done heavy checks on whether or not the CC pairs should actually be indexed
# Heavy check, should_index(), is called in _kickoff_indexing_tasks
@@ -1279,6 +1288,26 @@ def docprocessing_task(
INDEX_ATTEMPT_INFO_CONTEXTVAR.reset(token)
def _check_chunk_usage_limit(tenant_id: str) -> None:
"""Check if chunk indexing usage limit has been exceeded.
Raises UsageLimitExceededError if the limit is exceeded.
"""
if not USAGE_LIMITS_ENABLED:
return
from onyx.db.usage import UsageType
from onyx.server.usage_limits import check_usage_and_raise
with get_session_with_current_tenant() as db_session:
check_usage_and_raise(
db_session=db_session,
usage_type=UsageType.CHUNKS_INDEXED,
tenant_id=tenant_id,
pending_amount=0, # Just check current usage
)
def _docprocessing_task(
index_attempt_id: int,
cc_pair_id: int,
@@ -1290,6 +1319,25 @@ def _docprocessing_task(
if tenant_id:
CURRENT_TENANT_ID_CONTEXTVAR.set(tenant_id)
# Check if chunk indexing usage limit has been exceeded before processing
if USAGE_LIMITS_ENABLED:
try:
_check_chunk_usage_limit(tenant_id)
except HTTPException as e:
# Log the error and fail the indexing attempt
task_logger.error(
f"Chunk indexing usage limit exceeded for tenant {tenant_id}: {e}"
)
with get_session_with_current_tenant() as db_session:
from onyx.db.index_attempt import mark_attempt_failed
mark_attempt_failed(
index_attempt_id=index_attempt_id,
db_session=db_session,
failure_reason=str(e),
)
raise
task_logger.info(
f"Processing document batch: "
f"attempt={index_attempt_id} "
@@ -1434,6 +1482,23 @@ def _docprocessing_task(
adapter=adapter,
)
# Track chunk indexing usage for cloud usage limits
if USAGE_LIMITS_ENABLED and index_pipeline_result.total_chunks > 0:
try:
from onyx.db.usage import increment_usage
from onyx.db.usage import UsageType
with get_session_with_current_tenant() as usage_db_session:
increment_usage(
db_session=usage_db_session,
usage_type=UsageType.CHUNKS_INDEXED,
amount=index_pipeline_result.total_chunks,
)
usage_db_session.commit()
except Exception as e:
# Log but don't fail indexing if usage tracking fails
task_logger.warning(f"Failed to track chunk indexing usage: {e}")
# Update batch completion and document counts atomically using database coordination
with get_session_with_current_tenant() as db_session, cross_batch_db_lock:

View File

@@ -10,7 +10,6 @@ from sqlalchemy.orm import Session
from onyx.configs.app_configs import DISABLE_INDEX_UPDATE_ON_SWAP
from onyx.configs.constants import CELERY_GENERIC_BEAT_LOCK_TIMEOUT
from onyx.configs.constants import DocumentSource
from onyx.db.connector_credential_pair import get_connector_credential_pair_from_id
from onyx.db.engine.time_utils import get_db_current_time
from onyx.db.enums import ConnectorCredentialPairStatus
from onyx.db.enums import IndexingStatus
@@ -126,18 +125,9 @@ class IndexingCallback(IndexingHeartbeatInterface):
def is_in_repeated_error_state(
cc_pair_id: int, search_settings_id: int, db_session: Session
cc_pair: ConnectorCredentialPair, search_settings_id: int, db_session: Session
) -> bool:
"""Checks if the cc pair / search setting combination is in a repeated error state."""
cc_pair = get_connector_credential_pair_from_id(
db_session=db_session,
cc_pair_id=cc_pair_id,
)
if not cc_pair:
raise RuntimeError(
f"is_in_repeated_error_state - could not find cc_pair with id={cc_pair_id}"
)
# if the connector doesn't have a refresh_freq, a single failed attempt is enough
number_of_failed_attempts_in_a_row_needed = (
NUM_REPEAT_ERRORS_BEFORE_REPEATED_ERROR_STATE
@@ -146,7 +136,7 @@ def is_in_repeated_error_state(
)
most_recent_index_attempts = get_recent_attempts_for_cc_pair(
cc_pair_id=cc_pair_id,
cc_pair_id=cc_pair.id,
search_settings_id=search_settings_id,
limit=number_of_failed_attempts_in_a_row_needed,
db_session=db_session,
@@ -180,7 +170,7 @@ def should_index(
db_session=db_session,
)
all_recent_errored = is_in_repeated_error_state(
cc_pair_id=cc_pair.id,
cc_pair=cc_pair,
search_settings_id=search_settings_instance.id,
db_session=db_session,
)

View File

@@ -1,9 +1,15 @@
from datetime import datetime
from datetime import timezone
from typing import Any
from celery import shared_task
from celery import Task
from onyx.configs.app_configs import BRAINTRUST_API_KEY
from onyx.configs.app_configs import JOB_TIMEOUT
from onyx.configs.app_configs import SCHEDULED_EVAL_DATASET_NAMES
from onyx.configs.app_configs import SCHEDULED_EVAL_PERMISSIONS_EMAIL
from onyx.configs.app_configs import SCHEDULED_EVAL_PROJECT
from onyx.configs.constants import OnyxCeleryTask
from onyx.evals.eval import run_eval
from onyx.evals.models import EvalConfigurationOptions
@@ -33,3 +39,109 @@ def eval_run_task(
except Exception:
logger.error("Failed to run eval task")
raise
@shared_task(
name=OnyxCeleryTask.SCHEDULED_EVAL_TASK,
ignore_result=True,
soft_time_limit=JOB_TIMEOUT * 5, # Allow more time for multiple datasets
bind=True,
trail=False,
)
def scheduled_eval_task(self: Task, **kwargs: Any) -> None:
"""
Scheduled task to run evaluations on configured datasets.
Runs weekly on Sunday at midnight UTC.
Configure via environment variables (with defaults):
- SCHEDULED_EVAL_DATASET_NAMES: Comma-separated list of Braintrust dataset names
- SCHEDULED_EVAL_PERMISSIONS_EMAIL: Email for search permissions (default: roshan@onyx.app)
- SCHEDULED_EVAL_PROJECT: Braintrust project name
"""
if not BRAINTRUST_API_KEY:
logger.error("BRAINTRUST_API_KEY is not configured, cannot run scheduled evals")
return
if not SCHEDULED_EVAL_PROJECT:
logger.error(
"SCHEDULED_EVAL_PROJECT is not configured, cannot run scheduled evals"
)
return
if not SCHEDULED_EVAL_DATASET_NAMES:
logger.info("No scheduled eval datasets configured, skipping")
return
if not SCHEDULED_EVAL_PERMISSIONS_EMAIL:
logger.error("SCHEDULED_EVAL_PERMISSIONS_EMAIL not configured")
return
project_name = SCHEDULED_EVAL_PROJECT
dataset_names = SCHEDULED_EVAL_DATASET_NAMES
permissions_email = SCHEDULED_EVAL_PERMISSIONS_EMAIL
# Create a timestamp for the scheduled run
run_timestamp = datetime.now(timezone.utc).strftime("%Y-%m-%d")
logger.info(
f"Starting scheduled eval pipeline for project '{project_name}' "
f"with {len(dataset_names)} dataset(s): {dataset_names}"
)
pipeline_start = datetime.now(timezone.utc)
results: list[dict[str, Any]] = []
for dataset_name in dataset_names:
start_time = datetime.now(timezone.utc)
error_message: str | None = None
success = False
# Create informative experiment name for scheduled runs
experiment_name = f"{dataset_name} - {run_timestamp}"
try:
logger.info(
f"Running scheduled eval for dataset: {dataset_name} "
f"(project: {project_name})"
)
configuration = EvalConfigurationOptions(
search_permissions_email=permissions_email,
dataset_name=dataset_name,
no_send_logs=False,
braintrust_project=project_name,
experiment_name=experiment_name,
)
result = run_eval(
configuration=configuration,
remote_dataset_name=dataset_name,
)
success = result.success
logger.info(f"Completed eval for {dataset_name}: success={success}")
except Exception as e:
logger.exception(f"Failed to run scheduled eval for {dataset_name}")
error_message = str(e)
success = False
end_time = datetime.now(timezone.utc)
results.append(
{
"dataset_name": dataset_name,
"success": success,
"start_time": start_time,
"end_time": end_time,
"error_message": error_message,
}
)
pipeline_end = datetime.now(timezone.utc)
total_duration = (pipeline_end - pipeline_start).total_seconds()
passed_count = sum(1 for r in results if r["success"])
logger.info(
f"Scheduled eval pipeline completed: {passed_count}/{len(results)} passed "
f"in {total_duration:.1f}s"
)

View File

@@ -1,135 +1,45 @@
from typing import Any
import requests
from celery import shared_task
from celery import Task
from onyx.background.celery.apps.app_base import task_logger
from onyx.configs.app_configs import JOB_TIMEOUT
from onyx.configs.app_configs import LLM_MODEL_UPDATE_API_URL
from onyx.configs.app_configs import AUTO_LLM_CONFIG_URL
from onyx.configs.constants import OnyxCeleryTask
from onyx.db.engine.sql_engine import get_session_with_current_tenant
from onyx.db.models import LLMProvider
from onyx.db.models import ModelConfiguration
def _process_model_list_response(model_list_json: Any) -> list[str]:
# Handle case where response is wrapped in a "data" field
if isinstance(model_list_json, dict):
if "data" in model_list_json:
model_list_json = model_list_json["data"]
elif "models" in model_list_json:
model_list_json = model_list_json["models"]
else:
raise ValueError(
"Invalid response from API - expected dict with 'data' or "
f"'models' field, got {type(model_list_json)}"
)
if not isinstance(model_list_json, list):
raise ValueError(
f"Invalid response from API - expected list, got {type(model_list_json)}"
)
# Handle both string list and object list cases
model_names: list[str] = []
for item in model_list_json:
if isinstance(item, str):
model_names.append(item)
elif isinstance(item, dict):
if "model_name" in item:
model_names.append(item["model_name"])
elif "id" in item:
model_names.append(item["id"])
else:
raise ValueError(
f"Invalid item in model list - expected dict with model_name or id, got {type(item)}"
)
else:
raise ValueError(
f"Invalid item in model list - expected string or dict, got {type(item)}"
)
return model_names
from onyx.llm.well_known_providers.auto_update_service import (
sync_llm_models_from_github,
)
@shared_task(
name=OnyxCeleryTask.CHECK_FOR_LLM_MODEL_UPDATE,
name=OnyxCeleryTask.CHECK_FOR_AUTO_LLM_UPDATE,
ignore_result=True,
soft_time_limit=JOB_TIMEOUT,
soft_time_limit=300, # 5 minute timeout
trail=False,
bind=True,
)
def check_for_llm_model_update(self: Task, *, tenant_id: str) -> bool | None:
if not LLM_MODEL_UPDATE_API_URL:
raise ValueError("LLM model update API URL not configured")
def check_for_auto_llm_updates(self: Task, *, tenant_id: str) -> bool | None:
"""Periodic task to fetch LLM model updates from GitHub
and sync them to providers in Auto mode.
# First fetch the models from the API
try:
response = requests.get(LLM_MODEL_UPDATE_API_URL)
response.raise_for_status()
available_models = _process_model_list_response(response.json())
task_logger.info(f"Found available models: {available_models}")
except Exception:
task_logger.exception("Failed to fetch models from API.")
This task checks the GitHub-hosted config file and updates all
providers that have is_auto_mode=True.
"""
if not AUTO_LLM_CONFIG_URL:
task_logger.debug("AUTO_LLM_CONFIG_URL not configured, skipping")
return None
# Then update the database with the fetched models
with get_session_with_current_tenant() as db_session:
# Get the default LLM provider
default_provider = (
db_session.query(LLMProvider)
.filter(LLMProvider.is_default_provider.is_(True))
.first()
)
try:
# Sync to database
with get_session_with_current_tenant() as db_session:
results = sync_llm_models_from_github(db_session)
if not default_provider:
task_logger.warning("No default LLM provider found")
return None
if results:
task_logger.info(f"Auto mode sync results: {results}")
else:
task_logger.debug("No model updates applied")
# log change if any
old_models = set(
model_configuration.name
for model_configuration in default_provider.model_configurations
)
new_models = set(available_models)
added_models = new_models - old_models
removed_models = old_models - new_models
if added_models:
task_logger.info(f"Adding models: {sorted(added_models)}")
if removed_models:
task_logger.info(f"Removing models: {sorted(removed_models)}")
# Update the provider's model list
# Remove models that are no longer available
db_session.query(ModelConfiguration).filter(
ModelConfiguration.llm_provider_id == default_provider.id,
ModelConfiguration.name.notin_(available_models),
).delete(synchronize_session=False)
# Add new models
for available_model_name in available_models:
db_session.merge(
ModelConfiguration(
llm_provider_id=default_provider.id,
name=available_model_name,
is_visible=False,
max_input_tokens=None,
)
)
# if the default model is no longer available, set it to the first model in the list
if default_provider.default_model_name not in available_models:
task_logger.info(
f"Default model {default_provider.default_model_name} not "
f"available, setting to first model in list: {available_models[0]}"
)
default_provider.default_model_name = available_models[0]
db_session.commit()
if added_models or removed_models:
task_logger.info("Updated model list for default provider.")
except Exception:
task_logger.exception("Error in auto LLM update task")
raise
return True

View File

@@ -886,9 +886,7 @@ def monitor_celery_queues_helper(
OnyxCeleryQueues.CONNECTOR_DOC_FETCHING, r_celery
)
n_docprocessing = celery_get_queue_length(OnyxCeleryQueues.DOCPROCESSING, r_celery)
n_user_files_indexing = celery_get_queue_length(
OnyxCeleryQueues.USER_FILES_INDEXING, r_celery
)
n_user_file_processing = celery_get_queue_length(
OnyxCeleryQueues.USER_FILE_PROCESSING, r_celery
)
@@ -924,7 +922,6 @@ def monitor_celery_queues_helper(
f"docfetching_prefetched={len(n_docfetching_prefetched)} "
f"docprocessing={n_docprocessing} "
f"docprocessing_prefetched={len(n_docprocessing_prefetched)} "
f"user_files_indexing={n_user_files_indexing} "
f"user_file_processing={n_user_file_processing} "
f"user_file_project_sync={n_user_file_project_sync} "
f"user_file_delete={n_user_file_delete} "

View File

@@ -1,6 +1,5 @@
import datetime
import time
from collections.abc import Sequence
from typing import Any
from uuid import UUID
@@ -13,39 +12,33 @@ from retry import retry
from sqlalchemy import select
from onyx.background.celery.apps.app_base import task_logger
from onyx.background.celery.celery_redis import celery_get_queue_length
from onyx.background.celery.celery_utils import httpx_init_vespa_pool
from onyx.background.celery.tasks.shared.RetryDocumentIndex import RetryDocumentIndex
from onyx.configs.app_configs import MANAGED_VESPA
from onyx.configs.app_configs import VESPA_CLOUD_CERT_PATH
from onyx.configs.app_configs import VESPA_CLOUD_KEY_PATH
from onyx.configs.constants import CELERY_GENERIC_BEAT_LOCK_TIMEOUT
from onyx.configs.constants import CELERY_USER_FILE_DOCID_MIGRATION_LOCK_TIMEOUT
from onyx.configs.constants import CELERY_USER_FILE_PROCESSING_LOCK_TIMEOUT
from onyx.configs.constants import CELERY_USER_FILE_PROCESSING_TASK_EXPIRES
from onyx.configs.constants import CELERY_USER_FILE_PROJECT_SYNC_LOCK_TIMEOUT
from onyx.configs.constants import DocumentSource
from onyx.configs.constants import FileOrigin
from onyx.configs.constants import OnyxCeleryPriority
from onyx.configs.constants import OnyxCeleryQueues
from onyx.configs.constants import OnyxCeleryTask
from onyx.configs.constants import OnyxRedisLocks
from onyx.configs.constants import USER_FILE_PROCESSING_MAX_QUEUE_DEPTH
from onyx.connectors.file.connector import LocalFileConnector
from onyx.connectors.models import Document
from onyx.db.engine.sql_engine import get_session_with_current_tenant
from onyx.db.enums import UserFileStatus
from onyx.db.models import FileRecord
from onyx.db.models import SearchDoc
from onyx.db.models import UserFile
from onyx.db.search_settings import get_active_search_settings
from onyx.db.search_settings import get_active_search_settings_list
from onyx.document_index.factory import get_default_document_index
from onyx.document_index.interfaces import VespaDocumentFields
from onyx.document_index.interfaces import VespaDocumentUserFields
from onyx.document_index.vespa.shared_utils.utils import (
replace_invalid_doc_id_characters,
)
from onyx.document_index.vespa_constants import DOCUMENT_ID_ENDPOINT
from onyx.file_store.file_store import get_default_file_store
from onyx.file_store.file_store import S3BackedFileStore
from onyx.file_store.utils import user_file_id_to_plaintext_file_name
from onyx.httpx.httpx_pool import HttpxPool
from onyx.indexing.adapters.user_file_indexing_adapter import UserFileIndexingAdapter
@@ -63,6 +56,17 @@ def _user_file_lock_key(user_file_id: str | UUID) -> str:
return f"{OnyxRedisLocks.USER_FILE_PROCESSING_LOCK_PREFIX}:{user_file_id}"
def _user_file_queued_key(user_file_id: str | UUID) -> str:
"""Key that exists while a process_single_user_file task is sitting in the queue.
The beat generator sets this with a TTL equal to CELERY_USER_FILE_PROCESSING_TASK_EXPIRES
before enqueuing and the worker deletes it as its first action. This prevents
the beat from adding duplicate tasks for files that already have a live task
in flight.
"""
return f"{OnyxRedisLocks.USER_FILE_QUEUED_PREFIX}:{user_file_id}"
def _user_file_project_sync_lock_key(user_file_id: str | UUID) -> str:
return f"{OnyxRedisLocks.USER_FILE_PROJECT_SYNC_LOCK_PREFIX}:{user_file_id}"
@@ -126,7 +130,24 @@ def _get_document_chunk_count(
def check_user_file_processing(self: Task, *, tenant_id: str) -> None:
"""Scan for user files with PROCESSING status and enqueue per-file tasks.
Uses direct Redis locks to avoid overlapping runs.
Three mechanisms prevent queue runaway:
1. **Queue depth backpressure** if the broker queue already has more than
USER_FILE_PROCESSING_MAX_QUEUE_DEPTH items we skip this beat cycle
entirely. Workers are clearly behind; adding more tasks would only make
the backlog worse.
2. **Per-file queued guard** before enqueuing a task we set a short-lived
Redis key (TTL = CELERY_USER_FILE_PROCESSING_TASK_EXPIRES). If that key
already exists the file already has a live task in the queue, so we skip
it. The worker deletes the key the moment it picks up the task so the
next beat cycle can re-enqueue if the file is still PROCESSING.
3. **Task expiry** every enqueued task carries an `expires` value equal to
CELERY_USER_FILE_PROCESSING_TASK_EXPIRES. If a task is still sitting in
the queue after that deadline, Celery discards it without touching the DB.
This is a belt-and-suspenders defence: even if the guard key is lost (e.g.
Redis restart), stale tasks evict themselves rather than piling up forever.
"""
task_logger.info("check_user_file_processing - Starting")
@@ -141,7 +162,21 @@ def check_user_file_processing(self: Task, *, tenant_id: str) -> None:
return None
enqueued = 0
skipped_guard = 0
try:
# --- Protection 1: queue depth backpressure ---
r_celery = self.app.broker_connection().channel().client # type: ignore
queue_len = celery_get_queue_length(
OnyxCeleryQueues.USER_FILE_PROCESSING, r_celery
)
if queue_len > USER_FILE_PROCESSING_MAX_QUEUE_DEPTH:
task_logger.warning(
f"check_user_file_processing - Queue depth {queue_len} exceeds "
f"{USER_FILE_PROCESSING_MAX_QUEUE_DEPTH}, skipping enqueue for "
f"tenant={tenant_id}"
)
return None
with get_session_with_current_tenant() as db_session:
user_file_ids = (
db_session.execute(
@@ -154,12 +189,35 @@ def check_user_file_processing(self: Task, *, tenant_id: str) -> None:
)
for user_file_id in user_file_ids:
self.app.send_task(
OnyxCeleryTask.PROCESS_SINGLE_USER_FILE,
kwargs={"user_file_id": str(user_file_id), "tenant_id": tenant_id},
queue=OnyxCeleryQueues.USER_FILE_PROCESSING,
priority=OnyxCeleryPriority.HIGH,
# --- Protection 2: per-file queued guard ---
queued_key = _user_file_queued_key(user_file_id)
guard_set = redis_client.set(
queued_key,
1,
ex=CELERY_USER_FILE_PROCESSING_TASK_EXPIRES,
nx=True,
)
if not guard_set:
skipped_guard += 1
continue
# --- Protection 3: task expiry ---
# If task submission fails, clear the guard immediately so the
# next beat cycle can retry enqueuing this file.
try:
self.app.send_task(
OnyxCeleryTask.PROCESS_SINGLE_USER_FILE,
kwargs={
"user_file_id": str(user_file_id),
"tenant_id": tenant_id,
},
queue=OnyxCeleryQueues.USER_FILE_PROCESSING,
priority=OnyxCeleryPriority.HIGH,
expires=CELERY_USER_FILE_PROCESSING_TASK_EXPIRES,
)
except Exception:
redis_client.delete(queued_key)
raise
enqueued += 1
finally:
@@ -167,7 +225,8 @@ def check_user_file_processing(self: Task, *, tenant_id: str) -> None:
lock.release()
task_logger.info(
f"check_user_file_processing - Enqueued {enqueued} tasks for tenant={tenant_id}"
f"check_user_file_processing - Enqueued {enqueued} skipped_guard={skipped_guard} "
f"tasks for tenant={tenant_id}"
)
return None
@@ -182,6 +241,12 @@ def process_single_user_file(self: Task, *, user_file_id: str, tenant_id: str) -
start = time.monotonic()
redis_client = get_redis_client(tenant_id=tenant_id)
# Clear the "queued" guard set by the beat generator so that the next beat
# cycle can re-enqueue this file if it is still in PROCESSING state after
# this task completes or fails.
redis_client.delete(_user_file_queued_key(user_file_id))
file_lock: RedisLock = redis_client.lock(
_user_file_lock_key(user_file_id),
timeout=CELERY_USER_FILE_PROCESSING_LOCK_TIMEOUT,
@@ -618,315 +683,3 @@ def process_single_user_file_project_sync(
file_lock.release()
return None
def _normalize_legacy_user_file_doc_id(old_id: str) -> str:
# Convert USER_FILE_CONNECTOR__<uuid> -> FILE_CONNECTOR__<uuid> for legacy values
user_prefix = "USER_FILE_CONNECTOR__"
file_prefix = "FILE_CONNECTOR__"
if old_id.startswith(user_prefix):
remainder = old_id[len(user_prefix) :]
return file_prefix + remainder
return old_id
def update_legacy_plaintext_file_records() -> None:
"""Migrate legacy plaintext cache objects from int-based keys to UUID-based
keys. Copies each S3 object to its expected UUID key and updates DB.
Examples:
- Old key: bucket/schema/plaintext_<int>
- New key: bucket/schema/plaintext_<uuid>
"""
task_logger.info("update_legacy_plaintext_file_records - Starting")
with get_session_with_current_tenant() as db_session:
store = get_default_file_store()
if not isinstance(store, S3BackedFileStore):
task_logger.info(
"update_legacy_plaintext_file_records - Skipping non-S3 store"
)
return
s3_client = store._get_s3_client()
bucket_name = store._get_bucket_name()
# Select PLAINTEXT_CACHE records whose object_key ends with 'plaintext_' + non-hyphen chars
# Example: 'some/path/plaintext_abc123' matches; '.../plaintext_foo-bar' does not
plaintext_records: Sequence[FileRecord] = (
db_session.execute(
sa.select(FileRecord).where(
FileRecord.file_origin == FileOrigin.PLAINTEXT_CACHE,
FileRecord.object_key.op("~")(r"plaintext_[^-]+$"),
)
)
.scalars()
.all()
)
task_logger.info(
f"update_legacy_plaintext_file_records - Found {len(plaintext_records)} plaintext records to update"
)
normalized = 0
for fr in plaintext_records:
try:
expected_key = store._get_s3_key(fr.file_id)
if fr.object_key == expected_key:
continue
if fr.bucket_name is None:
task_logger.warning(f"id={fr.file_id} - Bucket name is None")
continue
if fr.object_key is None:
task_logger.warning(f"id={fr.file_id} - Object key is None")
continue
# Copy old object to new key
copy_source = f"{fr.bucket_name}/{fr.object_key}"
s3_client.copy_object(
CopySource=copy_source,
Bucket=bucket_name,
Key=expected_key,
MetadataDirective="COPY",
)
# Delete old object (best-effort)
try:
s3_client.delete_object(Bucket=fr.bucket_name, Key=fr.object_key)
except Exception:
pass
# Update DB record with new key
fr.object_key = expected_key
db_session.add(fr)
normalized += 1
except Exception as e:
task_logger.warning(f"id={fr.file_id} - {e.__class__.__name__}")
if normalized:
db_session.commit()
task_logger.info(
f"user_file_docid_migration_task normalized {normalized} plaintext objects"
)
@shared_task(
name=OnyxCeleryTask.USER_FILE_DOCID_MIGRATION,
ignore_result=True,
bind=True,
)
def user_file_docid_migration_task(self: Task, *, tenant_id: str) -> bool:
task_logger.info(
f"user_file_docid_migration_task - Starting for tenant={tenant_id}"
)
redis_client = get_redis_client(tenant_id=tenant_id)
lock: RedisLock = redis_client.lock(
OnyxRedisLocks.USER_FILE_DOCID_MIGRATION_LOCK,
timeout=CELERY_USER_FILE_DOCID_MIGRATION_LOCK_TIMEOUT,
)
if not lock.acquire(blocking=False):
task_logger.info(
f"user_file_docid_migration_task - Lock held, skipping tenant={tenant_id}"
)
return False
updated_count = 0
try:
update_legacy_plaintext_file_records()
# Track lock renewal
last_lock_time = time.monotonic()
with get_session_with_current_tenant() as db_session:
# 20 is the documented default for httpx max_keepalive_connections
if MANAGED_VESPA:
httpx_init_vespa_pool(
20, ssl_cert=VESPA_CLOUD_CERT_PATH, ssl_key=VESPA_CLOUD_KEY_PATH
)
else:
httpx_init_vespa_pool(20)
active_settings = get_active_search_settings(db_session)
document_index = get_default_document_index(
search_settings=active_settings.primary,
secondary_search_settings=active_settings.secondary,
httpx_client=HttpxPool.get("vespa"),
)
retry_index = RetryDocumentIndex(document_index)
# Select user files with a legacy doc id that have not been migrated
user_files = (
db_session.execute(
sa.select(UserFile).where(
sa.and_(
UserFile.document_id.is_not(None),
UserFile.document_id_migrated.is_(False),
)
)
)
.scalars()
.all()
)
task_logger.info(
f"user_file_docid_migration_task - Found {len(user_files)} user files to migrate"
)
# Query all SearchDocs that need updating
search_docs = (
db_session.execute(
sa.select(SearchDoc).where(
SearchDoc.document_id.like("%FILE_CONNECTOR__%")
)
)
.scalars()
.all()
)
task_logger.info(
f"user_file_docid_migration_task - Found {len(search_docs)} search docs to update"
)
# Build a map of normalized doc IDs to SearchDocs
search_doc_map: dict[str, list[SearchDoc]] = {}
for sd in search_docs:
doc_id = sd.document_id
if search_doc_map.get(doc_id) is None:
search_doc_map[doc_id] = []
search_doc_map[doc_id].append(sd)
task_logger.debug(
f"user_file_docid_migration_task - Built search doc map with {len(search_doc_map)} entries"
)
ids_preview = list(search_doc_map.keys())[:5]
task_logger.debug(
f"user_file_docid_migration_task - First few search_doc_map ids: {ids_preview if ids_preview else 'No ids found'}"
)
task_logger.debug(
f"user_file_docid_migration_task - search_doc_map total items: "
f"{sum(len(docs) for docs in search_doc_map.values())}"
)
for user_file in user_files:
# Periodically renew the Redis lock to prevent expiry mid-run
current_time = time.monotonic()
if current_time - last_lock_time >= (
CELERY_USER_FILE_DOCID_MIGRATION_LOCK_TIMEOUT / 4
):
renewed = False
try:
# extend lock ttl to full timeout window
lock.extend(CELERY_USER_FILE_DOCID_MIGRATION_LOCK_TIMEOUT)
renewed = True
except Exception:
# if extend fails, best-effort reacquire as a fallback
try:
lock.reacquire()
renewed = True
except Exception:
renewed = False
last_lock_time = current_time
if not renewed or not lock.owned():
task_logger.error(
"user_file_docid_migration_task - Lost lock ownership or failed to renew; aborting for safety"
)
return False
try:
clean_old_doc_id = replace_invalid_doc_id_characters(
user_file.document_id
)
normalized_doc_id = _normalize_legacy_user_file_doc_id(
clean_old_doc_id
)
user_project_ids = [project.id for project in user_file.projects]
task_logger.info(
f"user_file_docid_migration_task - Migrating user file {user_file.id} with doc_id {normalized_doc_id}"
)
index_name = active_settings.primary.index_name
# First find the chunks count using direct Vespa query
selection = f"{index_name}.document_id=='{normalized_doc_id}'"
# Count all chunks for this document
chunk_count = _get_document_chunk_count(
index_name=index_name,
selection=selection,
)
task_logger.info(
f"Found {chunk_count} chunks for document {normalized_doc_id}"
)
# Now update Vespa chunks with the found chunk count using retry_index
# WARNING: In the future this will error; we no longer want
# to support changing document ID.
# TODO(andrei): Delete soon.
retry_index.update_single(
doc_id=str(normalized_doc_id),
tenant_id=tenant_id,
chunk_count=chunk_count,
fields=VespaDocumentFields(document_id=str(user_file.id)),
user_fields=VespaDocumentUserFields(
user_projects=user_project_ids
),
)
user_file.chunk_count = chunk_count
# Update the SearchDocs
actual_doc_id = str(user_file.document_id)
normalized_actual_doc_id = _normalize_legacy_user_file_doc_id(
actual_doc_id
)
if (
normalized_doc_id in search_doc_map
or normalized_actual_doc_id in search_doc_map
):
to_update = (
search_doc_map[normalized_doc_id]
if normalized_doc_id in search_doc_map
else search_doc_map[normalized_actual_doc_id]
)
task_logger.debug(
f"user_file_docid_migration_task - Updating {len(to_update)} search docs for user file {user_file.id}"
)
for search_doc in to_update:
search_doc.document_id = str(user_file.id)
db_session.add(search_doc)
user_file.document_id_migrated = True
db_session.add(user_file)
db_session.commit()
updated_count += 1
except Exception as per_file_exc:
# Rollback the current transaction and continue with the next file
db_session.rollback()
task_logger.exception(
f"user_file_docid_migration_task - Error migrating user file {user_file.id} - "
f"{per_file_exc.__class__.__name__}"
)
task_logger.info(
f"user_file_docid_migration_task - Updated {updated_count} user files"
)
task_logger.info(
f"user_file_docid_migration_task - Completed for tenant={tenant_id} (updated={updated_count})"
)
return True
except Exception as e:
task_logger.exception(
f"user_file_docid_migration_task - Error during execution for tenant={tenant_id} "
f"(updated={updated_count}) exception={e.__class__.__name__}"
)
return False
finally:
if lock.owned():
lock.release()

View File

@@ -63,7 +63,7 @@ To ensure the LLM follows certain specific instructions, instructions are added
tool is used, a citation reminder is always added. Otherwise, by default there is no reminder. If the user configures reminders, those are added to the
final message. If a search related tool just ran and the user has reminders, both appear in a single message.
If a search related tool is called at any point during the turn, the reminder will remain at the end until the turn is over and the agent as responded.
If a search related tool is called at any point during the turn, the reminder will remain at the end until the turn is over and the agent has responded.
## Tool Calls
@@ -145,9 +145,83 @@ attention despite having global access.
In a similar concept, LLM instructions in the system prompt are structured specifically so that there are coherent sections for the LLM to attend to. This is
fairly surprising actually but if there is a line of instructions effectively saying "If you try to use some tools and find that you need more information or
need to call additional tools, you are encouraged to do this", having this in the Tool section of the System prompt makes all the LLMs follow it well but if it's
even just a paragraph away like near the beginning of the prompt, it is often often ignored. The difference is as drastic as a 30% follow rate to a 90% follow
even just a paragraph away like near the beginning of the prompt, it is often ignored. The difference is as drastic as a 30% follow rate to a 90% follow
rate even just moving the same statement a few sentences.
## Other related pointers
- How messages, files, images are stored can be found in backend/onyx/db/models.py, there is also a README.md under that directory that may be helpful.
---
# Overview of LLM flow architecture
**Concepts:**
Turn: User sends a message and AI does some set of things and responds
Step/Cycle: 1 single LLM inference given some context and some tools
## 1. Top Level (process_message function):
This function can be thought of as the set-up and validation layer. It ensures that the database is in a valid state, reads the
messages in the session and sets up all the necessary items to run the chat loop and state containers. The major things it does
are:
- Validates the request
- Builds the chat history for the session
- Fetches any additional context such as files and images
- Prepares all of the tools for the LLM
- Creates the state container objects for use in the loop
### Wrapper (run_chat_loop_with_state_containers function):
This wrapper is used to run the LLM flow in a background thread and monitor the emitter for stop signals. This means the top
level is as isolated from the LLM flow as possible and can continue to yield packets as soon as they are available from the lower
levels. This also means that if the lower levels fail, the top level will still guarantee a reasonable response to the user.
All of the saving and database operations are abstracted away from the lower levels.
### Emitter
The emitter is designed to be an object queue so that lower levels do not need to yield objects all the way back to the top.
This way the functions can be better designed (not everything as a generator) and more easily tested. The wrapper around the
LLM flow (run_chat_loop_with_state_containers) is used to monitor the emitter and handle packets as soon as they are available
from the lower levels. Both the emitter and the state container are mutating state objects and only used to accumulate state.
There should be no logic dependent on the states of these objects, especially in the lower levels. The emitter should only take
packets and should not be used for other things.
### State Container
The state container is used to accumulate state during the LLM flow. Similar to the emitter, it should not be used for logic,
only for accumulating state. It is used to gather all of the necessary information for saving the chat turn into the database.
So it will accumulate answer tokens, reasoning tokens, tool calls, citation info, etc. This is used at the end of the flow once
the lower level is completed whether on its own or stopped by the user. At that point, all of the state is read and stored into
the database. The state container can be added to by any of the underlying layers, this is fine.
### Stopping Generation
A stop signal is checked every 300ms by the wrapper around the LLM flow. The signal itself
is stored in Redis and is set by the user calling the stop endpoint. The wrapper ensures that no matter what the lower level is
doing at the time, the thread can be killed by the top level. It does not require a cooperative cancellation from the lower level
and in fact the lower level does not know about the stop signal at all.
## 2. LLM Loop (run_llm_loop function)
This function handles the logic of the Turn. It's essentially a while loop where context is added and modified (according what
is outlined in the first half of this doc). Its main functionality is:
- Translate and truncate the context for the LLM inference
- Add context modifiers like reminders, updates to the system prompts, etc.
- Run tool calls and gather results
- Build some of the objects stored in the state container.
## 3. LLM Step (run_llm_step function)
This function is a single inference of the LLM. It's a wrapper around the LLM stream function which handles packet translations
so that the Emitter can emit individual tokens as soon as they arrive. It also keeps track of the different sections since they
do not all come at once (reasoning, answers, tool calls are all built up token by token). This layer also tracks the different
tool calls and returns that to the LLM Loop to execute.
## Things to know
- Packets are labeled with a "turn_index" field as part of the Placement of the packet. This is not the same as the backend
concept of a turn. The turn_index for the frontend is which block does this packet belong to. So while a reasoning + tool call
comes from the same LLM inference (same backend LLM step), they are 2 turns to the frontend because that's how it's rendered.
- There are 3 representations of "message". The first is the database model ChatMessage, this one should be translated away and
not used deep into the flow. The second is ChatMessageSimple which is the data model which should be used throughout the code
as much as possible. If modifications/additions are needed, it should be to this object. This is the rich representation of a
message for the code. Finally there is the LanguageModelInput representation of a message. This one is for the LLM interface
layer and is as stripped down as possible so that the LLM interface can be clean and easy to maintain/extend.

View File

@@ -0,0 +1,57 @@
from uuid import UUID
from redis.client import Redis
# Redis key prefixes for chat message processing
PREFIX = "chatprocessing"
FENCE_PREFIX = f"{PREFIX}_fence"
FENCE_TTL = 30 * 60 # 30 minutes
def _get_fence_key(chat_session_id: UUID) -> str:
"""
Generate the Redis key for a chat session processing a message.
Args:
chat_session_id: The UUID of the chat session
Returns:
The fence key string (tenant_id is automatically added by the Redis client)
"""
return f"{FENCE_PREFIX}_{chat_session_id}"
def set_processing_status(
chat_session_id: UUID, redis_client: Redis, value: bool
) -> None:
"""
Set or clear the fence for a chat session processing a message.
If the key exists, we are processing a message. If the key does not exist, we are not processing a message.
Args:
chat_session_id: The UUID of the chat session
redis_client: The Redis client to use
value: True to set the fence, False to clear it
"""
fence_key = _get_fence_key(chat_session_id)
if value:
redis_client.set(fence_key, 0, ex=FENCE_TTL)
else:
redis_client.delete(fence_key)
def is_chat_session_processing(chat_session_id: UUID, redis_client: Redis) -> bool:
"""
Check if the chat session is processing a message.
Args:
chat_session_id: The UUID of the chat session
redis_client: The Redis client to use
Returns:
True if the chat session is processing a message, False otherwise
"""
fence_key = _get_fence_key(chat_session_id)
return bool(redis_client.exists(fence_key))

View File

@@ -1,4 +1,5 @@
import threading
import time
from collections.abc import Callable
from collections.abc import Generator
from queue import Empty
@@ -38,6 +39,7 @@ class ChatStateContainer:
self.citation_to_doc: CitationMapping = {}
# True if this turn is a clarification question (deep research flow)
self.is_clarification: bool = False
# Note: LLM cost tracking is now handled in multi_llm.py
def add_tool_call(self, tool_call: ToolCallInfo) -> None:
"""Add a tool call to the accumulated state."""
@@ -92,6 +94,7 @@ class ChatStateContainer:
def run_chat_loop_with_state_containers(
func: Callable[..., None],
completion_callback: Callable[[ChatStateContainer], None],
is_connected: Callable[[], bool],
emitter: Emitter,
state_container: ChatStateContainer,
@@ -144,6 +147,9 @@ def run_chat_loop_with_state_containers(
thread = run_in_background(run_with_exception_capture)
pkt: Packet | None = None
last_turn_index = 0 # Track the highest turn_index seen for stop packet
last_cancel_check = time.monotonic()
cancel_check_interval = 0.3 # Check for cancellation every 300ms
try:
while True:
# Poll queue with 300ms timeout for natural stop signal checking
@@ -152,20 +158,51 @@ def run_chat_loop_with_state_containers(
pkt = emitter.bus.get(timeout=0.3)
except Empty:
if not is_connected():
# Stop signal detected, kill the thread
# Stop signal detected
yield Packet(
placement=Placement(turn_index=last_turn_index + 1),
obj=OverallStop(type="stop", stop_reason="user_cancelled"),
)
break
last_cancel_check = time.monotonic()
continue
if pkt is not None:
if pkt.obj == OverallStop(type="stop"):
# Track the highest turn_index for the stop packet
if pkt.placement and pkt.placement.turn_index > last_turn_index:
last_turn_index = pkt.placement.turn_index
if isinstance(pkt.obj, OverallStop):
yield pkt
break
elif isinstance(pkt.obj, PacketException):
raise pkt.obj.exception
else:
yield pkt
# Check for cancellation periodically even when packets are flowing
# This ensures stop signal is checked during active streaming
current_time = time.monotonic()
if current_time - last_cancel_check >= cancel_check_interval:
if not is_connected():
# Stop signal detected during streaming
yield Packet(
placement=Placement(turn_index=last_turn_index + 1),
obj=OverallStop(type="stop", stop_reason="user_cancelled"),
)
break
last_cancel_check = current_time
finally:
# Wait for thread to complete on normal exit to propagate exceptions and ensure cleanup.
# Skip waiting if user disconnected to exit quickly.
if is_connected():
wait_on_background(thread)
try:
completion_callback(state_container)
except Exception as e:
emitter.emit(
Packet(
placement=Placement(turn_index=last_turn_index + 1),
obj=PacketException(type="error", exception=e),
)
)

View File

@@ -26,6 +26,7 @@ from onyx.context.search.models import RerankingDetails
from onyx.context.search.models import RetrievalDetails
from onyx.db.chat import create_chat_session
from onyx.db.chat import get_chat_messages_by_session
from onyx.db.chat import get_or_create_root_message
from onyx.db.kg_config import get_kg_config_settings
from onyx.db.kg_config import is_kg_config_settings_enabled_valid
from onyx.db.llm import fetch_existing_doc_sets
@@ -37,6 +38,7 @@ from onyx.db.models import SearchDoc as DbSearchDoc
from onyx.db.models import Tool
from onyx.db.models import User
from onyx.db.models import UserFile
from onyx.db.projects import check_project_ownership
from onyx.db.search_settings import get_current_search_settings
from onyx.file_processing.extract_file_text import extract_file_text
from onyx.file_store.file_store import get_default_file_store
@@ -51,7 +53,9 @@ from onyx.natural_language_processing.utils import BaseTokenizer
from onyx.prompts.chat_prompts import ADDITIONAL_CONTEXT_PROMPT
from onyx.prompts.chat_prompts import TOOL_CALL_RESPONSE_CROSS_MESSAGE
from onyx.prompts.tool_prompts import TOOL_CALL_FAILURE_PROMPT
from onyx.server.query_and_chat.models import ChatSessionCreationRequest
from onyx.server.query_and_chat.models import CreateChatMessageRequest
from onyx.server.query_and_chat.models import MessageOrigin
from onyx.server.query_and_chat.streaming_models import CitationInfo
from onyx.tools.models import ToolCallKickoff
from onyx.tools.tool_implementations.custom.custom_tool import (
@@ -61,9 +65,45 @@ from onyx.utils.logger import setup_logger
from onyx.utils.threadpool_concurrency import run_functions_tuples_in_parallel
from onyx.utils.timing import log_function_time
logger = setup_logger()
def create_chat_session_from_request(
chat_session_request: ChatSessionCreationRequest,
user_id: UUID | None,
db_session: Session,
) -> ChatSession:
"""Create a chat session from a ChatSessionCreationRequest.
Includes project ownership validation when project_id is provided.
Args:
chat_session_request: The request containing persona_id, description, and project_id
user_id: The ID of the user creating the session (can be None for anonymous)
db_session: The database session
Returns:
The newly created ChatSession
Raises:
ValueError: If user lacks access to the specified project
Exception: If the persona is invalid
"""
project_id = chat_session_request.project_id
if project_id:
if not check_project_ownership(project_id, user_id, db_session):
raise ValueError("User does not have access to project")
return create_chat_session(
db_session=db_session,
description=chat_session_request.description or "",
user_id=user_id,
persona_id=chat_session_request.persona_id,
project_id=chat_session_request.project_id,
)
def prepare_chat_message_request(
message_text: str,
user: User | None,
@@ -77,6 +117,8 @@ def prepare_chat_message_request(
skip_gen_ai_answer_generation: bool = False,
llm_override: LLMOverride | None = None,
allowed_tool_ids: list[int] | None = None,
forced_tool_ids: list[int] | None = None,
origin: MessageOrigin | None = None,
) -> CreateChatMessageRequest:
# Typically used for one shot flows like SlackBot or non-chat API endpoint use cases
new_chat_session = create_chat_session(
@@ -103,6 +145,8 @@ def prepare_chat_message_request(
skip_gen_ai_answer_generation=skip_gen_ai_answer_generation,
llm_override=llm_override,
allowed_tool_ids=allowed_tool_ids,
forced_tool_ids=forced_tool_ids,
origin=origin or MessageOrigin.UNKNOWN,
)
@@ -164,13 +208,15 @@ def create_chat_history_chain(
)
if not all_chat_messages:
raise RuntimeError("No messages in Chat Session")
root_message = all_chat_messages[0]
if root_message.parent_message is not None:
raise RuntimeError(
"Invalid root message, unable to fetch valid chat message sequence"
root_message = get_or_create_root_message(
chat_session_id=chat_session_id, db_session=db_session
)
else:
root_message = all_chat_messages[0]
if root_message.parent_message is not None:
raise RuntimeError(
"Invalid root message, unable to fetch valid chat message sequence"
)
current_message: ChatMessage | None = root_message
previous_message: ChatMessage | None = None
@@ -201,9 +247,6 @@ def create_chat_history_chain(
previous_message = current_message
if not mainline_messages:
raise RuntimeError("Could not trace chat message history")
return mainline_messages

View File

@@ -99,7 +99,7 @@ def _build_project_file_citation_mapping(
def construct_message_history(
system_prompt: ChatMessageSimple,
system_prompt: ChatMessageSimple | None,
custom_agent_prompt: ChatMessageSimple | None,
simple_chat_history: list[ChatMessageSimple],
reminder_message: ChatMessageSimple | None,
@@ -114,7 +114,7 @@ def construct_message_history(
)
history_token_budget = available_tokens
history_token_budget -= system_prompt.token_count
history_token_budget -= system_prompt.token_count if system_prompt else 0
history_token_budget -= (
custom_agent_prompt.token_count if custom_agent_prompt else 0
)
@@ -125,9 +125,12 @@ def construct_message_history(
if history_token_budget < 0:
raise ValueError("Not enough tokens available to construct message history")
if system_prompt:
system_prompt.should_cache = True
# If no history, build minimal context
if not simple_chat_history:
result = [system_prompt]
result = [system_prompt] if system_prompt else []
if custom_agent_prompt:
result.append(custom_agent_prompt)
if project_files and project_files.project_file_texts:
@@ -199,6 +202,7 @@ def construct_message_history(
for msg in reversed(history_before_last_user):
if current_token_count + msg.token_count <= remaining_budget:
msg.should_cache = True
truncated_history_before.insert(0, msg)
current_token_count += msg.token_count
else:
@@ -218,7 +222,7 @@ def construct_message_history(
# Build the final message list according to README ordering:
# [system], [history_before_last_user], [custom_agent], [project_files],
# [last_user_message], [messages_after_last_user], [reminder]
result = [system_prompt]
result = [system_prompt] if system_prompt else []
# 1. Add truncated history before last user message
result.extend(truncated_history_before)
@@ -342,8 +346,13 @@ def run_llm_loop(
has_called_search_tool: bool = False
citation_mapping: dict[int, str] = {} # Maps citation_num -> document_id/URL
default_base_system_prompt: str = get_default_base_system_prompt(db_session)
system_prompt = None
custom_agent_prompt_msg = None
reasoning_cycles = 0
for llm_cycle_count in range(MAX_LLM_CYCLES):
out_of_cycles = llm_cycle_count == MAX_LLM_CYCLES - 1
if forced_tool_id:
# Needs to be just the single one because the "required" currently doesn't have a specified tool, just a binary
final_tools = [tool for tool in tools if tool.id == forced_tool_id]
@@ -351,7 +360,7 @@ def run_llm_loop(
raise ValueError(f"Tool {forced_tool_id} not found in tools")
tool_choice = ToolChoiceOptions.REQUIRED
forced_tool_id = None
elif llm_cycle_count == MAX_LLM_CYCLES - 1 or ran_image_gen:
elif out_of_cycles or ran_image_gen:
# Last cycle, no tools allowed, just answer!
tool_choice = ToolChoiceOptions.NONE
final_tools = []
@@ -370,35 +379,47 @@ def run_llm_loop(
)
custom_agent_prompt_msg = None
else:
# System message and custom agent message are both included.
open_ai_formatting_enabled = model_needs_formatting_reenabled(
llm.config.model_name
)
system_prompt_str = build_system_prompt(
base_system_prompt=get_default_base_system_prompt(db_session),
datetime_aware=persona.datetime_aware if persona else True,
memories=memories,
tools=tools,
should_cite_documents=should_cite_documents
or always_cite_documents,
open_ai_formatting_enabled=open_ai_formatting_enabled,
)
system_prompt = ChatMessageSimple(
message=system_prompt_str,
token_count=token_counter(system_prompt_str),
message_type=MessageType.SYSTEM,
)
custom_agent_prompt_msg = (
ChatMessageSimple(
message=custom_agent_prompt,
token_count=token_counter(custom_agent_prompt),
message_type=MessageType.USER,
# If it's an empty string, we assume the user does not want to include it as an empty System message
if default_base_system_prompt:
open_ai_formatting_enabled = model_needs_formatting_reenabled(
llm.config.model_name
)
if custom_agent_prompt
else None
)
system_prompt_str = build_system_prompt(
base_system_prompt=default_base_system_prompt,
datetime_aware=persona.datetime_aware if persona else True,
memories=memories,
tools=tools,
should_cite_documents=should_cite_documents
or always_cite_documents,
open_ai_formatting_enabled=open_ai_formatting_enabled,
)
system_prompt = ChatMessageSimple(
message=system_prompt_str,
token_count=token_counter(system_prompt_str),
message_type=MessageType.SYSTEM,
)
custom_agent_prompt_msg = (
ChatMessageSimple(
message=custom_agent_prompt,
token_count=token_counter(custom_agent_prompt),
message_type=MessageType.USER,
)
if custom_agent_prompt
else None
)
else:
# If there is a custom agent prompt, it replaces the system prompt when the default system prompt is empty
system_prompt = (
ChatMessageSimple(
message=custom_agent_prompt,
token_count=token_counter(custom_agent_prompt),
message_type=MessageType.SYSTEM,
)
if custom_agent_prompt
else None
)
custom_agent_prompt_msg = None
reminder_message_text: str | None
if ran_image_gen:
@@ -406,7 +427,7 @@ def run_llm_loop(
# This is to prevent it generating things like:
# [Cute Cat](attachment://a_cute_cat_sitting_playfully.png)
reminder_message_text = IMAGE_GEN_REMINDER
elif just_ran_web_search:
elif just_ran_web_search and not out_of_cycles:
reminder_message_text = OPEN_URL_REMINDER
else:
# This is the default case, the LLM at this point may answer so it is important
@@ -417,6 +438,7 @@ def run_llm_loop(
),
include_citation_reminder=should_cite_documents
or always_cite_documents,
is_last_cycle=out_of_cycles,
)
reminder_msg = (
@@ -483,7 +505,7 @@ def run_llm_loop(
# in-flight citations
# It can be cleaned up but not super trivial or worthwhile right now
just_ran_web_search = False
tool_responses, citation_mapping = run_tool_calls(
parallel_tool_call_results = run_tool_calls(
tool_calls=tool_calls,
tools=final_tools,
message_history=truncated_message_history,
@@ -491,8 +513,11 @@ def run_llm_loop(
user_info=None, # TODO, this is part of memories right now, might want to separate it out
citation_mapping=citation_mapping,
next_citation_num=citation_processor.get_next_citation_number(),
max_concurrent_tools=None,
skip_search_query_expansion=has_called_search_tool,
)
tool_responses = parallel_tool_call_results.tool_responses
citation_mapping = parallel_tool_call_results.updated_citation_mapping
# Failure case, give something reasonable to the LLM to try again
if tool_calls and not tool_responses:

View File

@@ -1,4 +1,5 @@
import json
import time
from collections.abc import Callable
from collections.abc import Generator
from collections.abc import Mapping
@@ -17,6 +18,7 @@ from onyx.context.search.models import SearchDoc
from onyx.file_store.models import ChatFileType
from onyx.llm.interfaces import LanguageModelInput
from onyx.llm.interfaces import LLM
from onyx.llm.interfaces import LLMConfig
from onyx.llm.interfaces import LLMUserIdentity
from onyx.llm.interfaces import ToolChoiceOptions
from onyx.llm.model_response import Delta
@@ -31,6 +33,7 @@ from onyx.llm.models import TextContentPart
from onyx.llm.models import ToolCall
from onyx.llm.models import ToolMessage
from onyx.llm.models import UserMessage
from onyx.llm.prompt_cache.processor import process_with_prompt_cache
from onyx.server.query_and_chat.placement import Placement
from onyx.server.query_and_chat.streaming_models import AgentResponseDelta
from onyx.server.query_and_chat.streaming_models import AgentResponseStart
@@ -46,7 +49,6 @@ from onyx.tracing.framework.create import generation_span
from onyx.utils.b64 import get_image_type_from_bytes
from onyx.utils.logger import setup_logger
logger = setup_logger()
@@ -277,6 +279,7 @@ def _extract_tool_call_kickoffs(
def translate_history_to_llm_format(
history: list[ChatMessageSimple],
llm_config: LLMConfig,
) -> LanguageModelInput:
"""Convert a list of ChatMessageSimple to LanguageModelInput format.
@@ -284,8 +287,23 @@ def translate_history_to_llm_format(
handling different message types and image files for multimodal support.
"""
messages: list[ChatCompletionMessage] = []
last_cacheable_msg_idx = -1
all_previous_msgs_cacheable = True
for idx, msg in enumerate(history):
# if the message is being added to the history
if msg.message_type in [
MessageType.SYSTEM,
MessageType.USER,
MessageType.ASSISTANT,
MessageType.TOOL_CALL_RESPONSE,
]:
all_previous_msgs_cacheable = (
all_previous_msgs_cacheable and msg.should_cache
)
if all_previous_msgs_cacheable:
last_cacheable_msg_idx = idx
for msg in history:
if msg.message_type == MessageType.SYSTEM:
system_msg = SystemMessage(
role="system",
@@ -395,7 +413,7 @@ def translate_history_to_llm_format(
assistant_msg_with_tool = AssistantMessage(
role="assistant",
content=None, # The tool call is parsed, doesn't need to be duplicated in the content
tool_calls=tool_calls if tool_calls else None,
tool_calls=tool_calls or None,
)
messages.append(assistant_msg_with_tool)
@@ -417,6 +435,18 @@ def translate_history_to_llm_format(
f"Unknown message type {msg.message_type} in history. Skipping message."
)
# prompt caching: rely on should_cache in ChatMessageSimple to
# pick the split point for the cacheable prefix and suffix
if last_cacheable_msg_idx != -1:
processed_messages, _ = process_with_prompt_cache(
llm_config=llm_config,
cacheable_prefix=messages[: last_cacheable_msg_idx + 1],
suffix=messages[last_cacheable_msg_idx + 1 :],
continuation=False,
)
assert isinstance(processed_messages, list) # for mypy
messages = processed_messages
return messages
@@ -429,6 +459,10 @@ def _increment_turns(
return turn_index, sub_turn_index + 1
def _delta_has_action(delta: Delta) -> bool:
return bool(delta.content or delta.reasoning_content or delta.tool_calls)
def run_llm_step_pkt_generator(
history: list[ChatMessageSimple],
tool_definitions: list[dict],
@@ -499,7 +533,7 @@ def run_llm_step_pkt_generator(
tab_index = placement.tab_index
sub_turn_index = placement.sub_turn_index
llm_msg_history = translate_history_to_llm_format(history)
llm_msg_history = translate_history_to_llm_format(history, llm.config)
has_reasoned = 0
# Uncomment the line below to log the entire message history to the console
@@ -526,6 +560,8 @@ def run_llm_step_pkt_generator(
span_generation.span_data.input = cast(
Sequence[Mapping[str, Any]], llm_msg_history
)
stream_start_time = time.monotonic()
first_action_recorded = False
for packet in llm.stream(
prompt=llm_msg_history,
tools=tool_definitions,
@@ -543,7 +579,13 @@ def run_llm_step_pkt_generator(
"cache_read_input_tokens": usage.cache_read_input_tokens,
"cache_creation_input_tokens": usage.cache_creation_input_tokens,
}
# Note: LLM cost tracking is now handled in multi_llm.py
delta = packet.choice.delta
if not first_action_recorded and _delta_has_action(delta):
span_generation.span_data.time_to_first_action_seconds = (
time.monotonic() - stream_start_time
)
first_action_recorded = True
if custom_token_processor:
# The custom token processor can modify the deltas for specific custom logic
@@ -702,6 +744,15 @@ def run_llm_step_pkt_generator(
# Flush custom token processor to get any final tool calls
if custom_token_processor:
flush_delta, processor_state = custom_token_processor(None, processor_state)
if (
not first_action_recorded
and flush_delta is not None
and _delta_has_action(flush_delta)
):
span_generation.span_data.time_to_first_action_seconds = (
time.monotonic() - stream_start_time
)
first_action_recorded = True
if flush_delta and flush_delta.tool_calls:
for tool_call_delta in flush_delta.tool_calls:
_update_tool_call_with_delta(id_to_tool_call_map, tool_call_delta)

View File

@@ -3,6 +3,7 @@ from collections.abc import Iterator
from datetime import datetime
from enum import Enum
from typing import Any
from uuid import UUID
from pydantic import BaseModel
from pydantic import Field
@@ -16,7 +17,9 @@ from onyx.context.search.models import SearchDoc
from onyx.file_store.models import FileDescriptor
from onyx.file_store.models import InMemoryChatFile
from onyx.server.query_and_chat.streaming_models import CitationInfo
from onyx.server.query_and_chat.streaming_models import GeneratedImage
from onyx.server.query_and_chat.streaming_models import Packet
from onyx.tools.models import SearchToolUsage
from onyx.tools.models import ToolCallKickoff
from onyx.tools.tool_implementations.custom.base_tool_types import ToolResultType
@@ -132,6 +135,13 @@ class ToolConfig(BaseModel):
id: int
class ProjectSearchConfig(BaseModel):
"""Configuration for search tool availability in project context."""
search_usage: SearchToolUsage
disable_forced_tool: bool
class PromptOverrideConfig(BaseModel):
name: str
description: str = ""
@@ -171,6 +181,10 @@ AnswerQuestionPossibleReturn = (
)
class CreateChatSessionID(BaseModel):
chat_session_id: UUID
AnswerQuestionStreamReturn = Iterator[AnswerQuestionPossibleReturn]
@@ -181,12 +195,14 @@ class LLMMetricsContainer(BaseModel):
StreamProcessor = Callable[[Iterator[str]], AnswerQuestionStreamReturn]
AnswerStreamPart = (
Packet
| StreamStopInfo
| MessageResponseIDInfo
| StreamingError
| UserKnowledgeFilePacket
| CreateChatSessionID
)
AnswerStream = Iterator[AnswerStreamPart]
@@ -204,6 +220,37 @@ class ChatBasicResponse(BaseModel):
citation_info: list[CitationInfo]
class ToolCallResponse(BaseModel):
"""Tool call with full details for non-streaming response."""
tool_name: str
tool_arguments: dict[str, Any]
tool_result: str
search_docs: list[SearchDoc] | None = None
generated_images: list[GeneratedImage] | None = None
# Reasoning that led to the tool call
pre_reasoning: str | None = None
class ChatFullResponse(BaseModel):
"""Complete non-streaming response with all available data."""
# Core response fields
answer: str
answer_citationless: str
pre_answer_reasoning: str | None = None
tool_calls: list[ToolCallResponse] = []
# Documents & citations
top_documents: list[SearchDoc]
citation_info: list[CitationInfo]
# Metadata
message_id: int
chat_session_id: UUID | None = None
error_msg: str | None = None
class ChatLoadedFile(InMemoryChatFile):
content_text: str | None
token_count: int
@@ -217,6 +264,12 @@ class ChatMessageSimple(BaseModel):
image_files: list[ChatLoadedFile] | None = None
# Only for TOOL_CALL_RESPONSE type messages
tool_call_id: str | None = None
# The last message for which this is true
# AND is true for all previous messages
# (counting from the start of the history)
# represents the end of the cacheable prefix
# used for prompt caching
should_cache: bool = False
class ProjectFileMetadata(BaseModel):
@@ -234,6 +287,8 @@ class ExtractedProjectFiles(BaseModel):
total_token_count: int
# Metadata for project files to enable citations
project_file_metadata: list[ProjectFileMetadata]
# None if not a project
project_uncapped_token_count: int | None
class LlmStepResult(BaseModel):

View File

@@ -1,15 +1,22 @@
"""
IMPORTANT: familiarize yourself with the design concepts prior to contributing to this file.
An overview can be found in the README.md file in this directory.
"""
import re
import traceback
from collections.abc import Callable
from collections.abc import Iterator
from uuid import UUID
from redis.client import Redis
from sqlalchemy.orm import Session
from onyx.chat.chat_processing_checker import set_processing_status
from onyx.chat.chat_state import ChatStateContainer
from onyx.chat.chat_state import run_chat_loop_with_state_containers
from onyx.chat.chat_utils import convert_chat_history
from onyx.chat.chat_utils import create_chat_history_chain
from onyx.chat.chat_utils import create_chat_session_from_request
from onyx.chat.chat_utils import get_custom_agent_prompt
from onyx.chat.chat_utils import is_last_assistant_message_clarification
from onyx.chat.chat_utils import load_all_chat_files
@@ -17,37 +24,38 @@ from onyx.chat.emitter import get_default_emitter
from onyx.chat.llm_loop import run_llm_loop
from onyx.chat.models import AnswerStream
from onyx.chat.models import ChatBasicResponse
from onyx.chat.models import ChatFullResponse
from onyx.chat.models import ChatLoadedFile
from onyx.chat.models import CreateChatSessionID
from onyx.chat.models import ExtractedProjectFiles
from onyx.chat.models import MessageResponseIDInfo
from onyx.chat.models import ProjectFileMetadata
from onyx.chat.models import ProjectSearchConfig
from onyx.chat.models import StreamingError
from onyx.chat.models import ToolCallResponse
from onyx.chat.prompt_utils import calculate_reserved_tokens
from onyx.chat.save_chat import save_chat_turn
from onyx.chat.stop_signal_checker import is_connected as check_stop_signal
from onyx.chat.stop_signal_checker import reset_cancel_status
from onyx.configs.chat_configs import CHAT_TARGET_CHUNK_PERCENTAGE
from onyx.configs.chat_configs import MAX_CHUNKS_FED_TO_CHAT
from onyx.configs.constants import DEFAULT_PERSONA_ID
from onyx.configs.constants import MessageType
from onyx.configs.constants import MilestoneRecordType
from onyx.context.search.enums import OptionalSearchSetting
from onyx.context.search.models import CitationDocInfo
from onyx.context.search.models import SearchDoc
from onyx.db.chat import create_new_chat_message
from onyx.db.chat import get_chat_message
from onyx.db.chat import get_chat_session_by_id
from onyx.db.chat import get_or_create_root_message
from onyx.db.chat import reserve_message_id
from onyx.db.engine.sql_engine import get_session_with_current_tenant
from onyx.db.memory import get_memories
from onyx.db.models import ChatMessage
from onyx.db.models import ChatSession
from onyx.db.models import User
from onyx.db.projects import get_project_token_count
from onyx.db.projects import get_user_files_from_project
from onyx.db.tools import get_tools
from onyx.deep_research.dr_loop import run_deep_research_llm_loop
from onyx.file_store.models import ChatFileType
from onyx.file_store.models import FileDescriptor
from onyx.file_store.utils import load_in_memory_chat_files
from onyx.file_store.utils import verify_user_files
from onyx.llm.factory import get_llm_for_persona
@@ -57,37 +65,34 @@ from onyx.llm.interfaces import LLMUserIdentity
from onyx.llm.utils import litellm_exception_to_error_msg
from onyx.onyxbot.slack.models import SlackContext
from onyx.redis.redis_pool import get_redis_client
from onyx.server.query_and_chat.models import AUTO_PLACE_AFTER_LATEST_MESSAGE
from onyx.server.query_and_chat.models import CreateChatMessageRequest
from onyx.server.query_and_chat.models import SendMessageRequest
from onyx.server.query_and_chat.streaming_models import AgentResponseDelta
from onyx.server.query_and_chat.streaming_models import AgentResponseStart
from onyx.server.query_and_chat.streaming_models import CitationInfo
from onyx.server.query_and_chat.streaming_models import Packet
from onyx.server.utils import get_json_line
from onyx.server.usage_limits import check_llm_cost_limit_for_provider
from onyx.tools.constants import SEARCH_TOOL_ID
from onyx.tools.interface import Tool
from onyx.tools.models import SearchToolUsage
from onyx.tools.tool_constructor import construct_tools
from onyx.tools.tool_constructor import CustomToolConfig
from onyx.tools.tool_constructor import SearchToolConfig
from onyx.tools.tool_constructor import SearchToolUsage
from onyx.utils.logger import setup_logger
from onyx.utils.long_term_log import LongTermLogger
from onyx.utils.telemetry import mt_cloud_telemetry
from onyx.utils.timing import log_function_time
from onyx.utils.timing import log_generator_function_time
from onyx.utils.variable_functionality import (
fetch_versioned_implementation_with_fallback,
)
from onyx.utils.variable_functionality import noop_fallback
from shared_configs.contextvars import get_current_tenant_id
logger = setup_logger()
ERROR_TYPE_CANCELLED = "cancelled"
class ToolCallException(Exception):
"""Exception raised for errors during tool calls."""
def __init__(self, message: str, tool_name: str | None = None):
super().__init__(message)
self.tool_name = tool_name
def _extract_project_file_texts_and_images(
project_id: int | None,
user_id: UUID | None,
@@ -126,6 +131,7 @@ def _extract_project_file_texts_and_images(
project_as_filter=False,
total_token_count=0,
project_file_metadata=[],
project_uncapped_token_count=None,
)
max_actual_tokens = (
@@ -211,159 +217,122 @@ def _extract_project_file_texts_and_images(
project_as_filter=project_as_filter,
total_token_count=total_token_count,
project_file_metadata=project_file_metadata,
project_uncapped_token_count=project_tokens,
)
def _get_project_search_availability(
project_id: int | None,
persona_id: int | None,
has_project_file_texts: bool,
forced_tool_ids: list[int] | None,
loaded_project_files: bool,
project_has_files: bool,
forced_tool_id: int | None,
search_tool_id: int | None,
) -> SearchToolUsage:
) -> ProjectSearchConfig:
"""Determine search tool availability based on project context.
Args:
project_id: The project ID if the user is in a project
persona_id: The persona ID to check if it's the default persona
has_project_file_texts: Whether project files are loaded in context
forced_tool_ids: List of forced tool IDs (may be mutated to remove search tool)
search_tool_id: The search tool ID to check against
Search is disabled when ALL of the following are true:
- User is in a project
- Using the default persona (not a custom agent)
- Project files are already loaded in context
Returns:
SearchToolUsage setting indicating how search should be used
When search is disabled and the user tried to force the search tool,
that forcing is also disabled.
Returns AUTO (follow persona config) in all other cases.
"""
# There are cases where the internal search tool should be disabled
# If the user is in a project, it should not use other sources / generic search
# If they are in a project but using a custom agent, it should use the agent setup
# (which means it can use search)
# However if in a project and there are more files than can fit in the context,
# it should use the search tool with the project filter on
# If no files are uploaded, search should remain enabled
search_usage_forcing_setting = SearchToolUsage.AUTO
if project_id:
if bool(persona_id is DEFAULT_PERSONA_ID and has_project_file_texts):
search_usage_forcing_setting = SearchToolUsage.DISABLED
# Remove search tool from forced_tool_ids if it's present
if forced_tool_ids and search_tool_id and search_tool_id in forced_tool_ids:
forced_tool_ids[:] = [
tool_id for tool_id in forced_tool_ids if tool_id != search_tool_id
]
elif forced_tool_ids and search_tool_id and search_tool_id in forced_tool_ids:
search_usage_forcing_setting = SearchToolUsage.ENABLED
return search_usage_forcing_setting
def _initialize_chat_session(
message_text: str,
files: list[FileDescriptor],
token_counter: Callable[[str], int],
parent_id: int | None,
user_id: UUID | None,
chat_session_id: UUID,
db_session: Session,
use_existing_user_message: bool = False,
) -> ChatMessage:
root_message = get_or_create_root_message(
chat_session_id=chat_session_id, db_session=db_session
)
if parent_id is None:
parent_message = root_message
else:
parent_message = get_chat_message(
chat_message_id=parent_id,
user_id=user_id,
db_session=db_session,
# Not in a project, this should have no impact on search tool availability
if not project_id:
return ProjectSearchConfig(
search_usage=SearchToolUsage.AUTO, disable_forced_tool=False
)
# For seeding, the parent message points to the message that is supposed to be the last
# user message.
if use_existing_user_message:
if parent_message.parent_message is None:
raise RuntimeError("No parent message found for seeding")
if parent_message.message_type != MessageType.USER:
raise RuntimeError(
"Parent message is not a user message, needed for seeded flow."
)
message_text = parent_message.message
token_count = parent_message.token_count
parent_message = parent_message.parent_message
else:
token_count = token_counter(message_text)
# Custom persona in project - let persona config decide
# Even if there are no files in the project, it's still guided by the persona config.
if persona_id != DEFAULT_PERSONA_ID:
return ProjectSearchConfig(
search_usage=SearchToolUsage.AUTO, disable_forced_tool=False
)
# Flushed for ID but not committed yet
user_message = create_new_chat_message(
chat_session_id=chat_session_id,
parent_message=parent_message,
message=message_text,
token_count=token_count,
message_type=MessageType.USER,
files=files,
db_session=db_session,
commit=False,
# If in a project with the default persona and the files have been already loaded into the context or
# there are no files in the project, disable search as there is nothing to search for.
if loaded_project_files or not project_has_files:
user_forced_search = (
forced_tool_id is not None
and search_tool_id is not None
and forced_tool_id == search_tool_id
)
return ProjectSearchConfig(
search_usage=SearchToolUsage.DISABLED,
disable_forced_tool=user_forced_search,
)
# Default persona in a project with files, but also the files have not been loaded into the context already.
return ProjectSearchConfig(
search_usage=SearchToolUsage.ENABLED, disable_forced_tool=False
)
return user_message
def stream_chat_message_objects(
new_msg_req: CreateChatMessageRequest,
def handle_stream_message_objects(
new_msg_req: SendMessageRequest,
user: User | None,
db_session: Session,
# Needed to translate persona num_chunks to tokens to the LLM
default_num_chunks: float = MAX_CHUNKS_FED_TO_CHAT,
# For flow with search, don't include as many chunks as possible since we need to leave space
# for the chat history, for smaller models, we likely won't get MAX_CHUNKS_FED_TO_CHAT chunks
max_document_percentage: float = CHAT_TARGET_CHUNK_PERCENTAGE,
# if specified, uses the last user message and does not create a new user message based
# on the `new_msg_req.message`. Currently, requires a state where the last message is a
litellm_additional_headers: dict[str, str] | None = None,
custom_tool_additional_headers: dict[str, str] | None = None,
is_connected: Callable[[], bool] | None = None,
enforce_chat_session_id_for_search_docs: bool = True,
bypass_acl: bool = False,
# Additional context that should be included in the chat history, for example:
# Slack threads where the conversation cannot be represented by a chain of User/Assistant
# messages.
# messages. Both of the below are used for Slack
# NOTE: is not stored in the database, only passed in to the LLM as context
additional_context: str | None = None,
# Slack context for federated Slack search
slack_context: SlackContext | None = None,
# Optional external state container for non-streaming access to accumulated state
external_state_container: ChatStateContainer | None = None,
) -> AnswerStream:
tenant_id = get_current_tenant_id()
use_existing_user_message = new_msg_req.use_existing_user_message
llm: LLM | None = None
chat_session: ChatSession | None = None
redis_client: Redis | None = None
user_id = user.id if user is not None else None
llm_user_identifier = (
user.email
if user is not None and getattr(user, "email", None)
else (str(user_id) if user_id else "anonymous_user")
)
try:
user_id = user.id if user is not None else None
llm_user_identifier = (
user.email
if user is not None and getattr(user, "email", None)
else (str(user_id) if user_id else "anonymous_user")
)
if not new_msg_req.chat_session_id:
if not new_msg_req.chat_session_info:
raise RuntimeError(
"Must specify a chat session id or chat session info"
)
chat_session = create_chat_session_from_request(
chat_session_request=new_msg_req.chat_session_info,
user_id=user_id,
db_session=db_session,
)
yield CreateChatSessionID(chat_session_id=chat_session.id)
else:
chat_session = get_chat_session_by_id(
chat_session_id=new_msg_req.chat_session_id,
user_id=user_id,
db_session=db_session,
)
chat_session = get_chat_session_by_id(
chat_session_id=new_msg_req.chat_session_id,
user_id=user_id,
db_session=db_session,
)
persona = chat_session.persona
message_text = new_msg_req.message
chat_session_id = new_msg_req.chat_session_id
user_identity = LLMUserIdentity(
user_id=llm_user_identifier, session_id=str(chat_session_id)
user_id=llm_user_identifier, session_id=str(chat_session.id)
)
parent_id = new_msg_req.parent_message_id
reference_doc_ids = new_msg_req.search_doc_ids
retrieval_options = new_msg_req.retrieval_options
new_msg_req.alternate_assistant_id
user_selected_filters = retrieval_options.filters if retrieval_options else None
# permanent "log" store, used primarily for debugging
long_term_logger = LongTermLogger(
metadata={"user_id": str(user_id), "chat_session_id": str(chat_session_id)}
metadata={"user_id": str(user_id), "chat_session_id": str(chat_session.id)}
)
# Milestone tracking, most devs using the API don't need to understand this
@@ -373,10 +342,23 @@ def stream_chat_message_objects(
event=MilestoneRecordType.MULTIPLE_ASSISTANTS,
)
if reference_doc_ids is None and retrieval_options is None:
raise RuntimeError(
"Must specify a set of documents for chat or specify search options"
)
# Track user message in PostHog for analytics
fetch_versioned_implementation_with_fallback(
module="onyx.utils.telemetry",
attribute="event_telemetry",
fallback=noop_fallback,
)(
distinct_id=user.email if user else tenant_id,
event="user_message_sent",
properties={
"origin": new_msg_req.origin.value,
"has_files": len(new_msg_req.file_descriptors) > 0,
"has_project": chat_session.project_id is not None,
"has_persona": persona is not None and persona.id != DEFAULT_PERSONA_ID,
"deep_research": new_msg_req.deep_research,
"tenant_id": tenant_id,
},
)
llm = get_llm_for_persona(
persona=persona,
@@ -387,6 +369,14 @@ def stream_chat_message_objects(
)
token_counter = get_llm_token_counter(llm)
# Check LLM cost limits before using the LLM (only for Onyx-managed keys)
check_llm_cost_limit_for_provider(
db_session=db_session,
tenant_id=tenant_id,
llm_provider_api_key=llm.config.api_key,
)
# Verify that the user specified files actually belong to the user
verify_user_files(
user_files=new_msg_req.file_descriptors,
@@ -395,35 +385,61 @@ def stream_chat_message_objects(
project_id=chat_session.project_id,
)
# Makes sure that the chat session has the right message nodes
# and that the latest user message is created (not yet committed)
user_message = _initialize_chat_session(
message_text=message_text,
files=new_msg_req.file_descriptors,
token_counter=token_counter,
parent_id=parent_id,
user_id=user_id,
chat_session_id=chat_session_id,
db_session=db_session,
use_existing_user_message=use_existing_user_message,
)
# re-create linear history of messages
chat_history = create_chat_history_chain(
chat_session_id=chat_session_id, db_session=db_session
chat_session_id=chat_session.id, db_session=db_session
)
last_chat_message = chat_history[-1]
# Determine the parent message based on the request:
# - -1: auto-place after latest message in chain
# - None: regeneration from root (first message)
# - positive int: place after that specific parent message
root_message = get_or_create_root_message(
chat_session_id=chat_session.id, db_session=db_session
)
if last_chat_message.id != user_message.id:
db_session.rollback()
raise RuntimeError(
"The new message was not on the mainline. "
"Chat message history tree is not correctly built."
if new_msg_req.parent_message_id == AUTO_PLACE_AFTER_LATEST_MESSAGE:
# Auto-place after the latest message in the chain
parent_message = chat_history[-1] if chat_history else root_message
elif (
new_msg_req.parent_message_id is None
or new_msg_req.parent_message_id == root_message.id
):
# None = regeneration from root
parent_message = root_message
# Truncate history since we're starting from root
chat_history = []
else:
# Specific parent message ID provided, find parent in chat_history
parent_message = None
for i in range(len(chat_history) - 1, -1, -1):
if chat_history[i].id == new_msg_req.parent_message_id:
parent_message = chat_history[i]
# Truncate history to only include messages up to and including parent
chat_history = chat_history[: i + 1]
break
if parent_message is None:
raise ValueError(
"The new message sent is not on the latest mainline of messages"
)
# At this point we can save the user message as it's validated and final
db_session.commit()
# If the parent message is a user message, it's a regeneration and we use the existing user message.
if parent_message.message_type == MessageType.USER:
user_message = parent_message
else:
user_message = create_new_chat_message(
chat_session_id=chat_session.id,
parent_message=parent_message,
message=message_text,
token_count=token_counter(message_text),
message_type=MessageType.USER,
files=new_msg_req.file_descriptors,
db_session=db_session,
commit=True,
)
chat_history.append(user_message)
memories = get_memories(user, db_session)
@@ -433,7 +449,7 @@ def stream_chat_message_objects(
db_session=db_session,
persona_system_prompt=custom_agent_prompt or "",
token_counter=token_counter,
files=last_chat_message.files,
files=new_msg_req.file_descriptors,
memories=memories,
)
@@ -455,15 +471,20 @@ def stream_chat_message_objects(
None,
)
# This may also mutate the new_msg_req.forced_tool_ids
# This logic is specifically for projects
search_usage_forcing_setting = _get_project_search_availability(
# Determine if search should be disabled for this project context
forced_tool_id = new_msg_req.forced_tool_id
project_search_config = _get_project_search_availability(
project_id=chat_session.project_id,
persona_id=persona.id,
has_project_file_texts=bool(extracted_project_files.project_file_texts),
forced_tool_ids=new_msg_req.forced_tool_ids,
loaded_project_files=bool(extracted_project_files.project_file_texts),
project_has_files=bool(
extracted_project_files.project_uncapped_token_count
),
forced_tool_id=new_msg_req.forced_tool_id,
search_tool_id=search_tool_id,
)
if project_search_config.disable_forced_tool:
forced_tool_id = None
emitter = get_default_emitter()
@@ -475,7 +496,7 @@ def stream_chat_message_objects(
user=user,
llm=llm,
search_tool_config=SearchToolConfig(
user_selected_filters=user_selected_filters,
user_selected_filters=new_msg_req.internal_search_filters,
project_id=(
chat_session.project_id
if extracted_project_files.project_as_filter
@@ -485,17 +506,20 @@ def stream_chat_message_objects(
slack_context=slack_context,
),
custom_tool_config=CustomToolConfig(
chat_session_id=chat_session_id,
chat_session_id=chat_session.id,
message_id=user_message.id if user_message else None,
additional_headers=custom_tool_additional_headers,
),
allowed_tool_ids=new_msg_req.allowed_tool_ids,
search_usage_forcing_setting=search_usage_forcing_setting,
search_usage_forcing_setting=project_search_config.search_usage,
)
tools: list[Tool] = []
for tool_list in tool_dict.values():
tools.extend(tool_list)
if forced_tool_id and forced_tool_id not in [tool.id for tool in tools]:
raise ValueError(f"Forced tool {forced_tool_id} not found in tools")
# TODO Once summarization is done, we don't need to load all the files from the beginning anymore.
# load all files needed for this chat chain in memory
files = load_all_chat_files(chat_history, db_session)
@@ -505,7 +529,7 @@ def stream_chat_message_objects(
# Reserve a message id for the assistant response for frontend to track packets
assistant_response = reserve_message_id(
db_session=db_session,
chat_session_id=chat_session_id,
chat_session_id=chat_session.id,
parent_message=user_message.id,
message_type=MessageType.ASSISTANT,
)
@@ -529,15 +553,33 @@ def stream_chat_message_objects(
redis_client = get_redis_client()
reset_cancel_status(
chat_session_id,
chat_session.id,
redis_client,
)
def check_is_connected() -> bool:
return check_stop_signal(chat_session_id, redis_client)
return check_stop_signal(chat_session.id, redis_client)
# Create state container for accumulating partial results
state_container = ChatStateContainer()
set_processing_status(
chat_session_id=chat_session.id,
redis_client=redis_client,
value=True,
)
# Use external state container if provided, otherwise create internal one
# External container allows non-streaming callers to access accumulated state
state_container = external_state_container or ChatStateContainer()
def llm_loop_completion_callback(
state_container: ChatStateContainer,
) -> None:
llm_loop_completion_handle(
state_container=state_container,
db_session=db_session,
chat_session_id=str(chat_session.id),
is_connected=check_is_connected,
assistant_message=assistant_response,
)
# Run the LLM loop with explicit wrapper for stop signal handling
# The wrapper runs run_llm_loop in a background thread and polls every 300ms
@@ -554,6 +596,7 @@ def stream_chat_message_objects(
yield from run_chat_loop_with_state_containers(
run_deep_research_llm_loop,
llm_loop_completion_callback,
is_connected=check_is_connected,
emitter=emitter,
state_container=state_container,
@@ -565,11 +608,12 @@ def stream_chat_message_objects(
db_session=db_session,
skip_clarification=skip_clarification,
user_identity=user_identity,
chat_session_id=str(chat_session_id),
chat_session_id=str(chat_session.id),
)
else:
yield from run_chat_loop_with_state_containers(
run_llm_loop,
llm_loop_completion_callback,
is_connected=check_is_connected, # Not passed through to run_llm_loop
emitter=emitter,
state_container=state_container,
@@ -582,60 +626,11 @@ def stream_chat_message_objects(
llm=llm,
token_counter=token_counter,
db_session=db_session,
forced_tool_id=(
new_msg_req.forced_tool_ids[0]
if new_msg_req.forced_tool_ids
else None
),
forced_tool_id=forced_tool_id,
user_identity=user_identity,
chat_session_id=str(chat_session_id),
chat_session_id=str(chat_session.id),
)
# Determine if stopped by user
completed_normally = check_is_connected()
if not completed_normally:
logger.debug(f"Chat session {chat_session_id} stopped by user")
# Build final answer based on completion status
if completed_normally:
if state_container.answer_tokens is None:
raise RuntimeError(
"LLM run completed normally but did not return an answer."
)
final_answer = state_container.answer_tokens
else:
# Stopped by user - append stop message
if state_container.answer_tokens:
final_answer = (
state_container.answer_tokens
+ " ... The generation was stopped by the user here."
)
else:
final_answer = "The generation was stopped by the user."
# Build citation_docs_info from accumulated citations in state container
citation_docs_info: list[CitationDocInfo] = []
seen_citation_nums: set[int] = set()
for citation_num, search_doc in state_container.citation_to_doc.items():
if citation_num not in seen_citation_nums:
seen_citation_nums.add(citation_num)
citation_docs_info.append(
CitationDocInfo(
search_doc=search_doc,
citation_number=citation_num,
)
)
save_chat_turn(
message_text=final_answer,
reasoning_tokens=state_container.reasoning_tokens,
citation_docs_info=citation_docs_info,
tool_calls=state_container.tool_calls,
db_session=db_session,
assistant_message=assistant_response,
is_clarification=state_container.is_clarification,
)
except ValueError as e:
logger.exception("Failed to process chat message.")
@@ -653,15 +648,7 @@ def stream_chat_message_objects(
error_msg = str(e)
stack_trace = traceback.format_exc()
if isinstance(e, ToolCallException):
yield StreamingError(
error=error_msg,
stack_trace=stack_trace,
error_code="TOOL_CALL_FAILED",
is_retryable=True,
details={"tool_name": e.tool_name} if e.tool_name else None,
)
elif llm:
if llm:
client_error_msg, error_code, is_retryable = litellm_exception_to_error_msg(
e, llm
)
@@ -693,26 +680,127 @@ def stream_chat_message_objects(
)
db_session.rollback()
return
finally:
try:
if redis_client is not None and chat_session is not None:
set_processing_status(
chat_session_id=chat_session.id,
redis_client=redis_client,
value=False,
)
except Exception:
logger.exception("Error in setting processing status")
@log_generator_function_time()
def stream_chat_message(
def llm_loop_completion_handle(
state_container: ChatStateContainer,
is_connected: Callable[[], bool],
db_session: Session,
chat_session_id: str,
assistant_message: ChatMessage,
) -> None:
# Determine if stopped by user
completed_normally = is_connected()
# Build final answer based on completion status
if completed_normally:
if state_container.answer_tokens is None:
raise RuntimeError(
"LLM run completed normally but did not return an answer."
)
final_answer = state_container.answer_tokens
else:
# Stopped by user - append stop message
logger.debug(f"Chat session {chat_session_id} stopped by user")
if state_container.answer_tokens:
final_answer = (
state_container.answer_tokens
+ " ... \n\nGeneration was stopped by the user."
)
else:
final_answer = "The generation was stopped by the user."
# Build citation_docs_info from accumulated citations in state container
citation_docs_info: list[CitationDocInfo] = []
seen_citation_nums: set[int] = set()
for citation_num, search_doc in state_container.citation_to_doc.items():
if citation_num not in seen_citation_nums:
seen_citation_nums.add(citation_num)
citation_docs_info.append(
CitationDocInfo(
search_doc=search_doc,
citation_number=citation_num,
)
)
save_chat_turn(
message_text=final_answer,
reasoning_tokens=state_container.reasoning_tokens,
citation_docs_info=citation_docs_info,
tool_calls=state_container.tool_calls,
db_session=db_session,
assistant_message=assistant_message,
is_clarification=state_container.is_clarification,
)
def stream_chat_message_objects(
new_msg_req: CreateChatMessageRequest,
user: User | None,
db_session: Session,
# if specified, uses the last user message and does not create a new user message based
# on the `new_msg_req.message`. Currently, requires a state where the last message is a
litellm_additional_headers: dict[str, str] | None = None,
custom_tool_additional_headers: dict[str, str] | None = None,
) -> Iterator[str]:
with get_session_with_current_tenant() as db_session:
objects = stream_chat_message_objects(
new_msg_req=new_msg_req,
user=user,
db_session=db_session,
litellm_additional_headers=litellm_additional_headers,
custom_tool_additional_headers=custom_tool_additional_headers,
bypass_acl: bool = False,
# Additional context that should be included in the chat history, for example:
# Slack threads where the conversation cannot be represented by a chain of User/Assistant
# messages. Both of the below are used for Slack
# NOTE: is not stored in the database, only passed in to the LLM as context
additional_context: str | None = None,
# Slack context for federated Slack search
slack_context: SlackContext | None = None,
) -> AnswerStream:
forced_tool_id = (
new_msg_req.forced_tool_ids[0] if new_msg_req.forced_tool_ids else None
)
if (
new_msg_req.retrieval_options
and new_msg_req.retrieval_options.run_search == OptionalSearchSetting.ALWAYS
):
all_tools = get_tools(db_session)
search_tool_id = next(
(tool.id for tool in all_tools if tool.in_code_tool_id == SEARCH_TOOL_ID),
None,
)
for obj in objects:
yield get_json_line(obj.model_dump())
forced_tool_id = search_tool_id
translated_new_msg_req = SendMessageRequest(
message=new_msg_req.message,
llm_override=new_msg_req.llm_override,
allowed_tool_ids=new_msg_req.allowed_tool_ids,
forced_tool_id=forced_tool_id,
file_descriptors=new_msg_req.file_descriptors,
internal_search_filters=(
new_msg_req.retrieval_options.filters
if new_msg_req.retrieval_options
else None
),
deep_research=new_msg_req.deep_research,
parent_message_id=new_msg_req.parent_message_id,
chat_session_id=new_msg_req.chat_session_id,
origin=new_msg_req.origin,
)
return handle_stream_message_objects(
new_msg_req=translated_new_msg_req,
user=user,
db_session=db_session,
litellm_additional_headers=litellm_additional_headers,
custom_tool_additional_headers=custom_tool_additional_headers,
bypass_acl=bypass_acl,
additional_context=additional_context,
slack_context=slack_context,
)
def remove_answer_citations(answer: str) -> str:
@@ -767,3 +855,83 @@ def gather_stream(
error_msg=error_msg,
top_documents=top_documents,
)
@log_function_time()
def gather_stream_full(
packets: AnswerStream,
state_container: ChatStateContainer,
) -> ChatFullResponse:
"""
Aggregate streaming packets and state container into a complete ChatFullResponse.
This function consumes all packets from the stream and combines them with
the accumulated state from the ChatStateContainer to build a complete response
including answer, reasoning, citations, and tool calls.
Args:
packets: The stream of packets from handle_stream_message_objects
state_container: The state container that accumulates tool calls, reasoning, etc.
Returns:
ChatFullResponse with all available data
"""
answer: str | None = None
citations: list[CitationInfo] = []
error_msg: str | None = None
message_id: int | None = None
top_documents: list[SearchDoc] = []
chat_session_id: UUID | None = None
for packet in packets:
if isinstance(packet, Packet):
if isinstance(packet.obj, AgentResponseStart):
if packet.obj.final_documents:
top_documents = packet.obj.final_documents
elif isinstance(packet.obj, AgentResponseDelta):
if answer is None:
answer = ""
if packet.obj.content:
answer += packet.obj.content
elif isinstance(packet.obj, CitationInfo):
citations.append(packet.obj)
elif isinstance(packet, StreamingError):
error_msg = packet.error
elif isinstance(packet, MessageResponseIDInfo):
message_id = packet.reserved_assistant_message_id
elif isinstance(packet, CreateChatSessionID):
chat_session_id = packet.chat_session_id
if message_id is None:
raise ValueError("Message ID is required")
# Use state_container for complete answer (handles edge cases gracefully)
final_answer = state_container.get_answer_tokens() or answer or ""
# Get reasoning from state container (None when model doesn't produce reasoning)
reasoning = state_container.get_reasoning_tokens()
# Convert ToolCallInfo list to ToolCallResponse list
tool_call_responses = [
ToolCallResponse(
tool_name=tc.tool_name,
tool_arguments=tc.tool_call_arguments,
tool_result=tc.tool_call_response,
search_docs=tc.search_docs,
generated_images=tc.generated_images,
pre_reasoning=tc.reasoning_tokens,
)
for tc in state_container.get_tool_calls()
]
return ChatFullResponse(
answer=final_answer,
answer_citationless=remove_answer_citations(final_answer),
pre_answer_reasoning=reasoning,
tool_calls=tool_call_responses,
top_documents=top_documents,
citation_info=citations,
message_id=message_id,
chat_session_id=chat_session_id,
error_msg=error_msg,
)

View File

@@ -10,6 +10,7 @@ from onyx.file_store.models import FileDescriptor
from onyx.prompts.chat_prompts import CITATION_REMINDER
from onyx.prompts.chat_prompts import CODE_BLOCK_MARKDOWN
from onyx.prompts.chat_prompts import DEFAULT_SYSTEM_PROMPT
from onyx.prompts.chat_prompts import LAST_CYCLE_CITATION_REMINDER
from onyx.prompts.chat_prompts import REQUIRE_CITATION_GUIDANCE
from onyx.prompts.chat_prompts import USER_INFO_HEADER
from onyx.prompts.prompt_utils import get_company_context
@@ -22,6 +23,7 @@ from onyx.prompts.tool_prompts import PYTHON_TOOL_GUIDANCE
from onyx.prompts.tool_prompts import TOOL_DESCRIPTION_SEARCH_GUIDANCE
from onyx.prompts.tool_prompts import TOOL_SECTION_HEADER
from onyx.prompts.tool_prompts import WEB_SEARCH_GUIDANCE
from onyx.prompts.tool_prompts import WEB_SEARCH_SITE_DISABLED_GUIDANCE
from onyx.tools.interface import Tool
from onyx.tools.tool_implementations.images.image_generation_tool import (
ImageGenerationTool,
@@ -37,7 +39,7 @@ def get_default_base_system_prompt(db_session: Session) -> str:
default_persona = get_default_behavior_persona(db_session)
return (
default_persona.system_prompt
if default_persona and default_persona.system_prompt
if default_persona and default_persona.system_prompt is not None
else DEFAULT_SYSTEM_PROMPT
)
@@ -115,8 +117,11 @@ def calculate_reserved_tokens(
def build_reminder_message(
reminder_text: str | None,
include_citation_reminder: bool,
is_last_cycle: bool,
) -> str | None:
reminder = reminder_text.strip() if reminder_text else ""
if is_last_cycle:
reminder += "\n\n" + LAST_CYCLE_CITATION_REMINDER
if include_citation_reminder:
reminder += "\n\n" + CITATION_REMINDER
reminder = reminder.strip()
@@ -169,7 +174,9 @@ def build_system_prompt(
TOOL_SECTION_HEADER
+ TOOL_DESCRIPTION_SEARCH_GUIDANCE
+ INTERNAL_SEARCH_GUIDANCE
+ WEB_SEARCH_GUIDANCE
+ WEB_SEARCH_GUIDANCE.format(
site_colon_disabled=WEB_SEARCH_SITE_DISABLED_GUIDANCE
)
+ OPEN_URLS_GUIDANCE
+ GENERATE_IMAGE_GUIDANCE
+ PYTHON_TOOL_GUIDANCE
@@ -195,7 +202,16 @@ def build_system_prompt(
system_prompt += INTERNAL_SEARCH_GUIDANCE
if has_web_search or include_all_guidance:
system_prompt += WEB_SEARCH_GUIDANCE
site_disabled_guidance = ""
if has_web_search:
web_search_tool = next(
(t for t in tools if isinstance(t, WebSearchTool)), None
)
if web_search_tool and not web_search_tool.supports_site_filter:
site_disabled_guidance = WEB_SEARCH_SITE_DISABLED_GUIDANCE
system_prompt += WEB_SEARCH_GUIDANCE.format(
site_colon_disabled=site_disabled_guidance
)
if has_open_urls or include_all_guidance:
system_prompt += OPEN_URLS_GUIDANCE

View File

@@ -117,22 +117,30 @@ def _create_and_link_tool_calls(
tool_call_map[tool_call_obj.tool_call_id] = tool_call_obj.id
# Update parent_tool_call_id for all tool calls
# Filter out orphaned children (whose parents don't exist) - this can happen
# when generation is stopped mid-execution and parent tool calls were cancelled
valid_tool_calls: list[ToolCall] = []
for tool_call_obj in tool_call_objects:
tool_call_info = tool_call_info_map[tool_call_obj.tool_call_id]
if tool_call_info.parent_tool_call_id is not None:
parent_id = tool_call_map.get(tool_call_info.parent_tool_call_id)
if parent_id is not None:
tool_call_obj.parent_tool_call_id = parent_id
valid_tool_calls.append(tool_call_obj)
else:
# This would cause chat sessions to fail if this function is miscalled with
# tool calls that have bad parent pointers but this falls under "fail loudly"
raise ValueError(
f"Parent tool call with tool_call_id '{tool_call_info.parent_tool_call_id}' "
f"not found for tool call '{tool_call_obj.tool_call_id}'"
# Parent doesn't exist (likely cancelled) - skip this orphaned child
logger.warning(
f"Skipping tool call '{tool_call_obj.tool_call_id}' with missing parent "
f"'{tool_call_info.parent_tool_call_id}' (likely cancelled during execution)"
)
# Remove from DB session to prevent saving
db_session.delete(tool_call_obj)
else:
# Top-level tool call (no parent)
valid_tool_calls.append(tool_call_obj)
# Link SearchDocs to ToolCalls
for tool_call_obj in tool_call_objects:
# Link SearchDocs only to valid ToolCalls
for tool_call_obj in valid_tool_calls:
search_doc_ids = tool_call_to_search_doc_ids.get(tool_call_obj.tool_call_id, [])
if search_doc_ids:
add_search_docs_to_tool_call(

View File

@@ -2,12 +2,23 @@ from uuid import UUID
from redis.client import Redis
from shared_configs.contextvars import get_current_tenant_id
# Redis key prefixes for chat session stop signals
PREFIX = "chatsessionstop"
FENCE_PREFIX = f"{PREFIX}_fence"
FENCE_TTL = 24 * 60 * 60 # 24 hours - defensive TTL to prevent memory leaks
FENCE_TTL = 10 * 60 # 10 minutes - defensive TTL to prevent memory leaks
def _get_fence_key(chat_session_id: UUID) -> str:
"""
Generate the Redis key for a chat session stop signal fence.
Args:
chat_session_id: The UUID of the chat session
Returns:
The fence key string (tenant_id is automatically added by the Redis client)
"""
return f"{FENCE_PREFIX}_{chat_session_id}"
def set_fence(chat_session_id: UUID, redis_client: Redis, value: bool) -> None:
@@ -16,11 +27,10 @@ def set_fence(chat_session_id: UUID, redis_client: Redis, value: bool) -> None:
Args:
chat_session_id: The UUID of the chat session
redis_client: Redis client to use
redis_client: Redis client to use (tenant-aware client that auto-prefixes keys)
value: True to set the fence (stop signal), False to clear it
"""
tenant_id = get_current_tenant_id()
fence_key = f"{FENCE_PREFIX}_{tenant_id}_{chat_session_id}"
fence_key = _get_fence_key(chat_session_id)
if not value:
redis_client.delete(fence_key)
return
@@ -34,13 +44,12 @@ def is_connected(chat_session_id: UUID, redis_client: Redis) -> bool:
Args:
chat_session_id: The UUID of the chat session to check
redis_client: Redis client to use for checking the stop signal
redis_client: Redis client to use for checking the stop signal (tenant-aware client that auto-prefixes keys)
Returns:
True if the session should continue, False if it should stop
"""
tenant_id = get_current_tenant_id()
fence_key = f"{FENCE_PREFIX}_{tenant_id}_{chat_session_id}"
fence_key = _get_fence_key(chat_session_id)
return not bool(redis_client.exists(fence_key))
@@ -50,8 +59,7 @@ def reset_cancel_status(chat_session_id: UUID, redis_client: Redis) -> None:
Args:
chat_session_id: The UUID of the chat session
redis_client: Redis client to use
redis_client: Redis client to use (tenant-aware client that auto-prefixes keys)
"""
tenant_id = get_current_tenant_id()
fence_key = f"{FENCE_PREFIX}_{tenant_id}_{chat_session_id}"
fence_key = _get_fence_key(chat_session_id)
redis_client.delete(fence_key)

View File

@@ -120,6 +120,14 @@ VALID_EMAIL_DOMAINS = (
if _VALID_EMAIL_DOMAINS_STR
else []
)
# Disposable email blocking - blocks temporary/throwaway email addresses
# Set to empty string to disable disposable email blocking
DISPOSABLE_EMAIL_DOMAINS_URL = os.environ.get(
"DISPOSABLE_EMAIL_DOMAINS_URL",
"https://disposable.github.io/disposable-email-domains/domains.json",
)
# OAuth Login Flow
# Used for both Google OAuth2 and OIDC flows
OAUTH_CLIENT_ID = (
@@ -186,6 +194,16 @@ TRACK_EXTERNAL_IDP_EXPIRY = (
# DB Configs
#####
DOCUMENT_INDEX_NAME = "danswer_index"
OPENSEARCH_HOST = os.environ.get("OPENSEARCH_HOST") or "localhost"
OPENSEARCH_REST_API_PORT = int(os.environ.get("OPENSEARCH_REST_API_PORT") or 9200)
OPENSEARCH_ADMIN_USERNAME = os.environ.get("OPENSEARCH_ADMIN_USERNAME", "admin")
OPENSEARCH_ADMIN_PASSWORD = os.environ.get("OPENSEARCH_ADMIN_PASSWORD", "")
ENABLE_OPENSEARCH_FOR_ONYX = (
os.environ.get("ENABLE_OPENSEARCH_FOR_ONYX", "").lower() == "true"
)
VESPA_HOST = os.environ.get("VESPA_HOST") or "localhost"
# NOTE: this is used if and only if the vespa config server is accessible via a
# different host than the main vespa application
@@ -550,6 +568,7 @@ JIRA_CONNECTOR_LABELS_TO_SKIP = [
JIRA_CONNECTOR_MAX_TICKET_SIZE = int(
os.environ.get("JIRA_CONNECTOR_MAX_TICKET_SIZE", 100 * 1024)
)
JIRA_SLIM_PAGE_SIZE = int(os.environ.get("JIRA_SLIM_PAGE_SIZE", 500))
GONG_CONNECTOR_START_TIME = os.environ.get("GONG_CONNECTOR_START_TIME")
@@ -661,10 +680,6 @@ INDEXING_EMBEDDING_MODEL_NUM_THREADS = int(
os.environ.get("INDEXING_EMBEDDING_MODEL_NUM_THREADS") or 8
)
# Maximum number of user file connector credential pairs to index in a single batch
# Setting this number too high may overload the indexing process
USER_FILE_INDEXING_LIMIT = int(os.environ.get("USER_FILE_INDEXING_LIMIT") or 100)
# Maximum file size in a document to be indexed
MAX_DOCUMENT_CHARS = int(os.environ.get("MAX_DOCUMENT_CHARS") or 5_000_000)
MAX_FILE_SIZE_BYTES = int(
@@ -736,7 +751,27 @@ BRAINTRUST_PROJECT = os.environ.get("BRAINTRUST_PROJECT", "Onyx")
# Braintrust API key - if provided, Braintrust tracing will be enabled
BRAINTRUST_API_KEY = os.environ.get("BRAINTRUST_API_KEY") or ""
# Maximum concurrency for Braintrust evaluations
BRAINTRUST_MAX_CONCURRENCY = int(os.environ.get("BRAINTRUST_MAX_CONCURRENCY") or 5)
# None means unlimited concurrency, otherwise specify a number
_braintrust_concurrency = os.environ.get("BRAINTRUST_MAX_CONCURRENCY")
BRAINTRUST_MAX_CONCURRENCY = (
int(_braintrust_concurrency) if _braintrust_concurrency else None
)
#####
# Scheduled Evals Configuration
#####
# Comma-separated list of Braintrust dataset names to run on schedule
SCHEDULED_EVAL_DATASET_NAMES = [
name.strip()
for name in os.environ.get("SCHEDULED_EVAL_DATASET_NAMES", "").split(",")
if name.strip()
]
# Email address to use for search permissions during scheduled evals
SCHEDULED_EVAL_PERMISSIONS_EMAIL = os.environ.get(
"SCHEDULED_EVAL_PERMISSIONS_EMAIL", "roshan@onyx.app"
)
# Braintrust project name to use for scheduled evals
SCHEDULED_EVAL_PROJECT = os.environ.get("SCHEDULED_EVAL_PROJECT", "st-dev")
#####
# Langfuse Configuration
@@ -772,8 +807,16 @@ try:
except json.JSONDecodeError:
pass
# LLM Model Update API endpoint
LLM_MODEL_UPDATE_API_URL = os.environ.get("LLM_MODEL_UPDATE_API_URL")
# Auto LLM Configuration - fetches model configs from GitHub for providers in Auto mode
AUTO_LLM_CONFIG_URL = os.environ.get(
"AUTO_LLM_CONFIG_URL",
"https://raw.githubusercontent.com/onyx-dot-app/onyx/main/backend/onyx/llm/well_known_providers/recommended-models.json",
)
# How often to check for auto LLM model updates (in seconds)
AUTO_LLM_UPDATE_INTERVAL_SECONDS = int(
os.environ.get("AUTO_LLM_UPDATE_INTERVAL_SECONDS", 1800) # 30 minutes
)
#####
# Enterprise Edition Configs
@@ -786,6 +829,11 @@ ENTERPRISE_EDITION_ENABLED = (
os.environ.get("ENABLE_PAID_ENTERPRISE_EDITION_FEATURES", "").lower() == "true"
)
#####
# Image Generation Configuration (DEPRECATED)
# These environment variables will be deprecated soon.
# To configure image generation, please visit the Image Generation page in the Admin Panel.
#####
# Azure Image Configurations
AZURE_IMAGE_API_VERSION = os.environ.get("AZURE_IMAGE_API_VERSION") or os.environ.get(
"AZURE_DALLE_API_VERSION"
@@ -870,6 +918,19 @@ DEV_MODE = os.environ.get("DEV_MODE", "").lower() == "true"
INTEGRATION_TESTS_MODE = os.environ.get("INTEGRATION_TESTS_MODE", "").lower() == "true"
#####
# Captcha Configuration (for cloud signup protection)
#####
# Enable captcha verification for new user registration
CAPTCHA_ENABLED = os.environ.get("CAPTCHA_ENABLED", "").lower() == "true"
# Google reCAPTCHA secret key (server-side validation)
RECAPTCHA_SECRET_KEY = os.environ.get("RECAPTCHA_SECRET_KEY", "")
# Minimum score threshold for reCAPTCHA v3 (0.0-1.0, higher = more likely human)
# 0.5 is the recommended default
RECAPTCHA_SCORE_THRESHOLD = float(os.environ.get("RECAPTCHA_SCORE_THRESHOLD", "0.5"))
MOCK_CONNECTOR_FILE_PATH = os.environ.get("MOCK_CONNECTOR_FILE_PATH")
# Set to true to mock LLM responses for testing purposes
@@ -923,3 +984,21 @@ S3_GENERATE_LOCAL_CHECKSUM = (
# Forcing Vespa Language
# English: en, German:de, etc. See: https://docs.vespa.ai/en/linguistics.html
VESPA_LANGUAGE_OVERRIDE = os.environ.get("VESPA_LANGUAGE_OVERRIDE")
#####
# Default LLM API Keys (for cloud deployments)
# These are Onyx-managed API keys provided to tenants by default
#####
OPENAI_DEFAULT_API_KEY = os.environ.get("OPENAI_DEFAULT_API_KEY")
ANTHROPIC_DEFAULT_API_KEY = os.environ.get("ANTHROPIC_DEFAULT_API_KEY")
COHERE_DEFAULT_API_KEY = os.environ.get("COHERE_DEFAULT_API_KEY")
VERTEXAI_DEFAULT_CREDENTIALS = os.environ.get("VERTEXAI_DEFAULT_CREDENTIALS")
VERTEXAI_DEFAULT_LOCATION = os.environ.get("VERTEXAI_DEFAULT_LOCATION", "global")
OPENROUTER_DEFAULT_API_KEY = os.environ.get("OPENROUTER_DEFAULT_API_KEY")
INSTANCE_TYPE = (
"managed"
if os.environ.get("IS_MANAGED_INSTANCE", "").lower() == "true"
else "cloud" if AUTH_TYPE == AuthType.CLOUD else "self_hosted"
)

View File

@@ -11,9 +11,6 @@ NUM_POSTPROCESSED_RESULTS = 20
# May be less depending on model
MAX_CHUNKS_FED_TO_CHAT = int(os.environ.get("MAX_CHUNKS_FED_TO_CHAT") or 25)
# For Chat, need to keep enough space for history and other prompt pieces
# ~3k input, half for docs, half for chat history + prompts
CHAT_TARGET_CHUNK_PERCENTAGE = 512 * 3 / 3072
# Maximum percentage of the context window to fill with selected sections
SELECTED_SECTIONS_MAX_WINDOW_PERCENTAGE = 0.8

View File

@@ -7,6 +7,7 @@ from enum import Enum
ONYX_DEFAULT_APPLICATION_NAME = "Onyx"
ONYX_DISCORD_URL = "https://discord.gg/4NA5SbzrWb"
ONYX_UTM_SOURCE = "onyx_app"
SLACK_USER_TOKEN_PREFIX = "xoxp-"
SLACK_BOT_TOKEN_PREFIX = "xoxb-"
ONYX_EMAILABLE_LOGO_MAX_DIM = 512
@@ -146,11 +147,19 @@ CELERY_PERMISSIONS_SYNC_LOCK_TIMEOUT = 3600 # 1 hour (in seconds)
CELERY_EXTERNAL_GROUP_SYNC_LOCK_TIMEOUT = 300 # 5 min
# Doc ID migration can be long-running; use a longer TTL and renew periodically
CELERY_USER_FILE_DOCID_MIGRATION_LOCK_TIMEOUT = 10 * 60 # 10 minutes (in seconds)
CELERY_USER_FILE_PROCESSING_LOCK_TIMEOUT = 30 * 60 # 30 minutes (in seconds)
# How long a queued user-file task is valid before workers discard it.
# Should be longer than the beat interval (20 s) but short enough to prevent
# indefinite queue growth. Workers drop tasks older than this without touching
# the DB, so a shorter value = faster drain of stale duplicates.
CELERY_USER_FILE_PROCESSING_TASK_EXPIRES = 60 # 1 minute (in seconds)
# Maximum number of tasks allowed in the user-file-processing queue before the
# beat generator stops adding more. Prevents unbounded queue growth when workers
# fall behind.
USER_FILE_PROCESSING_MAX_QUEUE_DEPTH = 500
CELERY_USER_FILE_PROJECT_SYNC_LOCK_TIMEOUT = 5 * 60 # 5 minutes (in seconds)
DANSWER_REDIS_FUNCTION_LOCK_PREFIX = "da_function_lock:"
@@ -237,6 +246,8 @@ class NotificationType(str, Enum):
REINDEX = "reindex"
PERSONA_SHARED = "persona_shared"
TRIAL_ENDS_TWO_DAYS = "two_day_trial_ending" # 2 days left in trial
RELEASE_NOTES = "release_notes"
ASSISTANT_FILES_READY = "assistant_files_ready"
class BlobType(str, Enum):
@@ -365,9 +376,6 @@ class OnyxCeleryQueues:
CONNECTOR_EXTERNAL_GROUP_SYNC = "connector_external_group_sync"
CSV_GENERATION = "csv_generation"
# Indexing queue
USER_FILES_INDEXING = "user_files_indexing"
# User file processing queue
USER_FILE_PROCESSING = "user_file_processing"
USER_FILE_PROJECT_SYNC = "user_file_project_sync"
@@ -422,11 +430,16 @@ class OnyxRedisLocks:
# User file processing
USER_FILE_PROCESSING_BEAT_LOCK = "da_lock:check_user_file_processing_beat"
USER_FILE_PROCESSING_LOCK_PREFIX = "da_lock:user_file_processing"
# Short-lived key set when a task is enqueued; cleared when the worker picks it up.
# Prevents the beat from re-enqueuing the same file while a task is already queued.
USER_FILE_QUEUED_PREFIX = "da_lock:user_file_queued"
USER_FILE_PROJECT_SYNC_BEAT_LOCK = "da_lock:check_user_file_project_sync_beat"
USER_FILE_PROJECT_SYNC_LOCK_PREFIX = "da_lock:user_file_project_sync"
USER_FILE_DELETE_BEAT_LOCK = "da_lock:check_user_file_delete_beat"
USER_FILE_DELETE_LOCK_PREFIX = "da_lock:user_file_delete"
USER_FILE_DOCID_MIGRATION_LOCK = "da_lock:user_file_docid_migration"
# Release notes
RELEASE_NOTES_FETCH_LOCK = "da_lock:release_notes_fetch"
class OnyxRedisSignals:
@@ -492,7 +505,7 @@ class OnyxCeleryTask:
CHECK_FOR_PRUNING = "check_for_pruning"
CHECK_FOR_DOC_PERMISSIONS_SYNC = "check_for_doc_permissions_sync"
CHECK_FOR_EXTERNAL_GROUP_SYNC = "check_for_external_group_sync"
CHECK_FOR_LLM_MODEL_UPDATE = "check_for_llm_model_update"
CHECK_FOR_AUTO_LLM_UPDATE = "check_for_auto_llm_update"
# User file processing
CHECK_FOR_USER_FILE_PROCESSING = "check_for_user_file_processing"
@@ -533,7 +546,6 @@ class OnyxCeleryTask:
CONNECTOR_PRUNING_GENERATOR_TASK = "connector_pruning_generator_task"
DOCUMENT_BY_CC_PAIR_CLEANUP_TASK = "document_by_cc_pair_cleanup_task"
VESPA_METADATA_SYNC_TASK = "vespa_metadata_sync_task"
USER_FILE_DOCID_MIGRATION = "user_file_docid_migration"
# chat retention
CHECK_TTL_MANAGEMENT_TASK = "check_ttl_management_task"
@@ -542,6 +554,7 @@ class OnyxCeleryTask:
GENERATE_USAGE_REPORT_TASK = "generate_usage_report_task"
EVAL_RUN_TASK = "eval_run_task"
SCHEDULED_EVAL_TASK = "scheduled_eval_task"
EXPORT_QUERY_HISTORY_TASK = "export_query_history_task"
EXPORT_QUERY_HISTORY_CLEANUP_TASK = "export_query_history_cleanup_task"
@@ -562,9 +575,9 @@ REDIS_SOCKET_KEEPALIVE_OPTIONS[socket.TCP_KEEPINTVL] = 15
REDIS_SOCKET_KEEPALIVE_OPTIONS[socket.TCP_KEEPCNT] = 3
if platform.system() == "Darwin":
REDIS_SOCKET_KEEPALIVE_OPTIONS[socket.TCP_KEEPALIVE] = 60 # type: ignore
REDIS_SOCKET_KEEPALIVE_OPTIONS[socket.TCP_KEEPALIVE] = 60 # type: ignore[attr-defined,unused-ignore]
else:
REDIS_SOCKET_KEEPALIVE_OPTIONS[socket.TCP_KEEPIDLE] = 60
REDIS_SOCKET_KEEPALIVE_OPTIONS[socket.TCP_KEEPIDLE] = 60 # type: ignore[attr-defined,unused-ignore]
class OnyxCallTypes(str, Enum):

View File

@@ -128,3 +128,17 @@ if _LITELLM_EXTRA_BODY_RAW:
LITELLM_EXTRA_BODY = json.loads(_LITELLM_EXTRA_BODY_RAW)
except Exception:
pass
#####
# Prompt Caching Configs
#####
# Enable prompt caching framework
ENABLE_PROMPT_CACHING = (
os.environ.get("ENABLE_PROMPT_CACHING", "true").lower() != "false"
)
# Cache TTL multiplier - store caches slightly longer than provider TTL
# This allows for some clock skew and ensures we don't lose cache metadata prematurely
PROMPT_CACHE_REDIS_TTL_MULTIPLIER = float(
os.environ.get("PROMPT_CACHE_REDIS_TTL_MULTIPLIER") or 1.2
)

View File

@@ -93,7 +93,7 @@ if __name__ == "__main__":
#### Docs Changes
Create the new connector page (with guiding images!) with how to get the connector credentials and how to set up the
connector in Onyx. Then create a Pull Request in https://github.com/onyx-dot-app/onyx-docs.
connector in Onyx. Then create a Pull Request in [https://github.com/onyx-dot-app/documentation](https://github.com/onyx-dot-app/documentation).
### Before opening PR

View File

@@ -901,13 +901,16 @@ class OnyxConfluence:
space_key: str,
) -> list[dict[str, Any]]:
"""
This is a confluence server specific method that can be used to
This is a confluence server/data center specific method that can be used to
fetch the permissions of a space.
This is better logging than calling the get_space_permissions method
because it returns a jsonrpc response.
TODO: Make this call these endpoints for newer confluence versions:
- /rest/api/space/{spaceKey}/permissions
- /rest/api/space/{spaceKey}/permissions/anonymous
NOTE: This uses the JSON-RPC API which is the ONLY way to get space permissions
on Confluence Server/Data Center. The REST API equivalent (expand=permissions)
is Cloud-only and not available on Data Center as of version 8.9.x.
If this fails with 401 Unauthorized, the customer needs to enable JSON-RPC:
Confluence Admin -> General Configuration -> Further Configuration
-> Enable "Remote API (XML-RPC & SOAP)"
"""
url = "rpc/json-rpc/confluenceservice-v2"
data = {
@@ -916,7 +919,18 @@ class OnyxConfluence:
"id": 7,
"params": [space_key],
}
response = self.post(url, data=data)
try:
response = self.post(url, data=data)
except HTTPError as e:
if e.response is not None and e.response.status_code == 401:
raise HTTPError(
"Unauthorized (401) when calling JSON-RPC API for space permissions. "
"This is likely because the Remote API is disabled. "
"To fix: Confluence Admin -> General Configuration -> Further Configuration "
"-> Enable 'Remote API (XML-RPC & SOAP)'",
response=e.response,
) from e
raise
logger.debug(f"jsonrpc response: {response}")
if not response.get("result"):
logger.warning(
@@ -961,14 +975,20 @@ def get_user_email_from_username__server(
try:
response = confluence_client.get_mobile_parameters(user_name)
email = response.get("email")
except Exception:
logger.warning(f"failed to get confluence email for {user_name}")
except HTTPError as e:
status_code = e.response.status_code if e.response is not None else "N/A"
logger.warning(
f"Failed to get confluence email for {user_name}: "
f"HTTP {status_code} - {e}"
)
# For now, we'll just return None and log a warning. This means
# we will keep retrying to get the email every group sync.
email = None
# We may want to just return a string that indicates failure so we dont
# keep retrying
# email = f"FAILED TO GET CONFLUENCE EMAIL FOR {user_name}"
except Exception as e:
logger.warning(
f"Failed to get confluence email for {user_name}: {type(e).__name__} - {e}"
)
email = None
_USER_EMAIL_CACHE[user_name] = email
return _USER_EMAIL_CACHE[user_name]

View File

@@ -97,10 +97,17 @@ def basic_expert_info_representation(info: BasicExpertInfo) -> str | None:
def get_experts_stores_representations(
experts: list[BasicExpertInfo] | None,
) -> list[str] | None:
"""Gets string representations of experts supplied.
If an expert cannot be represented as a string, it is omitted from the
result.
"""
if not experts:
return None
reps = [basic_expert_info_representation(owner) for owner in experts]
reps: list[str | None] = [
basic_expert_info_representation(owner) for owner in experts
]
return [owner for owner in reps if owner is not None]

View File

@@ -58,51 +58,59 @@ class DrupalWikiConnector(
CheckpointedConnector[DrupalWikiCheckpoint],
SlimConnector,
):
# Deprecated parameters that may exist in old connector configurations
_DEPRECATED_PARAMS = {"drupal_wiki_scope", "include_all_spaces"}
def __init__(
self,
base_url: str,
spaces: list[str] | None = None,
pages: list[str] | None = None,
include_all_spaces: bool = False,
batch_size: int = INDEX_BATCH_SIZE,
continue_on_failure: bool = CONTINUE_ON_CONNECTOR_FAILURE,
drupal_wiki_scope: str | None = None,
include_attachments: bool = False,
allow_images: bool = False,
**kwargs: Any,
) -> None:
"""
Initialize the Drupal Wiki connector.
Args:
base_url: The base URL of the Drupal Wiki instance (e.g., https://help.drupal-wiki.com)
spaces: List of space IDs to index. If None and include_all_spaces is False, no spaces will be indexed.
pages: List of page IDs to index. If provided, only these specific pages will be indexed.
include_all_spaces: If True, all spaces will be indexed regardless of the spaces parameter.
spaces: List of space IDs to index. If None and pages is also None, all spaces will be indexed.
pages: List of page IDs to index. If provided, these specific pages will be indexed.
batch_size: Number of documents to process in a batch.
continue_on_failure: If True, continue indexing even if some documents fail.
drupal_wiki_scope: The selected tab value from the frontend. If "all_spaces", all spaces will be indexed.
include_attachments: If True, enable processing of page attachments including images and documents.
allow_images: If True, enable processing of image attachments.
"""
#########################################################
# TODO: Remove this after 02/01/2026 and remove **kwargs from the function signature
# Check for deprecated parameters from old connector configurations
# If attempting to update without deleting the connector:
# Remove the deprecated parameters from the custom_connector_config in the relevant connector table rows
deprecated_found = set(kwargs.keys()) & self._DEPRECATED_PARAMS
if deprecated_found:
raise ConnectorValidationError(
f"Outdated Drupal Wiki connector configuration detected "
f"(found deprecated parameters: {', '.join(deprecated_found)}). "
f"Please delete and recreate this connector, or contact Onyx support "
f"for assistance with updating the configuration without deleting the connector."
)
# Reject any other unexpected parameters
if kwargs:
raise ConnectorValidationError(
f"Unexpected parameters for Drupal Wiki connector: {', '.join(kwargs.keys())}"
)
#########################################################
self.base_url = base_url.rstrip("/")
self.spaces = spaces or []
self.pages = pages or []
# Determine whether to include all spaces based on the selected tab
# If drupal_wiki_scope is "all_spaces", we should index all spaces
# If it's "specific_spaces", we should only index the specified spaces
# If it's None, we use the include_all_spaces parameter
if drupal_wiki_scope is not None:
logger.debug(f"drupal_wiki_scope is set to {drupal_wiki_scope}")
self.include_all_spaces = drupal_wiki_scope == "all_spaces"
# If scope is specific_spaces, include_all_spaces correctly defaults to False
else:
logger.debug(
f"drupal_wiki_scope is not set, using include_all_spaces={include_all_spaces}"
)
self.include_all_spaces = include_all_spaces
# If no specific spaces or pages are provided, index all spaces
self.include_all_spaces = not self.spaces and not self.pages
self.batch_size = batch_size
self.continue_on_failure = continue_on_failure

View File

@@ -12,7 +12,9 @@ from functools import partial
from typing import Any
from typing import cast
from typing import Protocol
from urllib.parse import parse_qs
from urllib.parse import urlparse
from urllib.parse import urlunparse
from google.auth.exceptions import RefreshError
from google.oauth2.credentials import Credentials as OAuthCredentials
@@ -64,6 +66,7 @@ from onyx.connectors.google_utils.shared_constants import USER_FIELDS
from onyx.connectors.interfaces import CheckpointedConnectorWithPermSync
from onyx.connectors.interfaces import CheckpointOutput
from onyx.connectors.interfaces import GenerateSlimDocumentOutput
from onyx.connectors.interfaces import NormalizationResult
from onyx.connectors.interfaces import SecondsSinceUnixEpoch
from onyx.connectors.interfaces import SlimConnectorWithPermSync
from onyx.connectors.models import ConnectorFailure
@@ -281,6 +284,54 @@ class GoogleDriveConnector(
)
return self._creds
@classmethod
@override
def normalize_url(cls, url: str) -> NormalizationResult:
"""Normalize a Google Drive URL to match the canonical Document.id format.
Reuses the connector's existing document ID creation logic from
onyx_document_id_from_drive_file.
"""
parsed = urlparse(url)
netloc = parsed.netloc.lower()
if not (
netloc.startswith("docs.google.com")
or netloc.startswith("drive.google.com")
):
return NormalizationResult(normalized_url=None, use_default=False)
# Handle ?id= query parameter case
query_params = parse_qs(parsed.query)
doc_id = query_params.get("id", [None])[0]
if doc_id:
scheme = parsed.scheme or "https"
netloc = "drive.google.com"
path = f"/file/d/{doc_id}"
params = ""
query = ""
fragment = ""
normalized = urlunparse(
(scheme, netloc, path, params, query, fragment)
).rstrip("/")
return NormalizationResult(normalized_url=normalized, use_default=False)
# Extract file ID and use connector's function
path_parts = parsed.path.split("/")
file_id = None
for i, part in enumerate(path_parts):
if part == "d" and i + 1 < len(path_parts):
file_id = path_parts[i + 1]
break
if not file_id:
return NormalizationResult(normalized_url=None, use_default=False)
# Create minimal file object for connector function
file_obj = {"webViewLink": url, "id": file_id}
normalized = onyx_document_id_from_drive_file(file_obj).rstrip("/")
return NormalizationResult(normalized_url=normalized, use_default=False)
# TODO: ensure returned new_creds_dict is actually persisted when this is called?
def load_credentials(self, credentials: dict[str, Any]) -> dict[str, str] | None:
try:

View File

@@ -25,6 +25,18 @@ GenerateSlimDocumentOutput = Iterator[list[SlimDocument]]
CT = TypeVar("CT", bound=ConnectorCheckpoint)
class NormalizationResult(BaseModel):
"""Result of URL normalization attempt.
Attributes:
normalized_url: The normalized URL string, or None if normalization failed
use_default: If True, fall back to default normalizer. If False, return None.
"""
normalized_url: str | None
use_default: bool = False
class BaseConnector(abc.ABC, Generic[CT]):
REDIS_KEY_PREFIX = "da_connector_data:"
@@ -74,6 +86,15 @@ class BaseConnector(abc.ABC, Generic[CT]):
"""Implement if the underlying connector wants to skip/allow image downloading
based on the application level image analysis setting."""
@classmethod
def normalize_url(cls, url: str) -> "NormalizationResult":
"""Normalize a URL to match the canonical Document.id format used during ingestion.
Connectors that use URLs as document IDs should override this method.
Returns NormalizationResult with use_default=True if not implemented.
"""
return NormalizationResult(normalized_url=None, use_default=True)
def build_dummy_checkpoint(self) -> CT:
# TODO: find a way to make this work without type: ignore
return ConnectorCheckpoint(has_more=True) # type: ignore

View File

@@ -18,6 +18,7 @@ from typing_extensions import override
from onyx.configs.app_configs import INDEX_BATCH_SIZE
from onyx.configs.app_configs import JIRA_CONNECTOR_LABELS_TO_SKIP
from onyx.configs.app_configs import JIRA_CONNECTOR_MAX_TICKET_SIZE
from onyx.configs.app_configs import JIRA_SLIM_PAGE_SIZE
from onyx.configs.constants import DocumentSource
from onyx.connectors.cross_connector_utils.miscellaneous_utils import (
is_atlassian_date_error,
@@ -57,7 +58,6 @@ logger = setup_logger()
ONE_HOUR = 3600
_MAX_RESULTS_FETCH_IDS = 5000 # 5000
_JIRA_SLIM_PAGE_SIZE = 500
_JIRA_FULL_PAGE_SIZE = 50
# Constants for Jira field names
@@ -683,7 +683,7 @@ class JiraConnector(
jira_client=self.jira_client,
jql=jql,
start=current_offset,
max_results=_JIRA_SLIM_PAGE_SIZE,
max_results=JIRA_SLIM_PAGE_SIZE,
all_issue_ids=checkpoint.all_issue_ids,
checkpoint_callback=checkpoint_callback,
nextPageToken=checkpoint.cursor,
@@ -703,11 +703,11 @@ class JiraConnector(
)
)
current_offset += 1
if len(slim_doc_batch) >= _JIRA_SLIM_PAGE_SIZE:
if len(slim_doc_batch) >= JIRA_SLIM_PAGE_SIZE:
yield slim_doc_batch
slim_doc_batch = []
self.update_checkpoint_for_next_run(
checkpoint, current_offset, prev_offset, _JIRA_SLIM_PAGE_SIZE
checkpoint, current_offset, prev_offset, JIRA_SLIM_PAGE_SIZE
)
prev_offset = current_offset

View File

@@ -1,10 +1,13 @@
import os
import re
from datetime import datetime
from datetime import timezone
from typing import Any
from typing import cast
from urllib.parse import urlparse
import requests
from typing_extensions import override
from onyx.configs.app_configs import INDEX_BATCH_SIZE
from onyx.configs.app_configs import LINEAR_CLIENT_ID
@@ -16,6 +19,7 @@ from onyx.connectors.cross_connector_utils.miscellaneous_utils import (
from onyx.connectors.cross_connector_utils.miscellaneous_utils import time_str_to_utc
from onyx.connectors.interfaces import GenerateDocumentsOutput
from onyx.connectors.interfaces import LoadConnector
from onyx.connectors.interfaces import NormalizationResult
from onyx.connectors.interfaces import OAuthConnector
from onyx.connectors.interfaces import PollConnector
from onyx.connectors.interfaces import SecondsSinceUnixEpoch
@@ -312,6 +316,31 @@ class LinearConnector(LoadConnector, PollConnector, OAuthConnector):
yield from self._process_issues(start_str=start_time, end_str=end_time)
@classmethod
@override
def normalize_url(cls, url: str) -> NormalizationResult:
"""Extract Linear issue identifier from URL.
Linear URLs are like: https://linear.app/team/issue/IDENTIFIER/...
Returns the identifier (e.g., "DAN-2327") which can be used to match Document.link.
"""
parsed = urlparse(url)
netloc = parsed.netloc.lower()
if "linear.app" not in netloc:
return NormalizationResult(normalized_url=None, use_default=False)
# Extract identifier from path: /team/issue/IDENTIFIER/...
# Pattern: /{team}/issue/{identifier}/...
path_parts = [p for p in parsed.path.split("/") if p]
if len(path_parts) >= 3 and path_parts[1] == "issue":
identifier = path_parts[2]
# Validate identifier format (e.g., "DAN-2327")
if re.match(r"^[A-Z]+-\d+$", identifier):
return NormalizationResult(normalized_url=identifier, use_default=False)
return NormalizationResult(normalized_url=None, use_default=False)
if __name__ == "__main__":
connector = LinearConnector()

View File

@@ -1,13 +1,17 @@
import re
from collections.abc import Generator
from datetime import datetime
from datetime import timezone
from typing import Any
from typing import cast
from typing import Optional
from urllib.parse import parse_qs
from urllib.parse import urlparse
import requests
from pydantic import BaseModel
from retry import retry
from typing_extensions import override
from onyx.configs.app_configs import INDEX_BATCH_SIZE
from onyx.configs.app_configs import NOTION_CONNECTOR_DISABLE_RECURSIVE_PAGE_LOOKUP
@@ -21,6 +25,7 @@ from onyx.connectors.exceptions import InsufficientPermissionsError
from onyx.connectors.exceptions import UnexpectedValidationError
from onyx.connectors.interfaces import GenerateDocumentsOutput
from onyx.connectors.interfaces import LoadConnector
from onyx.connectors.interfaces import NormalizationResult
from onyx.connectors.interfaces import PollConnector
from onyx.connectors.interfaces import SecondsSinceUnixEpoch
from onyx.connectors.models import ConnectorMissingCredentialError
@@ -101,6 +106,49 @@ class NotionConnector(LoadConnector, PollConnector):
# very large, this may not be practical.
self.recursive_index_enabled = recursive_index_enabled or self.root_page_id
@classmethod
@override
def normalize_url(cls, url: str) -> NormalizationResult:
"""Normalize a Notion URL to extract the page ID (UUID format)."""
parsed = urlparse(url)
netloc = parsed.netloc.lower()
if not ("notion.so" in netloc or "notion.site" in netloc):
return NormalizationResult(normalized_url=None, use_default=False)
# Extract page ID from path (format: "Title-PageID")
path_last = parsed.path.split("/")[-1]
candidate = path_last.split("-")[-1] if "-" in path_last else path_last
# Clean and format as UUID
candidate = re.sub(r"[^0-9a-fA-F-]", "", candidate)
cleaned = candidate.replace("-", "")
if len(cleaned) == 32 and re.fullmatch(r"[0-9a-fA-F]{32}", cleaned):
normalized_uuid = (
f"{cleaned[0:8]}-{cleaned[8:12]}-{cleaned[12:16]}-"
f"{cleaned[16:20]}-{cleaned[20:]}"
).lower()
return NormalizationResult(
normalized_url=normalized_uuid, use_default=False
)
# Try query params
params = parse_qs(parsed.query)
for key in ("p", "page_id"):
if key in params and params[key]:
candidate = params[key][0].replace("-", "")
if len(candidate) == 32 and re.fullmatch(r"[0-9a-fA-F]{32}", candidate):
normalized_uuid = (
f"{candidate[0:8]}-{candidate[8:12]}-{candidate[12:16]}-"
f"{candidate[16:20]}-{candidate[20:]}"
).lower()
return NormalizationResult(
normalized_url=normalized_uuid, use_default=False
)
return NormalizationResult(normalized_url=None, use_default=False)
@retry(tries=3, delay=1, backoff=2)
def _fetch_child_blocks(
self, block_id: str, cursor: str | None = None

View File

@@ -15,6 +15,7 @@ from http.client import RemoteDisconnected
from typing import Any
from typing import cast
from urllib.error import URLError
from urllib.parse import urlparse
from pydantic import BaseModel
from redis import Redis
@@ -41,6 +42,7 @@ from onyx.connectors.interfaces import CheckpointOutput
from onyx.connectors.interfaces import CredentialsConnector
from onyx.connectors.interfaces import CredentialsProviderInterface
from onyx.connectors.interfaces import GenerateSlimDocumentOutput
from onyx.connectors.interfaces import NormalizationResult
from onyx.connectors.interfaces import SecondsSinceUnixEpoch
from onyx.connectors.interfaces import SlimConnectorWithPermSync
from onyx.connectors.models import BasicExpertInfo
@@ -626,6 +628,43 @@ class SlackConnector(
# self.delay_lock: str | None = None # the redis key for the shared lock
# self.delay_key: str | None = None # the redis key for the shared delay
@classmethod
@override
def normalize_url(cls, url: str) -> NormalizationResult:
"""Normalize a Slack URL to extract channel_id__thread_ts format."""
parsed = urlparse(url)
if "slack.com" not in parsed.netloc.lower():
return NormalizationResult(normalized_url=None, use_default=False)
# Slack document IDs are format: channel_id__thread_ts
# Extract from URL pattern: .../archives/{channel_id}/p{timestamp}
path_parts = parsed.path.split("/")
if "archives" not in path_parts:
return NormalizationResult(normalized_url=None, use_default=False)
archives_idx = path_parts.index("archives")
if archives_idx + 1 >= len(path_parts):
return NormalizationResult(normalized_url=None, use_default=False)
channel_id = path_parts[archives_idx + 1]
if archives_idx + 2 >= len(path_parts):
return NormalizationResult(normalized_url=None, use_default=False)
thread_part = path_parts[archives_idx + 2]
if not thread_part.startswith("p"):
return NormalizationResult(normalized_url=None, use_default=False)
# Convert p1234567890123456 to 1234567890.123456 format
timestamp_str = thread_part[1:] # Remove 'p' prefix
if len(timestamp_str) == 16:
# Insert dot at position 10 to match canonical format
thread_ts = f"{timestamp_str[:10]}.{timestamp_str[10:]}"
else:
thread_ts = timestamp_str
normalized = f"{channel_id}__{thread_ts}"
return NormalizationResult(normalized_url=normalized, use_default=False)
@staticmethod
def make_credential_prefix(key: str) -> str:
return f"connector:slack:credential_{key}"

View File

@@ -1,4 +1,3 @@
import io
import ipaddress
import random
import socket
@@ -36,10 +35,11 @@ from onyx.connectors.interfaces import GenerateDocumentsOutput
from onyx.connectors.interfaces import LoadConnector
from onyx.connectors.models import Document
from onyx.connectors.models import TextSection
from onyx.file_processing.extract_file_text import read_pdf_file
from onyx.file_processing.html_utils import web_html_cleanup
from onyx.utils.logger import setup_logger
from onyx.utils.sitemap import list_pages_for_site
from onyx.utils.web_content import extract_pdf_text
from onyx.utils.web_content import is_pdf_resource
from shared_configs.configs import MULTI_TENANT
logger = setup_logger()
@@ -116,16 +116,6 @@ DEFAULT_HEADERS = {
"Sec-CH-UA-Platform": '"macOS"',
}
# Common PDF MIME types
PDF_MIME_TYPES = [
"application/pdf",
"application/x-pdf",
"application/acrobat",
"application/vnd.pdf",
"text/pdf",
"text/x-pdf",
]
class WEB_CONNECTOR_VALID_SETTINGS(str, Enum):
# Given a base site, index everything under that path
@@ -268,12 +258,6 @@ def get_internal_links(
return internal_links
def is_pdf_content(response: requests.Response) -> bool:
"""Check if the response contains PDF content based on content-type header"""
content_type = response.headers.get("content-type", "").lower()
return any(pdf_type in content_type for pdf_type in PDF_MIME_TYPES)
def start_playwright() -> Tuple[Playwright, BrowserContext]:
playwright = sync_playwright().start()
@@ -529,14 +513,13 @@ class WebConnector(LoadConnector):
head_response = requests.head(
initial_url, headers=DEFAULT_HEADERS, allow_redirects=True
)
is_pdf = is_pdf_content(head_response)
content_type = head_response.headers.get("content-type")
is_pdf = is_pdf_resource(initial_url, content_type)
if is_pdf or initial_url.lower().endswith(".pdf"):
if is_pdf:
# PDF files are not checked for links
response = requests.get(initial_url, headers=DEFAULT_HEADERS)
page_text, metadata, images = read_pdf_file(
file=io.BytesIO(response.content)
)
page_text, metadata = extract_pdf_text(response.content)
last_modified = response.headers.get("Last-Modified")
result.doc = Document(

View File

@@ -1016,9 +1016,15 @@ def slack_retrieval(
for query_string in query_strings
]
# If include_dm is True, add additional searches without channel filters
# This allows searching DMs/group DMs while still searching the specified channels
if entities and entities.get("include_dm"):
# If include_dm is True AND we're not already searching all channels,
# add additional searches without channel filters.
# This allows searching DMs/group DMs while still searching the specified channels.
# Skip this if search_all_channels is already True (would be duplicate queries).
if (
entities
and entities.get("include_dm")
and not entities.get("search_all_channels")
):
# Create a minimal entities dict that won't add channel filters
# This ensures we search ALL conversations (DMs, group DMs, private channels)
# BUT we still want to exclude channels specified in exclude_channels

View File

@@ -398,8 +398,8 @@ def extract_channel_references_from_query(query_text: str) -> set[str]:
channel_patterns = [
r"\bin\s+(?:the\s+)?([a-z0-9_-]+)\s+(?:slack\s+)?channels?\b", # "in the office channel"
r"\bfrom\s+(?:the\s+)?([a-z0-9_-]+)\s+(?:slack\s+)?channels?\b", # "from the office channel"
r"\bin\s+#([a-z0-9_-]+)\b", # "in #office"
r"\bfrom\s+#([a-z0-9_-]+)\b", # "from #office"
r"\bin[:\s]*#([a-z0-9_-]+)\b", # "in #office" or "in:#office"
r"\bfrom[:\s]*#([a-z0-9_-]+)\b", # "from #office" or "from:#office"
]
for pattern in channel_patterns:
@@ -566,6 +566,23 @@ def extract_content_words_from_recency_query(
return content_words_filtered[:MAX_CONTENT_WORDS]
def _is_valid_keyword_query(line: str) -> bool:
"""Check if a line looks like a valid keyword query vs explanatory text.
Returns False for lines that appear to be LLM explanations rather than keywords.
"""
# Reject lines that start with parentheses (explanatory notes)
if line.startswith("("):
return False
# Reject lines that are too long (likely sentences, not keywords)
# Keywords should be short - reject if > 50 chars or > 6 words
if len(line) > 50 or len(line.split()) > 6:
return False
return True
def expand_query_with_llm(query_text: str, llm: LLM) -> list[str]:
"""Use LLM to expand query into multiple search variations.
@@ -586,10 +603,18 @@ def expand_query_with_llm(query_text: str, llm: LLM) -> list[str]:
response_clean = _parse_llm_code_block_response(response)
# Split into lines and filter out empty lines
rephrased_queries = [
raw_queries = [
line.strip() for line in response_clean.split("\n") if line.strip()
]
# Filter out lines that look like explanatory text rather than keywords
rephrased_queries = [q for q in raw_queries if _is_valid_keyword_query(q)]
# Log if we filtered out garbage
if len(raw_queries) != len(rephrased_queries):
filtered_out = set(raw_queries) - set(rephrased_queries)
logger.warning(f"Filtered out non-keyword LLM responses: {filtered_out}")
# If no queries generated, use empty query
if not rephrased_queries:
logger.debug("No content keywords extracted from query expansion")

View File

@@ -253,7 +253,7 @@ class RetrievalDetails(ChunkContext):
# Use LLM to determine whether to do a retrieval or only rely on existing history
# If the Persona is configured to not run search (0 chunks), this is bypassed
# If no Prompt is configured, the only search results are shown, this is bypassed
run_search: OptionalSearchSetting = OptionalSearchSetting.ALWAYS
run_search: OptionalSearchSetting = OptionalSearchSetting.AUTO
# Is this a real-time/streaming call or a question where Onyx can take more time?
# Used to determine reranking flow
real_time: bool = True

View File

@@ -19,7 +19,6 @@ from onyx.db.models import Persona
from onyx.db.models import User
from onyx.document_index.interfaces import DocumentIndex
from onyx.llm.interfaces import LLM
from onyx.onyxbot.slack.models import SlackContext
from onyx.secondary_llm_flows.source_filter import extract_source_filter
from onyx.secondary_llm_flows.time_filter import extract_time_filter
from onyx.utils.logger import setup_logger
@@ -249,8 +248,6 @@ def search_pipeline(
db_session: Session,
auto_detect_filters: bool = False,
llm: LLM | None = None,
# Needed for federated Slack search
slack_context: SlackContext | None = None,
# If a project ID is provided, it will be exclusively scoped to that project
project_id: int | None = None,
) -> list[InferenceChunk]:
@@ -291,11 +288,9 @@ def search_pipeline(
retrieved_chunks = search_chunks(
query_request=query_request,
# Needed for federated Slack search
user_id=user.id if user else None,
document_index=document_index,
db_session=db_session,
slack_context=slack_context,
)
# For some specific connectors like Salesforce, a user that has access to an object doesn't mean

View File

@@ -29,7 +29,6 @@ from onyx.document_index.vespa.shared_utils.utils import (
from onyx.federated_connectors.federated_retrieval import (
get_federated_retrieval_functions,
)
from onyx.onyxbot.slack.models import SlackContext
from onyx.secondary_llm_flows.query_expansion import multilingual_query_expansion
from onyx.utils.logger import setup_logger
from onyx.utils.threadpool_concurrency import run_functions_tuples_in_parallel
@@ -331,7 +330,6 @@ def retrieve_chunks(
retrieval_metrics_callback: (
Callable[[RetrievalMetricsContainer], None] | None
) = None,
slack_context: SlackContext | None = None,
) -> list[InferenceChunk]:
"""Returns a list of the best chunks from an initial keyword/semantic/ hybrid search."""
@@ -348,8 +346,7 @@ def retrieve_chunks(
user_id,
list(query.filters.source_type) if query.filters.source_type else None,
query.filters.document_set,
slack_context,
query.filters.user_file_ids,
user_file_ids=query.filters.user_file_ids,
)
federated_sources = set(
federated_retrieval_info.source.to_non_federated_source()
@@ -460,7 +457,6 @@ def search_chunks(
user_id: UUID | None,
document_index: DocumentIndex,
db_session: Session,
slack_context: SlackContext | None = None,
) -> list[InferenceChunk]:
run_queries: list[tuple[Callable, tuple]] = []
@@ -476,7 +472,6 @@ def search_chunks(
user_id=user_id,
source_types=list(source_filters) if source_filters else None,
document_set_names=query_request.filters.document_set,
slack_context=slack_context,
user_file_ids=query_request.filters.user_file_ids,
)

Some files were not shown because too many files have changed in this diff Show More