SubashMohan
fe029eccae
chore: add SharePoint sync environment variables to integration test ( #5197 )
...
* chore: add SharePoint sync environment variables to integration test workflows
* fix cubic comments
* test: skip SharePoint permission tests for non-enterprise
* test: update SharePoint permission tests to skip for non-enterprise environments
2025-08-18 03:21:04 +00:00
Wenxi
ea72af7698
fix sharepoint tests ( #5209 )
2025-08-17 22:25:47 +00:00
Wenxi
17abf85533
fix unpaused user files ( #5205 )
2025-08-16 01:39:16 +00:00
Wenxi
3bd162acb9
fix: sharepoint tests and indexing logic ( #5204 )
...
* don't index onedrive personal sites in sharepoint
* fix sharepoint tests and indexing behavior
* remove print
2025-08-15 18:19:42 -07:00
Evan Lohn
664ce441eb
generous timeout between docfetching finishing and docprocessing starting ( #5201 )
2025-08-15 15:43:01 -07:00
Wenxi
6863fbee54
fix: validate sharepoint connector with validate_connector_settings ( #5199 )
...
* validate sharepoint connector with validate_connector_settings
* fix test
* fix tests
2025-08-15 00:38:31 +00:00
Nils
a605bd4ca4
feat: make sharepoint documents and sharepoint pages optional ( #5183 )
...
* feat: make sharepoint documents and sharepoint pages optional
* fix: address review feedback for PR #5183
* fix: exclude personal sites from sharepoint connector
---------
Co-authored-by: Nils Kleinrahm <nils.kleinrahm@pledoc.de >
2025-08-14 15:17:23 -07:00
Dominic Feliton
0e8b5af619
fix(connector): user file helm start cmd + legacy file connector incompatibility ( #5195 )
...
* Fix user file helm start cmd + legacy file connector incompatibility
* typo
* remove unnecessary logic
* undo
* make recommended changes
* keep comment
* cleanup
* format
---------
Co-authored-by: Dominic Feliton <37809476+dominicfeliton@users.noreply.github.com >
2025-08-14 13:20:19 -07:00
SubashMohan
46f3af4f68
enhance file processing with content type handling ( #5196 )
2025-08-14 08:59:53 +00:00
Evan Lohn
2af64ebf4c
fix: ensure exception strings don't get swallowed ( #5192 )
...
* ensure exception strings don't get swallowed
* just send exception code
2025-08-13 20:05:16 +00:00
Evan Lohn
0eb1824158
fix: sf connector docs ( #5171 )
...
* fix: sf connector docs
* more sf logs
* better logs and new attempt
* add fields to error temporarily
* fix sf
---------
Co-authored-by: Wenxi <wenxi@onyx.app >
2025-08-13 17:52:32 +00:00
Chris Weaver
e0a9a6fb66
feat: okta profile tool ( #5184 )
...
* Initial Okta profile tool
* Improve
* Fix
* Improve
* Improve
* Address EL comments
2025-08-13 09:57:31 -07:00
Wenxi
55dc24fd27
fix: seeded total doc count ( #5188 )
...
* fix seeded total doc count
* fix seeded total doc count
2025-08-13 00:19:06 +00:00
Evan Lohn
da02962a67
fix: thread safe approach to docprocessing logging ( #5185 )
...
* thread safe approach to docprocessing logging
* unify approaches
* reset
2025-08-12 02:25:47 +00:00
SubashMohan
9bc62cc803
feat: sharepoint perm sync ( #5033 )
...
* sharepoint perm sync first draft
* feat: Implement SharePoint permission synchronization
* mypy fix
* remove commented code
* bot comments fixes and job failure fixes
* introduce generic way to upload certificates in credentials
* mypy fix
* add checkpoiting to sharepoint connector
* add sharepoint integration tests
* Refactor SharePoint connector to derive tenant domain from verified domains and remove direct tenant domain input from credentials
* address review comments
* add permission sync to site pages
* mypy fix
* fix tests error
* fix tests and address comments
* Update file extraction behavior in SharePoint connector to continue processing on unprocessable files
2025-08-11 16:59:16 +00:00
Evan Lohn
bf6705a9a5
fix: max tokens param ( #5174 )
...
* max tokens param
* fix unit test
* fix unit test
2025-08-11 09:57:44 -07:00
Rei Meguro
df2fef3383
fix: removal of old tags + is_list differentiation ( #5147 )
...
* initial migration
* getting metadata from tags
* complete migration
* migration override for cloud
* fix: more robust structured tag gen
* tag and indexing update
* fix: move is_list to tags
* migration rebase
* test cases + bugfix on unique constraint
* fix logging
2025-08-10 22:39:33 +00:00
SubashMohan
8cec3448d7
fix: restrict user file access to current user only ( #5177 )
...
* fix: restrict user file access to current user only
* fix: enhance user file access control for recent folder
2025-08-10 19:00:18 +00:00
Wenxi
bacee0d09d
fix: sanitize slack payload before logging ( #5167 )
...
* sanitize slack payload before logging
* nit
2025-08-08 02:10:00 +00:00
Evan Lohn
297720c132
refactor: file processing ( #5136 )
...
* file processing refactor
* mypy
* CW comments
* address CW
2025-08-08 00:34:35 +00:00
Evan Lohn
bd4bd00cef
feat: office parsing markitdown ( #5115 )
...
* switch to markitdown untested
* passing tests
* reset file
* dotenv version
* docs
* add test file
* add doc
* fix integration test
2025-08-07 23:26:02 +00:00
Wenxi
cf193dee29
feat: support gpt5 models ( #5169 )
...
* support gpt5 models
* gpt5mini visible
2025-08-07 12:35:46 -07:00
Evan Lohn
1b47fa2700
fix: remove erroneous error case and add valid error ( #5163 )
...
* fix: remove erroneous error case and add valid error
* also address docfetching-docprocessing limbo
2025-08-07 18:17:00 +00:00
Wenxi Onyx
e1a305d18a
mask llm api key from logs
2025-08-07 00:01:29 -07:00
Evan Lohn
e2233d22c9
feat: salesforce custom query ( #5158 )
...
* WIP merged approach untested
* tested custom configs
* JT comments
* fix unit test
* CW comments
* fix unit test
2025-08-07 02:37:23 +00:00
Wenxi
1b2f4f3b87
fix: slash command slackbot to respond in private msg ( #5151 )
...
* fix slash command slackbot to respond in private msg
* rename confusing variable. fix slash message response in DMs
2025-08-05 19:03:38 -07:00
Evan Lohn
d85b55a9d2
no more scheduled stalling ( #5154 )
2025-08-05 20:17:44 +00:00
Chris Weaver
258e08abcd
feat: add customization via env vars for curator role ( #5150 )
...
* Add customization via env vars for curator role
* Simplify
* Simplify more
* Address comments
2025-08-05 09:58:36 -07:00
SubashMohan
146628e734
fix unsupported character error in minio migration ( #5145 )
...
* fix unsupported character error in minio migration
* slash fix
2025-08-04 12:42:07 -07:00
Wenxi
c1d4b08132
fix: minio file names ( #5138 )
...
* nit var clarity
* maintain file names in connector config for display
* remove unused util
* migration draft
* optional file names to not break existing instances
* backwards compatible
* backwards compatible
* migration logging
* update file ocnn tests
* unncessary none
* mypy + explanatory comments
2025-08-01 20:31:29 +00:00
Wenxi
554cd0f891
fix: accept multiple zip types and fallback to extension ( #5135 )
...
* accept multiple zip types and fallback to extension
* move zip check to util
* mypy nit
2025-07-30 22:21:16 +00:00
Raunak Bhagat
f87d3e9849
fix: Make ungrounded types have a default name when sending to the frontend ( #5133 )
...
* Update names in map-comprehension
* Make default name for ungrounded types public
* Return the default name for ungrounded entity-types
* Update backend/onyx/db/entities.py
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
---------
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
2025-07-30 20:46:30 +00:00
SubashMohan
c442ebaff6
Feature/GitHub permission sync ( #4996 )
...
* github perm sync initial draft
* introduce github doc sync and perm sync
* remove specific start time check
* Refactor GitHub connector to use SlimCheckpointOutputWrapper for improved document handling
* Update GitHub sync frequency defaults from 30 minutes to 5 minutes
* Add stop signal handling and progress reporting in GitHub document sync
* Refactor tests for Confluence and Google Drive connectors to use a mock fetch function for document access
* change the doc_sync approach
* add static typing for ocument columns and where clause
* remove prefix logic in connector runner
* mypy fix
* code review changes
* mypy fix
* fix review comments
* add sort order
* Implement merge heads migration for Alembic and update Confluence and Google Drive test
* github unit tests fix
* delete merge head and rebase the docmetadata field migration
---------
Co-authored-by: Subash <subash@onyx.app >
2025-07-30 02:42:18 +00:00
Justin Tahara
0157ae099a
[Vespa] Update to optimized configuration pt.2 ( #5113 )
2025-07-28 20:42:31 +00:00
justin-tahara
565fb42457
Let's do this properly
2025-07-28 10:42:31 -07:00
justin-tahara
a50a8b4a12
[Vespa] Update to optimized configuration
2025-07-28 10:38:48 -07:00
Evan Lohn
4baf4e7d96
feat: pruning freq ( #5097 )
...
* pruning frequency increase
* add logs
2025-07-26 22:29:43 +00:00
Wenxi
8b7ab2eb66
onyx metadata minio fix + permissive unstructured fail ( #5085 )
2025-07-25 21:26:02 +00:00
Evan Lohn
650884d76a
fix: preserve error traces ( #5083 )
2025-07-25 18:56:11 +00:00
Evan Lohn
71037678c3
attempt to fix parsing of tricky template files ( #5080 )
2025-07-25 02:18:35 +00:00
Chris Weaver
68de1015e1
feat: support aspx files ( #5068 )
...
* Support aspx files
* Add fetching of site pages
* Improve
* Small enhancement
* more improvements
* Improvements
* Fix tests
2025-07-24 19:19:24 -07:00
Evan Lohn
e2b3a6e144
fix: drive external links ( #5079 )
2025-07-24 17:42:12 -07:00
Evan Lohn
4f04b09efa
add library to fall back to for tokenizing ( #5078 )
2025-07-24 11:15:07 -07:00
SubashMohan
5c4f44d258
fix: sharepoint lg files issue ( #5065 )
...
* add SharePoint file size threshold check
* Implement retry logic for SharePoint queries to handle rate limiting and server error
* mypy fix
* add content none check
* remove unreachable code from retry logic in sharepoint connector
2025-07-24 14:26:01 +00:00
Evan Lohn
19652ad60e
attempt fix for broken excel files ( #5071 )
2025-07-24 01:21:13 +00:00
Evan Lohn
70c96b6ab3
fix: remove locks from indexing callback ( #5070 )
2025-07-23 23:05:35 +00:00
Evan Lohn
bf1e2a2661
feat: avoid full rerun ( #5063 )
...
* fix: remove extra group sync
* second extra task
* minor improvement for non-checkpointed connectors
2025-07-23 18:01:23 +00:00
Evan Lohn
991d5e4203
fix: regen api key ( #5064 )
2025-07-23 03:36:51 +00:00
Evan Lohn
d21f012b04
fix: remove extra group sync ( #5061 )
...
* fix: remove extra group sync
* second extra task
2025-07-22 23:24:42 +00:00
Wenxi
86b7beab01
fix: too many internet chunks ( #5060 )
...
* minor internet search env vars
* add limit to internet search chunks
* note
* nits
2025-07-22 23:11:10 +00:00