Skip to content

feat: add Google Drive knowledge connector#3695

Closed
CalvinMagezi wants to merge 1 commit intoarchestra-ai:mainfrom
CalvinMagezi:feat/google-drive-connector
Closed

feat: add Google Drive knowledge connector#3695
CalvinMagezi wants to merge 1 commit intoarchestra-ai:mainfrom
CalvinMagezi:feat/google-drive-connector

Conversation

@CalvinMagezi
Copy link
Copy Markdown

Summary

Closes #3689 — adds a Google Drive knowledge connector for syncing documents and files into the Archestra knowledge base.

  • Backend connector (google-drive-connector.ts): Uses the Drive REST API v3 directly (no SDK dependency). Supports dual auth: paste a service account JSON key (auto-generates JWT via node:crypto) or a raw OAuth 2.0 access token.
  • Incremental sync: Uses modifiedTime with a 5-minute safety buffer to avoid re-syncing unchanged files.
  • Supported formats: Google Docs (exported as plain text), Sheets (CSV), Slides (plain text), .txt, .md, .csv, .json, .html.
  • Filtering: Shared Drive (driveId), folder (folderId), MIME type (mimeTypes).
  • Test suite: 11 vitest tests covering config validation, connection testing, sync batching, pagination, checkpoint advancement, folder/drive filtering, and query building.
  • Frontend: Google Drive option in create/edit dialogs with config fields for Shared Drive ID, folder ID, and MIME type filter.

Files changed

File Change
platform/backend/src/knowledge-base/connectors/google-drive/google-drive-connector.ts New — core connector implementation
platform/backend/src/knowledge-base/connectors/google-drive/google-drive-connector.test.ts New — vitest test suite
platform/backend/src/knowledge-base/connectors/registry.ts Register GoogleDriveConnector
platform/backend/src/types/knowledge-connector.ts Add GoogleDriveConfigSchema, GoogleDriveCheckpointSchema, googledrive type
platform/shared/knowledge-base.ts Add googledrive: "Google Drive" label
platform/frontend/public/icons/google-drive.png Google Drive icon
platform/frontend/src/app/knowledge/knowledge-bases/_parts/connector-icons.tsx Register icon
platform/frontend/src/app/knowledge/knowledge-bases/_parts/google-drive-config-fields.tsx New — config fields component
platform/frontend/src/app/knowledge/knowledge-bases/_parts/create-connector-dialog.tsx Add Google Drive option
platform/frontend/src/app/knowledge/knowledge-bases/_parts/edit-connector-dialog.tsx Add Google Drive config fields
platform/frontend/src/app/knowledge/knowledge-bases/_parts/transform-config-array-fields.ts Add mimeTypes to array fields

Test plan

  • Create a Google Drive connector using a service account JSON key
  • Create a Google Drive connector using a raw OAuth access token
  • Run earn_discover to verify the connector appears in the list
  • Trigger a sync and verify Google Docs are ingested as plain text
  • Trigger a sync with folderId set and verify only files from that folder are synced
  • Verify the lastSyncedAt checkpoint advances after each sync
  • Run pnpm test in platform/backend to verify all unit tests pass

🤖 Generated with Claude Code

Adds a Google Drive connector for syncing documents and files from
Google Drive into the Archestra knowledge base.

**Backend** (`platform/backend/src/knowledge-base/connectors/google-drive/`):
- `GoogleDriveConnector` — implements `validateConfig`, `testConnection`,
  and `sync` via the Drive REST API v3 (no third-party SDK)
- Dual auth: service account JSON key (auto-generates JWT via `node:crypto`)
  or raw OAuth 2.0 access token
- Syncs Google Docs (exported as plain text), Sheets (CSV), Slides,
  `.txt`, `.md`, `.csv`, `.json`, `.html`, and other text MIME types
- Incremental sync via `modifiedTime` with 5-minute safety buffer
- Supports filtering by Shared Drive (`driveId`), folder (`folderId`),
  and MIME type (`mimeTypes`)
- Pagination via `nextPageToken`
- Full vitest test suite covering config validation, auth, sync batching,
  pagination, checkpoint advancement, and query building

**Types** (`platform/backend/src/types/knowledge-connector.ts`):
- `GoogleDriveConfigSchema` / `GoogleDriveConfig`
- `GoogleDriveCheckpointSchema` / `GoogleDriveCheckpoint`
- `googledrive` literal added to `ConnectorTypeSchema` and union schemas

**Registry** (`registry.ts`): `googledrive` → `GoogleDriveConnector`

**Frontend**:
- `google-drive-config-fields.tsx` — Shared Drive ID, folder ID, MIME types
- `connector-icons.tsx` — Google Drive icon (`/icons/google-drive.png`)
- `create-connector-dialog.tsx` — Google Drive option + config fields
- `edit-connector-dialog.tsx` — Google Drive label + config fields
- `transform-config-array-fields.ts` — `mimeTypes` added to array fields

**Shared** (`platform/shared/knowledge-base.ts`):
- `googledrive: "Google Drive"` label

Closes archestra-ai#3689
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@joeyorlando
Copy link
Copy Markdown
Contributor

hi there 👋 #3689 has already been assigned to another contributor, but thank you for your contribution 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add Google Drive knowledge connector

3 participants