// netlify_functions backend for refhub.io
versioned api backend for api-key access to refhub vaults. single function entrypoint dispatched by path segment. reads and writes the existing refhub data model directly — no parallel store.
These routes are for the logged-in RefHub frontend and reject rhk_... API keys.
GET /api/v1/keys
POST /api/v1/keys
POST /api/v1/keys/:keyId/revoke
DELETE /api/v1/keys/:keyId
POST /api/v1/recommendations ← semantic scholar similar papers
POST /api/v1/references ← semantic scholar cited papers
POST /api/v1/citations ← semantic scholar citing papers
POST /api/v1/lookup ← semantic scholar DOI/title → paper id
POST /api/v1/doi-metadata ← semantic scholar DOI metadata enrichment
POST /api/v1/search ← semantic scholar topic/paper search
GET /api/v1/google-drive
POST /api/v1/google-drive/connect
POST /api/v1/google-drive/folder
DELETE /api/v1/google-drive
GET /api/v1/google-drive/callback ← oauth callback, no bearer token
POST /api/v1/google-drive/vaults/:vaultId/items/:itemId/pdf
POST /api/v1/publications/:publicationId/pdf
GET /api/v1/audit
GET /api/v1/vaults
POST /api/v1/vaults
GET /api/v1/vaults/:vaultId
PATCH /api/v1/vaults/:vaultId
DELETE /api/v1/vaults/:vaultId
PATCH /api/v1/vaults/:vaultId/visibility
GET /api/v1/vaults/:vaultId/shares
POST /api/v1/vaults/:vaultId/shares
PATCH /api/v1/vaults/:vaultId/shares/:shareId
DELETE /api/v1/vaults/:vaultId/shares/:shareId
GET /api/v1/vaults/:vaultId/items
POST /api/v1/vaults/:vaultId/items
PATCH /api/v1/vaults/:vaultId/items/:itemId
DELETE /api/v1/vaults/:vaultId/items/:itemId
POST /api/v1/vaults/:vaultId/items/upsert
POST /api/v1/vaults/:vaultId/items/import-preview
POST /api/v1/vaults/:vaultId/items/:itemId/pdf
POST /api/v1/vaults/:vaultId/items/:itemId/pdf/session
POST /api/v1/vaults/:vaultId/items/:itemId/pdf/complete
GET /api/v1/vaults/:vaultId/tags
POST /api/v1/vaults/:vaultId/tags
PATCH /api/v1/vaults/:vaultId/tags/:tagId
DELETE /api/v1/vaults/:vaultId/tags/:tagId
POST /api/v1/vaults/:vaultId/tags/attach
POST /api/v1/vaults/:vaultId/tags/detach
GET /api/v1/vaults/:vaultId/relations
POST /api/v1/vaults/:vaultId/relations
PATCH /api/v1/vaults/:vaultId/relations/:relationId
DELETE /api/v1/vaults/:vaultId/relations/:relationId
POST /api/v1/vaults/:vaultId/import/doi
POST /api/v1/vaults/:vaultId/import/bibtex
POST /api/v1/vaults/:vaultId/import/url
GET /api/v1/vaults/:vaultId/search
GET /api/v1/vaults/:vaultId/stats
GET /api/v1/vaults/:vaultId/changes
GET /api/v1/vaults/:vaultId/export
GET /api/v1/vaults/:vaultId/audit
GET /api/v1/extension/google-drive-status
POST /api/v1/pdf-metadata
.netlify/
functions/
api-v1.js ← versioned router and handlers
src/
auth.js ← api-key parsing, hashing, verification, scope checks
config.js ← required env vars and runtime knobs
export.js ← json and bibtex export helpers
http.js ← shared http/error/json helpers
google-drive.js ← google oauth, drive folder, pdf upload helpers
semantic-scholar.js← semantic scholar proxy normalization/helpers
bibtex.js ← bibtex parsing/serialization helpers
routes/ ← v2 vaults/items/tags/relations/search/import/audit handlers
netlify.toml ← redirects and function settings
package.json
all /api/v1/* traffic is routed to /.netlify/functions/api-v1 via netlify.toml.
two modes — never mix them.
api key (all data routes):
Authorization: Bearer rhk_<publicId>_<secret>
X-API-Key: rhk_<publicId>_<secret>
session jwt (management routes only):
Authorization: Bearer <supabase-session-jwt>
sending an api key to a management route returns 401 refhub_api_key_not_supported.
key storage rules:
- only
key_hashis stored — plaintext key material is never reconstructed key_prefixstoresrhk_<publicId>for lookupscopesis a text array- optional vault restrictions live in
api_key_vaults last_used_atupdated best-effort- request outcomes written best-effort to
api_request_audit_logs
| scope | grants |
|---|---|
vaults:read |
list/read vaults, search, stats, changes, audit |
vaults:write |
add/update/delete items, tags, relations, import |
vaults:export |
export vault as json or bibtex |
vaults:admin |
create/update/delete vaults, visibility, shares |
required:
SUPABASE_URL
SUPABASE_SERVICE_ROLE_KEY
REFHUB_API_KEY_PEPPER ← used when hashing presented keys before comparison
optional:
SEMANTIC_SCHOLAR_API_KEY ← recommended for stable upstream rate limits
SEMANTIC_SCHOLAR_RATE_LIMIT_MAX_REQUESTS ← per-user request cap, defaults to 60
SEMANTIC_SCHOLAR_RATE_LIMIT_WINDOW_MS ← rate-limit window, defaults to 60000
SEMANTIC_SCHOLAR_TIMEOUT_MS ← upstream timeout, defaults to 8000
REFHUB_API_MAX_BULK_ITEMS ← defaults to 50
REFHUB_API_MAX_BODY_BYTES ← defaults to 52428800
REFHUB_API_ALLOWED_ORIGINS ← comma-separated browser origins; defaults to https://refhub.io plus localhost dev ports 3000, 5173, and 8081
REFHUB_API_AUDIT_DISABLED ← defaults to false
google drive (only when using drive storage flow):
GOOGLE_DRIVE_CLIENT_ID
GOOGLE_DRIVE_CLIENT_SECRET
GOOGLE_DRIVE_REDIRECT_URI
GOOGLE_DRIVE_STATE_SECRET
GOOGLE_DRIVE_TOKEN_SECRET
GOOGLE_DRIVE_FOLDER_NAME ← defaults to "refhub"
GOOGLE_DRIVE_MAX_UPLOAD_BYTES ← defaults to 26214400
for local dev, the backend loads from .env.local then .env before process.env.
reads and writes existing tables — no parallel api store:
vaults
vault_shares
vault_publications
publications
tags
publication_tags
publication_relations
write flow for new items:
- insert canonical row in
publications - insert vault-specific copy in
vault_publications - attach
publication_tagsagainstvault_publication_id
the function uses the supabase service-role key — rls is not the primary enforcement layer for this path. access control is enforced in-function by:
- api-key hash verification
- scope checks
- explicit vault restriction checks via
api_key_vaults - owner/share permission checks before read, write, or export
if this backend moves away from the service-role key, keep these checks and validate rls separately.
auth: supabase session jwt. returns api keys owned by the authenticated user.
{
"data": [
{
"id": "uuid",
"label": "research_sync_bot",
"description": "local sync job",
"key_prefix": "rhk_a1b2c3d4e5f6",
"scopes": ["vaults:read"],
"expires_at": null,
"revoked_at": null,
"last_used_at": null,
"created_at": "2026-03-24T08:30:00Z",
"vault_ids": ["uuid"]
}
],
"meta": { "request_id": "uuid" }
}auth: supabase session jwt.
{
"label": "research_sync_bot",
"description": "local sync job",
"scopes": ["vaults:read", "vaults:write"],
"expires_at": "2026-06-22T08:30:00.000Z",
"vault_ids": ["uuid"]
}rules:
labelrequiredscopesmust be a non-empty subset of valid scopesexpires_atoptional; must be a future iso-8601 timestamp when presentvault_idsoptional; every vault must be accessible to the authenticated user- response returns plaintext key once as
secret— only the hash is stored
auth: supabase session jwt. revokes a key owned by the authenticated user. record is soft-revoked via revoked_at — not deleted from storage.
auth: supabase session jwt only — api keys explicitly rejected.
proxies semantic scholar server-side. applies lightweight per-user rate limiting; may return 429 rate_limit_exceeded. upstream failures returned as sanitized errors.
{ "paper_id": "DOI:10.1101/2020.02.20.958025", "limit": 10 }paper_idrequired — semantic scholar paper id orDOI:<doi>limitoptional,1–25- successful responses cached briefly in-process to reduce duplicate upstream calls
response shape (recommendations · references · citations):
{
"data": [
{
"paper_id": "52cdb6ed946dfed25113bd194d5e2bb843c66331",
"external_ids": { "DOI": "10.1101/2020.11.04.367797" },
"title": "example paper",
"abstract": "...",
"year": 2020,
"venue": "bioRxiv",
"url": "https://www.semanticscholar.org/paper/...",
"citation_count": 42,
"open_access_pdf_url": "https://...",
"authors": [{ "author_id": "12345", "name": "example author" }]
}
],
"meta": { "request_id": "uuid", "paper_id": "DOI:10.1101/...", "limit": 10 }
}POST /api/v1/citations — same request/response shape as references; returns papers citing the seed paper.
auth: supabase session jwt only — api keys explicitly rejected.
resolves a DOI or title to a Semantic Scholar paper id. DOI lookups are normalized to DOI:<doi> locally; title lookups proxy Semantic Scholar paper search.
{ "doi": "10.1145/3544548.3580907" }{ "title": "attention is all you need" }response:
{
"data": { "paper_id": "DOI:10.1145/3544548.3580907" },
"meta": { "request_id": "uuid", "query_type": "doi" }
}auth: supabase session jwt only — api keys explicitly rejected.
server-side Semantic Scholar topic/paper search for empty-vault discovery. Uses Semantic Scholar /graph/v1/paper/search/bulk with compact paper fields and citationCount:desc sorting. Applies the same per-user Semantic Scholar rate limit, timeout, cache, stale-cache fallback on upstream 429, and sanitized error shape as the other Semantic Scholar routes.
{ "query": "visual analytics provenance", "limit": 25 }queryrequired, minimum 2 characterslimitoptional,1–25- successful responses use the normalized paper shape shown above
SEMANTIC_SCHOLAR_API_KEYis optional but recommended; unauthenticated Semantic Scholar traffic uses a shared public pool and can return upstream429
auth: supabase session jwt only — api keys explicitly rejected.
resolves a DOI against semantic scholar and returns structured metadata. applies per-user rate limiting; may return 429 rate_limit_exceeded. results cached briefly in-process.
{ "doi": "10.1145/3544548.3580907" }doirequired — bare doi string (nohttps://doi.org/prefix)
{
"data": {
"title": "example paper",
"authors": ["author one", "author two"],
"year": 2023,
"journal": "CHI",
"doi": "10.1145/3544548.3580907",
"url": "https://doi.org/10.1145/3544548.3580907",
"abstract": "...",
"type": "inproceedings"
},
"meta": { "request_id": "uuid", "doi": "10.1145/3544548.3580907" }
}data is null when the DOI is not found in semantic scholar. type is one of article · inproceedings · book · thesis · report.
auth: supabase session jwt only unless noted.
GET /api/v1/google-drive— linked status, folder status, folder id/name.POST /api/v1/google-drive/connect— returns a Google OAuthauthorization_url; accepts optionalreturn_to.GET /api/v1/google-drive/callback— OAuth callback target, no bearer token; redirects back to the configured RefHub UI.POST /api/v1/google-drive/folder— ensure/recreate the managed Drive folder.DELETE /api/v1/google-drive— disconnect stored Drive credentials and best-effort revoke the refresh token.GET /api/v1/extension/google-drive-status— compact status shape for the browser extension/data route auth path.
POST /api/v1/pdf-metadata accepts { "source_url": "https://...pdf" } plus optional cookie_header/referer, fetches the PDF server-side, and returns best-effort DOI/title/authors/year/journal metadata. It returns empty metadata with a fetch_skipped note when the source PDF is not server-accessible instead of throwing a hard 500.
POST /api/v1/vaults/:vaultId/items/:itemId/pdf uploads or fetches a PDF for a vault item and stores it in linked Google Drive. Body can be raw application/pdf bytes or JSON with source_url plus optional cookie_header/referer. Raw PDF bodies are intentionally capped at the smallest of REFHUB_API_MAX_BODY_BYTES, GOOGLE_DRIVE_MAX_UPLOAD_BYTES, and the Netlify synchronous Function payload ceiling (6 MiB); larger PDFs should use the resumable session flow below so bytes go directly from browser to Google Drive.
POST /api/v1/vaults/:vaultId/items/:itemId/pdf/session creates a browser-side Google Drive resumable upload session. The API forwards the validated request Origin to Google when creating the session so browser-direct PUTs receive matching Drive CORS headers. If REFHUB_API_ALLOWED_ORIGINS is set explicitly, include every dev origin you use, e.g. http://localhost:8081; arbitrary origins are not reflected.
POST /api/v1/vaults/:vaultId/items/:itemId/pdf/complete records a browser-completed Drive upload. Body must include file_id; web_view_link and source_url are optional.
auth: supabase session jwt only. uploads a PDF for a publication the authenticated user owns and stores it in their linked Google Drive. records the asset in publication_pdf_assets.
request body: raw application/pdf bytes.
- publication must be owned by the authenticated user
- google drive must be linked for the account
- file size capped at
GOOGLE_DRIVE_MAX_UPLOAD_BYTES(default 26 MB)
{
"data": {
"attempted": true,
"stored": true,
"provider": "google_drive",
"fileId": "1BcnjrInjOGsnM142vNlg4KNMriooAc9u",
"folderId": "1-HxdtCdUxv03KEN-syenBc18fvnRl9EW",
"folderName": "refhub",
"pdfUrl": "https://drive.google.com/file/d/.../view?usp=drivesdk",
"sourceUrl": null
},
"meta": { "request_id": "uuid" }
}errors: 404 publication_not_found · 503 drive_not_linked · 502 drive_upload_failed.
requires migration 20260513000000_publication_pdf_assets_library_uploads.sql — drops the NOT NULL constraint on vault_publication_id and adds the partial unique index (publication_id, storage_provider) WHERE publication_id IS NOT NULL AND vault_publication_id IS NULL.
scope: vaults:read. returns vaults accessible through ownership or explicit share, narrowed by api_key_vaults when set.
{
"data": [
{
"id": "uuid",
"name": "ai reading list",
"visibility": "private",
"permission": "owner",
"item_count": 12,
"updated_at": "2026-03-23T18:00:00Z"
}
],
"meta": { "request_id": "uuid" }
}scope: vaults:read. returns vault metadata + vault_publications + vault-scoped tags + publication_tags + publication_relations.
POST /api/v1/vaults, PATCH /api/v1/vaults/:vaultId, DELETE /api/v1/vaults/:vaultId, PATCH /api/v1/vaults/:vaultId/visibility, and /shares routes require vaults:admin or an authenticated management user with owner/admin access.
Tag, relation, search, stats, changes, import, and audit routes are implemented under src/routes/*. They use the same vault access resolver and scope model:
- read/search/stats/changes/audit:
vaults:read - add/update/delete items, tags, relations, imports:
vaults:write - export:
vaults:export - vault CRUD/visibility/shares:
vaults:admin
scope: vaults:write · permission: editor.
{
"items": [
{
"title": "attention is all you need",
"authors": ["ashish vaswani"],
"year": 2017,
"publication_type": "article",
"doi": "10.48550/arXiv.1706.03762",
"tag_ids": ["uuid"]
}
]
}- bulk insert supported
tag_idsmust already exist in the vault — no implicit tag creation- requests above
REFHUB_API_MAX_BODY_BYTESrejected with413 - pre-validates full batch and attempts rollback on insert failure
scope: vaults:write · permission: editor. partial update. if tag_ids is present it replaces the full tag set.
scope: vaults:export · permission: viewer. supported formats: json · bibtex.
each request writes one audit row with:
api_key_id • owner_user_id • vault_id • method • path
response_status • request_id • latency • caller_ip • user_agent
best-effort — must not block successful api responses. failures emitted to function logs only.