GitHub - rba100/jina-firecrawl-api

Jina Firecrawl API Replacement for LibreChat

This project is a drop-in replacement for Firecrawl, designed for LibreChat, but powered by Jina.AI. It exposes a /v1/scrape endpoint compatible with Firecrawl's API.

Usage

In Librechat, set FIRECRAWL_API_URL to http://localhost:3002/ or wherever you will host this, and FIRECRAWL_API_KEY to your Jina.AI key.

Run directly from GitHub Container Registry (recommended):

docker run -d -p 3002:8080 ghcr.io/rba100/jina-firecrawl-api:latest

This will pull and run the latest published image from ghcr.io. It needs no configuration - just use your jina api key with it.

Or run with Docker Compose (builds locally):

docker compose up -d

Or run directly (requires .NET 9):

dotnet run

The API will be available at http://localhost:3002 by default.

How it works

For most URLs, requests are proxied to Jina.AI's r.jina.ai service. You must provide your Jina API key as the Bearer token in the Authorization header; this key is forwarded to Jina for those requests.
For PDF URLs, the service downloads and parses the PDF directly. Any API key is accepted for PDF requests.

Configuration

Timeout

You can configure the timeout for scraping operations (in seconds) via either appsettings.json or the SCRAPE__TIMEOUTSECONDS environment variable. The default is 15 seconds.

Environment variable:
Set SCRAPE__TIMEOUTSECONDS (note the double underscore) to your desired timeout value.
This is the .NET convention for mapping environment variables to configuration sections and properties (e.g., Scrape:TimeoutSeconds in appsettings.json maps to SCRAPE__TIMEOUTSECONDS as an environment variable).
appsettings.json:
Add or edit the following section:
```
"Scrape": {
  "TimeoutSeconds": 20
}
```

This timeout also controls the "fallback" timeout passed to Jina. If a page takes too long to load with JavaScript execution, Jina will abort the browser-based scrape and fall back to scraping the raw HTML (without JS execution). This fallback is much faster, but may compromise accuracy for pages that are slow to load or require JavaScript for rendering.

Tradeoff:
The fallback feature improves speed for slow or problematic pages, but may result in incomplete or less accurate content for sites that require JavaScript to render important information.

Endpoint

POST /v1/scrape

Request body:

{
  "url": "<URL_TO_SCRAPE>"
}

Headers:

Authorization: Bearer <your-jina-api-key>

Response:

On success: returns markdown (and empty html) in the data field, plus metadata.
On error: returns success: false and error metadata.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
Api		Api
JinaFirecrawlApi.Tests		JinaFirecrawlApi.Tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
jina-firecrawl-api.sln		jina-firecrawl-api.sln

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jina Firecrawl API Replacement for LibreChat

Usage

How it works

Configuration

Timeout

Endpoint

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jina Firecrawl API Replacement for LibreChat

Usage

How it works

Configuration

Timeout

Endpoint

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages