Skip to content

TheWebScrapingClub/scraping-wiki

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Web Scraping Wiki

A structured, LLM-maintained knowledge base on anti-bot systems, scraping tools, browser fingerprinting, and proxy infrastructure. It compiles what we tested across 300+ The Web Scraping Club articles since 2022, cross-referenced and updated daily.

New here? Read About.md for what this is, where the sources come from, how it is maintained, and how to open it as an Obsidian vault.

The Web Scraping Wiki is free to read and stays that way thanks to the companies below. If you want your product in front of a technical audience that scrapes for a living, get in touch.

Great Sponsors

Your company here a short description of what you offer and why it matters to people scraping the web.
Your company here a short description of what you offer and why it matters to people scraping the web.

Want your logo here? Reach out and we will add you.

Awesome Sponsors

Want your logo here? Reach out and we will add you.

Index

The full, page-by-page catalog lives in index.md, regenerated on every daily run. The shortcuts below jump straight to each section.

  • Entities (117) — tools, libraries, stealth browsers, anti-bot products, and proxy networks, one page each.
  • Concepts (29) — detection techniques, patterns, and domain ideas, with the signals they rely on.
  • Comparisons (3) — side-by-side analyses when the question is "which one and when?".
  • Timelines (1) — how a moving target evolved across sources over time.
  • Canvases (1) — JSON Canvas landscape maps, rendered as interactive whiteboards in Obsidian.
  • Views (4) — Obsidian Bases live queries that surface pages by frontmatter.

For everything else (sources, maintenance pipeline, Obsidian setup, error reporting, license), see About.md.

About

A structured, LLM-maintained knowledge base covering anti-bot systems, scraping tools, browser fingerprinting, proxy infrastructure, and everything else that matters when extracting data from the web.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors