node-re2provides Node.js bindings for RE2: a fast, safe alternative to backtracking regular expression engines. The npm package name isre2. It is a C++ native addon built withnode-gypandnan.
For project structure, module dependencies, and the architecture overview see ARCHITECTURE.md. For detailed usage docs see the README and the wiki.
This project uses git submodules for vendored dependencies (RE2 and Abseil):
git clone --recursive git@github.com:uhop/node-re2.git
cd node-re2
npm installIf the native addon fails to download a prebuilt artifact, it builds locally via node-gyp.
- Install:
npm install(downloads prebuilt artifact or builds from source) - Build (release):
npm run rebuild(ornode-gyp -j max rebuild) - Build (debug):
npm run rebuild:dev(ornode-gyp -j max rebuild --debug) - Test:
npm test(runstape6 --flags FO, worker threads) - Test (sequential):
npm run test:seq - Test (multi-process):
npm run test:proc - Test (single file):
node tests/test-<name>.mjs - TypeScript check:
npm run ts-check - Lint:
npm run lint(Prettier check) - Lint fix:
npm run lint:fix(Prettier write) - Verify build:
npm run verify-build
node-re2/
├── package.json # Package config; "tape6" section configures test discovery
├── binding.gyp # node-gyp build configuration for the C++ addon
├── re2.js # Main entry point: loads native addon, sets up Symbol aliases
├── re2.d.ts # TypeScript declarations for the public API
├── tsconfig.json # TypeScript config (noEmit, strict, types: ["node"])
├── lib/ # C++ source code (native addon)
│ ├── addon.cc # Node.js addon initialization, method registration
│ ├── wrapped_re2.h # WrappedRE2 class definition (core C++ wrapper)
│ ├── wrapped_re2_set.h # WrappedRE2Set class definition (RE2.Set wrapper)
│ ├── isolate_data.h # Per-isolate data struct for thread-safe addon state
│ ├── new.cc # Constructor: parse pattern/flags, create RE2 instance
│ ├── exec.cc # RE2.prototype.exec() implementation
│ ├── test.cc # RE2.prototype.test() implementation
│ ├── match.cc # RE2.prototype.match() implementation
│ ├── replace.cc # RE2.prototype.replace() implementation
│ ├── search.cc # RE2.prototype.search() implementation
│ ├── split.cc # RE2.prototype.split() implementation
│ ├── to_string.cc # RE2.prototype.toString() implementation
│ ├── accessors.cc # Property accessors (source, flags, lastIndex, etc.)
│ ├── pattern.cc # Pattern translation (RegExp → RE2 syntax, Unicode classes)
│ ├── set.cc # RE2.Set implementation (multi-pattern matching)
│ ├── util.cc # Shared utilities (UTF-8/UTF-16 conversion, buffer helpers)
│ ├── util.h # Utility declarations
│ └── pattern.h # Pattern translation declarations
├── scripts/
│ └── verify-build.js # Quick smoke test for the built addon
├── tests/ # Test files (test-*.mjs using tape-six)
├── ts-tests/ # TypeScript type-checking tests
│ └── test-types.ts # Verifies type declarations compile correctly
├── bench/ # Benchmarks
├── vendor/ # Vendored C++ dependencies (git submodules)
│ ├── re2/ # Google RE2 library source
│ └── abseil-cpp/ # Abseil C++ library (RE2 dependency)
└── .github/ # CI workflows, Dependabot config, actions
- CommonJS throughout (
"type": "commonjs"in package.json). - No transpilation — JavaScript code runs directly.
- C++ code uses tabs for indentation, 4-wide. JavaScript uses 2-space indentation.
- Prettier for JS/TS formatting (see
.prettierrc): 80 char width, single quotes, no bracket spacing, no trailing commas, arrow parens "avoid". - nan (Native Abstractions for Node.js) for the C++ addon API.
- Semicolons are enforced by Prettier (default
semi: true). - Imports use
require()syntax in source,importin tests (.mjs).
- Do not modify vendored code. Never edit files under
vendor/. They are git submodules. - Do not modify or delete test expectations without understanding why they changed.
- Do not add comments or remove comments unless explicitly asked.
- Keep
re2.jsandre2.d.tsin sync. All public API exposed fromre2.jsmust be typed inre2.d.ts. - The addon must build on all supported platforms: Linux (x64, arm64, Alpine), macOS (x64, arm64), Windows (x64, arm64).
- RE2 is always Unicode-mode. The
uflag is always added implicitly. - Buffer support is a first-class feature. All methods that accept strings must also accept Buffers, returning Buffers when given Buffer input.
re2.jsis the main entry point. It loads the native C++ addon frombuild/Release/re2.nodeand sets upSymbol.match,Symbol.search,Symbol.replace,Symbol.split, andSymbol.matchAllon the prototype.- The C++ addon (
lib/*.cc) wraps Google's RE2 library via nan. Each RegExp method has its own.ccfile. lib/new.cchandles construction: parsing patterns, translating RegExp syntax to RE2 syntax (vialib/pattern.cc), and creating the underlyingre2::RE2instance.lib/pattern.cctranslates JavaScript RegExp features to RE2 equivalents, including Unicode class names (\p{Letter}→\p{L},\p{Script=Latin}→\p{Latin}).lib/set.ccimplementsRE2.Setfor multi-pattern matching usingre2::RE2::Set.lib/util.ccprovides UTF-8 ↔ UTF-16 conversion helpers and buffer utilities.- Prebuilt native artifacts are hosted on GitHub Releases and downloaded at install time via
install-artifact-from-github.
import test from 'tape-six';
import {RE2} from '../re2.js';
test('example', t => {
const re = new RE2('a(b*)', 'i');
const result = re.exec('aBbC');
t.ok(result);
t.equal(result[0], 'aBb');
t.equal(result[1], 'Bb');
});- Test files use
tape-six:.mjsfor runtime tests,.tsfor TypeScript typing tests. - Test file naming convention:
test-*.mjsintests/,test-*.tsints-tests/. - Tests are configured in
package.jsonunder the"tape6"section. - Test files should be directly executable:
node tests/test-foo.mjs.
- The library is a drop-in replacement for
RegExp— theRE2object emulates the standardRegExpAPI. RE2.Setprovides multi-pattern matching:new RE2.Set(patterns, flags, options).- Static helpers:
RE2.getUtf8Length(str),RE2.getUtf16Length(buf). RE2.unicodeWarningLevelcontrols behavior when non-Unicode regexps are created.- The
installscript tries to download a prebuilt.nodeartifact before falling back tonode-gyp rebuild. - All C++ source is in
lib/, all vendored third-party C++ is invendor/.