A lightweight and powerful Parser Combinator library for Rust, ported from the C library mpc.
mpc-rs is a parser combinator library that allows you to build powerful parsers using simple, composable building blocks. Parser combinators are functions that take parsers as input and return new parsers as output, enabling you to construct complex parsers from simple ones.
You should consider using mpc-rs when you need to:
- Build a new programming language
- Parse a new data format
- Parse existing programming languages
- Parse existing data formats
- Embed a domain-specific language
- Implement complex text processing
- Type-Generic: Works with any Rust types through
Box<dyn Any> - Predictive, Recursive Descent: Efficient parsing with backtracking
- Easy to Integrate: Single library crate
- Automatic Error Reporting: Detailed error messages with position info
- Memory Safe: Leverages Rust's ownership system
- Composable: Build complex parsers from simple ones
Add this to your Cargo.toml:
[dependencies]
mpc = { path = "../mpc-rs" } # Adjust path as neededOr for a published version (when available):
[dependencies]
mpc = "0.1"Here's how to create a simple mathematical expression parser:
use mpc::*;
fn main() {
// Define basic parsers
let number = mpc_digits();
let plus = mpc_char('+');
let mul = mpc_char('*');
// Create expression parsers
let term = mpc_or(vec![number.clone(), mpc_and(vec![mpc_char('('), expression(), mpc_char(')')], mpcf_fst)]);
let expression = mpc_or(vec![
mpc_and(vec![term.clone(), plus.clone(), expression()], |_, xs| {
// Combine: term + expr
Box::new(format!("(+ {} {})", xs[0].downcast_ref::<String>().unwrap(), xs[2].downcast_ref::<String>().unwrap()))
}),
term
]);
// Parse some input
match mpc_parse("input", "(4 + 2)", &expression) {
MpcResult::Ok(result) => {
println!("Parsed: {:?}", result.downcast_ref::<String>());
}
MpcResult::Err(err) => {
err.print();
}
}
}A parser is a function that takes input and returns either a successfully parsed value or an error. All parsers implement the MpcParser struct with different MpcParserType variants.
Combinators take parsers and return new parsers:
mpc_and(): Sequence multiple parsersmpc_or(): Try alternativesmpc_many(): Zero or more repetitionsmpc_many1(): One or more repetitions
Parsing returns an MpcResult:
pub enum MpcResult {
Ok(MpcVal), // Box<dyn Any> containing parsed value
Err(MpcErr), // Error with position and expected tokens
}| Function | Description | Example |
|---|---|---|
mpc_any() |
Matches any single character | |
mpc_char(c) |
Matches specific character | mpc_char('a') |
mpc_range(s, e) |
Matches character in range | mpc_range('0', '9') |
mpc_oneof(s) |
Matches any char in string | mpc_oneof("abc") |
mpc_noneof(s) |
Matches any char not in string | mpc_noneof(" \t\n") |
mpc_satisfy(f) |
Matches char satisfying function | `mpc_satisfy( |
mpc_string(s) |
Matches exact string | mpc_string("hello") |
| Function | Description |
|---|---|
mpc_pass() |
Always succeeds, consumes nothing |
mpc_fail(msg) |
Always fails with message |
mpc_lift(f) |
Consumes nothing, returns function result |
mpc_lift_val(val) |
Consumes nothing, returns value |
mpc_anchor(f) |
Checks condition without consuming |
mpc_state() |
Returns current parser state |
| Function | Description | Example |
|---|---|---|
mpc_and(parsers, fold) |
Sequence parsers | mpc_and(vec![a, b], fold_fn) |
mpc_or(parsers) |
Alternative parsers | mpc_or(vec![a, b]) |
mpc_many(parser, fold) |
Zero or more | mpc_many(digit, strfold) |
mpc_many1(parser, fold) |
One or more | mpc_many1(digit, strfold) |
mpc_count(n, parser, fold) |
Exactly n times | mpc_count(3, digit, strfold) |
mpc_sepby(parser, sep, fold) |
Separated by separator | mpc_sepby(item, comma, fold) |
mpc_sepby1(parser, sep, fold) |
One or more separated | mpc_sepby1(item, comma, fold) |
| Function | Description |
|---|---|
mpc_digit() |
Single digit (0-9) |
mpc_digits() |
One or more digits |
mpc_alpha() |
Alphabetic character |
mpc_alphanum() |
Alphanumeric character |
mpc_whitespace() |
Single whitespace char |
mpc_whitespaces() |
Zero or more whitespace |
mpc_lower() |
Lowercase letter |
mpc_upper() |
Uppercase letter |
mpc_eoi() |
End of input |
mpc_soi() |
Start of input |
| Function | Description | Example |
|---|---|---|
mpca_tag(parser, tag) |
Tag parser result | mpca_tag(number, "number") |
mpca_root(parser) |
Mark as AST root | mpca_root(expression) |
| Function | Description |
|---|---|
mpcf_strfold |
Concatenate strings |
mpcf_fst |
Return first result |
mpcf_null |
Return unit |
| Function | Description | Example |
|---|---|---|
mpc_parse(filename, input, parser) |
Parse string input | mpc_parse("file", "input", &parser) |
use mpc::*;
fn main() {
let number = mpc_digits();
let plus = mpc_char('+');
let mul = mpc_char('*');
let expr = mpc_or(vec![
mpc_and(vec![number.clone(), plus.clone(), expr], |_, xs| {
let a: i32 = xs[0].downcast_ref::<String>().unwrap().parse().unwrap();
let b: i32 = xs[2].downcast_ref::<String>().unwrap().parse().unwrap();
Box::new(a + b)
}),
number
]);
match mpc_parse("calc", "1+2", &expr) {
MpcResult::Ok(val) => println!("Result: {}", val.downcast_ref::<i32>().unwrap()),
MpcResult::Err(e) => e.print(),
}
}use mpc::*;
// Simplified JSON parser
fn json_value() -> MpcParser {
mpc_or(vec![
json_string(),
json_number(),
json_bool(),
mpc_and(vec![mpc_char('{'), mpc_pass()], |_| Box::new("object")),
])
}
fn json_string() -> MpcParser {
mpc_and(vec![
mpc_char('"'),
mpc_many(mpc_noneof("\""), mpcf_strfold),
mpc_char('"')
], |_, xs| xs[1].clone())
}
fn json_number() -> MpcParser {
mpc_digits()
}
fn json_bool() -> MpcParser {
mpc_or(vec![
mpc_string("true"),
mpc_string("false")
])
}mpc-rs provides detailed error information:
match mpc_parse("input", "invalid", &parser) {
MpcResult::Ok(result) => {
// Process result
}
MpcResult::Err(err) => {
println!("Parse error at line {}, column {}: {}",
err.state.row + 1,
err.state.col + 1,
err.failure);
println!("Expected: {}", err.expected.join(", "));
}
}# Build the library
cargo build
# Run tests
cargo test
# Run examples
cargo run --bin simple_testmpc-rs uses recursive descent with backtracking, making it suitable for most parsing tasks. For maximum performance on LL(1) grammars, consider using predictive parsing techniques.
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure code compiles and tests pass
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
This library is a port of the excellent mpc C library by Daniel Holden. The original C implementation provided the foundation and inspiration for this Rust version.