[QF_S Benchmark] seq vs nseq on 299 QF_S benchmarks (ostrich.zip), c3 branch #9793

2026-06-09T13:47:08Z

github-actions[bot]
Bot Jun 9, 2026

Date: 2026-06-09
Branch: c3 · Commit: 3c39fc4
Workflow Run: #27208521350
Files benchmarked: 299 (from tests/ostrich.zip, 5 s timeout, Z3 internal timeout 4 s)

Summary

Metric	seq	nseq
Files solved (sat/unsat)	274	264
Timeouts	18	35
Unknown	7	0
Median solve time (solved)	11 ms	11 ms
Mean solve time (solved)	37 ms	34 ms
Disagreements (sat≠unsat)	—	0 ✅

Both solvers agree on every file where both return a definite result. seq solves 10 more files overall; nseq has 17 more timeouts but returns unknown on 0 files (vs. 7 for seq).

Performance Comparison

seq-fast / nseq-slow — 23 files (seq < 2 s, nseq timed out)

File	seq (ms)	nseq (ms)	seq	nseq
concat_backjump_bug.smt2	74	4057	sat	timeout
contains-4.smt2	10	4014	unsat	timeout
str.to_int_5.smt2	200	4009	sat	timeout
str.to_int_6.smt2	151	4009	sat	timeout
prefix-suffix.smt2	10	4008	unsat	timeout
parikh-constraints.smt2	22	4006	sat	timeout
str.from_int_6.smt2	75	4006	sat	timeout
word-equation-3.smt2	128	4006	unsat	timeout
str-leq7.smt2	35	637	sat	timeout
str-lt2.smt2	81	236	unsat	timeout

Also: cyclic-xy, failedProp, failedProp2, indexof_const_index_sat/unsat, indexof_var_sat/unsat, norn-benchmark-9f, str-leq11/12, str-lt, substr_var_sat, substr_const_len_unsat.

nseq-fast / seq-slow — 9 files (nseq < 2 s, seq timed out)

File	seq (ms)	nseq (ms)	seq	nseq
str.to_int_4.smt2	4019	11	timeout	unsat
noodles-unsat3.smt2	4013	23	timeout	unsat
bigSubstrIdx.smt2	4011	370	timeout	sat
nonlinear.smt2	4011	24	timeout	unsat
noodles-unsat7.smt2	4011	13	timeout	unsat
concat-regex2.smt2	4008	11	timeout	unsat
noodles-unsat8.smt2	4008	18	timeout	sat
indexof-2.smt2	4006	19	timeout	unsat
regexdeep.smt2	4006	95	timeout	sat

Correctness

Disagreements (sat vs unsat): 0 ✅

nseq is strictly more complete on 5 files where seq returns `unknown`:

File	seq	nseq
bug-58-replace-re.smt2	unknown	sat
contains-8.smt2	unknown	sat
replace_empty_string.smt2	unknown	sat
replace_shortest_sat.smt2	unknown	sat
test-replace-regex3.smt2	unknown	sat

Both timeout (9 files):

artur-unsat-we, artur-unsat, nikolai-unsat, noodles-unsat, noodles-unsat2, noodles-unsat5, noodles-unsat6, noodles-unsat9, substring-bug2.

seq Trace Analysis

The seq solver uses iterative deepening: it alternates increase-depth (unfolding string axioms) and increase-length (growing bounds on string variables) until solved or refuted.

concat_backjump_bug.smt2 (74 ms, sat): needed depth 5 and var_9 length 2 to find a model.
cyclic-xy.smt2 (15 ms, unsat): extended x to length 4 at depth 3 to prove unsat.
failedProp.smt2 (10 ms, unsat): discharged at depth 2, length 2 — minimal search.
contains-4.smt2 (10 ms, unsat): resolved by preprocessing before deepening loop.

The nseq (ZIPT-based) solver uses a fundamentally different algorithm; its failures on the above cases suggest incomplete handling of str.to_int, str.from_int, ordering predicates (str.<=, str.<), Parikh constraints, and certain concat patterns.

To reproduce: build Z3 from c3 and run z3 smt.string_solver=seq|nseq -T:10 <file.smt2>. Benchmarks from tests/ostrich.zip (299 files).

Generated by QF_S String Solver Benchmark · sonnet46 3.5M · ◷

expires on Jun 16, 2026, 1:47 PM UTC

2026-06-10T01:50:53Z

github-actions[bot]
Bot Jun 10, 2026
Author

This discussion has been marked as outdated by QF_S String Solver Benchmark.

A newer discussion is available at Discussion #9802.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QF_S Benchmark] seq vs nseq on 299 QF_S benchmarks (ostrich.zip), c3 branch #9793

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[QF_S Benchmark] seq vs nseq on 299 QF_S benchmarks (ostrich.zip), c3 branch #9793

Uh oh!

github-actions[bot] Bot Jun 9, 2026

Summary

Performance Comparison

seq-fast / nseq-slow — 23 files (seq < 2 s, nseq timed out)

nseq-fast / seq-slow — 9 files (nseq < 2 s, seq timed out)

Correctness

nseq is strictly more complete on 5 files where seq returns unknown:

Both timeout (9 files):

seq Trace Analysis

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Jun 10, 2026 Author

github-actions[bot]
Bot Jun 9, 2026

nseq is strictly more complete on 5 files where seq returns `unknown`:

github-actions[bot]
Bot Jun 10, 2026
Author