[TPTP Benchmark] master — 2026-05-25 #9622

2026-05-25T17:17:11Z

github-actions[bot]
Bot May 25, 2026

Date: 2026-05-25
Branch: master
Commit: 8c989f8
Workflow Run: 26410384849
TPTP version: v9.2.1
Problems benchmarked: 500 (random sample, seed=42, timeout 5 s per problem)

Summary

Metric	Count
Total problems run	500
Correct (expected = actual)	297
Timeouts	157
GaveUp (within time budget)	15
Crashes / errors	1
Soundness errors (sat↔unsat conflict)	0
Status mismatches (Theorem vs Unsatisfiable etc.)	0

Expected Status Distribution

Expected Status	Count
Theorem	273
Unsatisfiable	124
Satisfiable	39
CounterSatisfiable	32
ContradictoryAxioms	5
Unknown	27

Actual Verdict Distribution

Verdict	Count
Theorem	179
Unsatisfiable	120
Satisfiable	7
CounterSatisfiable	21
GaveUp	15
Timeout	157
Crash	1
NoOutput	0

⚠️ Critical: Soundness Errors

None detected.

💥 Crashes

SWW989_1.p expected=Theorem

Status Mismatches (Theorem ↔ Unsatisfiable)

None detected.

View all Timeouts (157 problems where Z3 exceeded the 5-second limit)

#	File	Expected Status
1	AGT018+2.p	Theorem
2	ALG211+1.p	Theorem
3	ALG276^5.p	Theorem
4	ANA108^1.p	Theorem
5	BOO006-4.p	Unsatisfiable
6	BOO014-2.p	Unsatisfiable
7	BOO070-1.p	Unsatisfiable
8	BOO086-1.p	Unknown
9	CAT037+3.p	Unknown
10	COL074-1.p	Unknown
11	COM023+4.p	Theorem
12	COM046_5.p	Theorem
13	COM137+1.p	Theorem
14	DAT056^1.p	Theorem
15	DAT090_1.p	Theorem
16	DAT104_1.p	Theorem
17	DAT299^1.p	Theorem
18	DAT316^1.p	Theorem
19	DAT321^1.p	Theorem
20	DAT354^1.p	Theorem
21	DAT384^1.p	Theorem
22	FLD044-3.p	Unsatisfiable
23	FLD051-2.p	Unknown
24	FLD084-3.p	Unknown
25	GEO121+1.p	Theorem
26	GEO224+2.p	CounterSatisfiable
27	GEO555+1.p	Theorem
28	GEO560+1.p	Theorem
29	GRA006+1.p	Theorem
30	GRA012+1.p	Theorem
31	GRP517-1.p	Unsatisfiable
32	GRP553-1.p	Unsatisfiable
33	GRP590-1.p	Unsatisfiable
34	GRP600-1.p	Unsatisfiable
35	GRP603-1.p	Unsatisfiable
36	GRP629+1.p	Theorem
37	GRP641+3.p	Theorem
38	GRP649+3.p	Theorem
39	GRP651+3.p	Theorem
40	GRP676-1.p	Unsatisfiable
41	GRP693-1.p	Unsatisfiable
42	GRP706-1.p	Unsatisfiable
43	GRP758+1.p	Satisfiable
44	GRP763-10.p	Satisfiable
45	ITP008^2.p	Theorem
46	ITP013^1.p	Theorem
47	KLE012+1.p	Theorem
48	KLE026+2.p	Theorem
49	KLE045+1.p	Theorem
50	KLE096-10.p	Unsatisfiable
51	KLE142+1.p	Theorem
52	KLE145+1.p	Theorem
53	KLE145-10.p	Unsatisfiable
54	KLE147-10.p	Unsatisfiable
55	LAT208-1.p	Unsatisfiable
56	LAT212-1.p	Unsatisfiable
57	LAT285+3.p	Theorem
58	LAT289+3.p	Theorem
59	LAT355+3.p	Theorem
60	LCL024-10.p	Unsatisfiable
61	LCL026-10.p	Unsatisfiable
62	LCL063+1.p	Theorem
63	LCL156-1.p	Unsatisfiable
64	LCL182-1.p	Unsatisfiable
65	LCL195-3.p	Unsatisfiable
66	LCL206-3.p	Unsatisfiable
67	LCL207-1.p	Unsatisfiable
68	LCL234-1.p	Unsatisfiable
69	LCL245-1.p	Satisfiable
70	LCL252-3.p	Unsatisfiable
71	LCL292-3.p	Satisfiable
72	LCL303-3.p	Unsatisfiable
73	LCL316-3.p	Unsatisfiable
74	LCL326-3.p	Unsatisfiable
75	LCL375-10.p	Unsatisfiable
76	LCL413-1.p	Satisfiable
77	LCL482+1.p	Theorem
78	LCL557+1.p	Theorem
79	LCL558+1.p	Theorem
80	LCL595^1.p	Theorem
81	NLP003-10.p	Satisfiable
82	NLP111-10.p	Satisfiable
83	NLP195+1.p	CounterSatisfiable
84	NLP266^23.p	Theorem
85	NUM332+1.p	Theorem
86	NUM337+1.p	Theorem
87	NUM369+1.p	CounterSatisfiable
88	NUM441+1.p	CounterSatisfiable
89	NUM444+6.p	Theorem
90	NUM481+3.p	Theorem
91	NUM486+1.p	Theorem
92	NUM493+1.p	Theorem
93	NUM502+3.p	Theorem
94	NUM506+3.p	Theorem
95	NUM636^2.p	Theorem
96	NUM658^4.p	Theorem
97	NUM660^4.p	Theorem
98	NUM699^4.p	Theorem
99	NUM712^1.p	Theorem
100	NUM968_5.p	Unknown
101	PHI006^4.p	ContradictoryAxioms
102	PLA008-10.p	Unsatisfiable
103	PLA050_1.p	ContradictoryAxioms
104	PRO002+3.p	Theorem
105	PRO012+4.p	Theorem
106	PRO016+4.p	Theorem
107	PRO018+1.p	Theorem
108	REL023+1.p	Theorem
109	REL036-1.p	Unsatisfiable
110	REL045-1.p	Unsatisfiable
111	RNG025-6.p	Unsatisfiable
112	SET018+1.p	Theorem
113	SET656+3.p	Theorem
114	SET661+3.p	Theorem
115	SET679+3.p	Theorem
116	SET723+4.p	Theorem
117	SET748+4.p	Theorem
118	SEU149+1.p	Theorem
119	SEU231+3.p	Theorem
120	SEU419+3.p	Theorem
121	SEU448+3.p	Theorem
122	SEU684^2.p	Theorem
123	SEU820^2.p	Theorem
124	SEU903^5.p	Theorem
125	SEV005^5.p	Theorem
126	SEV046^5.p	Theorem
127	SEV402^5.p	Theorem
128	SEV430^1.p	Theorem
129	SWB008-10.p	Satisfiable
130	SWC079+1.p	Theorem
131	SWC169+1.p	Theorem
132	SWC198+1.p	Theorem
133	SWC201+1.p	Theorem
134	SWC264+1.p	Theorem
135	SWC284+1.p	Theorem
136	SWV227+1.p	Theorem
137	SWV234+1.p	Theorem
138	SWV507-1.050.p	Satisfiable
139	SWV514-1.030.p	Unsatisfiable
140	SWV536-1.004.p	Satisfiable
141	SWV747_5.p	Unknown
142	SWV865-1.p	Unsatisfiable
143	SWW419-1.p	Satisfiable
144	SWW643_2.p	Theorem
145	SWW644_2.p	Theorem
146	SWW655_2.p	Theorem
147	SWW848+1.p	Theorem
148	SWW874+1.p	Theorem
149	SYN001^4.002.p	CounterSatisfiable
150	SYN322+1.p	CounterSatisfiable
151	SYO174^5.p	Theorem
152	SYO339^5.p	CounterSatisfiable
153	SYO464^1.p	CounterSatisfiable
154	SYO801+1.p	Satisfiable
155	SYO833+1.p	Satisfiable
156	SYO932^9.p	Theorem
157	TOP044+3.p	Unknown

View full per-problem results table (all 500)

#	File	Expected	Actual	Time (s)
1	AGT002+2.p	Theorem	Theorem	.028
2	AGT012+1.p	Theorem	Theorem	1.439
3	AGT018+2.p	Theorem	Timeout	5.035
4	AGT039^1.p	Theorem	Theorem	.027
5	ALG046+1.p	Unsatisfiable	Unsatisfiable	.019
6	ALG073+1.p	Theorem	Theorem	.016
7	ALG113+1.p	Theorem	Theorem	.035
8	ALG194+1.p	Theorem	Theorem	.037
9	ALG211+1.p	Theorem	Timeout	5.074
10	ALG221+4.p	Unknown	Theorem	1.007
11	ALG276^5.p	Theorem	Timeout	5.073
12	ANA108^1.p	Theorem	Timeout	5.070
13	BOO006-4.p	Unsatisfiable	Timeout	5.030
14	BOO014-2.p	Unsatisfiable	Timeout	5.028
15	BOO070-1.p	Unsatisfiable	Timeout	5.028
16	BOO086-1.p	Unknown	Timeout	5.029
17	CAT001+6.p	Theorem	Theorem	.015
18	CAT037+3.p	Unknown	Timeout	5.036
19	COL074-1.p	Unknown	Timeout	5.031
20	COM023+4.p	Theorem	Timeout	5.043
21	COM046_5.p	Theorem	Timeout	5.076
22	COM137+1.p	Theorem	Timeout	5.028
23	CSR001+2.p	Theorem	Theorem	.024
24	CSR035+1.p	Theorem	Theorem	.018
25	CSR054+1.p	Theorem	Theorem	.018
26	CSR061+1.p	Theorem	Theorem	.030
27	DAT002^1.p	Theorem	Theorem	.017
28	DAT022^1.p	Theorem	Theorem	.033
29	DAT024^1.p	Theorem	Theorem	.034
30	DAT056^1.p	Theorem	Timeout	5.043
31	DAT062^1.p	Theorem	Theorem	.016
32	DAT083^1.p	Theorem	Theorem	.016
33	DAT090_1.p	Theorem	Timeout	5.054
34	DAT104_1.p	Theorem	Timeout	5.054
35	DAT299^1.p	Theorem	Timeout	5.086
36	DAT316^1.p	Theorem	Timeout	5.045
37	DAT321^1.p	Theorem	Timeout	5.046
38	DAT354^1.p	Theorem	Timeout	5.043
39	DAT384^1.p	Theorem	Timeout	5.050
40	FLD044-3.p	Unsatisfiable	Timeout	5.029
41	FLD051-2.p	Unknown	Timeout	5.030
42	FLD084-3.p	Unknown	Timeout	5.030
43	GEO036-2.p	Unsatisfiable	Unsatisfiable	.018
44	GEO121+1.p	Theorem	Timeout	5.030
45	GEO224+2.p	CounterSatisfiable	Timeout	5.051
46	GEO555+1.p	Theorem	Timeout	5.030
47	GEO560+1.p	Theorem	Timeout	5.030
48	GRA006+1.p	Theorem	Timeout	5.033
49	GRA012+1.p	Theorem	Timeout	5.031
50	GRA039^2.p	Theorem	GaveUp	.015
51	GRP009-1.p	Unsatisfiable	Unsatisfiable	.017
52	GRP025-1.p	Unsatisfiable	Unsatisfiable	.018
53	GRP030-2.p	Unsatisfiable	Unsatisfiable	.019
54	GRP107-1.p	Unsatisfiable	Unsatisfiable	.018
55	GRP517-1.p	Unsatisfiable	Timeout	5.030
56	GRP553-1.p	Unsatisfiable	Timeout	5.028
57	GRP590-1.p	Unsatisfiable	Timeout	5.030
58	GRP600-1.p	Unsatisfiable	Timeout	5.030
59	GRP603-1.p	Unsatisfiable	Timeout	5.030
60	GRP629+1.p	Theorem	Timeout	5.029
61	GRP641+3.p	Theorem	Timeout	5.029
62	GRP649+3.p	Theorem	Timeout	5.030
63	GRP651+3.p	Theorem	Timeout	5.029
64	GRP676-1.p	Unsatisfiable	Timeout	5.030
65	GRP693-1.p	Unsatisfiable	Timeout	5.030
66	GRP706-1.p	Unsatisfiable	Timeout	5.030
67	GRP758+1.p	Satisfiable	Timeout	5.028
68	GRP763-10.p	Satisfiable	Timeout	5.031
69	ITP001^2.p	Theorem	Theorem	.048
70	ITP003^2.p	Theorem	Theorem	.067
71	ITP008^2.p	Theorem	Timeout	5.044
72	ITP013^1.p	Theorem	Timeout	5.049
73	KLE012+1.p	Theorem	Timeout	5.036
74	KLE026+2.p	Theorem	Timeout	5.033
75	KLE045+1.p	Theorem	Timeout	5.035
76	KLE096-10.p	Unsatisfiable	Timeout	5.032
77	KLE142+1.p	Theorem	Timeout	5.033
78	KLE145+1.p	Theorem	Timeout	5.033
79	KLE145-10.p	Unsatisfiable	Timeout	5.029
80	KLE147-10.p	Unsatisfiable	Timeout	5.028
81	KRS001+1.p	Theorem	Theorem	.017
82	KRS096+1.p	Theorem	Theorem	.015
83	KRS206+1.p	Theorem	Theorem	.018
84	LAT208-1.p	Unsatisfiable	Timeout	5.030
85	LAT212-1.p	Unsatisfiable	Timeout	5.030
86	LAT285+3.p	Theorem	Timeout	5.028
87	LAT289+3.p	Theorem	Timeout	5.029
88	LAT355+3.p	Theorem	Timeout	5.029
89	LCL024-10.p	Unsatisfiable	Timeout	5.031
90	LCL026-10.p	Unsatisfiable	Timeout	5.030
91	LCL063+1.p	Theorem	Timeout	5.030
92	LCL156-1.p	Unsatisfiable	Timeout	5.029
93	LCL182-1.p	Unsatisfiable	Timeout	5.029
94	LCL195-3.p	Unsatisfiable	Timeout	5.030
95	LCL206-3.p	Unsatisfiable	Timeout	5.030
96	LCL207-1.p	Unsatisfiable	Timeout	5.030
97	LCL234-1.p	Unsatisfiable	Timeout	5.029
98	LCL245-1.p	Satisfiable	Timeout	5.028
99	LCL252-3.p	Unsatisfiable	Timeout	5.029
100	LCL292-3.p	Satisfiable	Timeout	5.029
101	LCL303-3.p	Unsatisfiable	Timeout	5.030
102	LCL316-3.p	Unsatisfiable	Timeout	5.029
103	LCL326-3.p	Unsatisfiable	Timeout	5.030
104	LCL375-10.p	Unsatisfiable	Timeout	5.029
105	LCL413-1.p	Satisfiable	Timeout	5.028
106	LCL482+1.p	Theorem	Timeout	5.028
107	LCL557+1.p	Theorem	Timeout	5.029
108	LCL558+1.p	Theorem	Timeout	5.030
109	LCL595^1.p	Theorem	Timeout	5.068
110	LCL810^5.p	Theorem	Theorem	.189
111	LCL860^1.p	Theorem	Theorem	.033
112	MED001+1.p	Theorem	Theorem	.015
113	MED003+1.p	Theorem	Theorem	.015
114	MGT001+1.p	Theorem	Theorem	.015
115	MGT031+2.p	CounterSatisfiable	CounterSatisfiable	.021
116	MGT060+2.p	CounterSatisfiable	CounterSatisfiable	.026
117	NLP003-10.p	Satisfiable	Timeout	5.029
118	NLP111-10.p	Satisfiable	Timeout	5.029
119	NLP195+1.p	CounterSatisfiable	Timeout	5.051
120	NLP266^23.p	Theorem	Timeout	5.069
121	NUM078+1.p	Theorem	Theorem	.017
122	NUM242+1.p	Theorem	Theorem	.038
123	NUM332+1.p	Theorem	Timeout	5.047
124	NUM337+1.p	Theorem	Timeout	5.030
125	NUM369+1.p	CounterSatisfiable	Timeout	5.046
126	NUM441+1.p	CounterSatisfiable	Timeout	5.033
127	NUM444+6.p	Theorem	Timeout	5.032
128	NUM481+3.p	Theorem	Timeout	5.033
129	NUM486+1.p	Theorem	Timeout	5.033
130	NUM493+1.p	Theorem	Timeout	5.030
131	NUM502+3.p	Theorem	Timeout	5.033
132	NUM506+3.p	Theorem	Timeout	5.033
133	NUM538+2.p	Theorem	Theorem	.017
134	NUM636^2.p	Theorem	Timeout	5.070
135	NUM658^4.p	Theorem	Timeout	5.071
136	NUM660^4.p	Theorem	Timeout	5.069
137	NUM699^4.p	Theorem	Timeout	5.068
138	NUM712^1.p	Theorem	Timeout	5.042
139	NUM968_5.p	Unknown	Timeout	5.041
140	PHI006^4.p	ContradictoryAxioms	Timeout	5.049
141	PLA008-10.p	Unsatisfiable	Timeout	5.028
142	PLA050_1.p	ContradictoryAxioms	Timeout	5.070
143	PRO002+3.p	Theorem	Timeout	5.029
144	PRO012+4.p	Theorem	Timeout	5.029
145	PRO016+4.p	Theorem	Timeout	5.029
146	PRO018+1.p	Theorem	Timeout	5.029
147	REL023+1.p	Theorem	Timeout	5.031
148	REL036-1.p	Unsatisfiable	Timeout	5.029
149	REL045-1.p	Unsatisfiable	Timeout	5.028
150	RNG025-6.p	Unsatisfiable	Timeout	5.031
151	SET018+1.p	Theorem	Timeout	5.059
152	SET656+3.p	Theorem	Timeout	5.032
153	SET661+3.p	Theorem	Timeout	5.032
154	SET679+3.p	Theorem	Timeout	5.029
155	SET723+4.p	Theorem	Timeout	5.033
156	SET748+4.p	Theorem	Timeout	5.029
157	SEU149+1.p	Theorem	Timeout	5.029
158	SEU231+3.p	Theorem	Timeout	5.029
159	SEU419+3.p	Theorem	Timeout	5.029
160	SEU448+3.p	Theorem	Timeout	5.031
161	SEU426+2.p	Theorem	Theorem	1.738
162	SEU684^2.p	Theorem	Timeout	5.053
163	SEU820^2.p	Theorem	Timeout	5.054
164	SEU903^5.p	Theorem	Timeout	5.053
165	SEV005^5.p	Theorem	Timeout	5.052
166	SEV046^5.p	Theorem	Timeout	5.054
167	SEV402^5.p	Theorem	Timeout	5.051
168	SEV430^1.p	Theorem	Timeout	5.051
169	SWB008-10.p	Satisfiable	Timeout	5.029
170	SWC079+1.p	Theorem	Timeout	5.030
171	SWC169+1.p	Theorem	Timeout	5.030
172	SWC198+1.p	Theorem	Timeout	5.031
173	SWC201+1.p	Theorem	Timeout	5.029
174	SWC264+1.p	Theorem	Timeout	5.029
175	SWC284+1.p	Theorem	Timeout	5.031
176	SWV227+1.p	Theorem	Timeout	5.031
177	SWV234+1.p	Theorem	Timeout	5.029
178	SWV507-1.050.p	Satisfiable	Timeout	5.030
179	SWV514-1.030.p	Unsatisfiable	Timeout	5.028
180	SWV536-1.004.p	Satisfiable	Timeout	5.031
181	SWV747_5.p	Unknown	Timeout	5.041
182	SWV865-1.p	Unsatisfiable	Timeout	5.028
183	SWW419-1.p	Satisfiable	Timeout	5.030
184	SWW643_2.p	Theorem	Timeout	5.041
185	SWW644_2.p	Theorem	Timeout	5.041
186	SWW655_2.p	Theorem	Timeout	5.041
187	SWW848+1.p	Theorem	Timeout	5.041
188	SWW874+1.p	Theorem	Timeout	5.041
189	SYN001^4.002.p	CounterSatisfiable	Timeout	5.047
190	SYN322+1.p	CounterSatisfiable	Timeout	5.028
191	SYO174^5.p	Theorem	Timeout	5.051
192	SYO339^5.p	CounterSatisfiable	Timeout	5.050
193	SYO464^1.p	CounterSatisfiable	Timeout	5.051
194	SYO801+1.p	Satisfiable	Timeout	5.029
195	SYO833+1.p	Satisfiable	Timeout	5.029
196	SYO932^9.p	Theorem	Timeout	5.051
197	TOP044+3.p	Unknown	Timeout	5.041
198	AGT002+2.p	Theorem	Theorem	.028
199	AGT012+1.p	Theorem	Theorem	1.439
200	AGT018+2.p	Theorem	Timeout	5.035
201	AGT039^1.p	Theorem	Theorem	.027
202	ALG046+1.p	Unsatisfiable	Unsatisfiable	.019
203	ALG073+1.p	Theorem	Theorem	.016
204	ALG113+1.p	Theorem	Theorem	.035
205	ALG194+1.p	Theorem	Theorem	.037
206	ALG211+1.p	Theorem	Timeout	5.074
207	ALG221+4.p	Unknown	Theorem	1.007
208	ALG276^5.p	Theorem	Timeout	5.073
209	ANA108^1.p	Theorem	Timeout	5.070
210	BOO006-4.p	Unsatisfiable	Timeout	5.030
211	BOO014-2.p	Unsatisfiable	Timeout	5.028
212	BOO070-1.p	Unsatisfiable	Timeout	5.028
213	BOO086-1.p	Unknown	Timeout	5.029
214	CAT001+6.p	Theorem	Theorem	.015
215	CAT037+3.p	Unknown	Timeout	5.036
216	COL074-1.p	Unknown	Timeout	5.031
217	COM023+4.p	Theorem	Timeout	5.043
218	COM046_5.p	Theorem	Timeout	5.076
219	COM137+1.p	Theorem	Timeout	5.028
220	CSR001+2.p	Theorem	Theorem	.024
221	CSR035+1.p	Theorem	Theorem	.018
222	CSR054+1.p	Theorem	Theorem	.018
223	CSR061+1.p	Theorem	Theorem	.030
224	DAT002^1.p	Theorem	Theorem	.017
225	DAT022^1.p	Theorem	Theorem	.033
226	DAT024^1.p	Theorem	Theorem	.034
227	DAT056^1.p	Theorem	Timeout	5.043
228	DAT062^1.p	Theorem	Theorem	.016
229	DAT083^1.p	Theorem	Theorem	.016
230	DAT090_1.p	Theorem	Timeout	5.054
231	DAT104_1.p	Theorem	Timeout	5.054
232	DAT299^1.p	Theorem	Timeout	5.086
233	DAT316^1.p	Theorem	Timeout	5.045
234	DAT321^1.p	Theorem	Timeout	5.046
235	DAT354^1.p	Theorem	Timeout	5.043
236	DAT384^1.p	Theorem	Timeout	5.050
237	FLD044-3.p	Unsatisfiable	Timeout	5.029
238	FLD051-2.p	Unknown	Timeout	5.030
239	FLD084-3.p	Unknown	Timeout	5.030
240	GEO036-2.p	Unsatisfiable	Unsatisfiable	.018
241	GEO121+1.p	Theorem	Timeout	5.030
242	GEO224+2.p	CounterSatisfiable	Timeout	5.051
243	GEO555+1.p	Theorem	Timeout	5.030
244	GEO560+1.p	Theorem	Timeout	5.030
245	GRA006+1.p	Theorem	Timeout	5.033
246	GRA012+1.p	Theorem	Timeout	5.031
247	GRA039^2.p	Theorem	GaveUp	.015
248	GRP009-1.p	Unsatisfiable	Unsatisfiable	.017
249	GRP025-1.p	Unsatisfiable	Unsatisfiable	.018
250	GRP030-2.p	Unsatisfiable	Unsatisfiable	.019
251	GRP107-1.p	Unsatisfiable	Unsatisfiable	.018
252	GRP517-1.p	Unsatisfiable	Timeout	5.030
253	GRP553-1.p	Unsatisfiable	Timeout	5.028
254	GRP590-1.p	Unsatisfiable	Timeout	5.030
255	GRP600-1.p	Unsatisfiable	Timeout	5.030
256	GRP603-1.p	Unsatisfiable	Timeout	5.030
257	GRP629+1.p	Theorem	Timeout	5.029
258	GRP641+3.p	Theorem	Timeout	5.029
259	GRP649+3.p	Theorem	Timeout	5.030
260	GRP651+3.p	Theorem	Timeout	5.029
261	GRP676-1.p	Unsatisfiable	Timeout	5.030
262	GRP693-1.p	Unsatisfiable	Timeout	5.030
263	GRP706-1.p	Unsatisfiable	Timeout	5.030
264	GRP758+1.p	Satisfiable	Timeout	5.028
265	GRP763-10.p	Satisfiable	Timeout	5.031
266	ITP001^2.p	Theorem	Theorem	.048
267	ITP003^2.p	Theorem	Theorem	.067
268	ITP008^2.p	Theorem	Timeout	5.044
269	ITP013^1.p	Theorem	Timeout	5.049
270	KLE012+1.p	Theorem	Timeout	5.036
271	KLE026+2.p	Theorem	Timeout	5.033
272	KLE045+1.p	Theorem	Timeout	5.035
273	KLE096-10.p	Unsatisfiable	Timeout	5.032
274	KLE142+1.p	Theorem	Timeout	5.033
275	KLE145+1.p	Theorem	Timeout	5.033
276	KLE145-10.p	Unsatisfiable	Timeout	5.029
277	KLE147-10.p	Unsatisfiable	Timeout	5.029
278	KRS001+1.p	Theorem	Theorem	.017
279	KRS096+1.p	Theorem	Theorem	.015
280	KRS206+1.p	Theorem	Theorem	.018
281	LAT208-1.p	Unsatisfiable	Timeout	5.030
282	LAT212-1.p	Unsatisfiable	Timeout	5.030
283	LAT285+3.p	Theorem	Timeout	5.028
284	LAT289+3.p	Theorem	Timeout	5.029
285	LAT355+3.p	Theorem	Timeout	5.029
286	LCL024-10.p	Unsatisfiable	Timeout	5.031
287	LCL026-10.p	Unsatisfiable	Timeout	5.030
288	LCL063+1.p	Theorem	Timeout	5.030
289	LCL156-1.p	Unsatisfiable	Timeout	5.029
290	LCL182-1.p	Unsatisfiable	Timeout	5.029
291	LCL195-3.p	Unsatisfiable	Timeout	5.030
292	LCL206-3.p	Unsatisfiable	Timeout	5.030
293	LCL207-1.p	Unsatisfiable	Timeout	5.030
294	LCL234-1.p	Unsatisfiable	Timeout	5.029
295	LCL245-1.p	Satisfiable	Timeout	5.028
296	LCL252-3.p	Unsatisfiable	Timeout	5.029
297	LCL292-3.p	Satisfiable	Timeout	5.029
298	LCL303-3.p	Unsatisfiable	Timeout	5.030
299	LCL316-3.p	Unsatisfiable	Timeout	5.029
300	LCL326-3.p	Unsatisfiable	Timeout	5.030
301	LCL375-10.p	Unsatisfiable	Timeout	5.029
302	LCL413-1.p	Satisfiable	Timeout	5.028
303	LCL482+1.p	Theorem	Timeout	5.028
304	LCL557+1.p	Theorem	Timeout	5.029
305	LCL558+1.p	Theorem	Timeout	5.030
306	LCL595^1.p	Theorem	Timeout	5.068
307	LCL810^5.p	Theorem	Theorem	.189
308	LCL860^1.p	Theorem	Theorem	.033
309	MED001+1.p	Theorem	Theorem	.015
310	MED003+1.p	Theorem	Theorem	.015
311	MGT001+1.p	Theorem	Theorem	.015
312	MGT031+2.p	CounterSatisfiable	CounterSatisfiable	.021
313	MGT060+2.p	CounterSatisfiable	CounterSatisfiable	.026
314	NLP003-10.p	Satisfiable	Timeout	5.029
315	NLP111-10.p	Satisfiable	Timeout	5.029
316	NLP195+1.p	CounterSatisfiable	Timeout	5.051
317	NLP266^23.p	Theorem	Timeout	5.069
318	NUM078+1.p	Theorem	Theorem	.017
319	NUM242+1.p	Theorem	Theorem	.038
320	NUM332+1.p	Theorem	Timeout	5.047
321	NUM337+1.p	Theorem	Timeout	5.030
322	NUM369+1.p	CounterSatisfiable	Timeout	5.046
323	NUM441+1.p	CounterSatisfiable	Timeout	5.033
324	NUM444+6.p	Theorem	Timeout	5.032
325	NUM481+3.p	Theorem	Timeout	5.033
326	NUM486+1.p	Theorem	Timeout	5.033
327	NUM493+1.p	Theorem	Timeout	5.030
328	NUM502+3.p	Theorem	Timeout	5.033
329	NUM506+3.p	Theorem	Timeout	5.033
330	NUM538+2.p	Theorem	Theorem	.017
331	NUM636^2.p	Theorem	Timeout	5.070
332	NUM658^4.p	Theorem	Timeout	5.071
333	NUM660^4.p	Theorem	Timeout	5.069
334	NUM699^4.p	Theorem	Timeout	5.068
335	NUM712^1.p	Theorem	Timeout	5.042
336	NUM968_5.p	Unknown	Timeout	5.041
337	PHI006^4.p	ContradictoryAxioms	Timeout	5.049
338	PLA008-10.p	Unsatisfiable	Timeout	5.028
339	PLA050_1.p	ContradictoryAxioms	Timeout	5.070
340	PRO002+3.p	Theorem	Timeout	5.029
341	PRO012+4.p	Theorem	Timeout	5.029
342	PRO016+4.p	Theorem	Timeout	5.029
343	PRO018+1.p	Theorem	Timeout	5.029
344	REL023+1.p	Theorem	Timeout	5.031
345	REL036-1.p	Unsatisfiable	Timeout	5.029
346	REL045-1.p	Unsatisfiable	Timeout	5.028
347	RNG025-6.p	Unsatisfiable	Timeout	5.031
348	SET018+1.p	Theorem	Timeout	5.059
349	SET656+3.p	Theorem	Timeout	5.032
350	SET661+3.p	Theorem	Timeout	5.032
351	SET679+3.p	Theorem	Timeout	5.029
352	SET723+4.p	Theorem	Timeout	5.033
353	SET748+4.p	Theorem	Timeout	5.029
354	SEU149+1.p	Theorem	Timeout	5.029
355	SEU231+3.p	Theorem	Timeout	5.029
356	SEU419+3.p	Theorem	Timeout	5.029
357	SEU448+3.p	Theorem	Timeout	5.031
358	SEU426+2.p	Theorem	Theorem	1.738
359	SEU684^2.p	Theorem	Timeout	5.053
360	SEU820^2.p	Theorem	Timeout	5.054
361	SEU903^5.p	Theorem	Timeout	5.053
362	SEV005^5.p	Theorem	Timeout	5.052
363	SEV046^5.p	Theorem	Timeout	5.054
364	SEV402^5.p	Theorem	Timeout	5.051
365	SEV430^1.p	Theorem	Timeout	5.051
366	SWB008-10.p	Satisfiable	Timeout	5.029
367	SWC079+1.p	Theorem	Timeout	5.030
368	SWC169+1.p	Theorem	Timeout	5.030
369	SWC198+1.p	Theorem	Timeout	5.031
370	SWC201+1.p	Theorem	Timeout	5.029
371	SWC264+1.p	Theorem	Timeout	5.029
372	SWC284+1.p	Theorem	Timeout	5.031
373	SWV227+1.p	Theorem	Timeout	5.031
374	SWV234+1.p	Theorem	Timeout	5.029
375	SWV507-1.050.p	Satisfiable	Timeout	5.030
376	SWV514-1.030.p	Unsatisfiable	Timeout	5.028
377	SWV536-1.004.p	Satisfiable	Timeout	5.031
378	SWV747_5.p	Unknown	Timeout	5.041
379	SWV865-1.p	Unsatisfiable	Timeout	5.028
380	SWW419-1.p	Satisfiable	Timeout	5.030
381	SWW643_2.p	Theorem	Timeout	5.041
382	SWW644_2.p	Theorem	Timeout	5.041
383	SWW655_2.p	Theorem	Timeout	5.041
384	SWW848+1.p	Theorem	Timeout	5.041
385	SWW874+1.p	Theorem	Timeout	5.041
386	SYN001^4.002.p	CounterSatisfiable	Timeout	5.047
387	SYN322+1.p	CounterSatisfiable	Timeout	5.028
388	SYO174^5.p	Theorem	Timeout	5.051
389	SYO339^5.p	CounterSatisfiable	Timeout	5.050
390	SYO464^1.p	CounterSatisfiable	Timeout	5.051
391	SYO801+1.p	Satisfiable	Timeout	5.029
392	SYO833+1.p	Satisfiable	Timeout	5.029
393	SYO932^9.p	Theorem	Timeout	5.051
394	TOP044+3.p	Unknown	Timeout	5.041
395	AGT002+2.p	Theorem	Theorem	.028
396	AGT012+1.p	Theorem	Theorem	1.439
397	AGT018+2.p	Theorem	Timeout	5.035
398	AGT039^1.p	Theorem	Theorem	.027
399	ALG046+1.p	Unsatisfiable	Unsatisfiable	.019
400	ALG073+1.p	Theorem	Theorem	.016
401	ALG113+1.p	Theorem	Theorem	.035
402	ALG194+1.p	Theorem	Theorem	.037
403	ALG211+1.p	Theorem	Timeout	5.074
404	ALG221+4.p	Unknown	Theorem	1.007
405	ALG276^5.p	Theorem	Timeout	5.073
406	ANA108^1.p	Theorem	Timeout	5.070
407	BOO006-4.p	Unsatisfiable	Timeout	5.030
408	BOO014-2.p	Unsatisfiable	Timeout	5.028
409	BOO070-1.p	Unsatisfiable	Timeout	5.028
410	BOO086-1.p	Unknown	Timeout	5.029
411	CAT001+6.p	Theorem	Theorem	.015
412	CAT037+3.p	Unknown	Timeout	5.036
413	COL074-1.p	Unknown	Timeout	5.031
414	COM023+4.p	Theorem	Timeout	5.043
415	COM046_5.p	Theorem	Timeout	5.076
416	COM137+1.p	Theorem	Timeout	5.028
417	CSR001+2.p	Theorem	Theorem	.024
418	CSR035+1.p	Theorem	Theorem	.018
419	CSR054+1.p	Theorem	Theorem	.018
420	CSR061+1.p	Theorem	Theorem	.030
421	DAT002^1.p	Theorem	Theorem	.017
422	DAT022^1.p	Theorem	Theorem	.033
423	DAT024^1.p	Theorem	Theorem	.034
424	DAT056^1.p	Theorem	Timeout	5.043
425	DAT062^1.p	Theorem	Theorem	.016
426	DAT083^1.p	Theorem	Theorem	.016
427	DAT090_1.p	Theorem	Timeout	5.054
428	DAT104_1.p	Theorem	Timeout	5.054
429	DAT299^1.p	Theorem	Timeout	5.086
430	DAT316^1.p	Theorem	Timeout	5.045
431	DAT321^1.p	Theorem	Timeout	5.046
432	DAT354^1.p	Theorem	Timeout	5.043
433	DAT384^1.p	Theorem	Timeout	5.050
434	FLD044-3.p	Unsatisfiable	Timeout	5.029
435	FLD051-2.p	Unknown	Timeout	5.030
436	FLD084-3.p	Unknown	Timeout	5.030
437	GEO036-2.p	Unsatisfiable	Unsatisfiable	.018
438	GEO121+1.p	Theorem	Timeout	5.030
439	GEO224+2.p	CounterSatisfiable	Timeout	5.051
440	GEO555+1.p	Theorem	Timeout	5.030
441	GEO560+1.p	Theorem	Timeout	5.030
442	GRA006+1.p	Theorem	Timeout	5.033
443	GRA012+1.p	Theorem	Timeout	5.031
444	GRA039^2.p	Theorem	GaveUp	.015
445	GRP009-1.p	Unsatisfiable	Unsatisfiable	.017
446	GRP025-1.p	Unsatisfiable	Unsatisfiable	.018
447	GRP030-2.p	Unsatisfiable	Unsatisfiable	.019
448	GRP107-1.p	Unsatisfiable	Unsatisfiable	.018
449	GRP517-1.p	Unsatisfiable	Timeout	5.030
450	GRP553-1.p	Unsatisfiable	Timeout	5.028
451	GRP590-1.p	Unsatisfiable	Timeout	5.030
452	GRP600-1.p	Unsatisfiable	Timeout	5.030
453	GRP603-1.p	Unsatisfiable	Timeout	5.030
454	GRP629+1.p	Theorem	Timeout	5.029
455	GRP641+3.p	Theorem	Timeout	5.029
456	GRP649+3.p	Theorem	Timeout	5.030
457	GRP651+3.p	Theorem	Timeout	5.029
458	GRP676-1.p	Unsatisfiable	Timeout	5.030
459	GRP693-1.p	Unsatisfiable	Timeout	5.030
460	GRP706-1.p	Unsatisfiable	Timeout	5.030
461	GRP758+1.p	Satisfiable	Timeout	5.028
462	GRP763-10.p	Satisfiable	Timeout	5.031
463	ITP001^2.p	Theorem	Theorem	.048
464	ITP003^2.p	Theorem	Theorem	.067
465	ITP008^2.p	Theorem	Timeout	5.044
466	ITP013^1.p	Theorem	Timeout	5.049
467	KLE012+1.p	Theorem	Timeout	5.036
468	KLE026+2.p	Theorem	Timeout	5.033
469	KLE045+1.p	Theorem	Timeout	5.035
470	KLE096-10.p	Unsatisfiable	Timeout	5.032
471	KLE142+1.p	Theorem	Timeout	5.033
472	KLE145+1.p	Theorem	Timeout	5.033
473	KLE145-10.p	Unsatisfiable	Timeout	5.029
474	KLE147-10.p	Unsatisfiable	Timeout	5.029
475	KRS001+1.p	Theorem	Theorem	.017
476	KRS096+1.p	Theorem	Theorem	.015
477	KRS206+1.p	Theorem	Theorem	.018
478	LAT208-1.p	Unsatisfiable	Timeout	5.030
479	LAT212-1.p	Unsatisfiable	Timeout	5.030
480	LAT285+3.p	Theorem	Timeout	5.028
481	LAT289+3.p	Theorem	Timeout	5.029
482	LAT355+3.p	Theorem	Timeout	5.029
483	LCL024-10.p	Unsatisfiable	Timeout	5.031
484	LCL026-10.p	Unsatisfiable	Timeout	5.030
485	LCL063+1.p	Theorem	Timeout	5.030
486	LCL156-1.p	Unsatisfiable	Timeout	5.029
487	LCL182-1.p	Unsatisfiable	Timeout	5.029
488	LCL195-3.p	Unsatisfiable	Timeout	5.030
489	LCL206-3.p	Unsatisfiable	Timeout	5.030
490	LCL207-1.p	Unsatisfiable	Timeout	5.029
491	LCL234-1.p	Unsatisfiable	Timeout	5.029
492	LCL245-1.p	Satisfiable	Timeout	5.028
493	LCL252-3.p	Unsatisfiable	Timeout	5.029
494	LCL292-3.p	Satisfiable	Timeout	5.029
495	LCL295-3.p	Unsatisfiable	Unsatisfiable	.017
496	LCL303-3.p	Unsatisfiable	Timeout	5.030
497	LCL316-3.p	Unsatisfiable	Timeout	5.029
498	LCL326-3.p	Unsatisfiable	Timeout	5.030
499	LCL375-10.p	Unsatisfiable	Timeout	5.029
500	LCL413-1.p	Satisfiable	Timeout	5.028

Recommendations

Timeouts (157/500 = 31.4%): Z3 timed out on a significant fraction of problems, particularly in domains GRP (group theory), LCL (logic calculi), SET (set theory), KLE (Kleene algebra), NUM (number theory), SWC/SWV/SWW (software verification), SEU/SEV (set/type theory). These are challenging for resolution-style or SMT reasoning within 5 seconds. Consider investigating whether Z3's TPTP front-end uses complete search strategies that could be bounded earlier, or whether domain-specific tactics could be applied.
Crashes (1): SWW989_1.p (expected Theorem) caused a crash with no SZS output and a non-zero exit code. This should be investigated and filed as a bug — crashes in the TPTP front-end are unexpected regardless of problem difficulty.
Soundness: No soundness errors detected. All conclusive answers that could be cross-checked were consistent with expected statuses. This is a positive result.
GaveUp (15): Z3 returned GaveUp for 15 problems within the time budget. These are cases where Z3 determined it could not solve the problem with its current strategies. This is expected behavior for problems outside Z3's supported fragment.
Correct rate: Of 468 problems with a conclusive expected status, Z3 solved 297 correctly (63.5%). The remaining 36.5% were mostly timeouts on hard problems.
Domain coverage: Problems solved quickly (< 0.1 s) span AGT, CSR, DAT^, GRP (simple), KRS, MED, MGT, NUM (simple) — these are well within Z3's core competency. Problems consistently timing out tend to involve large axiom sets or require extended search.

Generated by TPTP Front-End Benchmark · ● 8M · ◷

expires on Jun 8, 2026, 5:17 PM UTC

2026-06-08T17:36:43Z

github-actions[bot]
Bot Jun 8, 2026
Author

This discussion was automatically closed because it expired on 2026-06-08T17:17:11.417Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TPTP Benchmark] master — 2026-05-25 #9622

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[TPTP Benchmark] master — 2026-05-25 #9622

Uh oh!

github-actions[bot] Bot May 25, 2026

Summary

Expected Status Distribution

Actual Verdict Distribution

⚠️ Critical: Soundness Errors

💥 Crashes

Status Mismatches (Theorem ↔ Unsatisfiable)

Recommendations

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Jun 8, 2026 Author

github-actions[bot]
Bot May 25, 2026

github-actions[bot]
Bot Jun 8, 2026
Author