Skip to content

Commit ef4aca7

Browse files
committed
feat: production-grade SDK hardening (7-phase audit)
Phase 1 - Data loss & correctness: - Default store_intermediate to True (prevents silent data loss) - Smart defaults: build() no longer unconditionally injects doc_intel - DataPackage.created_at uses server timestamp when available - Replace silent exception swallowing with logged parse_warnings - Add transient retry logic to pipeline wait() for HTTP 5xx - Fix file_to_base64 filename parameter priority - Fix PipelineResultResponse private fields with ConfigDict(extra=allow) Phase 2 - Exception handling & async: - Catch httpx.TransportError (subsumes Timeout + Connect) - Add TransportError intermediate class in exception hierarchy - Extract shared _parse_api_response to eliminate sync/async duplication - Fix data.get("success") identity comparison - Move Content-Type header from client init to post() only - Fix AsyncPipeline.validate to async def - Add NullHandler to logger; rename format -> fmt parameter - Integrate request/response logging into base client Phase 3 - DRY refactoring: - Deduplicate enrichment models (import from chunking) - Extract shared _inject_response_metadata / _build_common_request_body - Consolidate retry constants into _constants.py - Remove dead code (_looks_like_file_path, unused API_VERSION) - Add cycle detection to topological sort Phase 4 - Developer experience: - Client-side validation for threshold, mode, dimension parameters - from_yaml returns PipelineBuilder for method chaining - add() resolves aliases and rejects duplicates - Informative __repr__ on JobError / JobTimeoutError - Add on_progress callback to wait() with proper Callable typing - cancel() returns typed JobCancelResponse Phase 5 - Model & type safety: - Add RESUMABLE to JobStatus Literal - Type pipeline_report as PipelineReport | None - FileInput model_validator enforces at-least-one-of base64/url - PipelineValidationError inherits from LatenceError - Constrain Entity.score and Relation.confidence to [0.0, 1.0] - Remove "chunking" from ServiceName Literal Phase 6 - Test coverage (200 tests): - New test_models.py: Pydantic parsing, constraints, all statuses - New test_data_package.py: composition, merge, archive, warnings - New test_builder_validation.py: smart defaults, strict, aliases, cycles Phase 7 - Packaging: - Add py.typed marker for mypy - Pin httpx<1 and pydantic<3 upper bounds - Secure API key: _api_key with @Property, masked __repr__ Critical fix found during final reflection: - Fix execute() body construction to map configs by service NAME not positional INDEX (broken when validation auto-injects services) Made-with: Cursor
1 parent 46e10c4 commit ef4aca7

29 files changed

Lines changed: 2103 additions & 818 deletions

SDK_TUTORIAL.md

Lines changed: 127 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This tutorial covers every feature of the Latence AI Python SDK: the **Data Intelligence Pipeline** for multi-stage document processing, **direct API access** to individual services, job management, credits, async usage, file handling, error handling, and configuration.
44

5-
> **Prerequisites**: Python 3.10+, a Latence API key from [app.latence.ai](https://app.latence.ai)
5+
> **Prerequisites**: Python 3.9+, a Latence API key from [app.latence.ai](https://app.latence.ai)
66
77
---
88

@@ -143,11 +143,12 @@ The `steps` dict lets you configure each stage. Keys are short aliases:
143143
| `knowledge_graph` or `ontology` | Relation Extraction |
144144
| `redaction` | PII Redaction |
145145
| `compression` | Text Compression |
146-
| `chunking` | Text Chunking |
147146
| `embedding` | Dense Embeddings |
148147
| `colbert` | ColBERT Embeddings |
149148
| `colpali` | ColPali Embeddings |
150149

150+
> **Note**: Chunking is not available as a pipeline step. Use `client.experimental.chunking.chunk()` for standalone text chunking.
151+
151152
```python
152153
job = client.pipeline.run(
153154
files=["financial_report.pdf"],
@@ -222,11 +223,10 @@ builder.doc_intel(
222223
mode="default", # "default" or "performance"
223224
output_format="markdown", # "markdown", "json", "html", "xlsx"
224225
max_pages=None, # limit pages processed
225-
target_longest=None, # image preprocessing dimension
226-
use_layout_detection=True,
227-
use_chart_recognition=True,
228-
use_seal_recognition=False,
226+
use_ocr_for_image_block=False, # extract text from embedded images (+$0.25/1k pages)
229227
)
228+
# Layout detection, chart/seal recognition, and auto-rotate are pre-configured
229+
# for optimal pipeline results. For full control, use the direct API.
230230

231231
# Entity Extraction
232232
builder.extraction(
@@ -247,52 +247,43 @@ builder.ontology(
247247
optimize_relations=True, # refine relation labels (1.5x credits)
248248
predict_missing_relations=False, # predict implicit links (2.5x credits)
249249
relation_threshold=0.5,
250-
kg_output_format="custom", # "custom", "property_graph", "rdf_turtle"
250+
kg_output_format="custom", # "custom", "property_graph", "rdf"
251251
)
252252

253253
# Redaction
254254
builder.redaction(
255-
mode="balanced", # "balanced" or "strict"
255+
mode="balanced", # "balanced", "strict", "recall", "precision"
256256
threshold=0.3,
257257
redact=True,
258258
redaction_mode="mask", # "mask" or "replace"
259-
enable_refinement=False,
260259
chunk_size=1024,
261260
)
261+
# Full LLM refinement is always enabled in pipeline redaction for quality.
262+
# For manual refinement control, use the direct API.
262263

263264
# Compression
264265
builder.compression(
265266
compression_rate=0.5, # fraction of tokens to remove (0.0-1.0)
266-
force_preserve_digit=False,
267+
force_preserve_digit=True, # preserve numeric tokens (default: True)
267268
force_tokens=None, # tokens to always keep, e.g. ["API", "JSON"]
268269
apply_toon=False, # TOON encoding (+$0.50/1M tokens)
269-
chunk_size=512,
270+
chunk_size=4096, # max tokens per chunk (default: 4096)
270271
fallback_mode=True,
271272
)
272273

273-
# Chunking
274-
builder.chunking(
275-
strategy="hybrid", # "character", "token", "semantic", "hybrid"
276-
chunk_size=512, # 64-8192
277-
chunk_overlap=50,
278-
min_chunk_size=64,
279-
semantic_threshold=0.5, # 0.1-0.95 (semantic/hybrid only)
280-
semantic_window_size=3, # 1-10 (semantic/hybrid only)
281-
)
282-
283-
# Embedding
274+
# Embedding (experimental pipeline step)
284275
builder.embedding(
285276
dimension=512, # 256, 512, 768, or 1024
286277
encoding_format="float", # "float" or "base64"
287278
)
288279

289-
# ColBERT
280+
# ColBERT (experimental pipeline step)
290281
builder.colbert(
291282
is_query=False,
292283
query_expansion=False,
293284
)
294285

295-
# ColPali
286+
# ColPali (experimental pipeline step)
296287
builder.colpali(is_query=False)
297288

298289
# Pipeline options
@@ -302,15 +293,36 @@ builder.strict() # disable auto-injection of services
302293
config = builder.build()
303294
```
304295

296+
> **Note**: Chunking is not available as a pipeline step -- `builder.chunking()` raises `NotImplementedError`. Use `client.experimental.chunking.chunk()` for standalone text chunking.
297+
298+
### Pipeline execution model
299+
300+
The pipeline worker executes services as a **directed acyclic graph (DAG)**, not a linear chain. Services that share the same parent can run concurrently:
301+
302+
```
303+
┌─── extraction ──── ontology
304+
305+
document_intelligence ─┼─── redaction
306+
307+
├─── compression
308+
309+
├─── embedding
310+
311+
├─── colbert
312+
313+
└─── colpali
314+
```
315+
316+
This means `extraction`, `redaction`, `compression`, and embedding services all run in parallel once `document_intelligence` completes. The SDK and worker automatically handle ordering -- you just declare which services you want.
317+
305318
---
306319

307320
## 5. Pipeline: PipelineConfig Object
308321

309322
For maximum control, construct a `PipelineConfig` directly:
310323

311324
```python
312-
from latence import PipelineConfig
313-
from latence._models.pipeline import ServiceConfig
325+
from latence import PipelineConfig, ServiceConfig
314326

315327
config = PipelineConfig(
316328
services=[
@@ -352,8 +364,6 @@ steps:
352364
document_intelligence:
353365
mode: performance
354366
output_format: markdown
355-
use_layout_detection: true
356-
use_chart_recognition: false
357367

358368
extraction:
359369
label_mode: hybrid
@@ -419,6 +429,62 @@ job.cancel()
419429
pkg = job.data_package
420430
```
421431

432+
### Resumable jobs
433+
434+
If a pipeline fails partway through, it may enter `RESUMABLE` status. Completed stages are checkpointed and only remaining stages re-execute on resume:
435+
436+
```python
437+
try:
438+
pkg = job.wait_for_completion()
439+
except JobError as e:
440+
if e.is_resumable:
441+
print(f"Job failed at a stage but is resumable: {e.message}")
442+
pkg = job.resume().wait_for_completion()
443+
else:
444+
raise
445+
```
446+
447+
### Intermediate results and report
448+
449+
Access per-stage download URLs and the structured pipeline report while a job is running or after completion:
450+
451+
```python
452+
# Per-stage download URLs (presigned B2 URLs to results.jsonl)
453+
stages = job.intermediate_results()
454+
for stage in stages:
455+
print(f"{stage.service}: {stage.download_url}")
456+
457+
# Structured pipeline report (dataset facts, per-stage metrics)
458+
report = job.report
459+
if report:
460+
print(report)
461+
```
462+
463+
### Validate before running
464+
465+
Check a pipeline configuration without executing it:
466+
467+
```python
468+
from latence import PipelineBuilder
469+
470+
builder = PipelineBuilder().doc_intel().extraction().ontology()
471+
result = client.pipeline.validate(builder, files=["doc.pdf"])
472+
print(result.valid) # True/False
473+
print(result.errors) # list of errors
474+
print(result.warnings) # list of warnings
475+
print(result.auto_injected) # services auto-added
476+
```
477+
478+
### Get available stages
479+
480+
List the per-stage download links for a completed job:
481+
482+
```python
483+
stages = client.pipeline.stages("pipe_abc123")
484+
for s in stages:
485+
print(f"{s.service}: {s.download_url}")
486+
```
487+
422488
### Status values
423489

424490
| Status | Meaning |
@@ -428,6 +494,7 @@ pkg = job.data_package
428494
| `COMPLETED` | Finished successfully |
429495
| `CACHED` | Results retrieved from cache |
430496
| `PULLED` | Results pulled from storage |
497+
| `RESUMABLE` | Failed partway through; call `job.resume()` to continue |
431498
| `FAILED` | Pipeline failed |
432499
| `CANCELLED` | Cancelled by user |
433500

@@ -508,10 +575,10 @@ if pkg.compression:
508575
if pkg.chunking:
509576
print(pkg.chunking.summary.num_chunks)
510577
print(pkg.chunking.summary.strategy) # "hybrid"
511-
print(pkg.chunking.summary.avg_chunk_size)
578+
print(pkg.chunking.summary.chunk_size) # target chunk size parameter
512579

513580
for chunk in pkg.chunking.chunks:
514-
print(f"Chunk: {chunk[:80]}...")
581+
print(f"Chunk: {chunk}")
515582
```
516583

517584
### Enrichment section
@@ -525,18 +592,23 @@ if pkg.enrichment:
525592
for chunk in pkg.enrichment.chunks:
526593
print(chunk)
527594

528-
for feature_set in pkg.enrichment.features:
529-
print(feature_set)
595+
for name, data in pkg.enrichment.features.items():
596+
print(f"{name}: {data}")
530597
```
531598

532599
### Quality report
533600

534601
```python
535602
print(pkg.quality.total_cost_usd)
536-
print(pkg.quality.pipeline_name)
537-
print(pkg.quality.services_run) # ["document_intelligence", "extraction", "ontology"]
538-
print(pkg.quality.total_stages)
539-
print(pkg.quality.execution_summary)
603+
print(pkg.quality.total_processing_time_ms)
604+
605+
for stage in pkg.quality.stages:
606+
print(f"{stage.service}: {stage.status} ({stage.processing_time_ms}ms, ${stage.credits_used})")
607+
608+
# Confidence scores
609+
print(pkg.quality.confidence.entity_avg_confidence)
610+
print(pkg.quality.confidence.graph_completeness)
611+
print(pkg.quality.confidence.ocr_quality)
540612
```
541613

542614
### Download as ZIP archive
@@ -548,28 +620,37 @@ print(f"Saved to {path}")
548620

549621
Archive structure:
550622
```
551-
results/
623+
{pipeline_name}/
552624
README.md
553625
document.md
554626
pages/
555627
page_001.md
556628
page_002.md
557629
entities.json
558630
knowledge_graph.json
559-
redaction.json
560-
compression.json
561-
summary.json
631+
redaction.json (if redaction ran)
632+
compression.json (if compression ran)
633+
chunking.json (if chunking ran)
634+
enrichment.json (if enrichment ran)
635+
quality_report.json
636+
metadata.json
562637
```
563638

564639
### Merge into flat dict
565640

566641
```python
567642
merged = pkg.merge()
568643
# {
644+
# "id": "pipe_xxx",
645+
# "name": "My Pipeline",
646+
# "status": "COMPLETED",
647+
# "created_at": "2025-...",
569648
# "documents": [{"filename": "doc.pdf", "markdown": "...", "entities": [...], ...}],
570-
# "stats": {"documents": 1, "pages": 5, "entities": {...}, ...},
571-
# "meta": {"pipeline_name": "...", "services": [...], ...},
649+
# "summary": {"documents": 1, "pages": 5, "entities": {...}, "relations": {...}, ...},
572650
# }
651+
652+
# Save merged output directly to a JSON file:
653+
pkg.merge(save_to="./results.json")
573654
```
574655

575656
---
@@ -827,7 +908,7 @@ result = red.detect_pii(
827908
result = red.detect_pii(
828909
text="...",
829910
config={
830-
"mode": "balanced", # "balanced" | "strict"
911+
"mode": "balanced", # "balanced" | "strict" | "recall" | "precision"
831912
"threshold": 0.3, # confidence threshold (0.0-1.0)
832913
"redact": True, # detect only vs. detect and redact
833914
"redaction_mode": "mask", # "mask" | "replace"
@@ -879,7 +960,7 @@ result = ont.build_graph(
879960
"resolve_entities": True, # merge duplicates (2.0x credits)
880961
"optimize_relations": True, # refine labels with LLM (1.5x credits)
881962
"predict_missing_relations": True, # predict implicit links (2.5x credits)
882-
"kg_output_format": "custom", # "custom" | "rdf_turtle" | "property_graph"
963+
"kg_output_format": "custom", # "custom" | "rdf" | "property_graph"
883964
"relation_threshold": 0.6,
884965
"symmetric": True,
885966
"generate_knowledge_graph": True,
@@ -901,14 +982,14 @@ result = ont.build_graph(
901982
)
902983
```
903984

904-
### RDF/Turtle format
985+
### RDF format
905986

906987
```python
907988
result = ont.build_graph(
908989
text="...",
909990
entities=[...],
910991
config={
911-
"kg_output_format": "rdf_turtle",
992+
"kg_output_format": "rdf",
912993
"namespace_uri": "http://example.org/ontology#",
913994
},
914995
)

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,8 @@ classifiers = [
2424
"Typing :: Typed",
2525
]
2626
dependencies = [
27-
"httpx[http2]>=0.27.0",
28-
"pydantic>=2.0.0",
27+
"httpx[http2]>=0.27.0,<1",
28+
"pydantic>=2.0.0,<3",
2929
]
3030

3131
[project.optional-dependencies]

src/latence/__init__.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@
8484
NotFoundError,
8585
RateLimitError,
8686
ServerError,
87+
TransportError,
8788
ValidationError,
8889
)
8990

@@ -130,6 +131,7 @@
130131
PipelineReport,
131132
PipelineResultResponse,
132133
PipelineSubmitResponse,
134+
ServiceConfig,
133135
StageDownload,
134136
StageStatus,
135137
)
@@ -151,6 +153,7 @@
151153
"ValidationError",
152154
"RateLimitError",
153155
"ServerError",
156+
"TransportError",
154157
"APIConnectionError",
155158
"APITimeoutError",
156159
"JobError",
@@ -160,7 +163,6 @@
160163
"Job",
161164
"AsyncJob",
162165
"DataPackage",
163-
"PipelineBuilder",
164166
# Common models
165167
"Entity",
166168
"KnowledgeGraph",
@@ -191,6 +193,7 @@
191193
"PipelineReport",
192194
"PipelineResultResponse",
193195
"PipelineSubmitResponse",
196+
"ServiceConfig",
194197
"StageDownload",
195198
"StageStatus",
196199
]

0 commit comments

Comments
 (0)