Skip to content

Complete ocr_openai.py Utility with TUI Interface, Chat Upload Integration, and Text Parsing #13

Description

@suprch4rg3d

Description:

The ocr_openai.py utility is currently incomplete and needs to be brought in line with the other utilities in the project. The following enhancements are required:

### Objectives:

Implement a TUI (Text User Interface) for ocr_openai.py, consistent with existing utilities such as ocr_tesseract.py

Enable both batch and interactive OCR processing using OpenAI’s Vision API (e.g., GPT-4o), supporting image and PDF input files

Integrate the tool with Chainlit's drag-and-drop interface to automatically route uploaded manuals through this OCR pipeline

Ensure configuration and logging behavior is consistent with the rest of the CLI tools in the suite

Add a text parser to extract and structure the OCR results into usable formats (e.g., Markdown or JSON)

    Do not forget this step — the parser is critical for downstream processing

Optionally implement fallback handling for API failures, including timeouts, rate limits, or input size violations

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions