oomol-lab · Moskize91 · Feb 3, 2026 · Feb 3, 2026 · Feb 3, 2026 · coderabbitai
diff --git a/README.md b/README.md
@@ -95,8 +95,8 @@ transform_markdown(
     ignore_pdf_errors=False,  # Optional: continue on PDF rendering errors
     ignore_ocr_errors=False,  # Optional: continue on OCR recognition errors
     generate_plot=False,  # Optional: generate visualization charts
-    toc_mode=TocExtractionMode.NO_TOC_PAGE,  # Optional: TOC extraction mode
     toc_llm=None,  # Optional: LLM instance for enhanced TOC extraction
+    toc_assumed=False,  # Optional: whether to assume TOC pages exist (default: False)
 )
 ```
 
@@ -118,8 +118,8 @@ transform_epub(
     ignore_pdf_errors=False,  # Optional: continue on PDF rendering errors
     ignore_ocr_errors=False,  # Optional: continue on OCR recognition errors
     generate_plot=False,  # Optional: generate visualization charts
-    toc_mode=TocExtractionMode.AUTO_DETECT,  # Optional: TOC extraction mode
     toc_llm=None,  # Optional: LLM instance for enhanced TOC extraction
+    toc_assumed=True,  # Optional: whether to assume TOC pages exist (default: True for EPUB)
     book_meta=BookMeta(
         title="Book Title",
         authors=["Author 1", "Author 2"],
@@ -208,20 +208,19 @@ The `inline_latex` parameter (EPUB only, default: `True`) controls whether to pr
 
 ### Table of Contents Detection
 
-The `toc_mode` parameter controls how pdf-craft extracts table of contents information:
+The `toc_assumed` parameter controls how pdf-craft handles table of contents extraction:
 
-- `TocExtractionMode.NO_TOC_PAGE` (default for Markdown): Generates TOC based on document headings only, without detecting TOC pages
-- `TocExtractionMode.AUTO_DETECT` (default for EPUB): Detects TOC pages using statistical analysis and extracts chapter structure
-- `TocExtractionMode.LLM_ENHANCED`: Detects TOC pages and uses LLM to extract hierarchical chapter structure with improved accuracy. **Requires `toc_llm` parameter to be configured.**
+- `False` (default for Markdown): Assumes no TOC pages exist. The conversion generates TOC based on document headings only, without detecting or processing TOC pages.
+- `True` (default for EPUB): Assumes TOC pages exist. The conversion uses statistical analysis to detect TOC pages and extract chapter structure.
 
-For books with complex chapter hierarchies, `LLM_ENHANCED` mode provides the most accurate results.
+For books with complex chapter hierarchies, you can configure the optional `toc_llm` parameter to enable LLM-powered chapter title analysis, which provides more accurate TOC hierarchy detection.
 
 #### LLM-Enhanced TOC Extraction
 
 To use LLM-enhanced TOC extraction, you need to configure an LLM instance:
 
 ```python
-from pdf_craft import transform_epub, BookMeta, LLM, TocExtractionMode
+from pdf_craft import transform_epub, BookMeta, LLM
 
 # Configure LLM for TOC extraction
 toc_llm = LLM(
@@ -237,8 +236,8 @@ toc_llm = LLM(
 transform_epub(
     pdf_path="input.pdf",
     epub_path="output.epub",
-    toc_mode=TocExtractionMode.LLM_ENHANCED,
-    toc_llm=toc_llm,
+    toc_assumed=True,  # Enable TOC detection
+    toc_llm=toc_llm,  # Enable LLM-powered chapter title analysis
     book_meta=BookMeta(
         title="Book Title",
         authors=["Author"],

diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -95,8 +95,8 @@ transform_markdown(
     ignore_pdf_errors=False,  # 可选：遇到 PDF 渲染错误时继续处理
     ignore_ocr_errors=False,  # 可选：遇到 OCR 识别错误时继续处理
     generate_plot=False,  # 可选：生成可视化图表
-    toc_mode=TocExtractionMode.NO_TOC_PAGE,  # 可选：目录提取模式
     toc_llm=None,  # 可选：用于增强目录提取的 LLM 实例
+    toc_assumed=False,  # 可选：是否假定存在目录页（默认：False）
 )
 ```
 
@@ -118,8 +118,8 @@ transform_epub(
     ignore_pdf_errors=False,  # 可选：遇到 PDF 渲染错误时继续处理
     ignore_ocr_errors=False,  # 可选：遇到 OCR 识别错误时继续处理
     generate_plot=False,  # 可选：生成可视化图表
-    toc_mode=TocExtractionMode.AUTO_DETECT,  # 可选：目录提取模式
     toc_llm=None,  # 可选：用于增强目录提取的 LLM 实例
+    toc_assumed=True,  # 可选：是否假定存在目录页（EPUB 默认：True）
     book_meta=BookMeta(
         title="书名",
         authors=["作者1", "作者2"],
@@ -208,20 +208,19 @@ transform_markdown(
 
 ### 目录检测
 
-`toc_mode` 参数控制 pdf-craft 如何提取目录信息：
+`toc_assumed` 参数控制 pdf-craft 如何处理目录提取：
 
-- `TocExtractionMode.NO_TOC_PAGE`（Markdown 默认值）：仅基于文档标题生成目录，不检测目录页
-- `TocExtractionMode.AUTO_DETECT`（EPUB 默认值）：使用统计分析检测目录页并提取章节结构
-- `TocExtractionMode.LLM_ENHANCED`：检测目录页并使用 LLM 提取层级化的章节结构，准确度更高。**需要配置 `toc_llm` 参数。**
+- `False`（Markdown 默认值）：假定不存在目录页。转换过程仅基于文档标题生成目录，不检测或处理目录页。
+- `True`（EPUB 默认值）：假定存在目录页。转换过程使用统计分析检测目录页并提取章节结构。
 
-对于具有复杂章节层级的书籍，`LLM_ENHANCED` 模式能提供最准确的结果。
+对于具有复杂章节层级的书籍，你可以配置可选的 `toc_llm` 参数来启用 LLM 驱动的章节标题分析，这能提供更准确的目录层级检测。
 
 #### LLM 增强目录提取
 
 要使用 LLM 增强的目录提取功能，你需要配置一个 LLM 实例：
 
 ```python
-from pdf_craft import transform_epub, BookMeta, LLM, TocExtractionMode
+from pdf_craft import transform_epub, BookMeta, LLM
 
 # 配置用于目录提取的 LLM
 toc_llm = LLM(
@@ -237,8 +236,8 @@ toc_llm = LLM(
 transform_epub(
     pdf_path="input.pdf",
     epub_path="output.epub",
-    toc_mode=TocExtractionMode.LLM_ENHANCED,
-    toc_llm=toc_llm,
+    toc_assumed=True,  # 启用目录检测
+    toc_llm=toc_llm,  # 启用 LLM 驱动的章节标题分析
     book_meta=BookMeta(
         title="书名",
         authors=["作者"],

diff --git a/docs/changelog/v1.0.10.md b/docs/changelog/v1.0.10.md
@@ -0,0 +1,110 @@
+This release simplifies the table of contents (TOC) extraction API by replacing enum-based modes with a boolean flag, while adding LLM-powered chapter title analysis capabilities for improved TOC hierarchy detection.
+
+## What's Changed
+
+### Breaking Changes
+
+* **Simplified TOC API**: Replaced `TocExtractionMode` enum with a simpler `toc_assumed` boolean parameter in https://github.com/oomol-lab/pdf-craft/pull/341
+  - Removed `toc_mode` parameter from `transform_markdown()` and `transform_epub()` functions
+  - Removed `TocExtractionMode` from public API exports
+  - Introduced `toc_assumed` boolean flag to control TOC detection behavior
+
+### Features
+
+* **LLM-Powered Chapter Title Analysis**: Added support for LLM-based analysis of chapter titles to enhance TOC extraction accuracy in https://github.com/oomol-lab/pdf-craft/pull/341
+  - Automatically analyzes chapter title hierarchies when `toc_llm` is configured
+  - Provides more accurate chapter level detection for complex book structures
+  - Intelligently falls back to standard analysis when LLM is unavailable or encounters errors
-* **LLM-Powered Chapter Title Analysis**: Added support for LLM-based analysis of chapter titles to enhance TOC extraction accuracy in https://github.com/oomol-lab/pdf-craft/pull/341
-  - Automatically analyzes chapter title hierarchies when `toc_llm` is configured
-  - Provides more accurate chapter level detection for complex book structures
-  - Intelligently falls back to standard analysis when LLM is unavailable or encounters errors
+* **LLM-Powered Chapter Title Analysis**: Added support for LLM-based analysis of chapter titles to enhance TOC extraction accuracy in https://github.com/oomol-lab/pdf-craft/pull/341
+  - Automatically analyzes chapter title hierarchies when `toc_llm` is configured
+  - Provides more accurate chapter-level detection for complex book structures
+  - Intelligently falls back to standard analysis when LLM is unavailable or encounters errors
-* **LLM-Powered Chapter Title Analysis**: Added support for LLM-based analysis of chapter titles to enhance TOC extraction accuracy in https://github.com/oomol-lab/pdf-craft/pull/341
-  - Automatically analyzes chapter title hierarchies when `toc_llm` is configured
-  - Provides more accurate chapter level detection for complex book structures
-  - Intelligently falls back to standard analysis when LLM is unavailable or encounters errors
+* **LLM-Powered Chapter Title Analysis**: Added support for LLM-based analysis of chapter titles to enhance TOC extraction accuracy in https://github.com/oomol-lab/pdf-craft/pull/341
+  - Automatically analyzes chapter title hierarchies when `toc_llm` is configured
+  - Provides more accurate chapter-level detection for complex book structures
+  - Intelligently falls back to standard analysis when LLM is unavailable or encounters errors
+
+### Improvements
+
+* **Enhanced Error Handling**: Added robust error handling for LLM-based analysis with automatic recovery mechanisms in https://github.com/oomol-lab/pdf-craft/pull/341
+  - Better error diagnostics for LLM analysis failures
+  - Graceful degradation when LLM analysis fails, ensuring conversion continues successfully
+
+## Migration Guide
+
+If you were using `toc_mode` in previous versions, update your code as follows:
+
+### Previous API (v1.0.9 and earlier)
+
+```python
+from pdf_craft import transform_markdown, TocExtractionMode
+
+# For Markdown conversion
+transform_markdown(
+    pdf_path="input.pdf",
+    markdown_path="output.md",
+    toc_mode=TocExtractionMode.NO_TOC_PAGE,  # Old parameter
+)
+
+# For EPUB conversion
+transform_epub(
+    pdf_path="input.pdf",
+    epub_path="output.epub",
+    toc_mode=TocExtractionMode.AUTO_DETECT,  # Old parameter
+)
+```
+
+### New API (v1.0.10)
+
+```python
+from pdf_craft import transform_markdown
+
+# For Markdown conversion (assumes no TOC pages by default)
+transform_markdown(
+    pdf_path="input.pdf",
+    markdown_path="output.md",
+    toc_assumed=False,  # New boolean parameter (default: False)
+)
+
+# For EPUB conversion (assumes TOC pages exist)
+transform_epub(
+    pdf_path="input.pdf",
+    epub_path="output.epub",
+    toc_assumed=True,  # New boolean parameter
+)
+```
+
+### Migration Mapping
+
+| Old `toc_mode` Value | New `toc_assumed` Value |
+|---------------------|------------------------|
+| `TocExtractionMode.NO_TOC_PAGE` | `False` |
+| `TocExtractionMode.AUTO_DETECT` | `True` |
+| `TocExtractionMode.LLM_ENHANCED` | `True` (with `toc_llm` configured) |
+
+## LLM-Enhanced TOC Extraction
+
+To use LLM-powered chapter title analysis:
+
+```python
+from pdf_craft import transform_epub, BookMeta, LLM
+
+# Configure LLM for TOC enhancement
+toc_llm = LLM(
+    key="your-api-key",
+    url="https://api.openai.com/v1",
+    model="gpt-4",
+    token_encoding="cl100k_base",
+)
+
+transform_epub(
+    pdf_path="input.pdf",
+    epub_path="output.epub",
+    toc_assumed=True,  # Enable TOC detection
+    toc_llm=toc_llm,   # Enable LLM-powered analysis
+    book_meta=BookMeta(
+        title="Book Title",
+        authors=["Author"],
+    ),
+)
+```
+
+## Notes
+
+- The `toc_assumed` parameter defaults to `False` for Markdown conversion and `True` for EPUB conversion (maintaining backward-compatible behavior)
+- LLM-powered chapter title analysis is optional and automatically falls back to standard analysis if not configured or if errors occur
+- The new API is simpler and more intuitive, reducing the cognitive load of choosing between multiple enum values
+
+**Full Changelog**: https://github.com/oomol-lab/pdf-craft/compare/v1.0.9...v1.0.10
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
 
 [tool.poetry]
 name = "pdf-craft"
-version = "1.0.9"
+version = "1.0.10"
 description = "PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books."
 license = "MIT"
 authors = ["Tao Zeyu <i@taozeyu.com>"]