Skip to content

RAG Knowledge Base: Add advanced chunking strategies #421

@JayBon24

Description

@JayBon24

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

In the current platform, the RAG knowledge base only provides three text chunking options. We suggest adding mainstream advanced strategies, especially Semantic Chunking, which uses embeddings to measure similarity between adjacent sentences/paragraphs and only splits when the topic actually changes. This keeps chunks more coherent and reduces context fragmentation caused by fixed or rule-only splitting.
We also recommend adding QA-based Chunking / QA Augmentation: during indexing, let an LLM read document sections and generate 3–5 hypothetical questions per section, then store these questions as retrieval anchors. At query time, user questions are matched against these generated QAs, which can significantly improve retrieval precision and intent matching, especially for complex or indirect queries.

2. Additional context or comments

As shown in the screenshot:
Image

3. Can you help us with this feature?

  • I am interested in contributing to this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions