[Question] About modeling document-chunk-entity graph in RAG use case

### Do you need to ask a question?

- [x] I have searched the existing question and discussions and this question is not already answered.
- [x] I believe this is a legitimate question, not just a bug or feature request.

### Your Question

Hi LightRAG team,

We’re working on ontology-based RAG project using supply-chain data and official USTR documents. Now using the LightRAG as a base and relying heavily on a graph engine to model relationships between document and domain, connected by **event nodes** which hold date information.

We have some questions about best practices for graph modeling for document:

### 1. About using chunk_id as source_id instead of document_id
In LightRAG’s example(example/insert_custom_kg.py), each chunk is assigned a unique source_id. We’re currently following this, but we wonder if this is ideal for our case.

Since our downstream queries and reasoning are often document-level (e.g., linking supply-chain events to official documents), would it make more sense to assign the source_id based on the document instead of each chunk?

### 2. Use existing entity node or create new one?
Now we extract entities and relationships from each chunk of a USTR document. Naturally, the same name of entities extract in multiple chunks with slightly different descriptions.
About this, we considers below two approaches and would love your advice:

Approach1. **Use existing entity node**
Pros: Avoid redundancy, easier to count node with filtering entity name
Cons: Hard to handle descriptions of entity, risk of losing context

Approach2. **Create new entity node per chunk**
Pros: Keep contextual info intact, no conflict in metadata
Cons: Cause duplication, harder to analyze globally

Thank you so much in advance :)
Your insights would be helpful as scale our RAG project.

Best Regards,
Byeonggyu

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] About modeling document-chunk-entity graph in RAG use case #1616

Do you need to ask a question?

Your Question

1. About using chunk_id as source_id instead of document_id

2. Use existing entity node or create new one?

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] About modeling document-chunk-entity graph in RAG use case #1616

Description

Do you need to ask a question?

Your Question

1. About using chunk_id as source_id instead of document_id

2. Use existing entity node or create new one?

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions