How are the changes applied? Can git-callbacks invoke update process? #1401
-
|
I am using cocoindex to embed code-base, but I want to know more about how the updates are carried out. Q1) If a file is modified, are the targeted/relevant chunk(s) updated (since we have the old and new changes) or will the whole file be processed again? Q2) Using the successfully merged PR content (changes, old content, new content) provided by git webhook payload, can I invoke the update process? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
|
I think for Q2, I can use the custom "Target" class with mutate function to customize how data is updated. This framework is way better than Llama index!!! 🎊 |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
|
Thanks @Paradoxinit for the discussion! llamaindex is also great project - to best of my knowledge, they are not focusing on building declarative data framework at the moment - so probably not too much to compare :) Glad cocoindex is helpful! To be transparent - we only offer native github hook within the enterprise version at the moment (it doesn't need to exist as part of the core engine that we are offering at open source), but as @georgeh0 suggested, you can always make it part of the github workflow or build custom target around it. We will keep the community posted if there is any change of plans. love your discussions! |
Beta Was this translation helpful? Give feedback.
CocoIndex caches on the boundary of "CocoIndex functions". For builtin functions, caches are enabled for heavy processings (e.g. run an embedding model, or run LLM); for custom functions, caching can be enabled by cache = True (document). Once a file is modified, for those unchanged function invocations, cached output can be reused (e.g. if some chunks don't change, the function invocation to compute its embedding can be reused).
This blog has more explanations of how CocoIndex is trying to minimize recomputing across changes.
We haven't integrated with webhook yet, but one possible solution is to: