Making Hudi Github Project More Agentic #18324
Replies: 4 comments 6 replies
-
|
Yes, i think this would be very useful for the community. How do we plan to keep the AGENTS/ARCHITECTURE.md files updates with major changes? Also, if you could share some of your own experience of how you use agents with Hudi repo, it would be very helpful for the community. |
Beta Was this translation helpful? Give feedback.
-
|
@soumilshah1995 One idea that we were talking about in the hudi community sync that @vinothchandar brought up was the idea of revamping the current hudi-cli to embrace this new era of agents. Ideally giving |
Beta Was this translation helpful? Give feedback.
-
|
I think indeed this is a very important step that the Hudi community needs to take. I feel there is a fundamental gap here that an AI-native experience can bridge for the project. For many years now, I think we've been having this divide where users want both advanced, powerful features and don't want to deal with the complexity of tuning and learning configs and things like that. Without AI, this unfortunately was not possible. For any other database, you get a system that can do many things powerfully; then you also need to spend the time learning the system. As the data lake became a lake house, you need data engineers to also level up and become kind of data lake administrators. Unfortunately, this never happened. Engineers are sold completely closed solutions even today for the sake of ease of use. But I think AI presents us an opportunity to bridge this gap once and for all for open source. I would like to reimagine the interface for the project and for any user to build a lake house to be a Claude Code or codex-like terminal experience. I think with lots of will and a little bit of skill, I think we can go get there much faster than we think. The vision is you create, build, optimize, deploy -- everything from a terminal agentic CLI. For starters
I built a toy MCP server with some 27 tools, over the weekend, with auto discovery of tables (local fs for now, we can bring catalogs and cloud storage in there pretty easily). I think we can even make |
Beta Was this translation helpful? Give feedback.
-
|
@soumilshah1995 on the original AGENTS.md/ARCHITECTURE.md/CLAUDE.md +1 . I think @yihua has a PR for CLAUDE.md file. we should consider that as well. but happy to take contributions around this. get it landed and keep chipping away |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone — I had a small proposal that could help make our open-source GitHub repos much more AI-agent friendly and easier to maintain.
Idea: introduce a few lightweight repo conventions so AI agents understand the project better and can contribute safely.
A.
AGENTS.mdA file that explains the repository structure, coding conventions, and important modules so AI agents understand how the project is organized before making changes.
B.
ARCHITECTURE.mdA high-level overview of the system architecture, key components, and data flow. This helps agents reason about design decisions instead of only reading code.
C. Documentation Skill (AI-based)
Introduce an automated documentation workflow where an AI agent periodically scans the repo and generates or updates documentation for APIs, modules, and examples. This could help keep open-source documentation consistently up to date.
Goal:
Make the repo easier for both humans and AI agents to understand, improve onboarding, and enable more reliable automated contributions.
Happy to experiment with a small prototype if folks think this could be useful.
Beta Was this translation helpful? Give feedback.
All reactions