Three parts
Merkle <=> VFS
Need to expose merkle storages as vfs. Mostly for structuring and desining how do we encode and store UNIX metadata inside the merkle storage. The storage integration for gridway(caching, rocksdb integ, etc) is a separate work and is orthogonal to vfs work. The main focus is, how do we make an abstract adapter layer above the backing merkle storage, so that the WASI will recognize this storage as a UNIX filesystem instead of raw kvstore.
Some bulletpoints
- what exact set of POSIX calls to support. For example, we may not need to support symlinks or lock-based access.
- how should we store the directory structure. Since we are planning to support both JMT and Merk, we can't rely on iterators(JMT doesn't have it). We may need some secondary structure to store iterable children but we also don't want to make the implementation too complex. Need to research more in this topic.
- how should we store the permission info. rwx is a good abstraction and could be nicely integrated with blockchain context actually. But we have to decide if we are going for full ocap mindset or more lean to stick with posix.
VFS <=> WASI
We need to actually make VFS accessible from WASI which is painful so far. WASI 0.2 is based on components and by substituting the filesystem world with our host implementation we can intercept all filesystem related syscalls to be handled by our callback. See add_to_linker from each bindgen generated files and their usages. The problem is that the complex type signatures and bindgen for this part makes really hard for an agent to solve the task. I think it would be better if I resolve this.
WASI <=> program
We need to design how the programs actually use the filesystem api. Since it is backed by key value store actually, we can make users to naturally create new file for each entry. Or, we can use it like actual filesystem, where an advanced db management system like sqlite layers above.
Both have major technical drawbacks:
- file-per-entry requires that we have to expect like a million files for each directory. Without iteration support, this may cause a problem when we want to, say, find a file with specific prefix, but not sure what is the whole file name? Basically we are giving up a db-like features in this way
- aggregated file is good, but this means we should have a guest side overhead for managing indexing and sorting. Also, if rocksdb is not handling partial write faster then whole rewrite, all these are meaningless as we have to reflect that in the gas price and having aggregated file would be order of magnitude expensive than individual file.
I think we should support both ideally. At the interface level both are permitted. Maybe we can incentivize the users to go for individual file approach when they dont have to deal with unknown keys, range queries, and ocap permissions are important; if the rocksdb handles partial write more efficiently, the gas price to partial rewrite a file would not exceed if it was a separate file, so the users can opt in to use a middleware that handles the file in this way, possibly giving more features to use.
@claude read the text and evaluate if this is a correct description for the current codebase, and if all my assumptions hold and valid. And then write a good plan/proposal/report document based on this.
Three parts
Merkle <=> VFS
Need to expose merkle storages as vfs. Mostly for structuring and desining how do we encode and store UNIX metadata inside the merkle storage. The storage integration for gridway(caching, rocksdb integ, etc) is a separate work and is orthogonal to vfs work. The main focus is, how do we make an abstract adapter layer above the backing merkle storage, so that the WASI will recognize this storage as a UNIX filesystem instead of raw kvstore.
Some bulletpoints
VFS <=> WASI
We need to actually make VFS accessible from WASI which is painful so far. WASI 0.2 is based on components and by substituting the filesystem world with our host implementation we can intercept all filesystem related syscalls to be handled by our callback. See add_to_linker from each bindgen generated files and their usages. The problem is that the complex type signatures and bindgen for this part makes really hard for an agent to solve the task. I think it would be better if I resolve this.
WASI <=> program
We need to design how the programs actually use the filesystem api. Since it is backed by key value store actually, we can make users to naturally create new file for each entry. Or, we can use it like actual filesystem, where an advanced db management system like sqlite layers above.
Both have major technical drawbacks:
I think we should support both ideally. At the interface level both are permitted. Maybe we can incentivize the users to go for individual file approach when they dont have to deal with unknown keys, range queries, and ocap permissions are important; if the rocksdb handles partial write more efficiently, the gas price to partial rewrite a file would not exceed if it was a separate file, so the users can opt in to use a middleware that handles the file in this way, possibly giving more features to use.
@claude read the text and evaluate if this is a correct description for the current codebase, and if all my assumptions hold and valid. And then write a good plan/proposal/report document based on this.