Proposed solution:
- a Go interface is encoded as a top-level Proto3 message. In the example below, the Animal go interface is represented by a Animal proto3 message.
type Animal interface
type Dog struct
type Cat struct
type House struct {
Pet Animal
}
type Skateboard struct {
Decal Image
Rider Animal
}
becomes
message House {
Animal pet = 1;
}
message Skateboard {
Image decal = 1;
Animal rider = 2;
}
message Animal {
oneof concrete {
Dog dog = 1;
Cat cat = 2;
// New fields can be appended here,
// but field numbers can never be released
// for incrementing version numbers
// (see below).
}
// Plus, some sort of information about:
// - the fact that this represents an Amino interface
// - the version/revision number, incrementing
// - namespace under which concrete types are being registered
// NOTE: I may be butchering proto3 options. LMK.
option (amino.interface) = true;
option (amino.version) = 1;
option (amino.namespace) = "cosmos-hub/x/amino-example";
}
The first option, option (amino.interface) = true; serves as a hint to code generators or other client libs, that this is an interface. The second option, option (amino.version) = 1 allows clients to keep track of changes and declare compatibility information between code and data over different versions. The third option, option (amino.namespace) = "cosmos-hub/x/amino-example"; tells the code generator and the programmer that the concrete types registered under this Animal interface are for the module "cosmos-hub/x/amino-example". But I don't think we should call them modules, I think we should just call them namespaces -- not that I prefer it that way, but that I believe we must make this distinction, but that's a separate conversation, maybe.
To be specific with an example, let's talk about the PubKey interface in the Cosmos SDK. Concrete types of the standard PubKey interface include PubKeyEd25519 and PubKeySecp256k1. The Cosmos hub would probably register all usages of the PubKey interface under the same namespace of "tendermint".
var cdc = amino.NewCodec()
func init() {
// Register the PubKey interface under the namespace of "tendermint".
// This would be the standard usage as provided by Tendermint code.
i := cdc.RegisterInterface((*crypto.PubKey)(nil), {namespace="tendermint", version=1})
// Register an Ed25519 key, with field number of 1.
i.RegisterConcrete(1, PubKeyEd25519{}, "PubKeyEd25519")
// Register a Secp256k1 key, with field number of 2.
i.RegisterConcrete(2, PubKeySecp256k1{}, "PubKeySecp256k1")
// The same codec can hold multiply different namespaced versions of the same interface.
// This example is for some experimental logic with unsafe pubkeys.
i2 := cdc.RegisterInterface((*crypto.PubKey)(nil), {namespace="unsafe", version=1})
i2.RegisterConcrete(1, PubKeyEd25519{}, "PubKeyEd25519")
i2.RegisterConcrete(2, PubKeyUnsafeExample{}, "PubKeyUnsafeExample")
}
// In this example, we're showing how Go tags are used to denote to Amino
// which field numbers to use.
type Example struct {
Key1 PubKey `amino.namespace:"tendermint"`
Key2 PubKey `amino.namespace:"unsafe"`
}
The intent of the version is primarily to be explicit about code compatibility, but there may be other benefits too. For example, we could create tooling that scans code and automatically figures out if maybe the version & registered concrete types are not consistent with what is registered on some registry service, thus catching bugs... in the long future maybe this becomes part of dependency management, and project Makefiles etc can help fetch updates from a blockchain repository. Notice that we could support idempotency of concrete registrations per interface/version, so that all code that needs to use a particular namespaced interface can register it upon init() of that module without fear of panics or errors.
If there is no namespace="..." specified for an interface, then it should do something consistent. Maybe we can say that not specifying the namespace is the same as setting it to the default namespace, the global namespace.
In any case, the biggest change is that registering concrete types happens on that interface (or at least a namespaced version of it).
Rationale/benefits.
- (relatively) minimal changes to Amino code -- for every interface we need to be explicit about the concrete types to list under. we can program an option in amino to help print warnings it if appears that we may have missed something during the migration.
- complete compatibility with Proto3 with no need for disamb/prefix calculation.
- compact enough, faster than Any (ergo is already needed, ergo will probably be adopted).
- a direction toward full specification such that it becomes standard in relevant languages.
From #267 (comment):
There are many ways I can think of to do this:
(1) have prefixes at both the module level and type-level that define the oneof field number
(2) auto-generate one-of prefixes in InitChain and store them in state so that they're always the same for any chain
(3) create a cross-chain schema registry of oneof prefixes
These are all interesting ideas, but I think they all suffer from what was mentioned before - namely that different chains will have different .proto files for a lot of the same modules. I would like to see an ecosystem with reusable client side modules corresponding to on-chain modules and having different message definitions for the same module on different chains will be problematic without some special infrastructure to support this. So in general, I see this approach as another premature optimization over proto.Any.
Thank you, responding in order:
(1) (assuming you mean prefix in the general sense, and not the current amino disamb/prefix sense), I think another way to express it is, "make the field number a composite of two things; the module and the type". And this would be helpful in the sense that it would provide something similar to what the amino disamb/prefix bytes tries to provide with interfaces -- some practical conflict prevention system, such that there is no/less need to worry about local codecs, because they can all be thrown in a global per-module namespace. But I'm not sure, is there as line number in the PR that you can point me to, so I can understand more? (Should I just read the whole PR?).
(2) I do want the latest version of the schemas used by a chain to be in the state as well for some clients. Also, using the state to keep track of duplicates or conflicts (or perhaps keeping a priority ordering of concrete types) seems like a good idea.
(3) is basically what I am about to propose.
These are all interesting ideas, but I think they all suffer from what was mentioned before - namely that different chains will have different .proto files for a lot of the same modules.
Agreed!
I would like to see an ecosystem with reusable client side modules corresponding to on-chain modules and having different message definitions for the same module on different chains will be problematic without some special infrastructure to support this. So in general, I see this approach as another premature optimization over proto.Any.
I like the way you put it, or described it.
I don't think our standards will be adopted, if we had to use the proto.Any spec exactly, which isn't (AFAIK) fully supported in implementations, and doesn't quite suite needs of power users. Since power users already require an optimized standard, such a standard will emerge, so if we know how to, and it isn't hard to, we should go with that.
What are valid namespaces?
I think that namespaces should be in the form of a domain+path (I think maybe a https:// or cosmos:// spec is not needed yet, and could maybe default to a blockchain in the future). Maybe "tendermint" and a few other names can be special, and not require a "dot". Maybe "cosmos", "sdk", and "tendermint" are reserved? I dunno.
Proto4?
Going with the previous example, maybe in Protobuf4, we can have:
oneof tendermint/PubKey {
option (version)=1;
Ed25519 = 1;
Secp256k1 = 2;
}
message Example {
tendermint/PubKey key1 = 1;
unsafe/PubKey key2 = 2;
}
Other
The original proposal wording is here: https://gist.github.com/jaekwon/1b9dd3b2accd4b5f49661b49536232dc but I think the wording is confusing, so I tried to expand upon it here.
Proposed solution:
becomes
The first option,
option (amino.interface) = true;serves as a hint to code generators or other client libs, that this is an interface. The second option,option (amino.version) = 1allows clients to keep track of changes and declare compatibility information between code and data over different versions. The third option,option (amino.namespace) = "cosmos-hub/x/amino-example";tells the code generator and the programmer that the concrete types registered under this Animal interface are for the module "cosmos-hub/x/amino-example". But I don't think we should call them modules, I think we should just call them namespaces -- not that I prefer it that way, but that I believe we must make this distinction, but that's a separate conversation, maybe.To be specific with an example, let's talk about the PubKey interface in the Cosmos SDK. Concrete types of the standard PubKey interface include PubKeyEd25519 and PubKeySecp256k1. The Cosmos hub would probably register all usages of the PubKey interface under the same namespace of "tendermint".
The intent of the version is primarily to be explicit about code compatibility, but there may be other benefits too. For example, we could create tooling that scans code and automatically figures out if maybe the version & registered concrete types are not consistent with what is registered on some registry service, thus catching bugs... in the long future maybe this becomes part of dependency management, and project Makefiles etc can help fetch updates from a blockchain repository. Notice that we could support idempotency of concrete registrations per interface/version, so that all code that needs to use a particular namespaced interface can register it upon init() of that module without fear of panics or errors.
If there is no
namespace="..."specified for an interface, then it should do something consistent. Maybe we can say that not specifying the namespace is the same as setting it to the default namespace, the global namespace.In any case, the biggest change is that registering concrete types happens on that interface (or at least a namespaced version of it).
Rationale/benefits.
From #267 (comment):
Thank you, responding in order:
(1) (assuming you mean prefix in the general sense, and not the current amino disamb/prefix sense), I think another way to express it is, "make the field number a composite of two things; the module and the type". And this would be helpful in the sense that it would provide something similar to what the amino disamb/prefix bytes tries to provide with interfaces -- some practical conflict prevention system, such that there is no/less need to worry about local codecs, because they can all be thrown in a global per-module namespace. But I'm not sure, is there as line number in the PR that you can point me to, so I can understand more? (Should I just read the whole PR?).
(2) I do want the latest version of the schemas used by a chain to be in the state as well for some clients. Also, using the state to keep track of duplicates or conflicts (or perhaps keeping a priority ordering of concrete types) seems like a good idea.
(3) is basically what I am about to propose.
Agreed!
I like the way you put it, or described it.
I don't think our standards will be adopted, if we had to use the proto.Any spec exactly, which isn't (AFAIK) fully supported in implementations, and doesn't quite suite needs of power users. Since power users already require an optimized standard, such a standard will emerge, so if we know how to, and it isn't hard to, we should go with that.
What are valid namespaces?
I think that namespaces should be in the form of a domain+path (I think maybe a https:// or cosmos:// spec is not needed yet, and could maybe default to a blockchain in the future). Maybe "tendermint" and a few other names can be special, and not require a "dot". Maybe "cosmos", "sdk", and "tendermint" are reserved? I dunno.
Proto4?
Going with the previous example, maybe in Protobuf4, we can have:
Other
The original proposal wording is here: https://gist.github.com/jaekwon/1b9dd3b2accd4b5f49661b49536232dc but I think the wording is confusing, so I tried to expand upon it here.