You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 25, 2026. It is now read-only.
Copy file name to clipboardExpand all lines: adrs/0009-repository-identity-and-discovery.md
+58-31Lines changed: 58 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,40 +10,40 @@ This ADR documents the architecture for `eka` repository identity, atom naming,
10
10
11
11
The core problems to be solved are:
12
12
13
-
1.**Consistent Identity Model**: An atom's identity is a two-part system: a machine-verifiable component (the root commit hash) and a human-readable name (`label`). The repository identity model should follow this same successful pattern. A formal mechanism is needed to add a user-definable naming component to the repository's identity. A primary consequence of this is the robust disambiguation of forks from mirrors.
13
+
1.**Consistent Identity Model**: An atom's identity is a two-part system: a machine-verifiable component (the root commit hash) and a human-readable name (`label`). The repository identity model should follow a similar pattern. A formal mechanism is needed to establish repository identity that provides robust disambiguation of forks from mirrors.
14
14
2.**Source of Truth**: A repository's composition is implicitly defined by the atoms present on the filesystem. A formal, declarative manifest is needed to act as the single, unambiguous source of truth.
15
15
3.**Discovery Inefficiency**: A performant method is needed to discover all atoms within a local checkout without expensive filesystem traversals.
16
16
4.**Terminology Ambiguity**: The historical use of `tag` for an atom's unique identifier is ambiguous when juxtaposed with the `tags` metadata list.
17
-
5.**Remote Discovery**: The purpose of the existing "Manifest Ref" must be clarified as the primary mechanism for all remote metadata discovery.
18
17
19
18
## Decision
20
19
21
-
The architecture is centered on a new root `ekala.toml` manifest as the single source of truth. It also formalizes terminology and clarifies the role of the "Manifest Ref" for remote discovery.
20
+
The architecture is centered on a new root `ekala.toml` manifest as the single source of truth. It establishes repository identity through initialization commits with entropy injection, providing robust fork disambiguation and temporal anchoring.
22
21
23
22
### 1. The Source of Truth: `ekala.toml` (New)
24
23
25
24
A single `ekala.toml` file **must** exist at the root of the repository. Its primary purpose is to serve as the **single source of truth** for the repository's composition.
26
25
27
-
-**Function**: It defines the repository's canonical `label` for fork disambiguation and provides a complete, static index of all `packages` (atoms) it contains.
26
+
-**Function**: It provides a complete, static index of all `packages` (atoms) it contains. Repository identity is established through the initialization process rather than explicit naming. It also supports optional metadata for enhanced discoverability.
28
27
-**Format**:
29
28
30
29
```toml
31
30
# ekala.toml
32
31
33
-
[project]
34
-
# The canonical, human-readable name for this repository.
35
-
# This is mixed into the atom ID hash to disambiguate forks.
36
-
label = "my-project"
37
-
38
-
# An optional list of tags for logically grouping entire repositories.
39
-
tags = ["ui-kit", "experimental"]
40
-
41
32
# A flat list of all atoms in this repository, identified by their path.
42
33
# The publisher will enforce that all atom names are unique within the repository.
34
+
[set]
43
35
packages = [
44
36
"path/to/ui-kit/button",
45
37
"path/to/core/validator",
46
38
]
39
+
40
+
41
+
# Optional key-value metadata for structured filtering and queries
42
+
[metadata]
43
+
domain = "my-company.com"
44
+
license = "MIT"
45
+
# Optional tags for simple categorization
46
+
tags = ["ui-kit", "experimental"]
47
47
```
48
48
49
49
### 2. The Atom and its Metadata: `atom.toml` (Terminology Change)
@@ -62,28 +62,44 @@ Each atom continues to be defined by an `atom.toml` file. This ADR formalizes a
62
62
label = "button"
63
63
version = "1.0.0"
64
64
65
+
66
+
# Optional key-value metadata for structured filtering and queries
67
+
[metadata]
68
+
license = "MIT"
69
+
maintainer = "ui-team@company.com"
65
70
# An optional list of arbitrary strings for logical grouping.
66
71
# This is the foundation for metadata-driven collections.
67
72
tags = ["ui", "interactive"]
68
73
```
69
74
70
-
### 3. Atom Identity (Formalized)
75
+
### 3. Repository Identity (New)
76
+
77
+
Repository identity is established through an initialization commit with entropy injection, providing robust disambiguation and temporal anchoring. Unlike atoms (which are individual components that benefit from human-readable names), repositories are collections of components where temporal identity provides clearer provenance tracking.
78
+
79
+
**Note**: This initialization commit mechanism is outlined here but will be implemented post-MVP to avoid delaying the core functionality.
80
+
81
+
-**Initialization Process**: When `eka init` is run, a special initialization commit is created. This commit includes injected entropy (random data for cryptographic strength) in its header along with a unique "ekala" identifier. Git commits are snapshots of repository state with metadata; headers contain additional information like author details.
82
+
-**Identity Components**: Repository identity is defined by this initialization commit, which implicitly includes the repository's complete history (including the original root commit) through Git's ancestry system. Git maintains a chain of commits where each commit references its parent(s), forming a tree structure that links the initialization point to the repository's entire development timeline.
83
+
-**Temporal Anchoring**: The init commit establishes a clear point in history when the repository was explicitly configured for Ekala, preventing publication of atoms created before this point and enabling precise analysis of when forks occurred. This creates a temporal boundary that distinguishes "before Ekala" from "after Ekala" in the repository's history.
84
+
-**Fork Tracking**: The unique "ekala" identifier in init commit headers allows tracking of repository reinitializations and fork points by marking commits that represent new identity establishments, providing a historical record of when repositories established independent identities.
85
+
86
+
### 4. Atom Identity (Formalized)
71
87
72
88
An atom's identity is a cryptographic hash. This ADR formalizes its components.
73
89
74
90
-**Hashing Components**: The ID is derived from two components:
75
-
1. The repository's **root commit hash**.
91
+
1. The repository's **init commit hash** (which implicitly encodes the entire repository history including the root commit through Git's parent chain system).
76
92
2. The atom's `label` (as defined in its `atom.toml`).
77
-
-**Fork Disambiguation**: The `project.label` from the root `ekala.toml` is incorporated into the hashing process, ensuring that forks with identical roots produce unique atom IDs.
93
+
-**Fork Disambiguation**: The init commit identity ensures that repositories with different initialization histories produce unique atom IDs, even if they share the same root of history.
78
94
79
-
### 4. Git Refspec Architecture
95
+
### 5. Git Refspec Architecture
80
96
81
97
To support this architecture, a unified and consistent Git refspec is required. All `ekala`-specific refs will live under the `refs/ekala/` namespace.
82
98
83
-
-**Repository Identity**: The repository's canonical name is advertised in a single, top-level ref.
99
+
-**Repository Identity**: Repository identity is established through a single ref that points to the latest initialization commit, leveraging Git's Merkle tree structure (where each commit contains references to its parent commits).
84
100
85
-
-**Format**: `refs/ekala/project/<project-label>`
86
-
-**Content**: This ref points to the repository's root commit hash.
101
+
-**Format**: `refs/ekala/init`
102
+
-**Content**: Points to the entropy-injected initialization commit hash. The root commit is implicitly encoded through the commit's ancestry chain, eliminating the need for a separate root ref.
87
103
88
104
-**Atom Content**: The primary ref for an atom points directly to its content. This path is optimized for the most common operation.
89
105
@@ -93,20 +109,28 @@ To support this architecture, a unified and consistent Git refspec is required.
A project rename is a critical lifecycle event that must be handled gracefully. This is managed in the manifest, which is the single source of truth.
114
+
Repository identity evolution is handled through the immutable initialization commit system, eliminating the need for the complex deprecation mechanisms in the previous draft.
99
115
100
-
-**Mechanism**: The `ekala.toml` manifest is extended with an optional `deprecated.labels` field.
101
-
```toml
102
-
[project]
103
-
label = "new-project-name"
104
-
deprecated.labels = ["old-project-name"]
105
-
```
106
-
-**Publisher Behavior**: The `eka publish` command will publish the primary ref (`refs/ekala/project/new-project-name`) and a special deprecation ref (`refs/ekala/deprecated/old-project-name`).
107
-
-**Resolver Behavior**: When resolving a dependency on an old name, the resolver will discover the deprecation ref, follow it to the new name, and emit a warning to the user, ensuring a non-breaking upgrade path.
116
+
-**Mechanism**: Since repository identity is tied to immutable Git commits rather than mutable labels, identity changes require explicit reinitialization. This provides clean slate evolution without legacy baggage.
117
+
-**Publisher Behavior**: The `eka init` command creates an initialization commit with the ekala.toml manifest and publishes the `refs/ekala/init` ref pointing to it, establishing the repository's identity.
118
+
-**Resolver Behavior**: Resolvers verify atom authenticity by checking that the atom's identity components match the published repository's initialization commit hash.
119
+
120
+
### 6. Alternatives Considered
121
+
122
+
#### User-Managed Repository Labels
123
+
124
+
A system of user-defined repository labels (similar to atom labels) was considered as an alternative to initialization commits. This would involve adding a `label` field to `ekala.toml` and incorporating it into repository identity calculations.
125
+
126
+
**Why Rejected:**
127
+
128
+
-**Collection vs Component**: Atoms are individual components that benefit from human-readable names for coordination. Repositories are collections of components where temporal identity provides clearer provenance tracking and avoids naming conflicts in a decentralized system.
129
+
-**Maintenance Complexity**: User-managed labels require deprecation mechanisms for renames, adding complexity that temporal identity avoids through immutable Git commits.
130
+
-**Coordination Overhead**: Labels create social coordination challenges (name conflicts, ownership disputes) that temporal identity sidesteps by using cryptographic time-based identity instead of human names.
131
+
-**Decentralized Constraints**: Without central registries, label-based coordination becomes impractical at scale, while temporal identity works naturally in distributed environments.
108
132
109
-
###5. Logical Grouping: Tags over Formal Sets
133
+
####Logical Grouping: Tags over Formal Sets
110
134
111
135
A rigid, filesystem-based `set` hierarchy was considered as a mechanism for grouping atoms. This approach was **rejected** because it conflates **physical layout** with **logical grouping**, is inflexible (an atom can only belong to one set), and does not work across repository boundaries.
112
136
@@ -116,11 +140,14 @@ The chosen `tags` system is superior because it is a **metadata-driven** approac
116
140
117
141
**Pros**:
118
142
119
-
-**Unambiguous Identity**: The `project.label` in the root `ekala.toml`, combined with the root commit hash and atom `label`, provides a robust, fork-safe identity for all atoms.
143
+
-**Robust Identity**: The initialization commit system provides mathematically strong provenance with temporal anchoring, enabling precise fork analysis and preventing publication of atoms created before explicit Ekala initialization.
144
+
-**Simplified Evolution**: Repository identity changes require clean reinitialization rather than complex deprecation management, providing clearer lifecycle semantics.
145
+
-**Cryptographic Strength**: Entropy injection in init commits ensures collision resistance while leveraging Git's immutability for free.
120
146
-**Clear Terminology**: Deprecating `tag` in favor of `label` for the unique identifier resolves a major point of confusion.
121
147
-**Performant Discovery**: Both local discovery (reading the root `ekala.toml`) and remote discovery (querying for manifest refs) are extremely fast and avoid filesystem traversals or repository clones.
122
148
-**Clear and Formalized**: The architecture is now based on a clear set of rules, with a single source of truth and precise terminology.
123
149
-**Flexible Grouping**: The metadata-driven `tags` system allows for flexible, multi-faceted grouping of atoms, which is not possible with a rigid, filesystem-based hierarchy.
150
+
-**Rich Metadata**: The dual tagging system (tags + key-value metadata) enables both simple categorization and structured queries, supporting advanced decentralized discovery through systems like Eos.
0 commit comments