EU Data Act: store data points by key/name, live SoC fix#31259
Merged
Conversation
Replaces the dual name+GUID map indexing with a flat []point. Each point carries its Key (GUID) and Name; find() matches by key first, then name, so a point is stored once yet looked up by either. Drops the knownKeys allowlist and the double-store/double-log of generically-named points (e.g. the ID.3 range). Behaviour preserved: id-based dedup keeps newest-per-id (last wins on equal timestamps), merge stamps delivery Seq, lookup picks the freshest field.
…plify # Conflicts: # vehicle/vw/eudataact/store.go # vehicle/vw/eudataact/types.go
Contributor
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The new
findhelper does a linear scan over the entire slice on every lookup and is now used inmerge,lookup, and tests; if the number of points per vehicle grows this could become a hot path, so consider maintaining a small auxiliary index (e.g. amap[id]int]) or precomputing a map when decoding to avoid repeated O(n) scans. - Given that
point.id()prefersKeyoverNamewhilefindmatches bothKeyandName, it might be worth adding a small comment or invariant check to ensure there are no cases where two distinct logical signals share the sameNamebut differ inKey, since that could makefindambiguous despite the deduplication logic.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `find` helper does a linear scan over the entire slice on every lookup and is now used in `merge`, `lookup`, and tests; if the number of points per vehicle grows this could become a hot path, so consider maintaining a small auxiliary index (e.g. a `map[id]int]`) or precomputing a map when decoding to avoid repeated O(n) scans.
- Given that `point.id()` prefers `Key` over `Name` while `find` matches both `Key` and `Name`, it might be worth adding a small comment or invariant check to ensure there are no cases where two distinct logical signals share the same `Name` but differ in `Key`, since that could make `find` ambiguous despite the deduplication logic.
## Individual Comments
### Comment 1
<location path="vehicle/vw/eudataact/types.go" line_range="202-211" />
<code_context>
- res[name] = p
- }
+// points decodes data points, keeping the newest entry per id (see point.id).
+func points(data []dataPoint) []point {
+ var res []point
- for _, p := range data {
- if p.Value == "" {
+ for _, dp := range data {
+ if dp.Value == "" {
continue
}
var ts time.Time
- if p.TimestampUtc != nil {
- ts = *p.TimestampUtc
+ if dp.TimestampUtc != nil {
+ ts = *dp.TimestampUtc
+ }
+ p := point{Key: dp.Key, Name: dp.DataFieldName, Value: dp.Value, Timestamp: ts}
+ if p.id() == "" {
+ continue
}
- pt := point{Key: p.Key, Name: p.DataFieldName, Value: p.Value, Timestamp: ts}
- set(p.DataFieldName, pt)
- if _, ok := knownKeys[p.Key]; ok {
- set(p.Key, pt)
+ if e := find(res, p.id()); e != nil {
+ // newest wins; on equal timestamps the later entry wins
+ if !e.Timestamp.After(p.Timestamp) {
</code_context>
<issue_to_address>
**suggestion (performance):** Current slice-based deduplication is O(n²); consider a map-based index to avoid quadratic behavior on larger datasets.
Using `find` on an ever-growing `[]point` makes deduplication O(n²). If this runs on larger datasets or in a hot path, it may become a bottleneck. Consider maintaining a `map[string]int` (id → index into `res`) while building `res`, so each lookup is O(1) and order is preserved. The same index map can be reused to avoid repeated scans in `merge` and `lookup`.
Suggested implementation:
```golang
// points decodes data points, keeping the newest entry per id (see point.id).
// It preserves the order of first occurrence while deduplicating by id.
func points(data []dataPoint) []point {
var (
res []point
index = make(map[string]int, len(data)) // id -> index in res
)
for _, dp := range data {
if dp.Value == "" {
continue
}
var ts time.Time
if dp.TimestampUtc != nil {
ts = *dp.TimestampUtc
}
p := point{
Key: dp.Key,
Name: dp.DataFieldName,
Value: dp.Value,
Timestamp: ts,
}
id := p.id()
if id == "" {
continue
}
if i, ok := index[id]; ok {
// newest wins; on equal timestamps the later entry wins
existing := &res[i]
if !existing.Timestamp.After(p.Timestamp) {
*existing = p
}
continue
}
index[id] = len(res)
res = append(res, p)
}
return res
}
```
The comment also suggests reusing an index map in `merge` and `lookup`. To fully implement that:
1. Identify any `merge` / `lookup` helpers that currently scan `[]point` linearly (possibly using `find`).
2. Introduce a shared `map[string]int` (or pass it as a parameter) in those functions, using the same pattern as above to avoid repeated O(n) scans.
3. Remove or limit use of `find` where the map-based index is available, to ensure lookups and merges remain O(1) per operation.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Comment on lines
+202
to
+211
| func points(data []dataPoint) []point { | ||
| var res []point | ||
|
|
||
| for _, p := range data { | ||
| if p.Value == "" { | ||
| for _, dp := range data { | ||
| if dp.Value == "" { | ||
| continue | ||
| } | ||
|
|
||
| var ts time.Time | ||
| if p.TimestampUtc != nil { | ||
| ts = *p.TimestampUtc | ||
| if dp.TimestampUtc != nil { |
Contributor
There was a problem hiding this comment.
suggestion (performance): Current slice-based deduplication is O(n²); consider a map-based index to avoid quadratic behavior on larger datasets.
Using find on an ever-growing []point makes deduplication O(n²). If this runs on larger datasets or in a hot path, it may become a bottleneck. Consider maintaining a map[string]int (id → index into res) while building res, so each lookup is O(1) and order is preserved. The same index map can be reused to avoid repeated scans in merge and lookup.
Suggested implementation:
// points decodes data points, keeping the newest entry per id (see point.id).
// It preserves the order of first occurrence while deduplicating by id.
func points(data []dataPoint) []point {
var (
res []point
index = make(map[string]int, len(data)) // id -> index in res
)
for _, dp := range data {
if dp.Value == "" {
continue
}
var ts time.Time
if dp.TimestampUtc != nil {
ts = *dp.TimestampUtc
}
p := point{
Key: dp.Key,
Name: dp.DataFieldName,
Value: dp.Value,
Timestamp: ts,
}
id := p.id()
if id == "" {
continue
}
if i, ok := index[id]; ok {
// newest wins; on equal timestamps the later entry wins
existing := &res[i]
if !existing.Timestamp.After(p.Timestamp) {
*existing = p
}
continue
}
index[id] = len(res)
res = append(res, p)
}
return res
}The comment also suggests reusing an index map in merge and lookup. To fully implement that:
- Identify any
merge/lookuphelpers that currently scan[]pointlinearly (possibly usingfind). - Introduce a shared
map[string]int(or pass it as a parameter) in those functions, using the same pattern as above to avoid repeated O(n) scans. - Remove or limit use of
findwhere the map-based index is available, to ensure lookups and merges remain O(1) per operation.
…plify # Conflicts: # vehicle/vw/eudataact/eudataact_test.go # vehicle/vw/eudataact/provider.go # vehicle/vw/eudataact/types.go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix #31256
Stores delivered data points in a slice keyed by GUID, falling back to the field name, so non-unique data field names can be disambiguated by key.
The official V5.0 data dictionary (added as
vehicle/vw/eudataact/datadictionary.jsonfor reference) confirms many field names are not unique:battery_state_report.socmaps to 4 GUIDs,settings.target_socto 4,mileage.valueto 3,charging_state_report.*to 4 each.SoC now reads the live fields only —
battery_level_HV.value,state_of_charge,hv_soc.battery_state_report.socis dropped: the dictionary defines it as "State of Charging when charging is started" (a charge-start snapshot, not the live value), and thebattery_level_HV.state == VALIDgating is removed.