indexer: backups are now a deterministic set of SQL statements#1338
indexer: backups are now a deterministic set of SQL statements#1338altergui wants to merge 1 commit into
Conversation
Pull Request Test Coverage Report for Build 9498381141Details
💛 - Coveralls |
062a1a5 to
cc46776
Compare
Pull Request Test Coverage Report for Build 9498587081Details
💛 - Coveralls |
Pull Request Test Coverage Report for Build 9498735272Details
💛 - Coveralls |
mvdan
left a comment
There was a problem hiding this comment.
Thanks for doing this. I said I would, but haven't had the time. LGTM with some nits.
Did you consider a test? Just to make sure that we don't regress with any other non-determinism in SQL backups in the future.
Also, I wonder if we should rename the indexer sql "backup" feature to "snapshot", as the latter is perhaps more well understood to mean "deterministic".
| statement.WriteString(line) | ||
| statement.WriteString("\n") | ||
|
|
||
| if strings.HasSuffix(line, ";") { |
There was a problem hiding this comment.
are you sure that the entire text ends with ;? what if it's like
foo;
bar;
baz
in that case you would ignore baz, so perhaps add a sanity check at the end that there's no remaining statement.
|
Could you explain why this is now deterministic? thanks |
cc46776 to
47767ee
Compare
Pull Request Test Coverage Report for Build 9546598969Details
💛 - Coveralls |
|
it's deterministic because the SQL statements produced by each node are the same (bit-by-bit) . this was not the case with the raw db format. |
they are a (zstd compressed) set of SQL statements also, relevant methods now use io.Reader and io.Writer
47767ee to
f345045
Compare
this is much less performant that simply exporting/importing the raw binary database, but makes the backup deterministic, allowing StateSync to pull the snapshot from many nodes at the same time