Skip to content

RFC: Add format_major_version and format_minor_version to thrift metadata#582

Draft
alamb wants to merge 3 commits into
apache:masterfrom
alamb:alamb/parquet-versions-option-2
Draft

RFC: Add format_major_version and format_minor_version to thrift metadata#582
alamb wants to merge 3 commits into
apache:masterfrom
alamb:alamb/parquet-versions-option-2

Conversation

@alamb

@alamb alamb commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Rationale for this change

NOTE: there is an alternate RFC here

As described on the mailing list thread about versions, there is currently no way for a parquet reader to know which if the many parquet features it may encounter in a particular file

The current version field in the thirft metadata is insufficient because:
2. There is no agreed upon definition of version and many writers use it incorrectly:

* As of December 2025, there is no agreed upon consensus of what constitutes
* version 2 of the file. For maximum compatibility with readers, writers should
* always populate "1" for version. For maximum compatibility with writers,
* readers should accept "1" and "2" interchangeably. All other versions are
* reserved for potential future use-cases.
*/

  1. Even if we agreed to use the version field, version "2" has several forward incompatible changes (see Document Parquet Features by Version parquet-site#186) meaning a reader doesn't know what features it may encounter

What changes are included in this PR?

Add net new format_major_version and format_minor_version fields to the thrift metadata to encode the version of parquet-format. Readers can use this field to determine what features it may encounter.

Open questions:

  • Guidance for writing with maximum compatibility (set version to 2 and then use major/minor version?)
  • Should new versions of parquet-format require writers to set this field?
  • Would we ever change the version field?

These field would be ignored by older readers

Do these changes have PoC implementations?

Not yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant