Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARQUET-1125: Add UUID logical type. #71

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion LogicalTypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,18 @@ was converted from an enumerated type in another data model (e.g. Thrift, Avro,
Applications using a data model lacking a native enum type should interpret `ENUM`
annotated field as a UTF-8 encoded string.

The sort order used for `ENUM`s is `UNSIGNED` byte-wise comparison.
The sort order used for `ENUM` values is unsigned byte-wise comparison.

### UUID

`UUID` annotates a 16-byte fixed-length binary. The value is encoded using
big-endian, so that `00112233-4455-6677-8899-aabbccddeeff` is encoded as the
bytes `00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff`
(This example is from [wikipedia's UUID page][wiki-uuid]).

The sort order used for `UUID` values is unsigned byte-wise comparison.

[wiki-uuid]: https://en.wikipedia.org/wiki/Universally_unique_identifier

## Numeric Types

Expand Down
1 change: 1 addition & 0 deletions src/main/thrift/parquet.thrift
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,7 @@ struct Statistics {

/** Empty structs to use as logical type annotations */
struct StringType {} // allowed for BINARY, must be encoded with UTF-8
struct UUIDType {} // allowed for FIXED[16], must encoded raw UUID bytes
struct MapType {} // see LogicalTypes.md
struct ListType {} // see LogicalTypes.md
struct EnumType {} // allowed for BINARY, must be encoded with UTF-8
Expand Down