Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-3649: Fix for union type to match default values for any innertype #1922

Closed
wants to merge 3 commits into from

Conversation

clesaec
Copy link
Contributor

@clesaec clesaec commented Oct 20, 2022

What is the purpose of the change

Change the logic on Union type to validate a value (default value in particular) for AVRO-3649. Instead of check value with first union inner type, try with inner type until find a match.

Verifying this change

Some test were updated.

Documentation

  • Does this pull request introduce a new feature? no

@github-actions github-actions bot added the Java Pull Requests for Java binding label Oct 20, 2022
* @param jsonValue a value to check against the schema
* @return true if the value is valid according to this schema
*/
public boolean isValidDefault(JsonNode jsonValue) {

Check notice

Code scanning / CodeQL

Missing Override annotation Note

This method overrides
Schema.isValidDefault
; it is advisable to add an Override annotation.
@RyanSkraba RyanSkraba changed the title avro-3649: POC for union type to match default values for any innertype AVRO-3649: POC for union type to match default values for any innertype Oct 20, 2022
@KalleOlaviNiemitalo
Copy link
Contributor

Does this pull request introduce a new feature? no

It does. This would have to be documented and implemented for other languages too.

If there is a union ["string", "bytes"], and the default value is a string, then this won't know which member was meant. That may be ok for nullable fields but makes the feature a bit more complex to document.

@clesaec
Copy link
Contributor Author

clesaec commented Oct 25, 2022

For bytes, even this code doesn't work

    Schema schemaBytes = Schema.create(Type.BYTES);
    BinaryNode validBinary = new BinaryNode("Hello".getBytes(StandardCharsets.UTF_8));
    assertTrue(schemaBytes.isValidValue(validBinary));

It it may be another issue.

For docs, indeed, text is now reviewed

For other languages, i will made other PR (otherwise, this one will be too big)

@KalleOlaviNiemitalo
Copy link
Contributor

BinaryNode is documented as "Base64 encoded binary value", but Avro does not use Base64 in field default values or in JSON encoding, so I don't think Schema.isValidValue needs to support the BinaryNode type.

@clesaec
Copy link
Contributor Author

clesaec commented Oct 25, 2022

for other languages too
This PR is for rust

Comment on lines 621 to 622
boolean acceptNull = fieldSchema.getTypes().stream().map(Schema::getType).anyMatch((t) -> t == Schema.Type.NULL);
if (acceptNull) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can also call fieldSchema.isNullable() here

Comment on lines -1656 to -1813
case UNION: // union default: first branch
return isValidDefault(schema.getTypes().get(0), defaultValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that these changes make any value for the null schema or a union an invalid value. What am I overlooking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static method isValidDefault (line 1650 on new version) is private, called by new virtual function "public boolean isValidValue(final Object value) {", line 1645.
This function is redefine for NULL Schema (line 1387) and for union schema (line 1248).
May be the main flaw of this PR is to keep the static isValidDefault method and not build function isValidValue for each schema type, but it also would duplicate code for STRING, BYTES, ENUM for example (See here)).

@clesaec clesaec requested a review from opwvhk January 4, 2023 07:27
@clesaec clesaec force-pushed the avro-3649_unionDefaultPOC branch from 34a76ea to 343e343 Compare January 5, 2023 11:11
@clesaec clesaec changed the title AVRO-3649: POC for union type to match default values for any innertype AVRO-3649: Fix for union type to match default values for any innertype Jan 20, 2023
@clesaec clesaec force-pushed the avro-3649_unionDefaultPOC branch from 343e343 to 0d5d9b9 Compare June 15, 2023 09:04
@clesaec clesaec force-pushed the avro-3649_unionDefaultPOC branch from 0d5d9b9 to 7af5cc8 Compare September 19, 2023 09:36
@clesaec clesaec closed this Sep 19, 2023
@clesaec clesaec deleted the avro-3649_unionDefaultPOC branch September 19, 2023 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java Pull Requests for Java binding website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants