Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: add variant type support #11831

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

aihuaxu
Copy link
Contributor

@aihuaxu aihuaxu commented Dec 19, 2024

This is to add some required changes in API and core module for Variant support, including:

  • Add isVariantType() method for variant type
  • Add variant support in ReassignIds and TypeUtil
  • Add variant support in avro projection.

Part of: #10392

@aihuaxu aihuaxu marked this pull request as ready for review December 19, 2024 23:00
@aihuaxu
Copy link
Contributor Author

aihuaxu commented Dec 19, 2024

cc @rdblue, @RussellSpitzer, @flyrain and @JonasJ-ap. This is to add the changes in core to support variant type.

@@ -39,7 +42,7 @@ class Identity<T> implements Transform<T, T> {
@Deprecated
public static <I> Identity<I> get(Type type) {
Preconditions.checkArgument(
type.typeId() != Type.TypeID.VARIANT, "Unsupported type for identity: %s", type);
!UNSUPPORTED_TYPES.contains(type), "Unsupported type for identity: %s", type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this? It works just fine right? We can alter this when we add another type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It works fine. I thought you mentioned that you wanted to make such change in a previous PR. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I might have. I would normally structure it as the contains at first but wouldn't necessarily make that change to existing code.

@@ -709,6 +709,10 @@ public T map(Types.MapType map, Supplier<T> keyResult, Supplier<T> valueResult)
return null;
}

public T variant() {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be safer to throw UnsupportedOperationException here instead? I think that's what we have done for expression visitors when we add new cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see others like map()/list() all return null here.

@@ -132,6 +133,8 @@ static void toJson(Type.PrimitiveType primitive, JsonGenerator generator) throws
static void toJson(Type type, JsonGenerator generator) throws IOException {
if (type.isPrimitiveType()) {
toJson(type.asPrimitiveType(), generator);
} else if (type.isVariantType()) {
generator.writeString(type.toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update TestSchemaParser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we don't have TestSchemaParser. I'm trying to use TestMetadataUpdateParser.java indirectly test the change here. Let me know if we should create TestSchemaParser.

@@ -42,6 +42,7 @@ private SchemaParser() {}
private static final String STRUCT = "struct";
private static final String LIST = "list";
private static final String MAP = "map";
private static final String VARIANT = "variant";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Types don't typically go here. Maybe add this to fromPrimitiveString. It fits better there, even if it isn't a primitive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of that. But PrimitiveHolder calls fromPrimitiveString() which is used in PrimitiveType class. Not sure if it would cause an issue in serialization.

/** Replacement for primitive types in Java Serialization. */
class PrimitiveHolder implements Serializable {
private String typeAsString = null;
...
Object readResolve() throws ObjectStreamException {
return Types.fromPrimitiveString(typeAsString);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also fromPrimitiveString() is returning PrimitiveType. I think it's better not to mix variant with primitive. I create typeFromTypeString(String typeString) to handle both cases. Let me know how that looks.

@@ -166,6 +169,10 @@ public static String toJson(Schema schema, boolean pretty) {

private static Type typeFromJson(JsonNode json) {
if (json.isTextual()) {
if (VARIANT.equalsIgnoreCase(json.asText())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fromPrimitiveString should handle this.

@@ -52,6 +56,15 @@ public class TestMetadataUpdateParser {
Types.NestedField.required(1, "id", Types.IntegerType.get()),
Types.NestedField.optional(2, "data", Types.StringType.get()));

private static final Schema ID_VARIANTDATA_SCHEMA =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the right place for a Variant test. What are you trying to test?

Copy link
Contributor Author

@aihuaxu aihuaxu Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no TestSchemaParser. I'm trying to test SchemaParser indirectly. Let me know if we should add TestSchemaParser or we don't have that because it's tested indirectly.

@@ -150,4 +152,16 @@ public void projectWithMapSchemaChanged() {
.as("Result of buildAvroProjection is missing some IDs")
.isFalse();
}

@Test
public void testVariantConversion() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the projection test suite the right place for conversion tests? Isn't there a conversion test suite?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Let me move there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants