Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unneeded metadata read during update event generation #11829

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

grantatspothero
Copy link
Contributor

Followup from this PR:
#10523

The above PR removed unnecessary objectstore reads after commit, but there was 1 I missed. SnapshotProducer.notifyListeners has the same problem of reading metadata from objectstore instead of just reading the in memory committed Snapshot object.

@github-actions github-actions bot added the core label Dec 19, 2024
@grantatspothero grantatspothero force-pushed the gn/removeUnneededMetadataReadUpdateEvent branch 2 times, most recently from ad81312 to 28984fe Compare December 19, 2024 21:28
@@ -475,10 +475,14 @@ public void commit() {
}
}

Object updateEvent(Snapshot committedSnapshot) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PendingUpdate.updateEvent only usage is in SnapshotProducer currently so could change the interface directly to updateEvent(Snapshot committedSnapshot), but did not want to make a backwards incompatible API change

The whole update event/listener functionality seems untouched for years.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SnapshotProducer is package private, so I think we're OK in terms of backwards compatibility since it's not like a public API is being broken.

@grantatspothero grantatspothero force-pushed the gn/removeUnneededMetadataReadUpdateEvent branch from 28984fe to a710e1c Compare December 20, 2024 18:40
@grantatspothero grantatspothero force-pushed the gn/removeUnneededMetadataReadUpdateEvent branch from a710e1c to 072bacf Compare December 20, 2024 18:43
@amogh-jahagirdar amogh-jahagirdar self-requested a review January 1, 2025 01:48
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @grantatspothero , I agree with the principle idea of this change to derive the update event from the committed snapshot, no need to potentially read the whole metadata again right after the commit. I just had some comments on the implementation ; it'd also be ideal to have some tests which verify the produced event has the expected properties.

There's a broad question for the need for the listener API since I think these days at least for the commit path, the commit report sent to REST implementations has all those details but there's probably legitimate use cases for non-REST cases or even just generic patterns (sending events to a queue or whatnot). The interface is pretty straightforward/lightweight, and users can have whatever complexity they want in their own implementations.

Comment on lines +163 to +168
ValidationException.check(
snapshotId == committedSnapshot.snapshotId(),
"Committed snapshotId %s does not match expected snapshotId %s",
committedSnapshot.snapshotId(),
snapshotId);
long sequenceNumber = committedSnapshot.sequenceNumber();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we uplevel this logic, it's common to all the implementations?

@@ -475,10 +475,14 @@ public void commit() {
}
}

Object updateEvent(Snapshot committedSnapshot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SnapshotProducer is package private, so I think we're OK in terms of backwards compatibility since it's not like a public API is being broken.

Comment on lines 956 to +961
long snapshotId = snapshotId();
Snapshot justSaved = ops().refresh().snapshot(snapshotId);
long sequenceNumber = TableMetadata.INVALID_SEQUENCE_NUMBER;
Map<String, String> summary;
if (justSaved == null) {
// The snapshot just saved may not be present if the latest metadata couldn't be loaded due to
// eventual
// consistency problems in refresh.
LOG.warn("Failed to load committed snapshot: omitting sequence number from notifications");
summary = summary();
} else {
sequenceNumber = justSaved.sequenceNumber();
summary = justSaved.summary();
}

return new CreateSnapshotEvent(tableName, operation(), snapshotId, sequenceNumber, summary);
ValidationException.check(
snapshotId == committedSnapshot.snapshotId(),
"Committed snapshotId %s does not match expected snapshotId %s",
committedSnapshot.snapshotId(),
snapshotId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need the validation? I feel like the principle of this change is that the update event that is produced is always going to be derived from the passed in committed snapshot. I think passing committedSnapshot.id() to the event suffices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants