Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partition delete logic in sync #243

Merged
merged 13 commits into from
Oct 11, 2024

Conversation

sudiptob2
Copy link
Collaborator

closes #239

@sudiptob2 sudiptob2 force-pushed the feat/239/handle-partition-delete branch from bda4b34 to 05b7746 Compare September 30, 2024 13:03
@sudiptob2 sudiptob2 marked this pull request as ready for review September 30, 2024 13:25
@@ -54,19 +57,18 @@ func (s *PostgresRepository) UpsertPartitions(ctx context.Context, partitions []
return nil
}

func (s *PostgresRepository) GetAllPartitions(ctx context.Context) ([]*dao.PartitionInfo, error) {
func (s *PostgresRepository) GetAllPartitions(ctx context.Context) ([]*model.PartitionInfo, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be preparing for pagination parameters here? It seems unlikely we'd ever want to return everything to the front end all at once...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, we also have several other APIs that need to be updated with pagination. This task is tracked in #245

@sudiptob2 sudiptob2 force-pushed the feat/239/handle-partition-delete branch from 444946f to cadcc6e Compare September 30, 2024 17:32
return fmt.Errorf("could not upsert partitions: %w", err)
}
// Delete partitions that are not present in the API response
deleteCandidates, err := s.findPartitionDeleteCandidates(ctx, partitions)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since workqueue has retries, you should probably build up deleteCandidates before pushing the task on it. If we ever retry, there is no need to re-calculate them

}

deletedAt := time.Now().Unix()
query := `UPDATE partitions SET deleted_at = $1 WHERE name = ANY($2)`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably target these partitions by ID. Imagine that we had a partition named "example". Then we removed it. Then we re-installed it. Then we remove it again. Currently, in the database, we have multiple partition records with the name "example", but only one of them should have deleted_at set to now.

@@ -54,19 +57,18 @@ func (s *PostgresRepository) UpsertPartitions(ctx context.Context, partitions []
return nil
}

func (s *PostgresRepository) GetAllPartitions(ctx context.Context) ([]*dao.PartitionInfo, error) {
func (s *PostgresRepository) GetAllPartitions(ctx context.Context) ([]*model.PartitionInfo, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, you should probably get only active partitions. We don't need to get deleted partitions to operate on them.
We should also change the name, so we know that we are fetching active partitions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Thanks for the awesome reviews 🚀
I found that, GetAllPartitions is also being used by the webservice here. I think frontend might need the deleted partition data as well to show some graphical design on them. Therefore, I updated the logic in findPartitionDeleteCandidates()

@sudiptob2 sudiptob2 requested a review from nikola-jokic October 1, 2024 13:35
}

// Fetch partitions from the database
dbPartitions, err := s.repo.GetAllPartitions(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are skipping stuff that is deleted, why not simply fetch partitions from the database where deleted_at is null
This is especially important when the amount of data builds up. ☺️

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah true, I found that, GetAllPartitions is also being used by the web service here. Should we embed the is null logic in GetAllPartitions() as this method is not used in internal call only.

I think we have the following two approaches here,

  1. Implement a different GetAllActivePartition() for internal call.
  2. once we add support for pagination and filtering we will be able to filter the null only partitions in our internal calls when needed.

Let me know your thoughts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented seperate GetActivePartitions()

@sudiptob2 sudiptob2 requested a review from nikola-jokic October 1, 2024 15:59
@sudiptob2 sudiptob2 force-pushed the feat/239/handle-partition-delete branch from 307e3df to 639f299 Compare October 2, 2024 15:25
nikola-jokic
nikola-jokic previously approved these changes Oct 3, 2024
Copy link
Collaborator

@nikola-jokic nikola-jokic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can push the work of finding partitions to the database to do something like:

DELETE FROM partitions where id IN (
  select id from partitions where deleted_at IS NULL AND NOT(id = ANY($1))
)

But I will leave it up to you if you want to do it now or we can proceed with this for now

@sudiptob2
Copy link
Collaborator Author

We can push the work of finding partitions to the database to do something like:
Hmm.. seems cleaner, we can reduce code in this way 👍

@sudiptob2 sudiptob2 force-pushed the feat/239/handle-partition-delete branch 2 times, most recently from efc8d8f to 4f06b06 Compare October 3, 2024 15:31
@@ -88,3 +126,25 @@ func (s *PostgresRepository) GetAllPartitions(ctx context.Context) ([]*dao.Parti
}
return partitions, nil
}

func (s *PostgresRepository) DeletePartitions(ctx context.Context, partitions []*dao.PartitionInfo) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to DeleteActivePartitionsNotInPartitions or something similar. The name is misleading ☺️

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the name, but still not sure about the wording 😐. Let me know how it sounds.

@sudiptob2 sudiptob2 requested a review from nikola-jokic October 4, 2024 13:23
@sudiptob2 sudiptob2 force-pushed the feat/239/handle-partition-delete branch from de4dcb8 to 181b855 Compare October 8, 2024 14:48
@sudiptob2 sudiptob2 merged commit f5a2063 into G-Research:main Oct 11, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement partition delete
3 participants