Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up existing indices on Upload (aka ReportSession) #442

Merged
merged 1 commit into from
Jan 15, 2025

Conversation

Swatinem
Copy link
Contributor

@Swatinem Swatinem commented Nov 28, 2024

This in particular cleans up the existing indices that exist in the production DB but not in django:

We drop these indices for the following reasons:

  • upload_index_id_type_number: This matches the above state migration,
    and it looks like the index does not actually exist in the production DB.

The following indices exist in the production DB, but not in django state/migrations:

  • reports_upload_order_number_idx:
    We never query by the order_number alone, so this index is likely unused.
  • reports_upload_report_id_f6b4ffae: Queries on report_id should already been covered by the
    newly added index on report_id+upload_type.
  • reports_upload_report_id_upload_type_index_ccnew:
    This seems to be a manually added variant of the upload_report_type_idx index and is thus duplicated.
  • reports_upload_report_id_upload_type_order_number_index:
    This is the same as the above, except with an additional order_number.
    We do use it in queries, but I doubt the index pulls its weight, as the order_number changes quite
    frequently so the index is costly to maintain.

@Swatinem Swatinem self-assigned this Nov 28, 2024
@Swatinem
Copy link
Contributor Author

Looks like the created_at index might be used only for this query here:
https://github.com/codecov/worker/blob/ad3621cbee3619ec1b47ee55a1050e8be4343dda/tasks/brolly_stats_rollup.py#L53

But for that case, I believe having an index would make sense, as otherwise it would do a full table scan there :-D

@Swatinem Swatinem force-pushed the swatinem/upload-idx-part1 branch from 04713aa to 3af2101 Compare January 8, 2025 08:05
Base automatically changed from swatinem/upload-idx-part1 to main January 8, 2025 08:24
@Swatinem Swatinem force-pushed the swatinem/upload-idx-part2 branch from 20fc6fa to ca4ae14 Compare January 8, 2025 09:39
Copy link

codecov bot commented Jan 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.97%. Comparing base (99581a7) to head (ad1607c).
Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #442      +/-   ##
==========================================
- Coverage   90.55%   89.97%   -0.58%     
==========================================
  Files         404      324      -80     
  Lines       12562     9197    -3365     
  Branches     2107     1633     -474     
==========================================
- Hits        11375     8275    -3100     
+ Misses       1078      859     -219     
+ Partials      109       63      -46     
Flag Coverage Δ
shared-docker-uploader ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Swatinem Swatinem force-pushed the swatinem/upload-idx-part2 branch from ca4ae14 to 5c43c07 Compare January 9, 2025 08:15
@Swatinem
Copy link
Contributor Author

Swatinem commented Jan 9, 2025

I updated the PR to add a bunch more manually added indices to the cleanup, and after running this, the state of the production DB should be fully aligned with the django migration state.

@Swatinem Swatinem force-pushed the swatinem/upload-idx-part2 branch from 5c43c07 to 610dd21 Compare January 10, 2025 09:08
This in particular cleans up the existing indices that exist in the production DB but not in django:

We drop these indices for the following reasons:
- `upload_index_id_type_number`: This matches the above state migration,
  and it looks like the index does not actually exist in the production DB.

The following indices exist in the production DB, but not in django state/migrations:
- `reports_upload_order_number_idx`:
  We never query by the `order_number` alone, so this index is likely unused.
- `reports_upload_report_id_f6b4ffae`: Queries on `report_id` should already been covered by the
   newly added index on `report_id`+`upload_type`.
- `reports_upload_report_id_upload_type_index_ccnew`:
  This seems to be a manually added variant of the `upload_report_type_idx` index and is thus duplicated.
- `reports_upload_report_id_upload_type_order_number_index`:
  This is the same as the above, except with an additional `order_number`.
  We do use it in queries, but I doubt the index pulls its weight, as the `order_number` changes quite
  frequently so the index is costly to maintain.
@Swatinem Swatinem force-pushed the swatinem/upload-idx-part2 branch from 610dd21 to ad1607c Compare January 10, 2025 14:17
@Swatinem Swatinem added this pull request to the merge queue Jan 15, 2025
Merged via the queue into main with commit 07b39b2 Jan 15, 2025
6 checks passed
@Swatinem Swatinem deleted the swatinem/upload-idx-part2 branch January 15, 2025 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants