Try to reproduce #19179 #19191

serathius · 2025-01-14T10:05:24Z

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.
#19179

k8s-ci-robot · 2025-01-14T10:05:29Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [serathius]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

codecov · 2025-01-14T10:24:18Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.78%. Comparing base (9eb85ee) to head (54813cf).
Report is 29 commits behind head on main.

Additional details and impacted files

see 26 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #19191      +/-   ##
==========================================
+ Coverage   68.77%   68.78%   +0.01%     
==========================================
  Files         420      420              
  Lines       35641    35629      -12     
==========================================
- Hits        24511    24507       -4     
+ Misses       9707     9700       -7     
+ Partials     1423     1422       -1

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9eb85ee...54813cf. Read the comment docs.

serathius · 2025-01-15T09:09:56Z

cc @fuweid any recommendation on how to reproduce the #19179

fuweid · 2025-01-15T16:30:21Z

Hi @serathius

I can't reproduce it in my local. But I used to etcd-dump-logs to get one thing in first three failure pipeline runs

https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/directory/pull-etcd-robustness-amd64/1877585036438409216
https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/directory/pull-etcd-robustness-arm64/1877364764741472256
https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/directory/pull-etcd-robustness-arm64/1877466683589791744

There were only two kinds of requests before compaction panic. One is creation and other is deletion.
Even if Update type has high weight during random pick, there is no update type in first 400+ requests.
So, creation revision has been deleted in first compaction batch before panic. After restart, etcd key index ignores keys which has only one revision for tombstone. And then etcd resumes compaction, there are no such key in index, compaction will delete all the tombstones. However, it should keep these tombstones.

I am still investigating why there is no update before compaction panic. Hope that information helps

Signed-off-by: Marek Siarkowicz <[email protected]>

k8s-ci-robot · 2025-01-16T09:25:40Z

@serathius: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-etcd-verify	`54813cf`	link	true	`/test pull-etcd-verify`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

serathius · 2025-01-16T09:26:32Z

There were only two kinds of requests before compaction panic. One is creation and other is deletion.

I think I can guess the reason, in Kubernetes traffic operations are based on local state that is feed from watch. It tries to balance number of objects within the storage to keep the average. If the watch was very delayed there would be a long time where the traffic would execute only creates as it would not be aware of any objects. When watch would caught up, the traffic would skip random operations and immediately go for deletions, as there were too many objects in storage.

Still no progress on reproduction.

k8s-ci-robot · 2025-01-16T20:40:59Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added area/robustness-testing area/testing approved size/L labels Jan 14, 2025

serathius force-pushed the repro-19179-2 branch from 9cc87ab to 9e95d30 Compare January 14, 2025 10:24

serathius force-pushed the repro-19179-2 branch from 9e95d30 to 5065b46 Compare January 14, 2025 10:45

serathius changed the title ~~Run Issue17780 10 times~~ Reproduce #19179 Jan 16, 2025

serathius changed the title ~~Reproduce #19179~~ Try to reproduce #19179 Jan 16, 2025

serathius force-pushed the repro-19179-2 branch from 5065b46 to 1c2d615 Compare January 16, 2025 09:21

Try to reproduce etcd-io#19179

54813cf

Signed-off-by: Marek Siarkowicz <[email protected]>

serathius force-pushed the repro-19179-2 branch from 1c2d615 to 54813cf Compare January 16, 2025 09:22

k8s-ci-robot added the needs-rebase label Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to reproduce #19179 #19191

Try to reproduce #19179 #19191

serathius commented Jan 14, 2025 •

edited

Loading

k8s-ci-robot commented Jan 14, 2025

codecov bot commented Jan 14, 2025 •

edited

Loading

serathius commented Jan 15, 2025

fuweid commented Jan 15, 2025 •

edited

Loading

k8s-ci-robot commented Jan 16, 2025

serathius commented Jan 16, 2025

k8s-ci-robot commented Jan 16, 2025

Try to reproduce #19179 #19191

Are you sure you want to change the base?

Try to reproduce #19179 #19191

Conversation

serathius commented Jan 14, 2025 • edited Loading

k8s-ci-robot commented Jan 14, 2025

codecov bot commented Jan 14, 2025 • edited Loading

Codecov Report

serathius commented Jan 15, 2025

fuweid commented Jan 15, 2025 • edited Loading

k8s-ci-robot commented Jan 16, 2025

serathius commented Jan 16, 2025

k8s-ci-robot commented Jan 16, 2025

serathius commented Jan 14, 2025 •

edited

Loading

codecov bot commented Jan 14, 2025 •

edited

Loading

fuweid commented Jan 15, 2025 •

edited

Loading