Verification error in TestV3WatchRestoreSnapshotUnsync #13922

serathius · 2022-04-11T11:53:49Z

We have recently enabled ETCD_VERIFY in integration tests (#13903) with hope to automatically verify fix for recent data inconsistency issue. Apart of introducing new check dedicated for data inconsistency, we also enabled previously existing check for consistent index. Based on recent weave of flakes on main branch, we have an issue.

All the failures I found happen in TestV3WatchRestoreSnapshotUnsync test:

logger.go:130: 2022-04-11T08:23:27.626Z	FATAL	m0	Verification failed	{"member": "m0", "data-dir": "/tmp/TestV3WatchRestoreSnapshotUnsync2779563335/002/etcd2880677907", "error": "backend.ConsistentIndex (8) must be >= last snapshot index (24)"}

Examples:

Our first objective is to verify the issue on release-3.5 branch as this would impact upcoming v3.5.3 release. Second is to fix the issue on main branch.

The text was updated successfully, but these errors were encountered:

ahrtr · 2022-04-11T12:19:37Z

I will take a deep dive into this tomorrow.

ahrtr · 2022-04-11T12:38:33Z

@serathius @ptabor @spzala Please code freeze the main branch before I figure out the root cause

serathius · 2022-04-11T13:58:44Z

Having problems with reproducing this even on main branch (prerequisite to confirming on release-3.5 branch).

Running

etcd $ cd tests
tests $ ETCD_VERIFY=all go test ./integration --count=1
ok      go.etcd.io/etcd/tests/v3/integration    340.612s
tests $ ETCD_VERIFY=all go test ./integration --run=TestV3WatchRestoreSnapshotUnsync --count=10
ok      go.etcd.io/etcd/tests/v3/integration    52.653s

serathius · 2022-04-11T15:32:27Z

ok, I reproduced issue locally on master branch, again I was having problems with /tmp being normal disk partition.

serathius · 2022-04-11T15:38:32Z

Confirmed that issue is reproducible on release-3.5

serathius · 2022-04-11T15:46:32Z

Problem: can't confirm if this is a new issue in v3.5 as ETCD_VERIFY checks were not present in v3.4 :(

Would like got confirmation from the author @ptabor:

Why those checks were added but not enabled by default?
Can we trust consistent index check, or should we disable it as it was not run before we wanted test data inconsistency fix?
Should this issue block v3.5?

ptabor · 2022-04-11T17:38:14Z

Why those checks were added but not enabled ?

I would expect them enabled in integrational tests by testing.BeforeTest(...) logic: ?

etcd/tests/framework/integration/testing.go

Line 104 in dd08e15

os.Setenv(verify.ENV_VERIFY, verify.ENV_VERIFY_ALL_VALUE)

Can we trust consistency index check ... ?

I think it's a valid (critical!) problem, related to the consistency-issue fix:

https://github.com/etcd-io/etcd/runs/5968635467?check_suite_focus=true

m0 was disconnected.... missed some commits... and is reconnected so gets the snapshot (from M2- the new leader):

logger.go:130: 2022-04-11T08:23:25.036Z	INFO	m2	saved snapshot	{"member": "m2", "snapshot-index": 22}
logger.go:130: 2022-04-11T08:23:25.506Z	INFO	m2	sending merged snapshot	{"member": "m2", "from": "e66e5fda013bdcb6", "to": "55c81bc7397fe82b", "bytes": 37742, "size": "38 kB"}

This enables : createMergedSnapshotMessage code-path within applyAll method (but not applyEntries).
That code is 'linear' to applyEntries and calls:

etcd/server/etcdserver/snapshot_merge.go

Line 42 in dd08e15

dbsnap := s.be.Snapshot()

that calls:

etcd/server/storage/backend/batch_tx.go

Lines 242 to 245 in dd08e15

    
           func (t *batchTx) Commit() { 
        
           	t.lock() 
        
           	t.commit(false) 
        
           	t.Unlock()

so it does NOT flushes consistencyIndex... but there might be just applied transactions in the backend.

Thus the m0 receives a snapshot announced as containing index:24 with the constent at 8:

logger.go:130: 2022-04-11T08:23:25.510Z	INFO	m0	applied incoming Raft snapshot	{"member": "m0", "snapshot-index": 24}
logger.go:130: 2022-04-11T08:23:25.510Z	INFO	m0	applying snapshot	{"member": "m0", "current-snapshot-index": 0, "current-applied-index": 8, "incoming-leader-snapshot-index": 24, "incoming-leader-snapshot-term": 33}
logger.go:130: 2022-04-11T08:23:25.513Z	INFO	m0	restored mvcc store	{"member": "m0", "consistent-index": 8}
logger.go:130: 2022-04-11T08:23:25.526Z	INFO	m0	applied snapshot	{"member": "m0", "current-snapshot-index": 24, "current-applied-index": 24, "incoming-leader-snapshot-index": 24, "incoming-leader-snapshot-term": 33}
logger.go:130: 2022-04-11T08:23:27.626Z	FATAL	m0	Verification failed	{"member": "m0", "data-dir": "/tmp/TestV3WatchRestoreSnapshotUnsync2779563335/002/etcd2880677907", "error": "backend.ConsistentIndex (8) must be >= last snapshot index (24)"}

Solutions:

Commit done by Snapshot should LockWithinApply to create snapshot that contains both: commit-index and the applied payloads.
I would consider moving actually all Commits to the 'Apply` path (or make it 'exclusive' with ApplyEntries) logic, such that we never have files with 'lagging' consistent-index produced.

ahrtr · 2022-04-11T20:38:58Z

so it does NOT flushes consistencyIndex... but there might be just applied transactions in the backend.

@ptabor I don't think it's the root cause, because we persist the consistent_index on OnPreCommitUnsafe

UPDATE: Got it, the reason is that we did not all LockInsideApply!!!

serathius · 2022-04-11T21:15:53Z

Great to hear we know the root cause @ahrtr! Awaiting the PR with fix.

ahrtr · 2022-04-11T21:34:42Z

After second thought, I still don't think it's the root cause.

If there is at least one entry in the apply.Entries, then the applying consistent_index must have been saved to consistent_index; accordingly it will be persisted to the db at [OnPreCommitUnsafe](

etcd/server/storage/backend/batch_tx.go

Line 333 in dd08e15

t.backend.hooks.OnPreCommitUnsafe(t)

).

If there is no any entry in currently apply.Entries, and the createMergedSnapshotMessage is triggered, then the consistent_index will not be persisted just as @ptabor pointed out. But from another perspective, it also means that there is no need to persist consistent_index because its value doesn't change.

@serathius can you always/easily reproduce this issue? I tried a couple of times, but couldn't reproduce this issue.

serathius · 2022-04-11T21:48:12Z

Yes, I can always reproduce it with command:

 ETCD_VERIFY=all go test ./integration --run=TestV3WatchRestoreSnapshotUnsync --count=10 -v

One thing I needed to make sure is that TMPDIR environment variable points to in memory mounted filesystem, should be default though on standard filesystem partition (/tmp as tmpfs).

ahrtr · 2022-04-11T22:29:26Z

git bisect lead to the following commit. I will take a closer look.


$ git bisect bad
1ea53d527e09382e32f3ef7ddc562b4e5a9c5d93 is the first bad commit
commit 1ea53d527e09382e32f3ef7ddc562b4e5a9c5d93
Author: Marek Siarkowicz <[email protected]>
Date:   Thu Apr 7 17:32:21 2022 +0200

    server: Save consistency index and term to backend even when they decrease
    
    Reason to store CI and term in backend was to make db fully independent
    snapshot, it was never meant to interfere with apply logic. Skip of CI
    was introduced for v2->v3 migration where we wanted to prevent it from
    decreasing when replaying wal in
    https://github.com/etcd-io/etcd/pull/5391. By mistake it was added to
    apply flow during refactor in
    https://github.com/etcd-io/etcd/pull/12855#commitcomment-70713670.
    
    Consistency index and term should only be negotiated and used by raft to make
    decisions. Their values should only driven by raft state machine and
    backend should only be responsible for storing them.

:040000 040000 fc2af0b5ab5d038542ddc860a0497748ecd1a376 c1e605feca42c6b78c21a2ac59e076dec69a5e71 M	etcdutl
:040000 040000 9a25be29aa428ed8789dcf117fe11cdba52c6e4b fc0ed8bc6f7d5a84b72dbceb1965691e83c5beea M	server

ahrtr · 2022-04-12T02:44:02Z

Eventually I figured out the root cause, please see the explanation in PR 13930.

This should be a legacy issue, so we need to backport to 3.5 as well. Will do it once this once is merged.

serathius mentioned this issue Apr 11, 2022

Plans for v3.5.3 release #13894

Closed

28 tasks

ahrtr mentioned this issue Apr 12, 2022

Set backend to cindex before recovering the lessor in applySnapshot #13930

Merged

serathius closed this as completed in #13930 Apr 12, 2022

ahrtr mentioned this issue Apr 12, 2022

[3.5]Set backend to cindex before recovering the lessor in applySnapshot #13933

Merged

serathius mentioned this issue Apr 13, 2022

tests: Test both fast memory and slower disk mounted filesystem #13927

Closed

terickson mentioned this issue Apr 28, 2022

Removing etcd Node on 1.22 RKE1 Cluster kills etcd cluster rancher/rancher#36874

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verification error in TestV3WatchRestoreSnapshotUnsync #13922

Verification error in TestV3WatchRestoreSnapshotUnsync #13922

serathius commented Apr 11, 2022 •

edited

Loading

ahrtr commented Apr 11, 2022

ahrtr commented Apr 11, 2022

serathius commented Apr 11, 2022 •

edited

Loading

serathius commented Apr 11, 2022

serathius commented Apr 11, 2022

serathius commented Apr 11, 2022

ptabor commented Apr 11, 2022

ahrtr commented Apr 11, 2022 •

edited

Loading

serathius commented Apr 11, 2022 •

edited

Loading

ahrtr commented Apr 11, 2022

serathius commented Apr 11, 2022 •

edited

Loading

ahrtr commented Apr 11, 2022

ahrtr commented Apr 12, 2022

Verification error in TestV3WatchRestoreSnapshotUnsync #13922

Verification error in TestV3WatchRestoreSnapshotUnsync #13922

Comments

serathius commented Apr 11, 2022 • edited Loading

ahrtr commented Apr 11, 2022

ahrtr commented Apr 11, 2022

serathius commented Apr 11, 2022 • edited Loading

serathius commented Apr 11, 2022

serathius commented Apr 11, 2022

serathius commented Apr 11, 2022

ptabor commented Apr 11, 2022

ahrtr commented Apr 11, 2022 • edited Loading

serathius commented Apr 11, 2022 • edited Loading

ahrtr commented Apr 11, 2022

serathius commented Apr 11, 2022 • edited Loading

ahrtr commented Apr 11, 2022

ahrtr commented Apr 12, 2022

serathius commented Apr 11, 2022 •

edited

Loading

serathius commented Apr 11, 2022 •

edited

Loading

ahrtr commented Apr 11, 2022 •

edited

Loading

serathius commented Apr 11, 2022 •

edited

Loading

serathius commented Apr 11, 2022 •

edited

Loading