Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data gets corrupted when incrby happens on a key after TTL expires #875

Open
pszabop opened this issue Dec 12, 2024 · 5 comments
Open

Data gets corrupted when incrby happens on a key after TTL expires #875

pszabop opened this issue Dec 12, 2024 · 5 comments

Comments

@pszabop
Copy link

pszabop commented Dec 12, 2024

Describe the bug

When a key expires, and then I attempt to increment it with the value of 0 (INCRBY), the key becomes corrupted.

I have a program that increments the key by zero if the key does not exist, and it has events every few seconds that cause this logic chain to execute Here's the output of manually polling that entry while this happens using a redis CLI client:

You can see the TTL count down, the value go to nil and the TTL to -2 as expected, but when the program attemps to incr by zero you can see a garbage value appear. That key is now permanently corrupted, it never recovers unless I flush the database or DEL that key.

garnetlocal:6379> ttl session:ae5d5924-2304-4037-99d2-c546fcf1a22c
(integer) 5
garnetlocal:6379> get session:ae5d5924-2304-4037-99d2-c546fcf1a22c
"0"
garnetlocal:6379> ttl session:ae5d5924-2304-4037-99d2-c546fcf1a22c
(integer) 4
garnetlocal:6379> get session:ae5d5924-2304-4037-99d2-c546fcf1a22c
"0"
garnetlocal:6379> ttl session:ae5d5924-2304-4037-99d2-c546fcf1a22c
(integer) 2
garnetlocal:6379> get session:ae5d5924-2304-4037-99d2-c546fcf1a22c
"0"
garnetlocal:6379> ttl session:ae5d5924-2304-4037-99d2-c546fcf1a22c
(integer) 0
garnetlocal:6379> get session:ae5d5924-2304-4037-99d2-c546fcf1a22c
(nil)
garnetlocal:6379> ttl session:ae5d5924-2304-4037-99d2-c546fcf1a22c
(integer) -2
garnetlocal:6379> get session:ae5d5924-2304-4037-99d2-c546fcf1a22c
(nil)
garnetlocal:6379> get session:ae5d5924-2304-4037-99d2-c546fcf1a22c
(nil)
garnetlocal:6379> get session:ae5d5924-2304-4037-99d2-c546fcf1a22c
"0V\xdcQ\xf3\x1a\xdd\b0"

Steps to reproduce the bug

  1. INCRBY some random key by zero and set its TTL to some number (say 30).
  2. GET the key every second for its value and TTL and watch as the TTL decrements down to zero
  3. After the key times out, INCR it by zero and set its TTL in a manner identical to step (1).
  4. You should observe garbage value in the key. Also, there is an error when trying to increment a non-integer value

Expected behavior

At step 4 when incrementing a (nil) entry I should get a value of zero in the key

Screenshots

No response

Release version

GarnetServer --version gives me an error, so here's the docker container info instead

           "Labels": {
                "org.opencontainers.image.created": "2024-09-30T19:25:02.511Z",
                "org.opencontainers.image.description": "Garnet is a remote cache-store from Microsoft Research that offers strong performance (throughput and latency), scalability, storage, recovery, cluster sharding, key migration, and replication features. Garnet can work with existing Redis clients.",
                "org.opencontainers.image.licenses": "MIT",
                "org.opencontainers.image.revision": "64636ce1efd6486b5d9e553bc3f6c364710c1b36",
                "org.opencontainers.image.source": "https://github.com/microsoft/garnet",
                "org.opencontainers.image.title": "garnet",
                "org.opencontainers.image.url": "https://github.com/microsoft/garnet",
                "org.opencontainers.image.version": "main"
            }

IDE

No response

OS version

Docker Engine: 20.10.17 (on OSX)

Additional context

I'm using rust's client API with a pipeline. This bug does not reproduce on redict 7.1.3

@pszabop pszabop changed the title Data gets corrupted when incr happens on a key after TTL expires Data gets corrupted when incrby happens on a key after TTL expires Dec 13, 2024
@Vijay-Nirmal
Copy link
Contributor

I tried to reproduce it but I am not able to. Below is my attempt to reproduce it. Let me know if I did something wrong

127.0.0.1:6379> SET key 5
OK
127.0.0.1:6379> INCRBY key 0
(integer) 5
127.0.0.1:6379> EXPIRE key 30
(integer) 1
127.0.0.1:6379> GET key
"5"
127.0.0.1:6379> TTL key
(integer) 17
127.0.0.1:6379> GET key
"5"
127.0.0.1:6379> TTL key
(integer) 6
127.0.0.1:6379> TTL key
(integer) 0
127.0.0.1:6379> GET key
(nil)
127.0.0.1:6379> TTL key
(integer) -2
127.0.0.1:6379> INCRBY key 0
(integer) 0
127.0.0.1:6379> EXPIRE key 30
(integer) 1
127.0.0.1:6379> GET key
"0"
127.0.0.1:6379> GET key
"0"

@badrishc
Copy link
Contributor

badrishc commented Dec 13, 2024

If you are not able to provide a sequence of redis-cli commands that reproduces this error then:

@pszabop
Copy link
Author

pszabop commented Dec 13, 2024

Maybe it has something to do with the pipelined commands. When I get a chance I'll try running redis-cli with --pipe

here's the Rust code that shows the pipeline command used

let _: () = redis::pipe().incr(session_key.as_ref().unwrap(), recording_score).expire(session_key.as_ref().unwrap(), self.session_blocking_ttl as i64).query_async(&mut con).await.unwrap();

@badrishc
Copy link
Contributor

Any update on getting a repro for this?

@pszabop
Copy link
Author

pszabop commented Dec 31, 2024

sorry, busy holiday season plus working on getting a release out. I'll get to this...

Another piece of info is I was regularly using flushdb between tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants