Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory usage (memory leak?) with 1 node cluster and small DB (but a lot of updates) #18382

Open
4 tasks done
vitalif opened this issue Jul 30, 2024 · 11 comments
Open
4 tasks done
Labels
area/performance priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. type/bug

Comments

@vitalif
Copy link
Contributor

vitalif commented Jul 30, 2024

Bug report criteria

What happened?

My etcd 3.5.8 ate 12 GB of RAM...
The database size is < 2 MB (etcdctl get --write-out=json --prefix / gives me 1966139 byte dump).
The database takes 1.5 GB on disk (/var/lib/etcd0.etcd).
There are 6 active streams.
Client traffic shows 2.2 MB/s in and 2.5 MB/s out in monitoring dashboard.
There are 128+5 keys which are bumped continuosly through leases.
There are other 139 keys which are updated every 30 seconds.

etcd_obozhralsya

As I've already experienced such memory usage with this etcd, it was started with --enable-pprof=true, pprof shows the following:

# go tool pprof http://192.168.1.144:2379/debug/pprof/heap
Fetching profile over HTTP from http://192.168.1.144:2379/debug/pprof/heap
Saved profile in /root/pprof/pprof.etcd.alloc_objects.alloc_space.inuse_objects.inuse_space.012.pb.gz
File: etcd
Type: inuse_space
Time: Jul 31, 2024 at 1:45am (MSK)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 10.88GB, 99.59% of 10.92GB total
Dropped 179 nodes (cum <= 0.05GB)
Showing top 10 nodes out of 17
      flat  flat%   sum%        cum   cum%
   10.88GB 99.59% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/api/v3/etcdserverpb.(*InternalRaftRequest).Marshal
         0     0% 99.59%    10.88GB 99.59%  github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1
         0     0% 99.59%    10.88GB 99.59%  github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1
         0     0% 99.59%    10.88GB 99.59%  github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).UnaryServerInterceptor.func1
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/api/v3/etcdserverpb._KV_Txn_Handler
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/api/v3/etcdserverpb._KV_Txn_Handler.func1
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).Txn
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).processInternalRaftRequestOnce
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).raftRequest (inline)
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).raftRequestOnce

Is it a memory leak or what?...

What did you expect to happen?

I expected etcd to consume at most 1 GB of memory with 2 MB database

How can we reproduce it (as minimally and precisely as possible)?

Install Vitastor on 1 node. :-) but I'm not sure it will be reproduced easily.

Anything else we need to know?

It's still running, I can collect other information or connect with debugger if you tell me what I should collect. :-)

Etcd version (please run commands below)

$ etcd --version
etcd Version: 3.5.8
Git SHA: 217d183e5
Go Version: go1.19.5
Go OS/Arch: linux/amd64

$ etcdctl version
etcdctl version: 3.5.8
API version: 3.5

Etcd configuration (command line flags or environment variables)

Startup options:

etcd -name etcd0 --data-dir /var/lib/etcd0.etcd --enable-pprof=true --snapshot-count=1000 --advertise-client-urls http://192.168.1.144:2379 --listen-client-urls http://192.168.1.144:2379 --initial-advertise-peer-urls http://192.168.1.144:2380 --listen-peer-urls http://192.168.1.144:2380 --initial-cluster-token vitastor-etcd-1 --initial-cluster etcd0=http://192.168.1.144:2380 --initial-cluster-state new --max-txn-ops=100000 --max-request-bytes=104857600 --auto-compaction-retention=10 --auto-compaction-mode=revision

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
+------------------+---------+-------+---------------------------+---------------------------+------------+
|        ID        | STATUS  | NAME  |        PEER ADDRS         |       CLIENT ADDRS        | IS LEARNER |
+------------------+---------+-------+---------------------------+---------------------------+------------+
| 720141fbb7123c6f | started | etcd0 | http://192.168.1.144:2380 | http://192.168.1.144:2379 |      false |
+------------------+---------+-------+---------------------------+---------------------------+------------+

$ etcdctl --endpoints=<member list> endpoint status -w table
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|      ENDPOINT      |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.1.144:2379 | 720141fbb7123c6f |   3.5.8 |  1.2 GB |      true |      false |        51 |   16960478 |           16960478 |        |
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

Relevant log output

No response

@serathius
Copy link
Member

Looks like #17098 for me, but to confirm. Can you provide:

  • percentage of PUT requests vs all other requests.
  • average size of payload in PUT request

@ahrtr
Copy link
Member

ahrtr commented Jul 31, 2024

Could you also attach the generated pprof.etcd.alloc_objects.alloc_space.inuse_objects.inuse_space.*.pb.gz file?

@jmhbnz jmhbnz added area/performance priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Aug 1, 2024
@ahrtr
Copy link
Member

ahrtr commented Aug 5, 2024

The database size is < 2 MB (etcdctl get --write-out=json --prefix / gives me 1966139 byte dump).
The database takes 1.5 GB on disk (/var/lib/etcd0.etcd).

Please also perform a compaction + defragmentation, and retry, please let's know the result. thx

@socketpair
Copy link

socketpair commented Aug 5, 2024

@ahrtr Topic starter concerns on RAM consumption, not about disk usage. If they are connected somehow, yes it might help, but disk usage itself is not the issue. Note, RAM usage is about 10 times bigger then the disk usage.

@ahrtr
Copy link
Member

ahrtr commented Aug 5, 2024

If they are connected somehow

They are related, bbolt mmaps the file into memory directly.

@vitalif
Copy link
Contributor Author

vitalif commented Aug 5, 2024

Here's the profile pprof.etcd.alloc_objects.alloc_space.inuse_objects.inuse_space.012.pb.gz

Percentage of PUT requests I think was close to 100% )). There were puts + 6 watchers, that's all. RPC rate graph shows 0.85 rps and 2.5 MByte/s traffic.

Memory usage actually dropped the next day along with update traffic when I restarted my software and it stopped updating... something. I think it was old statistics but I'm not sure. 2.5 MB/s is quite a lot given that total amount of data in etcd is 2 MB :-). My software updates statistics every 5 seconds in that installation, so 2.5 MB/s is 12.5 MB every 5 seconds... Still more than 2 MB. So I'm not sure what it was. Maybe these were some unapplied compare failures and retries... I'll monitor this instance for more time.

Here's the screenshot from grafana:
image

@socketpair
Copy link

socketpair commented Aug 5, 2024

@ahrtr okay, but

  1. does it map multiple times to different places ?
  2. If yes - it does not increase memory usage (RSS) because of shared memory.
  3. if no, why sizes differ so much ? seems it allocates huge amounts of anonymous memory (copies ?)

@socketpair
Copy link

socketpair commented Aug 5, 2024

I'm not an expert in Go. Is there any memory mapper ? I mean a tool that shows what memory block is used for.

@clement2026
Copy link
Contributor

clement2026 commented Sep 6, 2024

@socketpair I get that you’re concerned about memory size. You might want to try setting the --experimental-snapshot-catchup-entries flag. The default value is 5000, but you can lower it to, say, 500, which should bring your memory usage down to around 3GB.

The downside of setting a low value for --experimental-snapshot-catchup-entries is that slow followers might need to catch up with the leader using a snapshot, which is less efficient. But if you’re running a single instance of etcd, you can even set it to 1 and can keep your memory usage under 2GB.

Based on your posts, it seems the average size of a put request is around 2MB.

Your etcd instance’s heap size is roughly between (--experimental-snapshot-catchup-entries) * 2MB and (--experimental-snapshot-catchup-entries + --snapshot-count) * 2MB. In your case, that’s about 10GB to 12GB.

I’m sharing some tests here for your reference.

Test 1

--experimental-snapshot-catchup-entries 5000

Setup:

rm -rf etcd0.etcd; etcd -name etcd0 \
--data-dir etcd0.etcd \
--enable-pprof=true \
--snapshot-count=1000 \
--advertise-client-urls http://localhost:2379 \
--listen-client-urls http://localhost:2379 \
--initial-advertise-peer-urls http://localhost:2380 \
--listen-peer-urls http://localhost:2380 \
--initial-cluster-token vitastor-etcd-1 \
--initial-cluster etcd0=http://localhost:2380 \
--initial-cluster-state new \
--max-txn-ops=100000 \
--max-request-bytes=104857600 \
--auto-compaction-mode=periodic \
--auto-compaction-retention=5s \
--experimental-snapshot-catchup-entries 5000

Benchmark script:

./bin/tools/benchmark txn-mixed --total=50000000 --val-size=2000000 --key-space-size=1 --rw-ratio=1
Xnip2024-09-06_14-55-54

Test 2

The setup and benchmark script is the same except --experimental-snapshot-catchup-entries 500.

Xnip2024-09-06_14-39-17

Test 3

--experimental-snapshot-catchup-entries 1

image

Hope it helps.

@serathius
Copy link
Member

Awesome analysis @clement2026, makes me very excited about your work in #17098. Would be awesome to paste those findings there too.

@clement2026
Copy link
Contributor

I’m glad you found the analysis helpful. I’ll include them in #17098.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. type/bug
Development

No branches or pull requests

6 participants