Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

env.sh.eex is preventing the ability to run multiple self-hosted realtime instances in a cluster #1075

Open
2 tasks done
Towerful opened this issue Jun 16, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@Towerful
Copy link

Bug report

  • I confirm this is a bug with Supabase, not with my own application.

  • I confirm I have searched the Docs, GitHub Discussions, and Discord.

Describe the bug

Currently https://github.com/supabase/realtime/blob/cd04f2f744834296b5a4b3e360e95c3fab5f9165/rel/env.sh.eex is preventing any way of running the Postgres (or any other) Cluster Strategy.

None of the specific cases can be met/configured for a selfhosted instance. It is difficult/impossible/fragile to get the ip variable to actually configure, so it falls back to 127.0.0.1
This produces cluster attempt logs such as SYN[[email protected]], and the cluster strategy breaks

To Reproduce

I'm moving a lot of this over to a local k8s cluster, so this reproduction steps may not be as clear as they should be.

I think the supabase docker compose file could be tweaked with CLUSTER_STRATEGIES=POSTGRES to try and get the cluster strategy to work.
The realtime config will have to be duplicated to run 2 instances, as the realtime containers with broken cluster strategy will fight over the same replication slot so SLOT_NAME_SUFFIX will need to be unique to each container

Both containers will connect to their respective replication slots, and will handle postgres realtime updates fine.
However, broadcast between the instances will not work (broadcast will only work within an instance).
No idea how to direct traffic between the 2 instances (previously, I have used external HAProxy. k8s handles that automatically as a service)

This is because the first step of env.sh.eex is to try and extract the instance's IP address from etc/hosts. If it doesnt exactly match the fly.io config, it will fail to an empty string.
Later on - as ip is an empty string and no other conditions are met - it defaults to 127.0.0.1

Expected behavior

A way to set ip, RELEASE_DISTRIBUTION , RELEASE_NODE manually allowing for more advanced selfhosters to configure the clustering. Perhaps some additional logging about this could be helpful

Additional Context

I commented on the issue #760 (specifically #760 (comment) ) regarding this, with a fix that is working for me.

This includes logs of the cluster strategy working between multiple instances, with lots of things like Node [email protected] has joined the cluster, sending discover message which are completely absent when it fails to configure an IP and falls back to using 127.0.0.1 ip addresses.

I have rebuilt the image "internally" with this new env.sh.eex and have been using & testing it. There is now only 1 replication slot being used, and broadcast between instances appears to be working correctly.

I'm not great with bash and I can't test this within your environment. I also suck at GH pull requests etc, so I'll let you form the final fix for this :)

Again, sorry for the poor bug report, but hopefully it is enough

@Towerful Towerful added the bug Something isn't working label Jun 16, 2024
@Towerful Towerful changed the title env.sh.eex is preventing the ability to run multiple self-hosted realtime instances env.sh.eex is preventing the ability to run multiple self-hosted realtime instances in a cluster Jun 16, 2024
@filipecabaco
Copy link
Contributor

Hi @Towerful thank you for reporting.

Good catch! I will try to clean up this to improve our self hosting.

Do you want to open the PR so we can work together to fix the issue?

@Towerful
Copy link
Author

I presume I would have to fork, update my fork, then open a PR from that?
Like I said, I'm not great with git/github - but I can look into it.
I'm more than happy for you to do it if its easier, I'm not bothered about attribution/etc

@filipecabaco
Copy link
Contributor

Ok then I can tackle it 👍 I will ping you to also check the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants