Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the discovery service protocol for v3 discovery #583

Merged
merged 1 commit into from
Sep 14, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 42 additions & 52 deletions content/en/docs/v3.6/dev-internal/discovery_protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,24 @@ weight: 1500
description: Discover other etcd members in a cluster bootstrap phase
---

Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery URL.
Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery token and endpoint list.

Discovery service protocol is _only_ used in cluster bootstrap phase, and cannot be used for runtime reconfiguration or cluster monitoring.

The protocol uses a new discovery token to bootstrap one _unique_ etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if it fails halfway, it must not be used to bootstrap another etcd cluster.

The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster. The public discovery service, discovery.etcd.io, functions the same way, but with a layer of polish to abstract away ugly URLs, generate UUIDs automatically, and provide some protections against excessive requests. At its core, the public discovery service still uses an etcd cluster as the data store as described in this document.
The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster.

Note that this document is only for v3 discovery. Check previous document for more details on [v2 discovery][v2-discovery].

## Protocol workflow

The idea of discovery protocol is to use an internal etcd cluster to coordinate bootstrap of a new cluster. First, all new members interact with discovery service and help to generate the expected member list. Then each new member bootstraps its server using this list, which performs the same functionality as -initial-cluster flag.

In the following example workflow, we will list each step of protocol in curl format for ease of understanding.
In the following example workflow, we will list each step of protocol using `etcdctl` command for ease of understanding, and we assume that `http://example.com:2379` hosts an etcd cluster for discovery service.

By convention the etcd discovery protocol uses the key prefix `/_etcd/registry`.

By convention the etcd discovery protocol uses the key prefix `_etcd/registry`. If `http://example.com` hosts an etcd cluster for discovery service, a full URL to discovery keyspace will be `http://example.com/v2/keys/_etcd/registry`. We will use this as the URL prefix in the example.

### Creating a new discovery token

Expand All @@ -33,85 +36,72 @@ UUID=$(uuidgen)
The discovery token expects a cluster size that must be specified. The size is used by the discovery service to know when it has found all members that will initially form the cluster.

```
curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size -d value=${cluster_size}
etcdctl --endpoints=http://example.com:2379 put /_etcd/registry/${UUID}/_config/size ${cluster_size}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we can switch the curl to etcdctl as it is not obvious how developer would reimplement it. V2 discovery protocol was based on V2 api which was simple REST api that could be called via curl, showcasing how to implement it. Problem is that V3 api is using grpc which is binary protocol, so it's no longer obvious how we should explain it.

Could we maybe provide link to grpc proto definition and list which methods need to be implemented. It would be also great if we could provide examples by using client similar to curl however one native to grpc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The developers just need to follow the guide if they want to bootstrap a new cluster using v3 discovery. Probably we should add a simple example (as a new section) to help developers to understand it.

With regard to migrating from v2 discovery to v3 discovery, previously @ptabor has a comment . Both v2 & v3 discovery are only useful at the very first bootstrapping. Once etcd members have local data, they do not depend on the discovery service any more. So it doesn't make much sense to migrate the previous v2 discovery records?

We don't know the real data on how many users deploy the v2 discovery in their production environment. Most likely the use cases are very few. So users can manually perform the migration if they want.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The developers just need to follow the guide if they want to bootstrap a new cluster using v3 discovery. Probably we should add a simple example (as a new section) to help developers to understand it.

My understanding of this documentation is not for developer to learn it, but to be able to implement their service serving discover without deploying full etcd. For example https://github.com/etcd-io/discoveryserver. I would expect that this should be cough when we were migrating tests from V2 to V3, but it didn't.

With regard to migrating from v2 discovery to v3 discovery, previously @ptabor has a comment . Both v2 & v3 discovery are only useful at the very first bootstrapping. Once etcd members have local data, they do not depend on the discovery service any more. So it doesn't make much sense to migrate the previous v2 discovery records?

I think so.

We don't know the real data on how many users deploy the v2 discovery in their production environment. Most likely the use cases are very few. So users can manually perform the migration if they want.

I think we could look into metrics of https://github.com/etcd-io/discoveryserver which is the public etcd service discovery maintained by etcd project.

What do you think?

Feels like we should also consider implementing V3 discovery in https://github.com/etcd-io/discoveryserver. However I would be worried to change and deploy a project that haven't seen much updates for last couple of years.

TODO:

  • Look at usage metrics of discoveryserver
  • Decide if we want to implement V3 to discoveryserver
  • Add tests to etcd that use discoveryserver, to make sure we don't break things in future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of this documentation is not for developer to learn it, but to be able to implement their service serving discover without deploying full etcd. For example https://github.com/etcd-io/discoveryserver. I would expect that this should be cough when we were migrating tests from V2 to V3, but it didn't.

Not really per my understanding. The discovery.etcd.io or discoveryserver is just a router, which routes the client requests to backend etcd cluster. Please note that users do not necessarily to use the public discovery API, they can just deploy an etcd cluster as the discovery service themselves.

I think we could look into metrics of https://github.com/etcd-io/discoveryserver which is the public etcd service discovery maintained by etcd project.

This is a good point. But just as I mentioned above, users do not necessarily use the public discovery API. But anyway, we can take a look at the metrics firstly. Do you or @ptabor know the info on the backend etcd server supporting the public discovery service?

  • Decide if we want to implement V3 to discoveryserver

I don't think we should implement a public discovery service for V3 discovery. But it's open to discussion.

Copy link
Member

@serathius serathius May 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read some discoveryserver documentation, you are right it's just a router to etcd server underneath. Question should we also implement it to support v3 discovery to avoid removing ability for users to use public discovery service in v3.6?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments:

  1. There are already several e2e cases to cover the v2 discovery. We can remove them in 3.7 or 3.8 when clientv2 is completely removed.
  2. With regard to the question whether should we support public v3 discovery service, we can discuss it in separate session after we investigate the existing data in pubic v2 discovery service. Personally I don't think it's a priority for now.
  3. I will try to add a new example section into the doc in a separate PR.

This PR should be good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ptabor

```

Usually the cluster size is 3, 5 or 7. Check [optimal cluster size][cluster-size] for more details.

### Bringing up etcd processes

Given the discovery URL, use it as `-discovery` flag and bring up etcd processes. Every etcd process will follow this next few steps internally if given a `-discovery` flag.

### Registering itself
Set the discovery token `${UUID}` to `--discovery-token` flag, and set the endpoints of the etcd cluster backing the discovery service to `--discovery-endpoints` flag. This will enable v3 discovery to bootstrap the etcd cluster.

The first thing for etcd process is to register itself into the discovery URL as a member. This is done by creating member ID as a key in the discovery URL.
Every etcd process will follow the next few steps internally if `--discovery-token` and `--discovery-endpoints` flags are given.

If the discovery service enables client cert authentication, configure the following flags. They follow exactly the same usage as using `etcdctl` to communicate with an etcd cluster.
```
curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/${member_id}?prevExist=false -d value="${member_name}=${member_peer_url_1}&${member_name}=${member_peer_url_2}"
--discovery-insecure-transport
--discovery-insecure-skip-tls-verify
--discovery-cert
--discovery-key
--discovery-cacert
```

### Checking the status

It checks the expected cluster size and registration status in discovery URL, and decides what the next action is.

If the discovery service enables role based authentication, configure the following flags. They follow exactly the same usage as using `etcdctl` to communicate with an etcd cluster.
```
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}
--discovery-user
--discovery-password
```

If registered members are still not enough, it will wait for left members to appear.

If the number of registered members is bigger than the expected size N, it treats the first N registered members as the member list for the cluster. If the member itself is in the member list, the discovery procedure succeeds and it fetches all peers through the member list. If it is not in the member list, the discovery procedure finishes with the failure that the cluster has been full.

In etcd implementation, the member may check the cluster status even before registering itself. So it could fail quickly if the cluster has been full.

### Waiting for all members

The wait process is described in detail in the [etcd API documentation][api].

The default time or timeout values can also be changed using the following flags, which follow exactly the same usage as using `etcdctl` to communicate with an etcd cluster.
```
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}?wait=true&waitIndex=${current_etcd_index}
--discovery-dial-timeout
--discovery-request-timeout
--discovery-keepalive-time
--discovery-keepalive-timeout
```

It keeps waiting until finding all members.

## Public discovery service
### Registering itself

CoreOS Inc. hosts a public discovery service at https://discovery.etcd.io/ , which provides some nice features for ease of use.
The first thing that each etcd process does is to register itself into the given new cluster as a member. This is done by creating member ID as a key in the full registry key.

### Mask key prefix
```
etcdctl --endpoints=http://example.com:2379 put /_etcd/registry/${UUID}/members/${member_id} ${member_name}=${member_peer_url_1}&${member_name}=${member_peer_url_2}
```

Public discovery service will redirect `https://discovery.etcd.io/${UUID}` to etcd cluster behind for the key at `/v2/keys/_etcd/registry`. It masks register key prefix for short and readable discovery url.
### Checking the status

### Get new token
It checks the expected cluster size and registration status, and decides what the next action is.

```
GET /new

Sent query:
size=${cluster_size}
Possible status codes:
200 OK
400 Bad Request
200 Body:
generated discovery url
etcdctl --endpoints=http://example.com:2379 get /_etcd/registry/${UUID}/_config/size
etcdctl --endpoints=http://example.com:2379 get /_etcd/registry/${UUID}/members
```

The generation process in the service follows the steps from [Creating a New Discovery Token][new-discovery-token] to [Specifying the Expected Cluster Size][expected-cluster-size].
If registered members are still not enough, it will wait for other members to appear.

### Check discovery status
If the number of registered members is bigger than the expected size N, it treats the first N registered members as the member list for the cluster. If the member itself is in the member list, the discovery procedure succeeds, and it fetches all peers through the member list. If it is not in the member list, the discovery procedure finishes with the failure that the cluster has been full.

```
GET /${UUID}
```
The member may check the cluster status even before registering itself. So it could fail quickly if the cluster has been full.

The status for this discovery token, including the machines that have been registered, can be checked by requesting the value of the UUID.
### Waiting for all members

### Open-source repository
The wait process keeps watching the key prefix `/_etcd/registry/${UUID}/members` until finding all members.

The repository is located at https://github.com/coreos/discovery.etcd.io. It could be used to build a custom discovery service.
```
etcdctl --endpoints=http://example.com:2379 watch /_etcd/registry/${UUID}/members --prefix
```

[api]: /docs/v2.3/api#waiting-for-a-change
[v2-discovery]: /docs/v3.5/dev-internal/discovery_protocol
[cluster-size]: /docs/v2.3/admin_guide#optimal-cluster-size
[expected-cluster-size]: #specifying-the-expected-cluster-size
[new-discovery-token]: #creating-a-new-discovery-token