Skip to content

Commit

Permalink
update the discovery service protocol for v3 discovery
Browse files Browse the repository at this point in the history
  • Loading branch information
ahrtr committed May 7, 2022
1 parent 6fbdc8e commit 703f7ad
Showing 1 changed file with 42 additions and 52 deletions.
94 changes: 42 additions & 52 deletions content/en/docs/v3.6/dev-internal/discovery_protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,24 @@ weight: 1500
description: Discover other etcd members in a cluster bootstrap phase
---

Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery URL.
Discovery service protocol helps new etcd member to discover all other members in cluster bootstrap phase using a shared discovery token and endpoint list.

Discovery service protocol is _only_ used in cluster bootstrap phase, and cannot be used for runtime reconfiguration or cluster monitoring.

The protocol uses a new discovery token to bootstrap one _unique_ etcd cluster. Remember that one discovery token can represent only one etcd cluster. As long as discovery protocol on this token starts, even if it fails halfway, it must not be used to bootstrap another etcd cluster.

The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster. The public discovery service, discovery.etcd.io, functions the same way, but with a layer of polish to abstract away ugly URLs, generate UUIDs automatically, and provide some protections against excessive requests. At its core, the public discovery service still uses an etcd cluster as the data store as described in this document.
The rest of this article will walk through the discovery process with examples that correspond to a self-hosted discovery cluster.

Note that this document is only for v3 discovery. Check previous document for more details on [v2 discovery][v2-discovery].

## Protocol workflow

The idea of discovery protocol is to use an internal etcd cluster to coordinate bootstrap of a new cluster. First, all new members interact with discovery service and help to generate the expected member list. Then each new member bootstraps its server using this list, which performs the same functionality as -initial-cluster flag.

In the following example workflow, we will list each step of protocol in curl format for ease of understanding.
In the following example workflow, we will list each step of protocol using `etcdctl` command for ease of understanding, and we assume that `http://example.com:2379` hosts an etcd cluster for discovery service.

By convention the etcd discovery protocol uses the key prefix `/_etcd/registry`.

By convention the etcd discovery protocol uses the key prefix `_etcd/registry`. If `http://example.com` hosts an etcd cluster for discovery service, a full URL to discovery keyspace will be `http://example.com/v2/keys/_etcd/registry`. We will use this as the URL prefix in the example.

### Creating a new discovery token

Expand All @@ -33,85 +36,72 @@ UUID=$(uuidgen)
The discovery token expects a cluster size that must be specified. The size is used by the discovery service to know when it has found all members that will initially form the cluster.

```
curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size -d value=${cluster_size}
etcdctl --endpoints=http://example.com:2379 put /_etcd/registry/${UUID}/_config/size ${cluster_size}
```

Usually the cluster size is 3, 5 or 7. Check [optimal cluster size][cluster-size] for more details.

### Bringing up etcd processes

Given the discovery URL, use it as `-discovery` flag and bring up etcd processes. Every etcd process will follow this next few steps internally if given a `-discovery` flag.

### Registering itself
Set the discovery token `${UUID}` to `--discovery-token` flag, and set the endpoints of the etcd cluster backing the discovery service to `--discovery-endpoints` flag. This will enable v3 discovery to bootstrap the etcd cluster.

The first thing for etcd process is to register itself into the discovery URL as a member. This is done by creating member ID as a key in the discovery URL.
Every etcd process will follow the next few steps internally if `--discovery-token` and `--discovery-endpoints` flags are given.

If the discovery service enables client cert authentication, configure the following flags. They follow exactly the same usage as using `etcdctl` to communicate with an etcd cluster.
```
curl -X PUT http://example.com/v2/keys/_etcd/registry/${UUID}/${member_id}?prevExist=false -d value="${member_name}=${member_peer_url_1}&${member_name}=${member_peer_url_2}"
--discovery-insecure-transport
--discovery-insecure-skip-tls-verify
--discovery-cert
--discovery-key
--discovery-cacert
```

### Checking the status

It checks the expected cluster size and registration status in discovery URL, and decides what the next action is.

If the discovery service enables role based authentication, configure the following flags. They follow exactly the same usage as using `etcdctl` to communicate with an etcd cluster.
```
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}/_config/size
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}
--discovery-user
--discovery-password
```

If registered members are still not enough, it will wait for left members to appear.

If the number of registered members is bigger than the expected size N, it treats the first N registered members as the member list for the cluster. If the member itself is in the member list, the discovery procedure succeeds and it fetches all peers through the member list. If it is not in the member list, the discovery procedure finishes with the failure that the cluster has been full.

In etcd implementation, the member may check the cluster status even before registering itself. So it could fail quickly if the cluster has been full.

### Waiting for all members

The wait process is described in detail in the [etcd API documentation][api].

The default time or timeout values can also be changed using the following flags, which follow exactly the same usage as using `etcdctl` to communicate with an etcd cluster.
```
curl -X GET http://example.com/v2/keys/_etcd/registry/${UUID}?wait=true&waitIndex=${current_etcd_index}
--discovery-dial-timeout
--discovery-request-timeout
--discovery-keepalive-time
--discovery-keepalive-timeout
```

It keeps waiting until finding all members.

## Public discovery service
### Registering itself

CoreOS Inc. hosts a public discovery service at https://discovery.etcd.io/ , which provides some nice features for ease of use.
The first thing that each etcd process does is to register itself into the given new cluster as a member. This is done by creating member ID as a key in the full registry key.

### Mask key prefix
```
etcdctl --endpoints=http://example.com:2379 put /_etcd/registry/${UUID}/members/${member_id} ${member_name}=${member_peer_url_1}&${member_name}=${member_peer_url_2}
```

Public discovery service will redirect `https://discovery.etcd.io/${UUID}` to etcd cluster behind for the key at `/v2/keys/_etcd/registry`. It masks register key prefix for short and readable discovery url.
### Checking the status

### Get new token
It checks the expected cluster size and registration status, and decides what the next action is.

```
GET /new
Sent query:
size=${cluster_size}
Possible status codes:
200 OK
400 Bad Request
200 Body:
generated discovery url
etcdctl --endpoints=http://example.com:2379 get /_etcd/registry/${UUID}/_config/size
etcdctl --endpoints=http://example.com:2379 get /_etcd/registry/${UUID}/members
```

The generation process in the service follows the steps from [Creating a New Discovery Token][new-discovery-token] to [Specifying the Expected Cluster Size][expected-cluster-size].
If registered members are still not enough, it will wait for other members to appear.

### Check discovery status
If the number of registered members is bigger than the expected size N, it treats the first N registered members as the member list for the cluster. If the member itself is in the member list, the discovery procedure succeeds, and it fetches all peers through the member list. If it is not in the member list, the discovery procedure finishes with the failure that the cluster has been full.

```
GET /${UUID}
```
The member may check the cluster status even before registering itself. So it could fail quickly if the cluster has been full.

The status for this discovery token, including the machines that have been registered, can be checked by requesting the value of the UUID.
### Waiting for all members

### Open-source repository
The wait process keeps watching the key prefix `/_etcd/registry/${UUID}/members` until finding all members.

The repository is located at https://github.com/coreos/discovery.etcd.io. It could be used to build a custom discovery service.
```
etcdctl --endpoints=http://example.com:2379 watch /_etcd/registry/${UUID}/members --prefix
```

[api]: /docs/v2.3/api#waiting-for-a-change
[v2-discovery]: /docs/v3.5/dev-internal/discovery_protocol
[cluster-size]: /docs/v2.3/admin_guide#optimal-cluster-size
[expected-cluster-size]: #specifying-the-expected-cluster-size
[new-discovery-token]: #creating-a-new-discovery-token

0 comments on commit 703f7ad

Please sign in to comment.