Skip to content

Commit

Permalink
set up SIG-etcd (#7372)
Browse files Browse the repository at this point in the history
* Create SIG-etcd

Co-authored-by: Josh Berkus <[email protected]>
Co-authored-by: Marek Siarkowicz <[email protected]>

* Remove unnecessary eviations

* Update sig-etcd/charter.md

Co-authored-by: Benjamin Wang <[email protected]>

* Update charter.md to remove OWNERS file from deviation

Adding OWNERS file will be a hard requirement for etcd repo. I also added an issue in etcd repo for tracking: etcd-io/etcd#16367.

* update charter.md, vision.md and README.md to address comments

* update sig-etcd with new chairs

* Update charter.md

* Update charter.md to include implicit k8s-etcd-contract as part of sig-etcd's responsibility in a sentence, instead of a linked google doc

* Update sig-etcd/vision.md

Co-authored-by: Tim Bannister <[email protected]>

* Update sig-etcd/vision.md

Co-authored-by: Tim Bannister <[email protected]>

* update etcd meeting link, time and youtube link

---------

Co-authored-by: Josh Berkus <[email protected]>
Co-authored-by: Marek Siarkowicz <[email protected]>
Co-authored-by: Wenjia <[email protected]>
Co-authored-by: Benjamin Wang <[email protected]>
Co-authored-by: Tim Bannister <[email protected]>
  • Loading branch information
6 people authored Sep 12, 2023
1 parent e7383ee commit 3edd1c6
Show file tree
Hide file tree
Showing 8 changed files with 410 additions and 0 deletions.
5 changes: 5 additions & 0 deletions OWNERS_ALIASES
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,11 @@ aliases:
- reylejano
- sftim
- tengqm
sig-etcd-leads:
- ahrtr
- jmhbnz
- serathius
- wenjiaswe
sig-instrumentation-leads:
- dashpole
- dgrisonnet
Expand Down
1 change: 1 addition & 0 deletions liaisons.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ members will assume one of the departing members groups.
| [SIG Cluster Lifecycle](sig-cluster-lifecycle/README.md) | Nabarun Pal (**[@palnabarun](https://github.com/palnabarun)**) |
| [SIG Contributor Experience](sig-contributor-experience/README.md) | Bob Killen (**[@mrbobbytables](https://github.com/mrbobbytables)**) |
| [SIG Docs](sig-docs/README.md) | Carlos Tadeu Panato Jr. (**[@cpanato](https://github.com/cpanato)**) |
| [SIG etcd](sig-etcd/README.md) | TBD (**[@TBD](https://github.com/TBD)**) |
| [SIG Instrumentation](sig-instrumentation/README.md) | Christoph Blecker (**[@cblecker](https://github.com/cblecker)**) |
| [SIG K8s Infra](sig-k8s-infra/README.md) | Stephen Augustus (**[@justaugustus](https://github.com/justaugustus)**) |
| [SIG Multicluster](sig-multicluster/README.md) | Bob Killen (**[@mrbobbytables](https://github.com/mrbobbytables)**) |
Expand Down
8 changes: 8 additions & 0 deletions sig-etcd/OWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# See the OWNERS docs at https://go.k8s.io/owners

reviewers:
- sig-etcd-leads
approvers:
- sig-etcd-leads
labels:
- sig/etcd
118 changes: 118 additions & 0 deletions sig-etcd/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
<!---
This is an autogenerated file!
Please do not edit this file directly, but instead make changes to the
sigs.yaml file in the project root.
To understand how this file is generated, see https://git.k8s.io/community/generator/README.md
--->
# etcd Special Interest Group

etcd is a production-ready store for building cloud-native distributed systems and managing cloud-native infrastructure via orchestrators like Kubernetes.
Etcd should provide distributed system primitives** (such as distributed locking and leader election) that allow users to **create scalable, highly available and fault-tolerant systems.
Etcd is the place to store the infrastructure configuration, not only as part of Kubernetes, but also as a standalone solution.

The [charter](charter.md) defines the scope and governance of the etcd Special Interest Group.

## Meetings
*Joining the [mailing list](https://groups.google.com/g/etcd-dev) for the group will typically add invites for the following meetings to your calendar.*
* Regular SIG Meeting: [Thursdays at 11:00 PT (Pacific Time)](https://zoom.us/my/cncfetcdproject) (biweekly). [Convert to your timezone](http://www.thetimezoneconverter.com/?t=11:00&tz=PT%20%28Pacific%20Time%29).
* [Meeting notes and Agenda](https://docs.google.com/document/d/16XEGyPBisZvmmoIHSZzv__LoyOeluC5a4x353CX0SIM/edit?usp=sharing).
* [Meeting recordings](https://www.youtube.com/playlist?list=PLRGL688DpO9rtufHbiunuCHddYY6MGkwW).

## Leadership

### Chairs
The Chairs of the SIG run operations and processes governing the SIG.

* James Blair (**[@jmhbnz](https://github.com/jmhbnz)**), Red Hat
* Wenjia Zhang (**[@wenjiaswe](https://github.com/wenjiaswe)**), Google

### Technical Leads
The Technical Leads of the SIG establish new subprojects, decommission existing
subprojects, and resolve cross-subproject technical issues and decisions.

* Benjamin Wang (**[@ahrtr](https://github.com/ahrtr)**), VMWare
* Marek Siarkowicz (**[@serathius](https://github.com/serathius)**), Google

## Contact
- Slack: [#etcd](https://kubernetes.slack.com/messages/etcd)
- [Mailing list](https://groups.google.com/g/etcd-dev)
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/sig%2Fetcd)
- GitHub Teams:
- [@kubernetes/sig-etcd-leads](https://github.com/orgs/kubernetes/teams/sig-etcd-leads) - SIG Chairs and Tech Leads
- Steering Committee Liaison: TBD (**[@TBD](https://github.com/TBD)**)

## Subprojects

The following [subprojects][subproject-definition] are owned by sig-etcd:
### bbolt
An embedded key/value database for Go.
- **Owners:**
- [etcd-io/bbolt/MAINTAINERS](https://github.com/etcd-io/bbolt/blob/master/MAINTAINERS)
### cetcd
Serve Consul with etcd
- **Owners:**
- [etcd-io/cetcd/MAINTAINERS](https://github.com/etcd-io/cetcd/blob/master/MAINTAINERS)
### dbtester
Distributed database benchmark tester
- **Owners:**
- [etcd-io/dbtester/MAINTAINERS](https://github.com/etcd-io/dbtester/blob/master/MAINTAINERS)
### discovery.etcd.io
Kubernetes manifests powering discovery.etcd.io
- **Owners:**
- [etcd-io/discovery.etcd.io/MAINTAINERS](https://github.com/etcd-io/discovery.etcd.io/blob/master/MAINTAINERS)
### discoveryserver
Public etcd Discovery Service
- **Owners:**
- [etcd-io/discoveryserver/MAINTAINERS](https://github.com/etcd-io/discoveryserver/blob/master/MAINTAINERS)
### etcd
Distributed reliable key-value store for the most critical data of a distributed system
- **Owners:**
- [etcd-io/etcd/MAINTAINERS](https://github.com/etcd-io/etcd/blob/master/MAINTAINERS)
### etcd-play
etcd playground
- **Owners:**
- [etcd-io/etcd-play/MAINTAINERS](https://github.com/etcd-io/etcd-play/blob/master/MAINTAINERS)
### etcdlabs
etcd playground
- **Owners:**
- [etcd-io/etcdlabs/MAINTAINERS](https://github.com/etcd-io/etcdlabs/blob/master/MAINTAINERS)
### gofail
failpoints for go
- **Owners:**
- [etcd-io/gofail/MAINTAINERS](https://github.com/etcd-io/gofail/blob/master/MAINTAINERS)
### govanityurls
Use a custom domain in your Go import path
- **Owners:**
- [etcd-io/govanityurls/MAINTAINERS](https://github.com/etcd-io/govanityurls/blob/master/MAINTAINERS)
### jetcd
etcd java client
- **Owners:**
- [etcd-io/jetcd/MAINTAINERS](https://github.com/etcd-io/jetcd/blob/master/MAINTAINERS)
### maintainers
issue tracking for project wide non-code concerns
- **Owners:**
- [etcd-io/maintainers/MAINTAINERS](https://github.com/etcd-io/maintainers/blob/master/MAINTAINERS)
### protodoc
protodoc generates Protocol Buffer documentation.
- **Owners:**
- [etcd-io/protodoc/MAINTAINERS](https://github.com/etcd-io/protodoc/blob/master/MAINTAINERS)
### raft
Raft library for maintaining a replicated state machine
- **Owners:**
- [etcd-io/raft/MAINTAINERS](https://github.com/etcd-io/raft/blob/master/MAINTAINERS)
### website
etcd-io
- **Owners:**
- [etcd-io/website/MAINTAINERS](https://github.com/etcd-io/website/blob/master/MAINTAINERS)
### zetcd
Serve the Apache Zookeeper API but back it with an etcd cluster
- **Owners:**
- [etcd-io/zetcd/MAINTAINERS](https://github.com/etcd-io/zetcd/blob/master/MAINTAINERS)

[subproject-definition]: https://github.com/kubernetes/community/blob/master/governance.md#subprojects
[working-group-definition]: https://github.com/kubernetes/community/blob/master/governance.md#working-groups
<!-- BEGIN CUSTOM CONTENT -->

<!-- END CUSTOM CONTENT -->
63 changes: 63 additions & 0 deletions sig-etcd/charter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# SIG etcd Charter

This charter adheres to the conventions described in the [Kubernetes Charter README] and uses
the Roles and Organization Management outlined in [sig-governance].

[Kubernetes Charter README]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/README.md
[sig-governance]: https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance.md

## Scope

Owns the etcd project and how it is used by Kubernetes.

### In scope

#### Code, Binaries and Services

- Development of [etcd] and other repositories under [etcd-io organization]
- Maintenance of [etcd image] packaged with Kubernetes

[etcd]: https://github.com/etcd-io/etcd
[etcd-io organization]: https://github.com/etcd-io
[etcd image]: https://github.com/kubernetes/kubernetes/tree/master/cluster/images/etcd

#### Cross-cutting and Externally Facing Processes

- Specifying, testing and improving the implicit Kubernetes-ETCD Contract, which includes storage requirements, write and delete requirements, read requirements and watch requirements.
- Release process of etcd and other binaries belonging to [etcd-io organization]

### Out of scope

- Structure of data stored in etcd by Kubernetes components is owned by SIG API Machinery

## Roles and Organization Management

This SIG follows the Roles and Organization Management outlined in [sig-governance]
and opts-in to updates and modifications to [sig-governance].

### Additional responsibilities of Tech Leads

- Release of etcd and other binaries belonging to [etcd-io organization]

### Deviations from [sig-governance]

- SIG etcd's participation in the Kubernetes release cycle is limited by etcd having a different schedule for its releases.
- SIG etcd communication utilizes pre-existing forums for communication:
- Email: [etcd-dev](https://groups.google.com/forum/?hl=en#!forum/etcd-dev).
- Slack: [#etcd](https://kubernetes.slack.com/messages/C3HD8ARJ5/details/) channel on Kubernetes.
- SIG etcd contributing instructions ([CONTRIBUTING.md]) be defined in etcd project.

[CONTRIBUTING.md]: https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md

### Deviations from [kubernetes-repositories]

- SIG etcd repositories live in github.com/etcd-io
- SIG etcd repositories should (but not must) adopt merge bot, Kubernetes PR commands/bot.
- SIG etcd repositories will follow [rules for donated repositories].

[kubernetes-repositories]: https://github.com/kubernetes/community/blob/master/github-management/kubernetes-repositories.md#sig-repositories
[rules for donated repositories]: https://github.com/kubernetes/community/blob/master/github-management/kubernetes-repositories.md#rules-for-donated-repositories

### Subproject Creation

By SIG Technical Leads
100 changes: 100 additions & 0 deletions sig-etcd/vision.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# SIG etcd Vision

The long-term success of the etcd project depends on the following:
- Etcd is a reliable key-value storage
- Etcd is simple to operate
- Etcd is a standalone solution for managing infrastructure
- Etcd scales beyond Kubernetes dimensions

The goals and milestones listed here are for future releases.
The scope of release v3.6 has already been defined and is unlikely to change.

## Etcd is a reliable key-value storage service

Reliability remains the most important property of etcd.
The project cannot allow for another [data inconsistency incident].
If we could only pick one thing from the list of goals above, this would be it.
No matter what features we add in the future,
they must not diminish etcd's reliability.
We must establish processes and safeguards to prevent future incidents.

How?
- Etcd API guarantees are well understood, documented and tested.
- Etcd adopts a production readiness review process for new features, similar to Kubernetes one.
- Robustness tests should cover most of the API and most common failures.
- New features must have accompanying e2e tests and be covered by robustness tests.
- Etcd must be able to immediately detect corruption.
- Etcd must be able to automatically recover from data corruption.

[data inconsistency incident]: https://github.com/etcd-io/etcd/blob/main/Documentation/postmortems/v3.5-data-inconsistency.md

## Etcd is simple to operate

Etcd should be easy to operate.
Currently, there are many steps involved in operating etcd,
and some of these steps require external tools.
For example, Kubernetes provides tools to [downgrade/upgrade etcd].
These tools are not part of the etcd,
but they are available as part of the Kubernetes distribution of etcd.

How?
- Etcd should not require users to run periodic defrag
- Etcd officially supports live upgrades and downgrades
- Disaster recovery for Etcd & Kubernetes
- Reliable cluster membership changes via learners with automated promotion
- Two node etcd clusters

## Etcd is a standalone solution for managing infrastructure configuration

Kubernetes is not the only way to manage infrastructure.
It was the first to introduce many concepts that have now become the standard,
but they are not unique to Kubernetes.
The most important design principle of Kubernetes,
the reconciliation protocol, is not something unique to it.

Reconciliation can be implemented solely on etcd,
as has been shown by projects like Cillium,
Calico Typha that support etcd-based control planes.
The reason why this idea has not propagated further is
the amount of work that was put into making
the reconciliation protocol scale in Kubernetes.
The watch cache is a key part of this scaling,
and it is not part of the etcd project.

If etcd provided a Kubernetes-like storage interface
and primitives for the reconciliation protocol,
it would be a more viable solution for managing infrastructure.
This would allow users to build etcd-based control planes that
could scale to meet the needs of large and complex deployments.

How?
- Introduce Kubernetes like storage interface into etcd-client
- Provide etcd primitives for reconciliation protocol
- Strip out the Kubernetes watch cache and make it part of the etcd client.
- Use the watch cache in the client to build an eventually consistent etcd proxy.

[downgrade/upgrade etcd]: https://github.com/kubernetes/kubernetes/tree/master/cluster/images/etcd

## Etcd scales beyond Kubernetes dimensions

Etcd has proven its scalability by enabling Kubernetes clusters of up to 5,000 nodes.
However, as the cloud native ecosystem has evolved, new projects have been built on top of Kubernetes.
These projects, such as [KCP] (a multi-cluster control plane) and [Kueue] (a batch job queuing system),
have different scalability requirements than pure Kubernetes.
For example, they need support for larger storage sizes and higher throughput.

Etcd's strong points are its reliable raft and efficient watch implementation.
However, its storage capabilities are not as strong.
To address this, we should look into growing out storage capabilities and making them more flexible depending on the use case.

How?
- Well-defined and tested scalability dimensions
- Increase raft throughput (async and batch proposal handling)
- Increasing bbolt supported storage size
- Pluggable storage layer
- Hybrid clusters with write and read optimized members


[KCP]: https://cloud.redhat.com/blog/an-introduction-to-kcp
[Kueue]: https://github.com/kubernetes-sigs/kueue

Loading

0 comments on commit 3edd1c6

Please sign in to comment.