In today's post, we'll be discussing multi-datacenter Vault clusters that span multiple regions.
Most enterprises follow different replication strategies to provide scalable and highly-available services. One common replication/disaster recovery strategy for distributed applications is to have a hot standby replica of the very same deployment already setup in a secondary data center. When a catastrophic event occurs in the primary data center, all traffic is then redirected to the secondary datacenter.
Hot-standby is a method of redundancy in which the primary and secondary (that is, the backup) instances are run simultaneously. All data is mirrored to the secondary instance (using transactional updates) in real-time, so both instances contain identical data. Usually, an edge server (load-balancer) is deployed in front of the two data centers, so clients don't know which instance is serving their request.
Vault recently introduced the
Raft storage backend in version 1.2, which helps to create high-availability (multi-node) Vault clusters without using external storage backends. This simplifies the setup of an HA/replicated Vault cluster and removes the burden of maintaining a storage backend. In the cloud, for instance, you do not depend on cloud provider services like GCS, S3, etc.
We always planned to create a multi-datacenter setup for Vault, and this request also comes up frequently in the
Bank-Vaults community. With Raft storage available, we decided to give it a go, and added Vault replication across multiple datacenters to our
Bank-Vaults Kubernetes operator.
Multiple datacenters and the Vault operator for Kubernetes
Bank-Vaults has automated the creation of Raft Vault clusters since the 0.6.0 release. The work done in
PR 820 added support for running Vault across multiple Kubernetes clusters using the Banzai Cloud Vault operator,
Bank-Vaults, and is available as of the 0.8.0 release.
This change contains CRD changes that
automatically join the Raft cluster of another Kubernetes cluster, and store unseal keys in a distributed manner in AWS S3 encrypted with KMS. The operator then distributes the unseal keys to multiple buckets that span different AWS regions.
There is support for this feature on Google Cloud as well. When using Google, however, the operator does not have to distribute the unseal keys, as GCS and KMS support multi-region/global spanning.
If you are interested in running a multi-region Vault cluster on Azure or Alibaba, let us know.
What you will need
- The examples in this post will help you to install Vault into AWS. Thus, you will need an AWS account and correspondent access key. You will also need aws-cli installed and configured on your computer.
- In our examples, we use the banzai-cli command line tool to create and manage Kubernetes clusters on AWS. This tool is a command-line interface for our Banzai Cloud Pipeline platform. The platform is available as a service for free after registration. Note that using Pipeline and banzai-cli is not strictly necessary, you can create the required clusters and replace the banzai-cli commands with native aws-cli or kubectl commands. Bank-Vaults and the features described in this post do not require the Banzai Cloud Pipeline, but we used it for the sake of convenience.
Banzai Cloud Pipeline is a solution-oriented application platform which allows enterprises to develop, deploy and securely scale container-based applications in multi- and hybrid-cloud environments. Pipeline can create Kubernetes clusters on 5 cloud providers (AWS, Azure, Alibaba, Google) and on-prem (VMware, bare metal).
Showtime - running Vault across different datacenters/regions
The following example demonstrates implementation on AWS.
First, you have to create three clusters, because
Raft requires at least three nodes to work properly. These clusters will span three European AWS data centers (
"eu-central-1", "eu-west-1", "eu-north-1").
- Let's make this easier by creating these Kubernetes clusters through the Banzai CLI. Open a terminal and issue the following command:
or region in "eu-central-1" "eu-west-1" "eu-north-1"; do
banzai cluster create <<EOF
{
"name": "bv-${region}",
"location": "${region}",
"cloud": "amazon",
"secretName": "aws",
"properties": {
"eks": {
"version": "1.14.7",
"nodePools": {
"pool1": {
"spotPrice": "0.101",
"count": 1,
"minCount": 1,
"maxCount": 3,
"autoscaling": true,
"instanceType": "c5.large"
}
}
}
}
}
EOF
done
- Once all the clusters are up and running, you can check the status of the clusters with the banzai cluster list | grep bv- command.
$ banzai cluster list | grep bv-
2862 bv-eu-central-1 eks bonifaido 2020-01-09T08:29:45Z RUNNING
2863 bv-eu-north-1 eks bonifaido 2020-01-09T08:29:47Z RUNNING
2856 bv-eu-west-1 eks bonifaido 2020-01-08T15:05:12Z RUNNING
- After all clusters are ready, you need to get their KUBECONFIGs. You can do that with the following command:
for region in "eu-central-1" "eu-west-1" "eu-north-1"; do
banzai cluster sh --cluster-name "bv-${region}" 'cat $KUBECONFIG' > bv-${region}.yaml
done
- The Bank-Vaults repository holds a useful script and a bunch of Vault CustomResources to help with the installation. You need to create two S3 buckets and KMS keys as well, and change those in the resource manifests.
operator/deploy/multi-dc/multi-dc-raft.sh install \
bv-eu-central-1.yaml \
bv-eu-west-1.yaml \
bv-eu-north-1.yaml
NOTE: For the sake of time, the script we're using assumes that we have the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables set in our shell. The proper way to do this on an EKS cluster would be to create an IAM Role for your Vault instances, and to access S3 and KMS to bind that IAM Role to the vault
Kubernetes ServiceAccount those instances are using. For details on how to do that, see the official Amazon EKS documentation.
The script will install the
Bank-Vaults operator into all three clusters and deploy a federated Vault cluster (using Raft) which is connected in an HA manner across the three Kubernetes clusters. The unseal keys will be encrypted with KMS and stored in S3. This script will take a few minutes.
The following steps are performed during the provisioning:
- The Bank-Vaults Vault operator is installed on the first cluster.
- The operator deploys the primary Vault instance, here. This instance becomes the leader.
- The operator initializes the instance.
- The unseal keys are distributed to the two S3 buckets in the eu-central-1 and eu-west-3 regions.
- The Vault cluster port is exposed with an ELB to the other instances (applies to all).
- The Bank-Vaults operator is installed on the second cluster.
- The secondary Vault instance is deployed on the second cluster.
- The secondary Vault instance automatically joins the primary instance.
- The Bank-Vaults operator is installed on the tertiary (third) cluster.
- The tertiary Vault instance is deployed on the third cluster.
- The tertiary Vault instance automatically joins the primary instance.
Test a Vault failover and disaster recovery
- After the setup has successfully finished, check the status of the Vault cluster and find the current leader:
operator/deploy/multi-dc/multi-dc-raft.sh install \
bv-eu-central-1.yaml \
bv-eu-west-1.yaml \
bv-eu-north-1.yaml
- You can emulate a failover by scaling down the leader instance to 0 (in this example, bv-eu-central-1 ). Use the following command to do that:
# The operator will scale the instance back to 1 in a few seconds,
# but this is enough to simulate a service outage.
export KUBECONFIG=bv-eu-central-1.yaml
kubectl scale statefulset vault-primary --replicas 0
statefulset.apps/vault-primary scaled
- Let's check the logs of the secondary instance:
export KUBECONFIG=bv-eu-central-1.yaml
kubectl logs --tail=6 vault-secondary-0 vault
2020-01-09T12:29:32.954Z [WARN] storage.raft: heartbeat timeout reached, starting election: last-leader=
2020-01-09T12:29:32.954Z [INFO] storage.raft: entering candidate state: node="Node at [::]:8201 [Candidate]" term=5
2020-01-09T12:29:32.958Z [INFO] storage.raft: duplicate requestVote for same term: term=5
2020-01-09T12:29:32.958Z [WARN] storage.raft: duplicate requestVote from: candidate=[91, 58, 58, 93, 58, 56, 50, 48, 49]
2020-01-09T12:29:32.958Z [INFO] storage.raft: election won: tally=2
2020-01-09T12:29:32.958Z [INFO] storage.raft: entering leader state: leader="Node at [::]:8201 [Leader]"
As you can see that the Vault instances noticed the blip in the primary instance and selected a new leader.
- Don't forget to remove the whole setup with the clusters after you finish experimenting:
operator/deploy/multi-dc/multi-dc-raft.sh remove \
bv-eu-central-1.yaml \
bv-eu-west-1.yaml \
bv-eu-north-1.yaml
for region in "eu-central-1" "eu-west-1" "eu-north-1"; do
banzai cluster delete --no-interactive "bv-${region}"
done
Conclusion
As you can see it's quite easy to create a geo-distributed Vault cluster capable of surviving region or AZ outages, with the
Bank-Vaults Vault operator.
If you have questions about this Bank-Vaults feature, or you encounter problems during testing, you can reach us on our
community slack channel #bank-vaults.
Learn more about Bank-Vaults:
Upcoming features
As of now, you can only use one Vault instance per region, but we are already working to push past this limitation.
The current implementation works on AWS and Google Cloud out of the box. If you are interested in running multi-region Vault cluster on
Azure or
Alibaba, let us know.
The
Bank-Vaults secret injection webhook became available as an
integrated service on the
Pipeline platform a few months ago. This means that you can install Bank-Vaults on any Kubernetes cluster by using Pipeline (with the UI or with the CLI), and configure Vault with all the authentications and policies necessary to work with the webhook. Secret injections are now a default feature of
Pipeline.
Also, as you can see, the AWS LoadBalancers serve as a gateway between the clusters. They play a roll very similar to that of Istio Gateways - a load balancer operates at the edge of the mesh receiving incoming or outgoing
HTTP/TCP connections. In a forthcoming blog post we will describe how the whole multi-DC Vault setup can become more Kubernetes native using
Backyards (now Cisco Service Mesh Manager), the Banzai Cloud
automated and operationalized service mesh built on Istio.
For more information, or if you're interested in contributing, check out the
Bank-Vaults repo - our Vault extension project for Kubernetes, and/or give us a
GitHub star if you think this project deserves it!
About Banzai Cloud Pipeline
Banzai Cloud's
Pipeline provides a platform for enterprises to develop, deploy, and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security measures - multiple authentication backends, fine-grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, CI/CD, and so on - are default features of the
Pipeline platform.
#multicloud #hybridcloud #BanzaiCloud