Securing Kafka can be difficult. Securing Kafka on Kubernetes can also be diffult. But using Strimzi’s Kafka Operator makes it easy!

Below you will find a guide on how to use the Vault PKI secrets engine to generate an intermediate CA to use with the Strimzi Kafka Operator.


A brief introduction to Vault

If you’re here you probably know what Vault is and why it’s useful; but for the uninitiated here is a brief description from the creators themselves. The Hashicorp Vault project “secures, stores, and tightly controls access to tokens, passwords, certificates, API keys, and other secrets in modern computing […]”.

So how does Vault fit in with Strimzi and Kafka? Well, Vault has the concept of Secrets Engines:

Secrets engines are components which store, generate, or encrypt data. Secrets engines are incredibly flexible, so it is easiest to think about them in terms of their function. Secrets engines are provided some set of data, they take some action on that data, and they return a result. Some secrets engines simply store and read data - like encrypted Redis/Memcached. Other secrets engines connect to other services and generate dynamic credentials on demand. Other secrets engines provide encryption as a service, totp [time-based one-time password] generation, certificates, and much more.

It is this last point, certificates, which we are going to be looking at. Specifically, we will be using the Vault PKI Secrets Engine to create a CA, which we will then install instead of using CA certificates and private keys generated by the Strimzi Cluster Operator.


Securing a Kafka cluster on Kubernetes

The incubator Kafka charts

Like me, you may have tried configuring TLS for Kafka using the incubator Helm charts (you can find them here), and realised that in fact this is far from trivial. I had initially configured an initContainer to get certificates from the Vault PKI backend using the Vault CLI, which did indeed work. However, in addition to the startup scripts in the main broker container and another custom initContainer I had configured to set up rack awareness, this caused the pod startup times to become quite long!

At this point I had configured the brokers and clients with TLS, but what about the Zookeepers? Zookeeper does not support TLS, which is problematic if, for example, you are a financial organisation streaming sensitive data through your Kafka cluster. Even if you’re not a fintech company, but you’re a sensible individual with security concerns, this might be a deal breaker for you!

Strimzi & Security

Straight out of the box, Strimzi provides all of this for you:

Strimzi supports encrypted communication between the Kafka and Strimzi components using the TLS protocol. Communication between Kafka brokers (interbroker communication), between Zookeeper nodes (internodal communication), and between these and the Strimzi operators is always encrypted. Communication between Kafka clients and Kafka brokers is encrypted according to how the cluster is configured. For the Kafka and Strimzi components, TLS certificates are also used for authentication. The Cluster Operator automatically sets up TLS certificates to enable encryption and authentication within your cluster. It also sets up other TLS certificates if you want to enable encryption or TLS authentication between Kafka brokers and clients.

By default, Strimzi generates two CAs, one for the cluster and another for clients, storing the certificates and private keys as Kubernetes secrets.

  • Cluster CA - all internal communication within the cluster and encryption with clients
  • Client CA - all client communication for supporting TLS mutual authentication

Using these secrets, Strimzi generates certificates for all of the components in the cluster, which in turn ensures all communication is encrypted!

N.B. This post only deals with the Cluster CA!

Bringing your own CA

Now, in order to bring your own CA, you need to do this yourself and add the necessary certificate and key as Kubernetes secrets manually, adding the following labels:

  strimzi.io/kind="Kafka"
  strimzi.io/cluster="${clusterName}"

Why Vault?

Some reasons to use Vault:

  • Open source​
  • Battle tested, documented & secure​
  • Plethora of authentication methods​
  • Extensible​
  • Powerful auditing capabilities

Specifically in this context of using Vault to create your own CA, it gives you more control over your TLS configuration. It provides a centralised place to manage your CAs, and the rest of your secrets required by your environment - having a single secrets store minimizes the attack surface by reducing secrets sprawl; you only have to focus on securing a single service!

Vault integrates particularly well with Kubernetes, for example using the Kubernetes Auth Method.

See the Vault documentation for a full list of features:


Configuration of the Vault PKI secrets engine

Assuming you have a Vault cluster running, you can create your own CA by doing the following:

# Creating the root CA:
# First, enable the pki secrets engine at the pki path:
$ vault secrets enable pki

# Tune the pki secrets engine to issue certificates with a maximum time-to-live (TTL) 
#   of 87600 hours (10 years):
$ vault secrets tune -max-lease-ttl=87600h pki

# Generate the root CA, extracting the root CA's certificate to root.crt; the secret
#   key is not exported!
$ vault write -field=certificate pki/root/generate/internal common_name="example.com" \
       ttl=87600h > root.crt

# This generates a new self-signed CA certificate and private key. Vault will automatically
#   revoke the generated root at the end of its lease period (TTL); the CA certificate will
#     sign its own Certificate Revocation List (CRL).

# Configure the CA and CRL URLs:
$ vault write pki/config/urls \
       issuing_certificates="http://127.0.0.1:8200/v1/pki/ca" \
       crl_distribution_points="http://127.0.0.1:8200/v1/pki/crl"

# Creating the intermediate CA:
# First, enable the pki secrets engine at the pki_int path:
$ vault secrets enable -path=pki_int pki

# Tune the pki_int secrets engine to issue certificates with a maximum time-to-live (TTL)
#   of 43800 hours (5 years):
$ vault secrets tune -max-lease-ttl=43800h pki_int

# Execute the following command to generate an intermediate and save the CSR as 
#   pki_intermediate.csr:
$ vault write -format=json pki_int/intermediate/generate/exported \
        common_name="example.com Intermediate Authority" ttl="43800h" format="pem" > pki_intermediate

# Extract the private key & certificate signing request from the previous command's output:
$ jq -r '.data.private_key' < pki_intermediate > intermediate.key.pem
$ jq -r '.data.csr' < pki_intermediate > pki_intermediate.csr

# Sign the intermediate certificate with the root certificate and save the generated
#   certificate as intermediate.cert.pem:
$ vault write -format=json pki/root/sign-intermediate csr=@pki_intermediate.csr \
        format="pem" \
        | jq -r '.data.certificate' > intermediate.cert.pem

# Once the CSR is signed and the root CA returns a certificate, it can be imported back 
#   into Vault:
$ vault write pki_int/intermediate/set-signed certificate=@intermediate.cert.pem

You now have all the files required to install your own CA with Strimzi - the root CA certificate from the creation of the root CA, root.crt, the private key from the generation of the intermediate CA, intermediate.key.pem, and the intermediate CA certificate from the signing of the intermediate CSR by the root, intermediate.cert.pem. You will need both of these files to create your CA chain, intermediate.chain.pem:

$ cat intermediate.cert.pem > intermediate.chain.pem
$ cat root.crt >> intermediate.chain.pem

Separate intermediate CA per cluster

Now that you have an intermediate CA, you might be tempted to use the same certificate and key for multiple clusters. However, considering Strimzi generates certificates outside of Vault, the only way you can revoke those certificates is by revoking the intermediate CA itself. As such, you should have a separate intermediate CA per cluster.

Generating certificates using Vault and the intermediate CA

If you want a client, for example, to be able to generate certificates using the intermediate CA, you’ll need to set up a role for clients to use. More details on next steps here.


Launching your Kafka cluster using the Vault CA

Adding the certificate and private key to your Kubernetes cluster

You need to create secrets in your Strimzi namespace manually with the two files we generated in the previous few steps. Remember to include the labels, otherwise Strimzi will not pick them up.

# Private Key
kubectl create secret -n ${strimziNamespace} generic ${clusterName}-cluster-ca \
  --from-file=ca.key=intermediate.key.pem \
  && kubectl label secret -n ${strimziNamespace} ${clusterName}-cluster-ca \
  strimzi.io/kind="Kafka" \
  strimzi.io/cluster="${clusterName}"

# Certificate
kubectl create secret -n ${strimziNamespace} generic ${clusterName}-cluster-ca-cert \
  --from-file=ca.crt=intermediate.chain.pem \
  && kubectl label secret -n ${strimziNamespace} ${clusterName}-cluster-ca-cert \
  strimzi.io/kind="Kafka" \
  strimzi.io/cluster="${clusterName}"

Creating your Kafka cluster resource

You’re almost there! Just a few more steps:

  1. Deploy the Strimzi operator
  2. Apply your Kafka CRD, remembering to set the following parameter:
...
spec:
  clusterCa:
    generateCertificateAuthority: false
...

A few things to note…

  • The common_name cannot contain a wildcard, so set it to something sensible. Otherwise you’ll see something like this in your tls-sidecar:
LOG5[1:139683950376704]: Service [zookeeper-2181] accepted connection from 127.0.0.1:33576
LOG5[1:139683950376704]: connect_blocking: connected 10.100.241.50:2181
LOG5[1:139683950376704]: Service [zookeeper-2181] connected remote server from 10.100.214.45:37700
LOG4[1:139683950376704]: CERT: Verification error: permitted subtree violation
LOG4[1:139683950376704]: Certificate check failed: depth=0, /O=io.strimzi/CN=zookeeper
LOG3[1:139683950376704]: SSL_connect: 14090086: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed
  • If you are getting an error like this, make sure you have correctly signed your intermediate CA with the root CA:
CERT: Verification error: unable to get issuer certificate
Certificate check failed: depth=1, /O=${O}/CN=${CN}
SSL_connect: 14090086: error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed

When debugging something like this, remember: 1) depth=2 is the root CA certificate 2) depth=1 is the intermediate CA certificate 3) depth=0 is the primary certificate

  • When generating your intermediate CA, type needs to be exported, otherwise the command will not return the private key! More information on that here.

  • If you would prefer to set up Vault PKI this using the UI or the API, there is documentation for that here. Gotcha! Remember the importance of the type, which needs to be exported. The Hashicorp docs use internal.

  • You might be tempted to just use the root CA… But definitely use an intermediate CA! Why?


Conclusion

So we’ve looked at how to use Vault PKI secrets engine to generate an intermediate CA to use in your Kafka cluster resource. Hopefully you should now have a Kafka CRD applied, with the cluster running, which is using your Vault-generated CA certificate and private key!


Useful Bits

Deploying Vault

Seth Vargo’s vault-on-gke project gives you everything you need to create a Vault cluster on GKE, backed by GCS. This project was developed with the help of Google’s security team, so it is hopefully quite secure! It contains Kelsey Hightower’s vault-on-google-kubernetes-engine project captured as Terraform.

Resources

The difference between root certificates and intermediate certificates.