When deploying Apache Kafka on Kubernetes, Strimzi makes it easy to configure Kafka listeners with TLS encryption. But, with default settings, these listeners will only use certificates signed by the internal self-signed certification authority generated by Strimzi. When clients try to connect to a listener secured with such a certificate, they will not trust it by default. You need to get the public key of the certification authority (CA) that signed the server certificates and configure the client to trust it. This adds complexity, since you need to distribute the public key to all clients and configure them to use it. And you will need to update the public key when it changes, such as when the current one expires. This would be much easier if the listeners can use TLS certificates signed by a certification authority which the clients already trust. You would not need to distribute any certificates, you just tell the client to use TLS. And in this blog post I will show you how to do it with the help of Let’s Encrypt, cert-manager, and ExternalDNS.

Let’s Encrypt

Let’s Encrypt is a non-profit certification authority which provides an automated way of issuing signed certificates that are trusted by most operating systems and software platforms. Let’s Encrypt currently provides more than 200 million certificates. All you need to be able to get a signed certificate from Let’s Encrypt is to own a domain and be able to prove its ownership. Let’s Encrypt uses the IETF specification of the Automatic Certificate Management Environment (ACME) protocol to verify domain ownership and request or issue signed certificates.

Using the ACME protocol

Don’t be scared by the specification! We will use a tool called cert-manager to communicate with Let’s Encrypt. All we will need to do is to configure the cert-manager and request certificates by creating custom resources in our Kubernetes cluster.

When proving domain ownership, the ACME protocol supports two different mechanisms:

  • HTTP Challenge
  • DNS Challenge HTTP Challenge uses HTTP to verify that you own the domain. You supply an address and requested information that it tries to access using the HTTP protocol. For example, when requesting a certificate for domain example.com it will ask you to place a file with some specific information on the HTTP address https://example.com/.well-known/acme-challenge/.... If you are able to do it, you have proven that you control the domain. This works well for web applications which use HTTP, but it doesn’t work so well for applications like Apache Kafka that use TCP.

Apache Kafka does not use HTTP so it cannot easily provide information on a given HTTP URL. That is why with Apache Kafka, we will need to use the DNS Challenge. The DNS Challenge works similarly, but uses the DNS protocol instead of HTTP. It will ask us to create a DNS record with some specific information to prove our domain ownership. Using the DNS Challenge makes it a bit more complicated, because we will need to give cert-manager access to the DNS management of our domain. But once we configure it, it works well.

Assigning DNS names to Kafka brokers

The certificates issued by Let’s Encrypt are always bound to one or more domain names. So to make the certificates work, we need to make sure that the clients connecting to the Kafka cluster are able to use the same domain names that will be in the certificates. ExternalDNS is a project that allows you to use annotations to assign DNS names to Kubernetes resources, such as services, ingresses, or nodes. In the examples in this post, we access Kafka using Kubernetes NGINX Ingress Controller, and we use ExternalDNS to assign DNS names to the Ingress resources created by Strimzi and Kubernetes.

DNS management

Both cert-manager and ExternalDNS can work with many different DNS providers. I personally use Amazon AWS Route53 for most of my domains. I use Amazon AWS Route53 in this example too. If you use another provider, you will need to configure the cert-manager and ExternalDNS tools a bit differently. A full list of DNS providers supported by cert-manager is provided in its documentation. For ExternalDNS, you can find the list of supported providers on its GitHub page.

Configuring AWS access for cert-manager and ExternalDNS

To allow cert-manager and ExternalDNS to communicate with the Route 53 APIs, we will need to give them the rights to do so. Therefore we create two new policies in Amazon AWS IAM:

  • One for the cert-manager
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "route53:GetChange",
                "Resource": "arn:aws:route53:::change/*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "route53:ChangeResourceRecordSets",
                    "route53:ListResourceRecordSets"
                ],
                "Resource": "arn:aws:route53:::hostedzone/ZXXXXXXXXXXXXXXXXXXXX"
            }
        ]
    }
    
  • And one for ExternalDNS
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "route53:ChangeResourceRecordSets"
                ],
                "Resource": [
                    "arn:aws:route53:::hostedzone/ZXXXXXXXXXXXXXXXXXXXX"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "route53:ListHostedZones",
                    "route53:ListResourceRecordSets"
                ],
                "Resource": [
                    "*"
                ]
            }
        ]
    }
    

You will need to replace the hosted zone id ZXXXXXXXXXXXXXXXXXXXX with the actual ID of the hosted zone of your domain. It can be found in the Route 53 console. Once you have the policies created, you will need to assign them to the cert-manager and ExternalDNS deployments. In my case, I simply created two new AWS users, attached the policies to them, and stored the access key IDs and secret access keys in Kubernetes Secrets.

Installing cert-manager and ExternalDNS

To install ExternalDNS, you can just follow the installation process from its documentation. But in addition, I also configured the AWS credentials to be passed from the Secret as environment variables into the ExternalDNS deployment:

env:
  - name: AWS_REGION
    value: us-east-1
  - name: AWS_ACCESS_KEY_ID
    valueFrom:
      secretKeyRef:
        name: route53-credentials
        key: access-key
  - name: AWS_SECRET_ACCESS_KEY
    valueFrom:
      secretKeyRef:
        name: route53-credentials
        key: secret-access-key

For cert-manager, we will use the credentials later when creating the certificate. So we just follow the cert-manager documentation and install it in our cluster.

You can use other methods to pass the AWS credentials to cert-manager and ExternalDNS. For example, using kiam or kube2iam. If you use Amazon EKS, you can also use the built-in support for attaching roles to Kubernetes ServiceAccounts as well.

The configuration procedure will be similar for other DNS providers as well. Just follow the documentation provided by cert-manager, ExternalDNS and your DNS provider.

Creating the signed certificate

With cert-manager and ExternalDNS installed, we can move to the next step. We will connect cert-manager to Let’s Encrypt and get a signed certificate.

First, we need to create either ClusterIssuer or Issuer resources. ClusterIssuer is cluster-scoped and it can be used from applications running in all namespaces in your cluster. Issuer is namespace-scoped and can be used only from the namespace where you create it. In my case, I use ClusterIssuer:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: example-com-letsencrypt-prod
spec:
  acme:
    email: my-email@example.com
    preferredChain: ""
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: example-com-letsencrypt-prod-account-key
    solvers:
    - dns01:
        route53:
          accessKeyID: AKIAIOSFODNN7EXAMPLE
          hostedZoneID: ZXXXXXXXXXXXXXXXXXXXX
          region: us-east-1
          secretAccessKeySecretRef:
            key: secret-access-key
            name: example-com-route53-credentials
      selector:
        dnsZones:
        - example.com

The ClusterIssuer resource specifies how to obtain our certificate from Let’s Encrypt:

  • acme tells cert-manager that we want to use the ACME protocol.
  • server specifies the address of the ACME server. In the example here, it uses the Let’s Encrypt production server. But you can use the staging server https://acme-staging-v02.api.letsencrypt.org/directory for testing.
  • privateKeySecretRef specifies the Kubernetes Secret where the key for communication with Let’s Encrypt is stored.
  • solvers specifies how the ACME protocol challenges are solved. As explained earlier, we have to use the DNS Challenge type for Apache Kafka. So we need to specify the AWS credentials, the hosted zone ID, and the domain for which this ClusterIssuer will be used. In my case, the domain is example.com.

The Issuer resource would be the same, just with a different kind.

After you create the ClusterIssuer, cert-manager will register with the Let’s Encrypt ACME server. When you query the ClusterIssuers, you should see that it is Ready:

$ kubectl get clusterissuers -o wide
NAME                              READY   STATUS                                                 AGE
example-com-letsencrypt-prod      True    The ACME account was registered with the ACME server   9d

With the ClusterIssuer ready, we can issue the certificate. For that, we will need to create a Certificate resource. Create the Certificate resource in the same namespace as the Kafka cluster. The resource looks like this:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: my-cluster-lets-encrypt
spec:
  secretName: my-cluster-lets-encrypt
  issuerRef:
    name: example-com-letsencrypt-prod
    kind: ClusterIssuer
    group: cert-manager.io
  subject:
    organizations:
      - my-org
  dnsNames:
    - bootstrap.example.com
    - broker-0.example.com
    - broker-1.example.com
    - broker-2.example.com

The resource provides configuration for the certificate:

  • secretName specifies the Kubernetes Secret created by cert-manager that contains the signed certificate.
  • issuerRef defines the issuer to use. It links it to the ClusterIssuer we just created. In case you decided to use Issuer instead of ClusterIssuer, you can just change the kind.
  • subject configures the fields of the X509 subject.
  • dnsNames configures the Subject Alternative Names used in the certificate. This is important, because it must contain all the DNS names used by the Kafka cluster. Alternatively, you can also use a wildcard certificate:
      dnsNames:
        - "*.example.com"
    

After you have created the Certificate resource, cert-manager generates the private key and has it signed by Let’s Encrypt. It may take a few minutes. Once the signed certificate is ready, you should see something like this:

$ kubectl get certificate -o wide
NAME                      READY   SECRET                    ISSUER                         STATUS                                          AGE
my-cluster-lets-encrypt   True    my-cluster-lets-encrypt   example-com-letsencrypt-prod   Certificate is up to date and has not expired   3h18m

You should also have the secret with the public and private keys:

$ kubectl get secret my-cluster-lets-encrypt -o yaml
apiVersion: v1
kind: Secret
metadata:
  annotations:
    cert-manager.io/alt-names: bootstrap.example.com,broker-0.example.com,broker-1.example.com,broker-2.example.com
    cert-manager.io/certificate-name: my-cluster-lets-encrypt
    cert-manager.io/common-name: bootstrap.example.com
    cert-manager.io/ip-sans: ""
    cert-manager.io/issuer-group: cert-manager.io
    cert-manager.io/issuer-kind: ClusterIssuer
    cert-manager.io/issuer-name: example-com-letsencrypt-prod
    cert-manager.io/uri-sans: ""
  name: my-cluster-lets-encrypt
  namespace: myproject
type: kubernetes.io/tls
data:
  tls.crt: ...
  tls.key: ...

Deploying the Apache Kafka cluster

With the certificate signed, we just need to deploy our Apache Kafka cluster and configure it to use our public and private keys. First, of course, you need to install Strimzi And then create a Kafka custom resource. In my case, I used the signed certificate for an ingress type listener. But it would work similarly with other types of external listeners. The bootstrap address uses the address bootstrap.example.com. And the brokers use addresses broker-X.example.com, where X is the ID of the broker.

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: my-cluster
  labels:
    app: my-cluster
spec:
  kafka:
    replicas: 3
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
      - name: external
        port: 9094
        type: ingress
        tls: true
        configuration:
          bootstrap:
           annotations:
             external-dns.alpha.kubernetes.io/hostname: bootstrap.example.com.
             external-dns.alpha.kubernetes.io/ttl: "60"
            host: bootstrap.example.com
          brokers:
            - broker: 0
              annotations:
                external-dns.alpha.kubernetes.io/hostname: broker-0.example.com.
                external-dns.alpha.kubernetes.io/ttl: "60"
              host: broker-0.example.com
            - broker: 1
              annotations:
                external-dns.alpha.kubernetes.io/hostname: broker-1.example.com.
                external-dns.alpha.kubernetes.io/ttl: "60"
              host: broker-1.example.com
            - broker: 2
              annotations:
                external-dns.alpha.kubernetes.io/hostname: broker-2.example.com.
                external-dns.alpha.kubernetes.io/ttl: "60"
              host: broker-2.example.com
          brokerCertChainAndKey:
            secretName: my-cluster-lets-encrypt
            certificate: tls.crt
            key: tls.key
    config:
      auto.create.topics.enable: "false"
      offsets.topic.replication.factor: 3
      transaction.state.log.replication.factor: 3
      transaction.state.log.min.isr: 2
    storage:
      type: jbod
      volumes:
        - id: 0
          type: persistent-claim
          size: 100Gi
          deleteClaim: true
  zookeeper:
    replicas: 3
    storage:
      type: persistent-claim
      size: 100Gi
      deleteClaim: true
  entityOperator:
    topicOperator: {}
    userOperator: {}

The addresses are specified in two places:

  • In the annotations section, I use the external-dns.alpha.kubernetes.io/hostname annotation to tell ExternalDNS that this Ingress resource should be assigned the given DNS name. ExternalDNS sets up the Route 53 DNS records automatically. You don’t have to use Ingress. You can use loadbalancer or nodeport instead.
  • The host field configures the Ingress resource to expect the connection to use this hostname when connecting to the broker.

In addition to that, the brokerCertChainAndKey section specifies the listener certificate used for this listener. We just point it to the Kubernetes Secret created by cert-manager that contains our signed certificate.

The Kafka custom resource I’ve shown above is just an example. It doesn’t specify authentication, metrics, and other configuration. But it should help you to get started.

Testing the certificate with OpenSSL

Once the Kafka cluster is deployed, we can use OpenSSL to check that it is using the right certificate. We use the s_client feature and check that it uses the right certificate signed by Let’s Encrypt:

openssl s_client -connect bootstrap.example.com:443 -servername bootstrap.example.com
CONNECTED(00000005)
depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = R3
verify return:1
depth=0 CN = bootstrap.example.com
verify return:1
---
Certificate chain
 0 s:/CN=bootstrap.example.com
   i:/C=US/O=Let's Encrypt/CN=R3
 1 s:/C=US/O=Let's Encrypt/CN=R3
   i:/O=Digital Signature Trust Co./CN=DST Root CA X3
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=bootstrap.example.com
issuer=/C=US/O=Let's Encrypt/CN=R3
---
No client certificate CA names sent
Server Temp Key: ECDH, X25519, 253 bits
---
SSL handshake has read 2970 bytes and written 317 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES256-GCM-SHA384
    Session-ID: F2CE7100EF5F4E077C69EA3CAE6307F6F90848933FAAEF2F34F1F79C6A35590A
    Session-ID-ctx:
    Master-Key: 46A6C5A04A034E9350D73AC037E52D4CACD49CF6F33103B929372CB7F44816939A5473C9E67DD6AFE03D221ADA6412BB
    Start Time: 1619999991
    Timeout   : 7200 (sec)
    Verify return code: 0 (ok)
---
...

As you can see from the following part of the output:

Certificate chain
 0 s:/CN=bootstrap.example.com
   i:/C=US/O=Let's Encrypt/CN=R3
 1 s:/C=US/O=Let's Encrypt/CN=R3
   i:/O=Digital Signature Trust Co./CN=DST Root CA X3

the server certificate used by bootstrap.example.com is signed by Let’s Encrypt.

Sending or receiving messages

We can now also test that we can produce and consume messages from the Kafka cluster without configuring a certificate in our clients. When using the console consumer or producer from Kafka, we need to specify the --bootstrap-server option and the --topic option. And in addition to that we will also enable TLS encryption with --consumer-property security.protocol=SSL. We do not need to specify any truststore. We will use the default CA certificates bundled with Java.

$ bin/kafka-console-consumer.sh --bootstrap-server bootstrap.example.com:443 --topic my-topic --consumer-property security.protocol=SSL
{ "Hello World": "2021/05/03 00:08:44" }
{ "Hello World": "2021/05/03 00:08:45" }
{ "Hello World": "2021/05/03 00:08:46" }
...

We can also do the same, for example, with kafkacat. We configure the bootstrap server as bootstrap.example.com:443. And we enable the TLS encryption with -X security.protocol=ssl. But we do not specify a certificate. We let it use the CA certificates bundled with the operating system.

kafkacat -C -b bootstrap.example.com:443 -X security.protocol=ssl -o end -t my-topic
{ "Hello World": "2021/05/03 00:02:29" }
{ "Hello World": "2021/05/03 00:02:30" }
{ "Hello World": "2021/05/03 00:02:31" }
...

Certificate changes or renewals

If you need to change the signed certificate — for example, to add more DNS names or change some other settings — don’t worry. You can just update the Certificate resource and cert-manager will get a new updated certificate for you. Strimzi will automatically detect it and do a rolling update of the Kafka brokers to load the new certificate. Also certificate renewals will be done automatically by cert-manager and Strimzi.

Conclusion

This blog post shows that using signed certificates with Strimzi and Apache Kafka is really easy. The most difficult part is configuring cert-manager and ExternalDNS to work with your DNS provider. But trust me, it is worth the effort. You will not need to worry anymore about copying the self-signed certificates to your applications. And in many cases, cert-manager and ExternalDNS will come in handy also for other applications, not just for Apache Kafka.