Archive October 2021

Add a self-signed certificat on TKG cluster

If you want to deploy pods on a Kubernetes cluster that does not know the certificate of the registry containing the images (this is generally the case for labs with self-signed certificates that are not known by an authority), you risk to not be able to deploy your images, let’s see an example by deploying a Kuard image from my private Harbor registry:

(Warning, WordPress replaces the two dashes with one: cry:)

# kubectl run kuard –image=harbor.cpod-velocity.az-fkd.cloud-garage.net/library/kuard-amd64:blue
# kubectl get pods
NAME  READY STATUS   RESTARTS  AGE
kuard    0/1         ImagePullBackOff  0                     7s

# kubectl describe pods kuard
….
Type Reason Age From Message
—- —— —- —- ——-
Normal Scheduled 33s default-scheduler Successfully assigned default/kuard to test-md-0-5d6756b7fd-b9kwl
Normal Pulling 19s (x2 over 32s) kubelet Pulling image “harbor.cpod-velocity.az-fkd.cloud-garage.net/library/kuard-amd64:blue”
Warning Failed 19s (x2 over 32s) kubelet Failed to pull image “harbor.cpod-velocity.az-fkd.cloud-garage.net/library/kuard-amd64:blue”: rpc error: code = Unknown desc = failed to pull and unpack image “harbor.cpod-velocity.az-fkd.cloud-garage.net/library/kuard-amd64:blue”: failed to resolve reference “harbor.cpod-velocity.az-fkd.cloud-garage.net/library/kuard-amd64:blue”: failed to do request: Head “https://harbor.cpod-velocity.az-fkd.cloud-garage.net/v2/library/kuard-amd64/manifests/blue”: x509: certificate signed by unknown authority
Warning Failed 19s (x2 over 32s) kubelet Error: ErrImagePull
Normal BackOff 4s (x2 over 31s) kubelet Back-off pulling image “harbor.cpod-velocity.az-fkd.cloud-garage.net/library/kuard-amd64:blue”
Warning Failed 4s (x2 over 31s) kubelet Error: ImagePullBackOff

In this example, I tried from my Tanzu Kubernetes Grid (TKG) cluster to deploy the Kuard image which is located on my private Harbor registry hosted on my lab. This self-signed certificate is not recognized by an authority. For this to work anyway, each worker node in my cluster would need to know this certificate. I can copy it to each of them but the principle of TKG is to have a cluster whose life cycle can evolve easily and automatically, add worker nodes, remove them, replace them during the update , …. This would require each time a node is added to include the certificate.

Jesse Hu wrote a tkg-ytt-overlay-additional-ca-certs · GitHub procedure that I tested in TKG 1.4. It works great, it makes sense for those who use Kubernetes every day but may seem complicated to others. I will try to clarify it. This consists of obtaining the certificate, encoded in base64, of executing the commands to have the certificate taken into account by all existing and future Kubernetes nodes.

Encrypt the certificate with base64 and copy it

# base64 -w0 ca.crt (le résultat a volontairement été modifié)
LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFekNDQWZ1Z0F3SUJBZ0lRREthWmp6NmF4MitZS3RxS0YrUUNKekFOQmdrcWhraUc5dzBCQVFzRkFEQVUKTVJJd0VBWURWUVFERXdsb1lYSmliM0l0WTJFd0hoY05NakV3T0RJek1UVXdNREEwV2hjTk1qSXdPREl6TVRVdwpNREEwV2pBVU1SSXdFQVlEVlFRREV3bG9ZWEppYjNJdFkyRXdnZ0VpTUEwR0NTcUdTSWIzRFFFQkFRVUFBNElCCkR3QXdnZ0VLQW9JQkFRREl0MEhDUEF0ZWxrSVZhOXZtbjJnQWR4VXNaL3lTd2psNi95SzV0eTZ5OW1PbU5peDgKOHAvVFk2SG9MVHlJNUhtNytUYTJ4RGpvUUpmNllWMURsc3Y3d2E2R2pTUXE0WWxQWG1hUUNvVkp5eno5OVRvUgpDbW80VjhRUnJLbE5WL3NFbStrVXhseGFNZTZOMlA3UjB1MHVCV0Q5NW9Ra3RqWC8rS01uT0ErUlZoWkJ1dEFmCjJXSzhIMmRYeWo4bFV3Vk4rWWJqeW83dkdnNERNZlVHMWtDa0hYYURTMGcvWlhyNU1LRTNDWEo4YUVPZFhMNjMKb3BHTXVMQWp4WUZIdng0SnBMN0lNQ2VZMGFCcjcybUUzcy9SMENCMW1zTU5nYWhXTkhNZjhINktsUy9qUVlnUgpQMjV6WmYyOUZ0VFdnOWhZNVJZQngxalcyeXZERGJsMmFGNXBBZ01CQUFHallUQmZNQTRHQTFVZER3RUIvd1FFCkF3SUNwREFkQmdOVkhTVUVGakFVQmdnckJnRUZCUWNESUt3WUJCUVVIQXdJd0R3WURWUjBUQVFIL0JBVXcKQXdFQi96QWRCZ05WSFE0RUZnUVVVbTBvVUtiSXBHbTcxL3JzUlBReUJJbkZYYmt3RFFZSktvWklodmNOQVFFTApCUUFEZ2dFQkFE cVJBamwzWTNqVk5JK1JUbHBYdmwvdUYxbXNZUTNzTnFwVXR6eVNCVlNDWlpDMkRSSnpwYWZGCjhPdXN5QjBMTTNlS3VTd0t4STgrT0o5OTlhZkdOazRWTnpySVhOQURaZ1BxbnRFSWRucXNReGg4eFBuOVY0T2QKQUtsTVJycVI4R3g4ejdRM2EvN01uR0sra1l3VmorZ3BBNkFGUEJxSVJrU3Jscmo5b2dXVzBqWTFzL2tNU21ydgpaVEFZWTJqcFhBaGZrdzcrVDN4OHYwa0NRai9NREo5L3dNTnhxeVNGMEhzNXd6THVvbVJOM0VEME03eUNhWjg0CmJuOFZTN1VUSjBaWnhBRmx3TWxySlRYWmFpQmNOeDRNdm4wNXN4RG5KZktCdFloSkZwbGRwR3hLMDRUSmNXWm0KU0dDemZhc2FIK2M1NklNT0IvRllMdlJlelh4cE5mMD0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo =

 

Find the context of your TKG management cluster and the name of the TKG workload cluster on which you will copy the certificate:

# kubectl get-contexts

CURRENT                                     NAME CLUSTER      AUTHINFO                  NAMESPACE
….
my-cluster-admin@my-cluster                 my-cluster        my-cluster-admin
* test-admin@test                           test              test-admin
tkg-mgmt-vsphere-admin@tkg-mgmt-vsphere     tkg-mgmt-vsphere  tkg-mgmt-vsphere-admin
tool-admin@tool 

Edit the configuration of the controller and worker template to add the content of the certificate and the command to take this certificate into account. So each time a node is added, the certificate will be taken into account. The content of the file is to be added in the files: part and the command in the preKubeadmCommands: part as below (I have a configuration based on Photon OS, if you have another OS you must use another command. Copied / pasted does not work very well it is better to type the words again):

To edit the control plane template : kubectl edit KubeadmControlPlane test-control-plane –context tkg-mgmt-vsphere-admin@tkg-mgmt-vsphere

……

  files:
  – content: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFekNDQWZ1Z0F3SUJBZ0lRREthWmp6NmF4MitZS3RxS0YrUUNKekFOQmdrcWhraUc5dzBCQVFzRkFEQVUKTVJJd0VBWURWUVFERXdsb1lYSmliM0l0WTJFd0hoY05NakV3T0RJek1UVXdNREEwV2hjTk1qSXdPREl6TVRVdwpNREEwV2pBVU1SSXdFQVlEVlFRREV3bG9ZWEppYjNJdFkyRXdnZ0VpTUEwR0NTcUdTSWIzRFFFQkFRVUFBNElCCkR3QXdnZ0VLQW9JQkFRREl0MEhDUEF0ZWxrSVZhOXZtbjJnQWR4VXNaL3lTd2psNi95SzV0eTZ5OW1PbU5peDgKOHAvVFk2SG9MVHlJNUhtNytUYTJ4RGpvUUpmNllWMURsc3Y3d2E2R2pTUXE0WWxQWG1hUUNvVkp5eno5OVRvUgpDbW80VjhRUnJLbE5WL3NFbStrVXhseGFNZTZOMlA3UjB1MHVCV0Q5NW9Ra3RqWC8rS01uT0ErUlZoWkJ1dEFmCjJXSzhIMmRYeWo4bFV3Vk4rWWJqeW83dkdnNERNZlVHMWtDa0hYYURTMGcvWlhyNU1LRTNDWEo4YUVPZFhMNjMKb3BHTXVMQWp4WUZIdng0SnBMN0lNQ2VZMGFCcjcybUUzcy9SMENCMW1zTU5nYWhXTkhNZjhINktsUy9qUVlnUgpQMjV6WmYyOUZ0VFdnOWhZNVJZQngxalcyeXZERGJsMmFGNXBBZ01CQUFHallUQmZNQTRHQTFVZER3RUIvd1FFCkF3SUNwREFkQmdOVkhTVUVGakFVQmdnckJnRUZCUWNESUt3WUJCUVVIQXdJd0R3WURWUjBUQVFIL0JBVXcKQXdFQi96QWRCZ05WSFE0RUZnUVVVbTBvVUtiSXBHbTcxL3JzUlBReUJJbkZYYmt3RFFZSktvWklodmNOQVFFTApCUUFEZ2dFQkFEcVJBamwzWTNqVk5JK1JUbHBYdmwvdUYxbXNZUTNzTnFwVXR6eVNCVlNDWlpDMkRSSnpwYWZGCjhPdXN5QjBMTTNlS3VTd0t4STgrT0o5OTlhZkdOazRWTnpySVhOQURaZ1BxbnRFSWRucXNReGg4eFBuOVY0T2QKQUtsTVJycVI4R3g4ejdRM2EvN01uR0sra1l3VmorZ3BBNkFGUEJxSVJrU3Jscmo5b2dXVzBqWTFzL2tNU21ydgpaVEFZWTJqcFhBaGZrdzcrVDN4OHYwa0NRai9NREo5L3dNTnhxeVNGMEhzNXd6THVvbVJOM0VEME03eUNhWjg0CmJuOFZTN1VUSjBaWnhBRmx3TWxySlRYWmFpQmNOeDRNdm4wNXN4RG5KZktCdFloSkZwbGRwR3hLMDRUSmNXWm0KU0dDemZhc2FIK2M1NklNT0IvRllMdlJlelh4cE5mMD0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    encoding: base64
    owner: root:root
    permissions: “0644”
    path: /etc/ssl/certs/tkg-custom-ca.pem

……

  preKubeadmCommands:

  – ! which rehash_ca_certificates.sh 2>/dev/null || rehash_ca_certificates.sh

……….

It is taken into account immediately and a new controller is deployed to replace the old one. (or several depending on the plan chosen at the time of creation). Wait while the new controller is deploying and replaces the old one. The same for the worker: kubectl edit KubeadmConfigTemplate test-md-0 –context tkg-mgmt-vsphere-admin@tkg-mgmt-vsphere. The consideration for workers is not immediate, you have to run the following command:

# kubectl patch machinedeployment test-md-0 –type merge -p “{\”spec\”:{\”template\”:{\”metadata\”:{\”annotations\”:{\”date\”:\”`date +’%s’`\”}}}}}” –context tkg-mgmt-vsphere-admin@tkg-mgmt -vsphere machinedeployment.cluster.x-k8s.io/test-md-0 patched Wait a bit for the worker nodes to be replaced by new ones then we can retry deploying a new kuard image

# kubectl get node NAME STATUS ROLES AGE VERSION test-control-plane-gsmqz Ready control-plane,master 34m v1.21.2+vmware.1 test-md-0-698857566f-8pvt7 Ready <none> 118s v1.21.2+vmware.1 Now we can redeploy the Kuard image to check that it will run # kubectl run kuard –image=harbor.cpod-velocity.az-fkd.cloud-garage.net/library/kuard-amd64:blue pod/kuard created

# kg pods
NAME READY STATUS RESTARTS AGE
kuard 1/1 Running 0 17s

This procedure is valid for taking certificates into account if the cluster is already deployed. if it has not yet been created, it is preferable to have the certificate taken into account from the start using the procedure described in the installation documentation.