Two weeks with Kubernetes in production

Recently, the microservices model of designing applications has been getting more popular than the monolithic model for many reasons (which deserve a separate article). Thanks to containerisation, we are able to split our architecture into small, independent services which can be arranged in many ways (think about container/service as on a lightweight and isolated virtual machine). This gives us a hand building highly scalable and robust architecture solutions. When considering to use a cloud provider, it becomes even easier, as we don’t have to take care of our own, dedicated IT infrastructure.

However, there are some cons for software developers when it comes to maintaining microservices that were previously easily runnable as a monolith application. Now, the same app is cut into pieces that somehow must keep communication between each other working. This means that we must to run and deploy all of them locally or in production to provide one, properly working software. Depending on how the project is prepared, organised and documented, it might not be a rocket science to bring all of those services up to life, particularly in sophisticated systems that need to be run as a whole.

Production deployment, versioning, keeping zero-downtime work of these applications is a challenge as well. We need people at the expert level of understanding DevOps practices. Someone who knows how to design architecture, communication and processes at each layer to provide a highly available solution, which is also automatically scalable and easy to deploy. Nowadays, it’s quite hard to find these people. It’s obvious that the product should not depend on the present team members, but those processes should be simple and easy to adapt for new developers who will take care of this stuff in the future.

There is a lot of potential problematic situations, which takes us to the point of this article: why we decided to use Kubernetes orchestration.

Why Kubernetes?

At Software Brothers, and I guess in other software houses too, it’s a common practice to maintain multiple environments (development, staging and production) for each project we develop. Even though we need all the three instances, there is no need to keep their resources in the same configuration. Development version of the API service doesn’t need as much CPU resources as the production one because it only handles requests from our developers and testers. On the other hand, for the production version, we must provide auto scalability, a bigger amount of disk space, different configurations, SSL and other important features which make our end-product stable and secure.

Our choice for hosting client apps is a cloud, rather than a dedicated infrastructure. It gives us tons of possibilities with nice and easy workflow. So far, we’ve mainly deployed the software using Amazon ECS. Over the last few years, we’ve gained a lot of experience using AWS solutions - most of our projects use Amazon container orchestration service. Recently, however, we decided to use clusters environment to go a step further and deploy one of our latest projects in a different way, mainly using Kubernetes orchestration.

One of the key reasons to use k8s is that the configuration doesn’t depend on any cloud provider, hence it might be easily moved to a different cloud without much effort. Besides, it’s now a very popular orchestration way, well documented, created and supported by Google. Although, a few good alternatives exist, such as Docker Swarm, Apache Mesos, HashiCorp Nomad or Kontena and have a lot of users. It’s an open source project with more than 44k stars on GitHub, written in Go programming language. It’s also worth to mention that thanks to minikube, we are also able to run it locally and you don’t need to purchase anything to start playing with k8s.

The second reason is that we don’t need to change anything in our application’s source code to adapt it to a different architecture. We can write Kubernetes resources configuration within the project version control repository and track these changes. It becomes architecture as a configuration. Moreover, it’s a YAML format, so it increases readability. If you apply your configuration, Kubernetes does the auto formatting and always adds some extra details. There is a command line interface tool kubectl for applying changes on your clusters directly from the console or from a file.

Any reasons, why you shouldn’t try it? Maybe only one thing may concern people with no prior Kubernetes experience - how to do it right.

How we use Kubernetes?

First of all, we’ve written our Dockerfiles using Builder Pattern to keep image size as small as possible. They all follow the Single Responsibility Principle. Using the alpine version of images, we decrease the container size, which positively affects the time of continuous deployment process.

For production Dockerfiles, we recommend making the file system read-only as well as running your application as a normal user, just to avoid spawning processes with root privileges. Docker images with user namespace isolation are easier to keep secure by limiting access to their resources.

Another important rule is running a single process per container. Even though quite obvious, I think it’s worth to be mentioned as a good practice. It gives you a possibility to scale your service horizontally. Moreover, the whole system is more flexible and isolated, you can collect logs more easily or use this single container in other projects. Imagine you run two different processes in a single container and one of them crashes. Docker agent should recognize crashed container, kill it and create a new instance, but instead, you stay with a “healthy” zombie process, and it’s a problem. Containers should serve a distinct purpose only. Keep in mind to always kill gently, rather than restart your application.

We also create deployments that always represent only one microservice. The same situation as in the paragraph above, but from the Kubernetes point of view. Each deployment contains a single container per pod, if possible. Sometimes you might need to have i.e a CloudSQL proxy container to connect with the database and this is fine. Remember to avoid designing your deployments as a set of containers, since there is no scalability in this approach.

Use --record flag to keep all those changes within annotations when you provide any changes into your resources. This helps you track history of all changes. Try kubectl rollout history deployments to see what has been recently done with your configuration. I was controlling my actions in a very pure, not elegant way. This affected me losing the context of my work until I started to use this flag for kubectl.

Count carefully your cluster resources. It’s good to think about requests and limits fields which you can set for each container within a pod. Bear in mind the context of RollingUpdate strategy with maxSurge and maxUnavailable parameters. Without these definitions configured properly, your app can use all of the CPU from your cluster in case you have auto scaling off per nodes number. Setting them up, you increase a chance for spawning a new pod in a cluster. It depends whether you want to let your cluster have two identical copies of deployment at the same time. Remember about kube-reserved CPU and memory that you must calculate from the resource pool of your node.

My last recommendation is to use Helm, which is a charts (kubernetes packages) manager. It helps you to find and install Kubernetes templates that you can tune up to your needs by filling up values.yaml file. With Helm, you are able to create a template of some part of your application and deploy it more easily. You can also share it with the community.

However, our approach was to start from scratch, step by step, deploying simple, yet working pieces. We were completely focused on getting to know better every Kubernetes resource config structure. I personally prefer to make a deep research before I start to work. Same here, we’ve been trying to follow all the best practices since we’ve started the entire process of going live with our Django application.

Last steps - push it live

One of my last things to do was to expose our product to customers, redirect DNS and provide a secure connection via SSL. Not a big deal. Let’s see:

At first, you must create an Ingress object that allows external access to exposed cluster services. As the documentation says, Ingress can provide load balancing, SSL termination and name-based virtual hosting. What you need to do is to declare readinessProbe and livenessProbe probes for each service, so Ingress can see what status it is.

(Part of your deployment configuration, container spec)
livenessProbe:
    failureThreshold: 3
    httpGet:
      path: /
      port: 8000
      scheme: HTTP
    initialDelaySeconds: 60
    periodSeconds: 65
    successThreshold: 1
    timeoutSeconds: 1

readinessProbe:
    failureThreshold: 3
    httpGet:
      path: /
      port: 8000
      scheme: HTTP
    initialDelaySeconds: 50
    periodSeconds: 55
    successThreshold: 1
    timeoutSeconds: 1

This requires your container to be able to answer GET / 200, or you can prepare another health-check endpoint for this purpose.

Ingress has the following structure. See below a config file:

(production-ingress.yaml)
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    certmanager.k8s.io/cluster-issuer: our_cluster_issuer
    kubernetes.io/ingress.global-static-ip-name: our_static_ip
    kubernetes.io/tls-acme: "true"
  name: production-ingress
  namespace: default
spec:
  backend:
    serviceName: our_service
    servicePort: 80
  tls:
  - secretName: our_secret_with_certificates
status:
  loadBalancer:
    ingress:
    - ip: xxx.xxx.xxx.xxx

Take a look at the annotations. We need a static IP created in GCP, service pointing particular deployment with the application, cluster issuer and what you can find above in the spec group - definition of secret that keeps two values: cert and key. See below listings:

(our_secret_with_certificates.yaml)
apiVersion: v1
data:
  tls.crt: # crt
  tls.key: # key
kind: Secret
metadata:
  name: our_secret_with_certificates
  namespace: default
type: Opaque
(our_cluster_issuer.yaml)
apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
  name: our_cluster_issuer
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: your-registration-address@email.com
    privateKeySecretRef:
      name: our_cluster_issuer
    selfSigned: {}

Moreover, you need to create a certificate object for this purpose.

(our_certificate.yaml)
apiVersion: certmanager.k8s.io/v1alpha1
kind: Certificate
metadata:
  name: our_certificate
spec:
  secretName: our_secret_with_certificates
  dnsNames:
  - our.domain
  acme:
    config:
    - http01:
        ingress: production-ingress
      domains:
      - our.domain
  issuerRef:
    name: our_cluster_issuer
    kind: ClusterIssuer

Adopting the above setup, I managed to provide SSL with auto-renewal certificates.

Summary

Few people asked me how hard it is to use Kubernetes orchestration. From this perspective, I could say it is as difficult as trying any other new thing - the more you work with it, the easier it is. If you really want to use it in your project, think whether you need clusters at all. If the answer is yes, do not waste your time and learn Kubernetes basics.