Containers / Data / Development / Kubernetes / Storage

Deploy a Highly Available WordPress Instance as a StatefulSet in Kubernetes 1.5

13 Jan 2017 1:00am, by

At The New Stack, we covered various strategies for running stateful workloads on the Kubernetes container orchestration engine. This article takes a practical, hands-on approach to deploying a highly available WordPress application in Kubernetes based on the strategies and best practices that were discussed earlier. We will cover everything from setting up the Kubernetes cluster to configuring shared storage to deploying the stateful application to configuring auto scaling of the application.

Deploying and managing traditional workloads such as databases and content management systems in a containerized environment calls for a different approach. While it may be easy to package, deploy, manage, and scale contemporary, cloud native applications in Kubernetes, managing a MySQL cluster or a fleet of WordPress containers requires an understanding of how storage, networking, and service discovery work in Kubernetes environment. We will explore these concepts in the context of running a reliable, highly available content management system (CMS) powered by MySQL and WordPress.

Attributes of a Highly Available WordPress Deployment

WordPress is a stateful application the relies on two persistence backends: A file system and MySQL database. To ensure high availability of the application, we need to maximize the uptime of the core PHP application, the underlying storage layer backing the file system, and the data tier powered by MySQL.

When the traffic increases, the core PHP application should elastically scale to meet the load requirements. Each new instance should have read/write access to the file system to access and update the content. The file system capacity should proportionally increase as the content grows. Though not completely elastic, the data tier should also be able to scale-in and scale-out on-demand.

Kubernetes and other container management platforms support scaling the stateless PHP application effortlessly. However, scaling the storage backend and the MySQL data tier is not an easy task. The concept of StatefulSets, introduced in Kubernetes 1.5 precisely address the issue of scaling stateful workloads. We will leverage this concept in designing our deployment strategy.

A Peek at The Stack

The data tier of our application will be powered by Percona XtraDB Cluster. It is one of the most popular open source MySQL clustering technologies powered by Percona Server and Codership Galera. The best thing about Percona’s XtraDB Cluster is its support for synchronous, multi-master replication, which delivers high availability. The MySQL Docker images are available on Docker Hub which are maintained by Percona.

The XtraDB cluster relies on etcd for discovering new nodes. The open source etcd is a popular key-value store for shared configuration and service discovery. It powers some of the well-known distributed computing systems such as  Kubernetes, Cloud Foundry, locksmith, vulcand, and Doorman. Coming from CoreOS Inc, the company behind the Container Linux distribution, etcd is highly optimized for container scheduling and management.

For shared storage, we will be configuring a Network File Storage (NFS) backend that is accessible from all the Kubernetes nodes of the cluster.  The static content uploaded through WordPress will be stored in the distributed file system. This approach ensures that the WordPress container is entirely stateless, which can scale rapidly.

We will use the official WordPress image from Docker Hub to create the Kubernetes Pods. The only endpoint exposed to outside world will be the HTTP / HTTPS service associated with the WordPress Pods.

Readying Kubernetes to Run Stateful Workloads

The Persistent Volumes and Claims created in Kubernetes can be based on NFS. They become the storage backbone for MySQL and WordPress Pods. In production scenarios, it is recommended that the NFS share is based on SSD storage running in a network-optimized instance. For this proof of concept, we will configure NFS on the Kubernetes Master.

For higher availability, we will run a minimum of three instances of etcd. We will use the concept of Node Affinity to schedule one and only one etcd Pod in a Node.

The Percona XtraDB Cluster will be configured as a Kubernetes StatefulSet to ensure high availability. StatefulSet mimics the workflow involved in deploying and managing a virtual machine-based cluster. Each Pod gets a stable, unique identifier associated with dedicated persistent storage. It brings the flexibility of ReplicaSet to stateful Pods.

For elastic scaling and rapid scheduling of WordPress Pods, we will configure a ReplicaSet with a minimum of three replicas. Each Pod in the ReplicaSet will be associated with a Volume mounted on NFS. This approach makes WordPress almost stateless. We will also configure Horizontal Pod Autoscaling for the ReplicaSet to enable elasticity.

Setting up the Kubernetes Infrastructure

This walkthrough helps you configure Kubernetes cluster on a set of Vagrant boxes running locally. With a few modifications, the same can be used with mainstream public cloud providers.

Assuming you have a Mac with at least 8GB RAM and 256GB HDD running VirtualBox and Vagrant, you can easily spin up a fully configured three-node Kubernetes cluster in less than 20 minutes. Just run the following commands to get started.

This would provision four Fedora virtual machines: A Kubernetes Master and three nodes. It would also configure kubectl, the Kubernetes command line interface to work with the cluster.

Typing kubectl get nodes shows the following output.

Once this is done, the next step is to set up shared storage based on NFS. We will configure the Master as the NFS server with the mount point available on all the Nodes.

SSH into the Kubernetes Master (10.245.1.2) and run the commands to configure an NFS share:

SSH into each Node, run the below commands to automount NFS at boot time:

This step completes provisioning the cluster with shared storage available at /mnt/data on the Nodes.

Creating Persistent Volumes and Claims

Before we go any further with the setup, let’s create Persistent Volumes (PV) and Persistent Volume Claims (PVC) that will be used by the MySQL cluster.

We will first provision three Persistent Volumes that are based on NFS. Notice that the PV definition contains a pointer to the NFS server. The path /opt/data/vol/0 will be explicitly assigned to the PV called mysql-pv0. The remaining two PVs also have an ordinal index of 1 and 2 attached to them. The significance of this convention will be explained when we create the MySQL StatefulSet:

Each PV will be claimed by a PVC, which will be mapped to the Pod Volume of the StatefulSet:

Execute the following command to provision the storage infrastructure:

Let’s ensure that the PVs and PVCs are in place:

With the infrastructure setup, we are all set to deploy the stateful application.

Deploying etcd

We will configure 3 instances of the distributed key-value store, etcd. Since each instance requires unique configuration, it will be packaged a Pod with a dedicated Service. The Service endpoint will be used by the RAFT consensus protocol for internal communication. We will also expose an internal endpoint for the MySQL cluster to talk to the etcd cluster. To test the etcd deployment, we will also expose a NodePort.

To ensure that no two etcd instances are placed on the same node, we will use node affinity. Before that, we need to add a label to each Kubernetes node. The following commands will assign labels to the nodes.

The Pod definition of each etcd instance will have the node selector parameter that enforces node affinity.

Running the following command to create the etcd cluster:

We should now have three Pods and five Services. The Service, etcd-client is meant only to test the etcd cluster from the host machine. This can be safely deleted later.

Let’s verify the etcd configuration by storing and accessing value:

This step successfully configured the etcd cluster with node affinity. We will use this to configure the MySQL cluster.

Deploying Percona Cluster

Let’s go ahead and deploy 3 instances of MySQL as a StatefulSet. Before we launch the cluster, take a look at the StatefulSet definition. Notice how the volumeMounts parameter is associated with the claim.

 

During the scheduling of the StatefulSet, Kubernetes will ensure that each Pod is mapped to a Claim based on the ordinal index. The Pod mysql-0 will be mapped to db-mysql-0 which was created earlier. Take a minute to explore the StatefulSet YAML file.

You will also notice that the Percona cluster relies on the etcd service for discovery, which we created in the previous step.

Let’s now provision a three-node Percona XtraDB Cluster for MySQL:

After a few minutes, you should see three Pods created for us.

Inspect one of the Pods in the StatefulSet to see the storage and network configuration. We can also check the logs to see that the replication is configured among the MySQL instances:

This step also involves in the creation of an internal and external MySQL service:

The MySQL endpoint is a headless service used for routing the requests to one of the Pods of the StatefulSet. This will be used by WordPress to talk to the MySQL cluster.

The mysql-client endpoint is created to test the service. It can be deleted after the initial setup. Let’s see the MySQL cluster in action by connecting the CLI to the mysql-client endpoint:

The CLI is routed to one of the Pods by the headless service. Each unique IP represents the address of a stateful Pod.

This verifies that MySQL cluster is up and running. You can also SSH into the Kubernetes Master Vagrant box and check the folders (/opt/data/vol/[0,1,2]) that contain the MySQL data and log files.

Deploying WordPress

Since the state is already moved to NFS and MySQL, we can configure WordPress Pods as a ReplicaSet. This will give us flexibility in scaling the application.

Let’s create the WordPress ReplicaSet:

Let’s verify the Pod creation:

Each Pod of the ReplicaSet mounts the same file system share with read/write access. This will ensure that the content uploaded to WordPress is instantly available to all the Pods.

WordPress.yml file also has a definition for Horizontal Pod Autoscaler (HPA) that will automatically scale the Pods.

With the discovery backend (etcd), database (MySQL), and frontend (WordPress) in place, let’s go ahead and access it from the browser. Before that let’s get the NodePort of the service endpoint.

Disclaimers

  • Since StatefulSet is currently in beta, this architecture is designed to be a proof of concept for StatefulSet. It is not production-ready.
  • For deploying etcd, consider etcd operator from CoreOS.
  • NFS may not be the ideal distributed storage for I/O intensive workloads. The usage of Gluster or Ceph is recommended for this type of deployments.
  • Redis or Memcached is preferred to store PHP sessions. Moving the session state out of WordPress will make the deployment more scalable.
  • Usernames and passwords are hardwired into the YAML definitions. For production deployment, please consider the usage of Kubernetes Secrets.

The Cloud Native Computing Foundation, which manages Kubernetes, is a sponsor of The New Stack.

Feature image: The Eiffel Tower, taken by Peter Y. Chuang, via Unsplash.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.