Scaling ProxySQL rapidly in Kubernetes

Posted in: Cloud, DevOps, Google Cloud Platform, MySQL, Open Source, Technical Track

It’s not uncommon these days for us to use a high availability stack for MySQL consisting of Orchestrator, Consul, and ProxySQL. You can read more details about this stack by reading Matthias Crauwels’ blog post How To Autoscale ProxySQL In The Cloud as well as Ivan Groenwold’s post on MySQL High Availability With ProxySQL, Consul And Orchestrator. But the high-level concept is simply that Orchestrator will monitor the state of the MySQL replication topology and report changes to Consul which in turn can update ProxySQL hosts using a tool called consul-template.

Until now we’ve typically implemented the ProxySQL portion of this stack using an autoscaling group of sorts due to the high levels of CPU usage that can be associated with ProxySQL. It’s better to be able to scale up and down as traffic increases and decreases because this ensures you’re not paying for resources that you don’t need; however, this comes with a few disadvantages. The first of which is the amount of time it takes to scale up. If you are using an autoscaling group and a new instance is launched, the following steps will need to be taken:

  1. There will be a request to your cloud service provider for a new VM instance.
  2. Once the instance is up and running as part of the group, it will need to install ProxySQL along with supporting packages such as consul (agent) and consul-template.
  3. Once the packages are installed, they will need to be configured to work with the consul server nodes as well as the ProxySQL nodes that are participating in the ProxySQL cluster.
  4. The new ProxySQL host will announce to Consul that it’s available, which in turn will update all the other participating nodes in the ProxySQL cluster.

This can take time. Provisioning a new VM instance usually happens fairly quickly, normally within a couple minutes, but sometimes there can be unexpected delays. You can speed up package installation by using a custom machine image, but there is an operational overhead with keeping images up to date with the latest versions of the installed packages, so it may be easier to do this using a script that always installs the latest versions. All in all, you can expect a scale up to take more than a minute.

The next issue is how deterministic this solution is. If you’re not using a custom machine image, you’ll need to pull down your config and template files from somewhere, most likely a storage bucket, and there’s a chance that those files could be overwritten – meaning that the next time an instance is launched by the autoscaler it may not necessarily have the same configuration as the rest of the hosts participating in the ProxySQL cluster.

We can take this already impressive stack and take it another step further using Docker containers and Kubernetes.

For those of you who are unfamiliar with containerization: a container is similar to a virtual machine snapshot but is not a full snapshot that would include the OS; instead, it contains just the binary that’s required to run your process. You create this image using a Dockerfile, typically starting from a specified Linux distribution, and then use verbs like RUN, COPY, and USER to specify what should be included in your container “image”. Once this image is constructed, it can be centrally located in a repository and made available for usage by machines using a containerization platform like Docker. This method of deployment has become more and more popular in recent years due to the fact that containers are lightweight, and you know that if the container works on one system it will work exactly the same way when it’s moved to a different system, thus reducing common issues like dependencies and configuration variations from host to host.

Given that we want to be able to scale up and down, it’s safe to say we’re going to want to run more than one container. That’s where Kubernetes comes into play. Kubernetes is a container management platform that operates on an array of hosts (virtual or physical) and distributes containers on them as specified by your configuration, typically a YAML-format Kubernetes deployment file. If you’re using Google Kubernetes Engine (GKE) on Google Cloud Platform (GCP), this is even easier as the vast majority of the work in creating a Kubernetes deployment (referred to as a ‘workload’ in GKE) YAML is handled for you via a simple UI within the GCP Console.

If you want to learn more about Docker or Kubernetes, I highly recommend Nigel Poulton’s video content on Pluralsight. For now, let’s stick to learning about ProxySQL on this platform.

If we want ProxySQL to run in Kubernetes and operate with our existing stack with Consul and Orchestrator, we’re going to need to keep best practices in mind for our containers.

  1. Each container should run only a single process. We know that we’re working with ProxySQL, consul (agent), and consul-template, so these will all need to be in their own containers.
  2. The primary process running in each container should run as PID 1.
  3. The primary process running in each container should not run as root.
  4. Log output from the primary process in the container should be sent to STDOUT so that it can be collected by Docker logs.
  5. Containers should be as deterministic as possible – meaning they should run the same (or at least as much as possible) regardless of what environment they are deployed in.

The first thing in the list above that popped out is the need to have ProxySQL, consul-template, and consul (agent) isolated within their own containers. These are going to need to work together given that consul (agent) is acting as our communication conduit back to consul (server) hosts and consul-template is what updates ProxySQL based on changes to keys and values in Consul. So how can they work together like this if they are in separate containers?

The solution is provided by Kubernetes. When you’re thinking about Docker, the smallest computational unit is the container; however, when you’re thinking about Kubernetes, the smallest computational unit is the pod which can contain one or more containers. Any containers operating within the same pod can communicate with one another using localhost ports. So in this case, assuming you’re using default ports, the consul-template container can communicate to the consul (agent) container using localhost port 8500 and it can communicate to the ProxySQL container using port 6032 given that these three containers will be working together in the same pod.

So let’s start looking at some code, starting with the simplest container and then working our way to the most complex.

Consul (Agent) Container

Below is a generic version of the Dockerfile that I’m using for consul (agent). The objective is to install Consul and then instruct it to connect as an agent to the Consul cluster comprised of the consul (server) nodes.

FROM centos:7
RUN yum install -q -y unzip wget && \
  yum clean all
RUN groupadd consul && \
  useradd -r -g consul -d /var/lib/consul consul
RUN mkdir /opt/consul && \
  mkdir /etc/consul && \
  mkdir /var/log/consul && \
  mkdir /var/lib/consul && \
  chown -R consul:consul /opt/consul && \
  chown -R consul:consul /etc/consul && \
  chown -R consul:consul /var/log/consul && \
  chown -R consul:consul /var/lib/consul
RUN wget -q -O /opt/consul/consul.zip https://releases.hashicorp.com/consul/1.6.1/consul_1.6.1_linux_amd64.zip && \
  unzip /opt/consul/consul.zip -d /opt/consul/ && \
  rm -f /opt/consul/consul.zip && \
  ln -s /opt/consul/consul /usr/local/bin/consul
COPY supportfiles/consul.conf.json /etc/consul/
USER consul
ENTRYPOINT ["/usr/local/bin/consul", "agent", "--config-file=/etc/consul/consul.conf.json"]

Simply put, the code above follows these instructions:

  1. Start from CentOS 7. This is a personal preference of mine. There are probably more lightweight distributions that can be considered, such as Alpine as recommended by Google, but I’m not the best OS nerd out there so I wanted to stick with what I know.
  2. Install our dependencies, which in this case is unzip and wget.
  3. Create our consul user, group, and directory structure.
  4. Install consul.
  5. Copy over the consul config file from the host where the Docker build is being performed.
  6. Switch to the consul user.
  7. Start consul (agent).

Now let’s check the code and see if it matches best practices.

  • Container runs a single process
    • The ENTRYPOINT runs Consul directly, meaning that nothing else is being run. Keep in mind that ENTRYPOINT specifies what should be run when the container starts. This means that when the container starts it won’t have to install anything because the packages come with the image as designated by the Dockerfile, but we still need to launch Consul when the container starts.
  • Process should be PID 1
    • Any process ran by ENTRYPOINT will run as PID 1.
  • Process should not be run as root
    • We switched to the Consul user prior to starting the entrypoint.
  • Log output should go to STDOUT
    • If you run Consul using the command noted in the ENTRYPOINT, you’ll see that log output goes to STDOUT
  • Should be as deterministic as possible
    • We’ve copied the configuration file into the container, meaning that the container doesn’t have to get support files from anywhere else before Consul starts. The only way the nature of Consul will change is if we recreate the container image with a new configuration file.

There’s really nothing special about the Consul configuration file that gets copied into the container. You can see an example of this by checking the aforementioned blog posts by Matthias or Ivan for this particular HA stack.

ProxySQL Container

Below is a generic version of the Dockerfile that I’m using for ProxySQL. The objective is to install ProxySQL and make it available to receive traffic requests on 6033 for write traffic, 6034 for read traffic, and 6032 for the admin console which is how consul-template will interface with ProxySQL.

FROM centos:7
RUN groupadd proxysql && \
  useradd -r -g proxysql proxysql
RUN yum install -q -y https://github.com/sysown/proxysql/releases/download/v2.0.6/proxysql-2.0.6-1-centos67.x86_64.rpm mysql curl && \
  yum clean all
COPY supportfiles/* /opt/supportfiles/
COPY startstop/* /opt/
RUN chmod +x /opt/entrypoint.sh
RUN chown proxysql:proxysql /etc/proxysql.cnf
USER proxysql
ENTRYPOINT ["/opt/entrypoint.sh"]

Simply put, the code above follows these instructions:

  1. Start from CentOS 7.
  2. Create our ProxySQL user and group.
  3. Install ProxySQL and dependencies, which in this case is curl which while will be used to poll the GCP API in order to determine what region the ProxySQL cluster is in. We’ll cover this in more detail below.
  4. Move our configuration files and entrypoint script to the container.
  5. Make sure that the ProxySQL config file is readable by ProxySQL. 
  6. Switch to the ProxySQL user.
  7. Start ProxySQL via the entrypoint script that’s provided with the container.

In my use case, I have multiple ProxySQL clusters – one per GCP region. They have to be logically grouped together in order to ensure they route read traffic to replicas within the local region but send traffic to the master regardless of what region it’s in. In my solution, a hostgroup is noted for read replicas in each region, so my mysql_query_rules table needs to be configured accordingly. In my solution, the MySQL hosts will be added to different host groups, but the routing to each hostgroup would remain consistent. Given that it’s highly unlikely to change, I have mysql_query_rules configured in the configuration file. This means that I need to select the correct configuration file based on my region before starting ProxySQL, and this is where my entrypoint script comes into play. Let’s have a look at a simplified and more generic version of my code:

#!/bin/bash
dataCenter=$(curl http://metadata.google.internal/computeMetadata/v1/instance/zone -H "Metadata-Flavor: Google" | awk -F "/" '{print $NF}' | cut -d- -f1,2)
...
case $dataCenter in
  us-central1)
    cp -f /opt/supportfiles/proxysql-us-central1.cnf /etc/proxysql.cnf
    ;;
  us-east1)
    cp -f /opt/supportfiles/proxysql-us-east1.cnf /etc/proxysql.cnf
    ;;
esac
...
exec proxysql -c /etc/proxysql.cnf -f -D /var/lib/proxysql

The script starts by polling the GCP API to determine what region the container has been launched in. Based on the result, it will copy the correct config file to the appropriate location and then start ProxySQL.

Let’s see how the combination of the Dockerfile and the entrypoint script allows us to meet best practices.

  • Container runs a single process
    • ENTRYPOINT calls the entrypoint.sh script, which does some conditional logic based on the regional location of the container and then ends by running ProxySQL. This means that at the end of the process ProxySQL will be the only process running.
  • Process should be PID 1
    • The command “exec” at the end of the entrypoint script will start ProxySQL as PID 1
  • Process should not be run as root
    • We switched to the proxysql user prior to starting the entrypoint.
  • Log output should go to STDOUT
    • If you run proxysql using the command noted at the end of the entrypoint script you’ll see that log output goes to STDOUT
  • Should be as deterministic as possible
    • We’ve copied the potential configuration files into the container. Unlike Consul, there are multiple configuration files and we need to determine which will be used based on the region that the container lives in, but the configuration files themselves will not change unless the container image itself is updated. This ensures that all containers running within the same region will behave the same.

Consul-template container

Below is a generic version of the Dockerfile that I’m using for consul-template. The objective is to install consul-template and have it act as the bridge between Consul via the consul (agent) container and ProxySQL, updating ProxySQL as needed when keys and values change in Consul.

FROM centos:7
RUN yum install -q -y unzip wget mysql nmap-ncat curl && \
  yum clean all
RUN groupadd consul && \
  useradd -r -g consul -d /var/lib/consul consul
RUN mkdir /opt/consul-template && \
  mkdir /etc/consul-template && \
  mkdir /etc/consul-template/templates && \
  mkdir /etc/consul-template/config && \
  mkdir /opt/supportfiles && \
  mkdir /var/log/consul/ && \
  chown -R consul:consul /etc/consul-template && \
  chown -R consul:consul /etc/consul-template/templates && \
  chown -R consul:consul /etc/consul-template/config && \
  chown -R consul:consul /var/log/consul
RUN wget -q -O /opt/consul-template/consul-template.zip https://releases.hashicorp.com/consul-template/0.22.0/consul-template_0.22.0_linux_amd64.zip && \
  unzip /opt/consul-template/consul-template.zip -d /opt/consul-template/ && \
  rm -f /opt/consul-template/consul-template.zip && \
  ln -s /opt/consul-template/consul-template /usr/local/bin/consul-template
RUN chown -R consul:consul /opt/consul-template
COPY supportfiles/* /opt/supportfiles/
COPY startstop/* /opt/
RUN chmod +x /opt/entrypoint.sh
USER consul
ENTRYPOINT ["/opt/entrypoint.sh"]

Simply put, the code above follows these instructions:

  1. Start from CentOS 7.
  2. Install our dependencies which are unzip, wget, mysql (client), nmap-ncat, and curl.
  3. Create our Consul user and group.
  4. Create the consul-template directory structure.
  5. Download and install consul-template.
  6. Copy the configuration file, template files, and entrypoint script to the container.
  7. Make the entrypoint script executable.
  8. Switch to the Consul user.
  9. Start consul-template via the entrypoint script that’s provided with the container.

Much like our ProxySQL container, we really need to look at the entrypoint here in order to get the whole story. Remember, this is multi-region so there is additional logic that has to be considered when working with template files.

#!/bin/bash
dataCenter=$(curl http://metadata.google.internal/computeMetadata/v1/instance/zone -H "Metadata-Flavor: Google" | awk -F "/" '{print $NF}' | cut -d- -f1,2)
...
cp /opt/supportfiles/consul-template-config /etc/consul-template/config/consul-template.conf.json
case $dataCenter in
  us-central1)
    cp /opt/supportfiles/template-mysql-servers-us-central1 /etc/consul-template/templates/mysql_servers.tpl
    ;;
  us-east1)
    cp /opt/supportfiles/template-mysql-servers-us-east1 /etc/consul-template/templates/mysql_servers.tpl
    ;;
esac
cp /opt/supportfiles/template-mysql-users /etc/consul-template/templates/mysql_users.tpl
### Ensure that proxysql has started
while ! nc -z localhost 6032; do
  sleep 1;
done
### Ensure that consul agent has started
while ! nc -z localhost 8500; do
  sleep 1;
done
exec /usr/local/bin/consul-template --config=/etc/consul-template/config/consul-template.conf.json

This code is very similar to the entrypoint file that was used for ProxySQL in the sense that it checks for the region that the container is in and then moves configuration and template files into the appropriate location, but there is some additional logic here that checks to ensure that ProxySQL is up and listening on 6032 and that consul (agent) is up and listening on port 8500. The reason for this is the consul-template needs to be able to communicate with both of these hosts. You really have no assurance as to what container is going to load in what order in a pod, so to avoid excessive errors in the consul-template log, I have it wait until it knows that its dependent services are running.

Let’s go through our best practices checklist one more time against our consul-template container code.

  • Container runs a single process.
    • ENTRYPOINT calls the entrypoint.sh script, which does some conditional logic based on the regional location of the container and then ends by running consul-template. This means that at the end of the process consul-template will be the only process running.
  • Process should be PID 1.
    • The command “exec” at the end of the entrypoint script will start consul-template as PID 1.
  • Process should not be run as root.
    • We switched to the coinsul user prior to starting the entrypoint.
  • Log output should go to STDOUT.
    • If you run Consul using the command noted at the end of the entrypoint script, you’ll see that log output goes to STDOUT.
  • Should be as deterministic as possible.
    • Just like ProxySQL and consul (agent), all the supporting files are packaged with the container. Yes, there is logic to determine what files should be used, but you have the assurance that the files won’t change unless you create a new version of the container image.

Putting it all together

Okay, we have three containers that represent the three processes that we need to package together so ProxySQL can work as part of our HA stack. Now we need to put it all together in a pod so that Kubernetes can have it run against our resources.

In my use case, I’m running this on GCP, meaning that once my containers have been built they are going to need to be pushed up to the Google Container Registry. Once that’s done we can create our workload to run our pod and specify how many pods we want to run.

Getting this up and running can be done with just a few short and simple steps:

  1. Create a Kubernetes cluster if you don’t already have one. This is what provisions the Cloud Compute VMs that the pods will run on.
  2. Push your three Docker images to the Google container registry. This makes the images available for use by the Kubernetes engine.
  3. Create your Kubernetes workload, which can be done simply via the user interface in the GCP console. All that’s required is selecting the latest version of the three containers that you’ve pushed up to the registry, optionally applying some metadata like an application name, Kubernetes namespace, and labels, then selecting which cluster you want to run the workload on.

Once you click deploy, the containers will spin up and, assuming there are no issues bringing the containers online, you’ll quickly have a functioning ProxySQL pod in Kubernetes that follows these high-level steps:

  1. The pod is started.
  2. The three containers will start. In Kubernetes, pods are fully atomic. All the containers start without error or the pod will not consider itself started.
  3. The consul-template container will poll consul (agent) and ProxySQL on their respective ports until it’s confirmed that those processes have started and then consul-template will start.
  4. Consul-template will create the new SQL files meant to configure ProxySQL based on the contents of the Consul key/value store.
  5. Consul-template will run the newly created SQL files against ProxySQL via its admin interface.
  6. The pod is now ready to receive traffic.

The YAML

During the process of creating your workload, or even after the fact, you’ll be able to see the YAML that you’d normally have to create with standard Kubernetes deployments. Let’s have a look at the YAML that was created for my particular deployment.

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2019-10-16T15:41:37Z"
  generation: 64
  labels:
    app: pythian-proxysql
    env: sandbox
  name: pythian-proxysql
  namespace: pythian-proxysql
  resourceVersion: "7516809"
  selfLink: /apis/apps/v1/namespaces/pythian-proxysql/deployments/pythian-proxysql
  uid: 706c6284-f02b-11e9-8f3e-42010a800050
spec:
  minReadySeconds: 10
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: pythian-proxysql
      env: sandbox
  strategy:
    rollingUpdate:
      maxSurge: 100%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: pythian-proxysql
        env: sandbox
    spec:
      containers:
      - image: gcr.io/pythian-proxysql/pythian-proxysql-proxysql@sha256:3ba95101eb7a5aac58523e4c6489956869865452d1cbdbd32b4186a44f2a4500
        imagePullPolicy: IfNotPresent
        name: pythian-proxysql-proxysql-sha256
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - image: gcr.io/pythian-proxysql/pythian-proxysql-consul-agent@sha256:7c66fa5e630c4a0d70d662ec8e9d988c05bd471b43323a47e240294fc00a153d
        imagePullPolicy: IfNotPresent
        name: pythian-proxysql-consul-agent-sha256
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - image: gcr.io/pythian-proxysql/pythian-proxysql-consul-template@sha256:1e70f4b96614dfd865641bf75784d895a794775a6c51ce6b368387591f3f1918
        imagePullPolicy: IfNotPresent
        name: pythian-proxysql-consul-template-sha256
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 2
  collisionCount: 1
  conditions:
  - lastTransitionTime: "2019-10-16T15:41:37Z"
    lastUpdateTime: "2019-11-11T15:56:55Z"
    message: ReplicaSet "pythian-proxysql-8589fdbf54" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2019-11-11T20:41:31Z"
    lastUpdateTime: "2019-11-11T20:41:31Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 64
  readyReplicas: 2
  replicas: 2
  updatedReplicas: 2

The first thing I have to point out is that this is a LOT of YAML that we didn’t have to create given that it was all handled by the Google Kubernetes Engine. This is a huge part of easing the process which allows us to get our solution working so quickly.

However, despite the fact that we have a lot of YAML created for us, there are still some occasions where we may need to modify this manually, such as working with Kubernetes Container Lifecycle Hooks, or working with requests or limits for hardware resources for individual containers in our pod.

How do I access my ProxySQL instance?

One consideration for Kubernetes is that when pods are started and stopped they will get an ephemeral IP address, so you don’t want to have your applications connect to your pods directly. Kubernetes has a feature called a “service” that allows your pods to be exposed via a consistent network interface. This service can also handle load balancing, which is what I’m planning on using with my Kubernetes deployment. Adding a service to your GKE workload is very simple and can be added with a few clicks.

Autoscaling

As noted earlier in this post, before the implementation of Kubernetes for this solution, it was recommended to use cloud compute autoscaling groups in order to handle fluctuations in traffic. We’re going to want to include the same strategy with Kubernetes to ensure we have enough pods available to handle traffic demand. Including autoscaling in your workload is also fairly simple and can be done via the console UI.

One important thing to note about scaling with Kubernetes is the time it takes to scale up and down. In the intro section of this post, I noted the process of adding and removing nodes from an autoscaling group and how that can take minutes to achieve depending on how quickly your cloud provider can stand up a new instance and the complexity of your configuration. With Kubernetes, I’ve seen my pods scale up in as little as three seconds and scale down in less than one second. This is part of what makes this solution so powerful.

Considerations for Connections During Scale Up and Down

One important thing to note is that, with pods being added to and removed from the workload, your connections to ProxySQL via the exposed service can be interrupted. It’s noted in the autoscaling documentation that this can cause disruption and your application needs to be able to handle this much in the same way that it would need to have to handle this for a cloud compute autoscaling group. You’ll want to ensure that your application has retry on database failure logic built in before incorporating Kubernetes autoscaling (or any autoscaling for that matter) as part of your data platform.

Considerations for MySQL users in ProxySQL

There are three tables that are replicated when working with ProxySQL cluster: mysql_servers, mysql_query_rules, and mysql_users – meaning that when a change to any of these tables is made on one of the nodes in the cluster, it will be replicated to all the other nodes. 

We really don’t need to worry about this when working with mysql_servers given that all nodes will get their mysql_server information from Consul via consul-template, so I’ve disabled this clustering feature.

With my particular use case I don’t need to worry about mysql_query_rules either because, as noted earlier in this post, my traffic is being routed based on the port that traffic is being sent to. The rules for this are simple and should not change so I have it in the configuration file and I have disabled replicating this table, as well.

The last table to consider is mysql_users and this is where things get interesting. Remember that with Kubernetes it’s possible to have persistent storage, but we really want our containers to be as stateless as possible, so if we were to follow the Docker and Kubernetes philosophy as closely as possible we wouldn’t want to have our data persist. This falls into the whole cattle vs pets discussion when working with containers, but I digress.

Let’s assume we’ve opted NOT to persist our ProxySQL data, typically stored in SQLite, and we lose all of the pods in our Kuberenetes cluster. Unlikely, but we always need to be ready for disaster. When the first pod comes up, it’s starting with a blank slate and this isn’t a problem considering it will get its initial set of mysql_server data from Consul via consul-template and it’ll get its mysql_query_rules data from the config file. However, there is no source of truth for mysql_users data, so all that data would be lost.

In this case, we need to incorporate some source of truth for the ProxySQL mysql_users table. It’s possible to use a cloud compute VM with ProxySQL installed that could be an ever-present member of the cluster which could seed data for new joining pods, but that breaks our construct of working specifically with containers. Plus, if you have a multi-cluster configuration like I do where there is one cluster in each region, then you need one ProxySQL “master host” in each region, which is a bit of a waste considering it’s just acting as a source of truth for mysql_users, which likely will be the same across all clusters.

My solution, in this case, is to leverage the source of truth that we already have in place: Consul. If it’s already acting as a source of truth for mysql_servers, there’s no reason why it can’t act as a source of truth for this as well.

All I need to do is have my MySQL users and password hashes (always stay secure) in Consul and use consul-template to create these on new ProxySQL host, or change them as keys and values change. You may have noticed this in the entrypoint script in my consul-template container.

To Cluster or Not To Cluster?

I mentioned before that ProxySQL cluster handles the replication of three tables: mysql_users, mysql_query_rules, and mysql_servers. Considering that all three of these tables now have their own source of truth, we really don’t need to worry about replicating this data. As changes are reported to Consul, it will update all the ProxySQL pods considering that all of them have consul (agent) and consul-template containers as part of the pod.

With this in mind, I’ve opted to rely on my constructed sources of truth and reduce solution complexity by removing ProxySQL clustering; however, this is going to vary from use case to use case.

Conclusion

The solution implemented in this use case has required the inclusion of a lot of new technologies that MySQL DBAs may or may not have familiarity with: ProxySQL, Orchestrator, Consul, GTIDs, etc. We’ve made this solution a little more complex by adding Docker and Kubernetes to the stack, but I personally believe this complexity is worth it considering the higher degree of idempotency that is built into the solution, the lack of need for ProxySQL clustering, and the speed in which scale up and scale down occurs.

One last consideration is the simple need for learning how to incorporate containers into your stack. This is not my first blog post on container philosophy and implementation and I believe that containers are going to become a greater part of the landscape for all of us – even us, the database professionals with our highly stateful technological challenges. If you have not already started educating yourself on these technologies, I would highly encourage you to do so in order to better prepare yourself for the shift from “Database Administrator” to “Database Reliability Engineer”

email

Interested in working with Peter? Schedule a tech call.

About the Author

Internal Principal Consultant
Peter Sylvester is one of the Internal Principal Consultants in the Open Source Database Consulting Group at Pythian. He has been with Pythian since January of 2015 and has been working with MySQL since 2008. Apart from work, Peter is an avid ice hockey player to stay in keeping with the stereotypical Canadian lifestyle, playing typically no less than twice a week!

2 Comments. Leave new

johnsonsophia016
November 27, 2019 7:31 am

thank you for the article.

Reply
Peter Sylvester
November 27, 2019 8:09 am

You are quite welcome!

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *