Scaling ProxySQL rapidly in Kubernetes

Posted in: Cloud, DevOps, Google Cloud Platform, MySQL, Open Source, Renew Refresh Republish, Technical Track

Editor’s Note: Because our bloggers have lots of useful tips, every now and then we update and bring forward a popular post from the past. Today’s post was originally published on November 26, 2019.

It’s not uncommon these days for us to use a high availability stack for MySQL consisting of Orchestrator, Consul and ProxySQL. You can read more details about this stack by reading Matthias Crauwels’ blog post How to Autoscale ProxySQL in the Cloud as well as Ivan Groenwold’s post on MySQL High Availability With ProxySQL, Consul and Orchestrator. The high-level concept is simply that Orchestrator will monitor the state of the MySQL replication topology and report changes to Consul which in turn can update ProxySQL hosts using a tool called consul-template.

Until now we’ve typically implemented the ProxySQL portion of this stack using an autoscaling group of sorts due to the high levels of CPU usage that can be associated with ProxySQL. It’s better to be able to scale up and down as traffic increases and decreases. This ensures you’re not paying for resources you don’t need. This, however, comes with a few disadvantages. The first is the amount of time it takes to scale up. If you’re using an autoscaling group and it launches a new instance it will need to take the following steps:

  1. There will be a request to your cloud service provider for a new VM instance.
  2. Once the instance is up and running as part of the group, it will need to install ProxySQL along with supporting packages such as consul (agent) and consul-template.
  3. Once the packages are installed, they’ll need to be configured to work with the consul server nodes as well as the ProxySQL nodes that are participating in the ProxySQL cluster.
  4. The new ProxySQL host will announce to Consul that it’s available, which in turn will update all the other participating nodes in the ProxySQL cluster.

This can take time. Provisioning a new VM instance usually happens fairly quickly — normally within a couple of minutes — but sometimes there can be unexpected delays. You can speed up package installation by using a custom machine image, but since there’s an operational overhead with keeping images up to date with the latest versions of the installed packages, it may be easier to do this using a script that always installs the latest versions. All in all, you can expect a scale-up to take more than a minute.

The next issue is how deterministic this solution is. If you’re not using a custom machine image, you’ll need to pull down your config and template files from somewhere — most likely a storage bucket — and there’s a chance those files could be overwritten. This means the next time the autoscaler launches an instance it may not necessarily have the same configuration as the rest of the hosts participating in the ProxySQL cluster.

We can take this already impressive stack and go a step further using Docker containers and Kubernetes.

For those unfamiliar with containerization; a container is similar to a virtual machine snapshot but isn’t a full snapshot that would include the OS. Instead, it contains just the binary that’s required to run your process. You create this image using a Dockerfile; typically starting from a specified Linux distribution, then using verbs like RUN, COPY and USER to specify what should be included in your container “image.” Once this image is constructed, it can be centrally located in a repository and made available for usage by machines using a containerization platform like Docker.

This method of deployment has become more and more popular in recent years due to the fact that containers are lightweight, and you know that if the container works on one system it will work exactly the same way when it’s moved to a different system. This reduces common issues like dependencies and configuration variations from host to host.

Given that we want to be able to scale up and down, it’s safe to say we’re going to want to run more than one container. That’s where Kubernetes comes into play. Kubernetes is a container management platform that operates on an array of hosts (virtual or physical) and distributes containers on them as specified by your configuration; typically a YAML-format Kubernetes deployment file. If you’re using Google Kubernetes Engine (GKE) on Google Cloud Platform (GCP), this is even easier as the vast majority of the work in creating a Kubernetes deployment (referred to as a ‘workload’ in GKE) YAML is handled for you via a simple UI within the GCP Console.

If you want to learn more about Docker or Kubernetes, I highly recommend Nigel Poulton’s video content on Pluralsight. For now, let’s stick to learning about ProxySQL on this platform.

If we want ProxySQL to run in Kubernetes and operate with our existing stack with Consul and Orchestrator, we’re going to need to keep best practices in mind for our containers.

  1. Each container should run only a single process. We know we’re working with ProxySQL, consul (agent), and consul-template, so these will all need to be in their own containers.
  2. The primary process running in each container should run as PID 1.
  3. The primary process running in each container should not run as root.
  4. Log output from the primary process in the container should be sent to STDOUT so that it can be collected by Docker logs.
  5. Containers should be as deterministic as possible — meaning they should run the same (or at least as much as possible) regardless of what environment they are deployed in.

The first thing in the list above that popped out is the need to have ProxySQL, consul-template and consul (agent) isolated within their own containers. These are going to need to work together given that consul (agent) is acting as our communication conduit back to consul (server) hosts and consul-template is what updates ProxySQL based on changes to keys and values in Consul. So how can they work together if they’re in separate containers?

Kubernetes provides the solution. When you’re thinking about Docker, the smallest computational unit is the container; however, when you’re thinking about Kubernetes, the smallest computational unit is the pod which can contain one or more containers. Any containers operating within the same pod can communicate with one another using localhost ports. So in this case, assuming you’re using default ports, the consul-template container can communicate to the consul (agent) container using localhost port 8500, and it can communicate to the ProxySQL container using port 6032 given that these three containers will be working together in the same pod.

So let’s start looking at some code, starting with the simplest container and working our way to the most complex.

Consul (Agent) Container

Below is a generic version of the Dockerfile I’m using for consul (agent). The objective is to install Consul then instruct it to connect as an agent to the Consul cluster comprised of the consul (server) nodes.

FROM centos:7
RUN yum install -q -y unzip wget && \
  yum clean all
RUN groupadd consul && \
  useradd -r -g consul -d /var/lib/consul consul
RUN mkdir /opt/consul && \
  mkdir /etc/consul && \
  mkdir /var/log/consul && \
  mkdir /var/lib/consul && \
  chown -R consul:consul /opt/consul && \
  chown -R consul:consul /etc/consul && \
  chown -R consul:consul /var/log/consul && \
  chown -R consul:consul /var/lib/consul
RUN wget -q -O /opt/consul/consul.zip https://releases.hashicorp.com/consul/1.6.1/consul_1.6.1_linux_amd64.zip && \
  unzip /opt/consul/consul.zip -d /opt/consul/ && \
  rm -f /opt/consul/consul.zip && \
  ln -s /opt/consul/consul /usr/local/bin/consul
COPY supportfiles/consul.conf.json /etc/consul/
USER consul
ENTRYPOINT ["/usr/local/bin/consul", "agent", "--config-file=/etc/consul/consul.conf.json"]

Simply put, the code above follows these instructions:

  1. Start from CentOS 7. This is a personal preference of mine. There are probably more lightweight distributions that can be considered, such as Alpine as recommended by Google, but I’m not the best OS nerd out there so I wanted to stick with what I know.
  2. Install our dependencies, which in this case are unzip and wget.
  3. Create our consul user, group and directory structure.
  4. Install consul.
  5. Copy over the consul config file from the host where the Docker build is being performed.
  6. Switch to the consul user.
  7. Start consul (agent).

Now let’s check the code and see if it matches best practices.

  • Container runs a single process:
    • The ENTRYPOINT runs Consul directly, meaning nothing else is being run. Keep in mind that ENTRYPOINT specifies what should be run when the container starts. This means when the container starts it won’t have to install anything because the packages come with the image as designated by the Dockerfile, but we still need to launch Consul when the container starts.
  • Process should be PID 1:
    • Any process run by ENTRYPOINT will run as PID 1.
  • Process should not be run as root:
    • We switched to the Consul user prior to starting the ENTRYPOINT.
  • Log output should go to STDOUT:
    • If you run Consul using the command noted in the ENTRYPOINT, you’ll see log output goes to STDOUT.
  • Should be as deterministic as possible:
    • We’ve copied the configuration file into the container, meaning the container doesn’t have to get support files from anywhere else before Consul starts. The only way the nature of Consul will change is if we recreate the container image with a new configuration file.

There’s really nothing special about the Consul configuration file that gets copied into the container. You can see an example of this by checking the aforementioned blog posts by Matthias or Ivan for this particular HA stack.

ProxySQL Container

Below is a generic version of the Dockerfile I’m using for ProxySQL. The objective is to install ProxySQL and make it available to receive traffic requests on 6033 for write traffic, 6034 for read traffic and 6032 for the admin console which is how consul-template will interface with ProxySQL.

FROM centos:7
RUN groupadd proxysql && \
  useradd -r -g proxysql proxysql
RUN yum install -q -y https://github.com/sysown/proxysql/releases/download/v2.0.6/proxysql-2.0.6-1-centos67.x86_64.rpm mysql curl && \
  yum clean all
COPY supportfiles/* /opt/supportfiles/
COPY startstop/* /opt/
RUN chmod +x /opt/entrypoint.sh
RUN chown proxysql:proxysql /etc/proxysql.cnf
USER proxysql
ENTRYPOINT ["/opt/entrypoint.sh"]

Simply put, the code above follows these instructions:

  1. Start from CentOS 7.
  2. Create our ProxySQL user and group.
  3. Install ProxySQL and dependencies, which in this case is curl, which will be used to poll the GCP API in order to determine what region the ProxySQL cluster is in. We’ll cover this in more detail below.
  4. Move our configuration files and ENTRYPOINT script to the container.
  5. Make sure the ProxySQL config file is readable by ProxySQL. 
  6. Switch to the ProxySQL user.
  7. Start ProxySQL via the ENTRYPOINT script provided with the container.

In my use case, I have multiple ProxySQL clusters — one per GCP region. They have to be logically grouped together to ensure they route read traffic to replicas within the local region but send traffic to the master regardless of what region it’s in. In my solution, a hostgroup is noted for read replicas in each region, so my mysql_query_rules table needs to be configured accordingly. In my solution, the MySQL hosts will be added to different host groups, but the routing to each hostgroup will remain consistent. Given that it’s highly unlikely to change, I have mysql_query_rules configured in the configuration file. This means I need to select the correct configuration file based on my region before starting ProxySQL, and this is where my ENTRYPOINT script comes into play. Let’s have a look at a simplified and more generic version of my code:

#!/bin/bash
dataCenter=$(curl https://metadata.google.internal/computeMetadata/v1/instance/zone -H "Metadata-Flavor: Google" | awk -F "/" '{print $NF}' | cut -d- -f1,2)
...
case $dataCenter in
  us-central1)
    cp -f /opt/supportfiles/proxysql-us-central1.cnf /etc/proxysql.cnf
    ;;
  us-east1)
    cp -f /opt/supportfiles/proxysql-us-east1.cnf /etc/proxysql.cnf
    ;;
esac
...
exec proxysql -c /etc/proxysql.cnf -f -D /var/lib/proxysql

The script starts by polling the GCP API to determine what region the container has been launched in. Based on the result, it will copy the correct config file to the appropriate location, then start ProxySQL.

Let’s see how the combination of the Dockerfile and the ENTRYPOINT script allows us to meet best practices.

  • Container runs a single process:
    • ENTRYPOINT calls the entrypoint.sh script, which does some conditional logic based on the regional location of the container, then ends by running ProxySQL. This means at the end of the process ProxySQL will be the only process running.
  • Process should be PID 1:
    • The command “exec” at the end of the ENTRYPOINT script will start ProxySQL as PID 1.
  • Process should not be run as root:
    • We switched to the ProxySQL user prior to starting the ENTRYPOINT.
  • Log output should go to STDOUT:
    • If you run ProxySQL using the command noted at the end of the ENTRYPOINT script you’ll see that log output goes to STDOUT.
  • Should be as deterministic as possible:
    • We’ve copied the potential configuration files into the container. Unlike Consul, there are multiple configuration files and we need to determine which will be used based on the region the container lives in, but the configuration files themselves will not change unless the container image itself is updated. This ensures that all containers running within the same region will behave the same.

Consul-template container

Below is a generic version of the Dockerfile I’m using for consul-template. The objective is to install consul-template and have it act as the bridge between Consul via the consul (agent) container and ProxySQL; updating ProxySQL as needed when keys and values change in Consul.

FROM centos:7
RUN yum install -q -y unzip wget mysql nmap-ncat curl && \
  yum clean all
RUN groupadd consul && \
  useradd -r -g consul -d /var/lib/consul consul
RUN mkdir /opt/consul-template && \
  mkdir /etc/consul-template && \
  mkdir /etc/consul-template/templates && \
  mkdir /etc/consul-template/config && \
  mkdir /opt/supportfiles && \
  mkdir /var/log/consul/ && \
  chown -R consul:consul /etc/consul-template && \
  chown -R consul:consul /etc/consul-template/templates && \
  chown -R consul:consul /etc/consul-template/config && \
  chown -R consul:consul /var/log/consul
RUN wget -q -O /opt/consul-template/consul-template.zip https://releases.hashicorp.com/consul-template/0.22.0/consul-template_0.22.0_linux_amd64.zip && \
  unzip /opt/consul-template/consul-template.zip -d /opt/consul-template/ && \
  rm -f /opt/consul-template/consul-template.zip && \
  ln -s /opt/consul-template/consul-template /usr/local/bin/consul-template
RUN chown -R consul:consul /opt/consul-template
COPY supportfiles/* /opt/supportfiles/
COPY startstop/* /opt/
RUN chmod +x /opt/entrypoint.sh
USER consul
ENTRYPOINT ["/opt/entrypoint.sh"]

Simply put, the code above follows these instructions:

  1. Start from CentOS 7.
  2. Install our dependencies which are unzip, wget, mysql (client), nmap-ncat and curl.
  3. Create our Consul user and group.
  4. Create the consul-template directory structure.
  5. Download and install consul-template.
  6. Copy the configuration file, template files and ENTRYPOINT script to the container.
  7. Make the ENTRYPOINT script executable.
  8. Switch to the Consul user.
  9. Start consul-template via the ENTRYPOINT script that’s provided with the container.

Much like our ProxySQL container, we really need to look at the ENTRYPOINT here in order to get the whole story. Remember, this is multi-region so there is additional logic that has to be considered when working with template files.

#!/bin/bash
dataCenter=$(curl https://metadata.google.internal/computeMetadata/v1/instance/zone -H "Metadata-Flavor: Google" | awk -F "/" '{print $NF}' | cut -d- -f1,2)
...
cp /opt/supportfiles/consul-template-config /etc/consul-template/config/consul-template.conf.json
case $dataCenter in
  us-central1)
    cp /opt/supportfiles/template-mysql-servers-us-central1 /etc/consul-template/templates/mysql_servers.tpl
    ;;
  us-east1)
    cp /opt/supportfiles/template-mysql-servers-us-east1 /etc/consul-template/templates/mysql_servers.tpl
    ;;
esac
cp /opt/supportfiles/template-mysql-users /etc/consul-template/templates/mysql_users.tpl
### Ensure that proxysql has started
while ! nc -z localhost 6032; do
  sleep 1;
done
### Ensure that consul agent has started
while ! nc -z localhost 8500; do
  sleep 1;
done
exec /usr/local/bin/consul-template --config=/etc/consul-template/config/consul-template.conf.json

This code is very similar to the ENTRYPOINT file used for ProxySQL in the sense that it checks for the region the container is in, then moves configuration and template files into the appropriate location. However, there is some additional logic here that checks to ensure that ProxySQL is up and listening on 6032 and that consul (agent) is up and listening on port 8500. The reason for this is the consul-template needs to be able to communicate with both these hosts. You really have no assurance as to what container is going to load in what order in a pod, so to avoid excessive errors in the consul-template log, I have it wait until it knows that its dependent services are running.

Let’s go through our best practices checklist one more time against our consul-template container code.

  • Container runs a single process:
    • ENTRYPOINT calls the entrypoint.sh script, which does some conditional logic based on the regional location of the container, then ends by running consul-template. This means at the end of the process consul-template will be the only process running.
  • Process should be PID 1:
    • The command “exec” at the end of the ENTRYPOINT script will start consul-template as PID 1.
  • Process should not be run as root:
    • We switched to the consul user prior to starting the ENTRYPOINT.
  • Log output should go to STDOUT:
    • If you run Consul using the command noted at the end of the ENTRYPOINT script, you’ll see log output goes to STDOUT.
  • Should be as deterministic as possible:
    • Just like ProxySQL and consul (agent), all the supporting files are packaged with the container. Yes, there is logic to determine what files should be used, but you have the assurance that the files won’t change unless you create a new version of the container image.

Putting it all together

Okay, we have three containers representing the three processes we need to package together so ProxySQL can work as part of our HA stack. Now we need to put it all together in a pod so Kubernetes can have it run against our resources.

In my use case, I’m running this on GCP, meaning once my containers have been built they’re going to need to be pushed up to the Google Container Registry. After this we can create our workload to run our pod and specify how many pods we want to run.

Getting this up and running can be done with just a few short and simple steps:

  1. Create a Kubernetes cluster if you don’t already have one. This is what provisions the Cloud Compute VMs the pods will run on.
  2. Push your three Docker images to the Google container registry. This makes the images available for use by the Kubernetes engine.
  3. Create your Kubernetes workload, which can be done simply via the user interface in the GCP console. All that’s required is selecting the latest version of the three containers you’ve pushed up to the registry, optionally applying some metadata like an application name, Kubernetes namespace, and labels, then selecting which cluster you want to run the workload on.

Once you click deploy, the containers will spin up and, assuming there are no issues bringing the containers online, you’ll quickly have a functioning ProxySQL pod in Kubernetes that follows these high-level steps:

  1. The pod is started.
  2. The three containers will start. In Kubernetes, pods are fully atomic. All the containers start without error or the pod will not consider itself started.
  3. The consul-template container will poll consul (agent) and ProxySQL on their respective ports until it’s confirmed those processes have started, then consul-template will start.
  4. Consul-template will create the new SQL files meant to configure ProxySQL based on the contents of the Consul key / value store.
  5. Consul-template will run the newly created SQL files against ProxySQL via its admin interface.
  6. The pod is now ready to receive traffic.

The YAML

During the process of creating your workload, or even after the fact, you’ll be able to see the YAML you’d normally have to create with standard Kubernetes deployments. Let’s have a look at the YAML that was created for my particular deployment.

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2019-10-16T15:41:37Z"
  generation: 64
  labels:
    app: pythian-proxysql
    env: sandbox
  name: pythian-proxysql
  namespace: pythian-proxysql
  resourceVersion: "7516809"
  selfLink: /apis/apps/v1/namespaces/pythian-proxysql/deployments/pythian-proxysql
  uid: 706c6284-f02b-11e9-8f3e-42010a800050
spec:
  minReadySeconds: 10
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: pythian-proxysql
      env: sandbox
  strategy:
    rollingUpdate:
      maxSurge: 100%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: pythian-proxysql
        env: sandbox
    spec:
      containers:
      - image: gcr.io/pythian-proxysql/pythian-proxysql-proxysql@sha256:3ba95101eb7a5aac58523e4c6489956869865452d1cbdbd32b4186a44f2a4500
        imagePullPolicy: IfNotPresent
        name: pythian-proxysql-proxysql-sha256
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - image: gcr.io/pythian-proxysql/pythian-proxysql-consul-agent@sha256:7c66fa5e630c4a0d70d662ec8e9d988c05bd471b43323a47e240294fc00a153d
        imagePullPolicy: IfNotPresent
        name: pythian-proxysql-consul-agent-sha256
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      - image: gcr.io/pythian-proxysql/pythian-proxysql-consul-template@sha256:1e70f4b96614dfd865641bf75784d895a794775a6c51ce6b368387591f3f1918
        imagePullPolicy: IfNotPresent
        name: pythian-proxysql-consul-template-sha256
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 2
  collisionCount: 1
  conditions:
  - lastTransitionTime: "2019-10-16T15:41:37Z"
    lastUpdateTime: "2019-11-11T15:56:55Z"
    message: ReplicaSet "pythian-proxysql-8589fdbf54" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2019-11-11T20:41:31Z"
    lastUpdateTime: "2019-11-11T20:41:31Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 64
  readyReplicas: 2
  replicas: 2
  updatedReplicas: 2

The first thing I have to point out is this is a LOT of YAML that we didn’t have to create given the Google Kubernetes Engine handled all of it. This is a huge part of easing the process which allows us to get our solution working so quickly.

However, despite the fact that we have a lot of YAML created for us, there are still some occasions where we may need to modify this manually, such as working with Kubernetes Container Lifecycle Hooks, or working with requests or limits for hardware resources for individual containers in our pod.

How do I access my ProxySQL instance?

One consideration for Kubernetes is when pods are started and stopped they’ll get an ephemeral IP address, so you don’t want to have your applications connect to your pods directly. Kubernetes has a feature called a “service” that allows your pods to be exposed via a consistent network interface. This service can also handle load balancing, which is what I’m planning on using with my Kubernetes deployment. Adding a service to your GKE workload is very simple and can be added with a few clicks.

Autoscaling

As noted earlier in this post, before the implementation of Kubernetes for this solution, it was recommended to use cloud compute autoscaling groups to handle fluctuations in traffic. We want to include the same strategy with Kubernetes to ensure we have enough pods available to handle traffic demand. Including autoscaling in your workload is also fairly simple and can be done via the console UI.

One important thing to note about scaling with Kubernetes is the time it takes to scale up and down. In the intro section of this post, I noted the process of adding and removing nodes from an autoscaling group and how that can take minutes to achieve depending on how quickly your cloud provider can stand up a new instance and the complexity of your configuration. With Kubernetes, I’ve seen my pods scale up in as little as three seconds and scale down in less than one second. This is part of what makes this solution so powerful.

Considerations for Connections During Scale-Up and Down

One important thing to note is, as the workload gains and loses pods, your connections to ProxySQL via the exposed service can be interrupted. The autoscaling documentation notes that this can cause disruption and your application needs to be able to handle this in much the same way it would have to for a cloud compute autoscaling group. You’ll want to ensure that your application has retry on database failure logic built in before incorporating Kubernetes autoscaling (or any autoscaling for that matter) as part of your data platform.

Considerations for MySQL users in ProxySQL

Three tables are replicated when working with ProxySQL cluster: mysql_servers, mysql_query_rules and mysql_users — meaning when a change to any of these tables is made on one of the nodes in the cluster, it will be replicated to all the other nodes. 

We really don’t need to worry about this when working with mysql_servers given that all nodes will get their mysql_server information from Consul via consul-template, so I’ve disabled this clustering feature.

With my particular use case I don’t need to worry about mysql_query_rules either because, as noted earlier in this post, my traffic is being routed based on the port that traffic is being sent to. The rules for this are simple, and should not change, so I have it in the configuration file and I have disabled replicating this table, as well.

The last table to consider is mysql_users and this is where things get interesting. Remember with Kubernetes it’s possible to have persistent storage, but we really want our containers to be as stateless as possible, so if we were to follow the Docker and Kubernetes philosophy as closely as possible we wouldn’t want to have our data persist. This falls into the whole cattle vs pets discussion when working with containers, but I digress.

Let’s assume we’ve opted NOT to persist our ProxySQL data, typically stored in SQLite, and we lose all the pods in our Kubernetes cluster. It’s unlikely, but we always need to be ready for disaster. When the first pod comes up, it’s starting with a blank slate and this isn’t a problem considering it will get its initial set of mysql_server data from Consul via consul-template and its mysql_query_rules data from the config file. However, there is no source of truth for mysql_users data, so all that data would be lost.

In this case, we need to incorporate some source of truth for the ProxySQL mysql_users table. It’s possible to use a cloud compute VM with ProxySQL installed to be an ever-present member of the cluster which could seed data for new joining pods. However, that breaks our construct of working specifically with containers. Plus, if you have a multi-cluster configuration like I do, where there’s one cluster in each region, you need one ProxySQL “master host” in each region. This is a bit of a waste considering it’s just acting as a source of truth for mysql_users, which likely will be the same across all clusters.

My solution, in this case, is to leverage the source of truth we already have in place: Consul. If it’s already acting as a source of truth for mysql_servers, there’s no reason it can’t act as a source of truth for this as well.

All I need is to have my MySQL users and password hashes (always stay secure) in Consul. I can then use consul-template to create these on new ProxySQL host, or change them as keys and values change. You may have noticed this in the ENTRYPOINT script in my consul-template container.

To Cluster or Not To Cluster?

I mentioned before that ProxySQL cluster handles the replication of three tables: mysql_users, mysql_query_rules and mysql_servers. Considering all three of these tables now have their own source of truth, we really don’t need to worry about replicating this data. As Consul receives change reports, it will update all the ProxySQL pods considering that all of them have consul (agent) and consul-template containers as part of the pod.

With this in mind, I’ve opted to rely on my constructed sources of truth and reduce solution complexity by removing ProxySQL clustering; however, this is going to vary from use case to use case.

Conclusion

The solution implemented in this use case has required the inclusion of a lot of new technologies that MySQL DBAs may or may not have familiarity with: ProxySQL, Orchestrator, Consul, GTIDs, etc. We’ve made this solution a little more complex by adding Docker and Kubernetes to the stack, but I personally believe this complexity is worth it considering the higher degree of idempotency that is built into the solution, the lack of need for ProxySQL clustering and the speed in which scale-up and scale-down occurs.

One last consideration is the simple need for learning how to incorporate containers into your stack. This is not my first blog post on container philosophy and implementation. I believe containers are going to become a greater part of the landscape for all of us — even us, the database professionals with our highly stateful technological challenges. If you haven’t already started educating yourself on these technologies, I would highly encourage you to do so to better prepare yourself for the shift from “Database Administrator” to “Database Reliability Engineer.”

email

Interested in working with Peter? Schedule a tech call.

About the Author

Internal Principal Consultant
Peter Sylvester is one of the Internal Principal Consultants in the Open Source Database Consulting Group at Pythian. He has been with Pythian since January of 2015 and has been working with MySQL since 2008. Apart from work, Peter is an avid ice hockey player to stay in keeping with the stereotypical Canadian lifestyle, playing typically no less than twice a week!

2 Comments. Leave new

johnsonsophia016
November 27, 2019 7:31 am

thank you for the article.

Reply
Peter Sylvester
November 27, 2019 8:09 am

You are quite welcome!

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *