This Log Buffer Edition covers Cloud, Oracle, MySQL and PostgreSQL.
New Amazon EC2 spot pricing model: simplified purchasing without bidding and fewer interruptions
The confluence of cloud, data and AI is driving unprecedented change. The ability to utilize data and turn it into breakthrough insights is foundational to innovation today
Announcing general availability of Azure database services for MySQL and PostgreSQL
Hash Join overflow to disk summary
Dbvisit standby upgrade
ADF on Docker – Java memory limit tuning for JVM
Oracle ORA-02287: sequence number not allowed here
Oracle Integration Cloud: customer managed & patching
Safely purging binary logs from master
ANALYZE TABLE is no longer a blocking operation
MariaDB Server 10.2.14 and 10.1.32 now available
Comparing database proxy failover times – ProxySQL, MaxScale and HAProxy
No one has ever imagined dbForge would do it to Google Sheets!
Thanks to Jeremy Schneider.
As usual, I’ll start off with what seemed (to me) like the biggest headlines. Let’s warm up with a quick article from our friend Matt Asay at Adobe, who’s had more than one attention-grabbing PostgreSQL article. He stole the bacon again on March 20 with an article that got lots of retweets… titled “How PostgreSQL just might replace your Oracle database” which largely explored SalesForce’s internal investment in PostgreSQL. There’s been a larger theme this month that PostgreSQL is “having a moment”… (is it a moment or a movement?)
By the way, did you see Matt‘s article last month with the data-driven ranked list of corporate contributors to Open Source software? It was published on Feb 7 and I had missed it until today – nice!
Anyway, back to headlines. This next article is interesting for a different reason than most people realized. Evidently, the PostgreSQL database superuser has more access the underlying operating system than many people assumed. But YOU knew that already, right? :) Most of the articles I saw were calling this Scarlett Johansson thing a “hack” of PostgreSQL. But let’s be honest… scanning the internet for open ports with obvious guessable admin passwords isn’t really hacking… right?
Imperva broke the story, and their article itself was actually quite interesting. (I just didn’t like most of the press coverage.) It walked through some of the more clever ways a superuser can interact with the operating system – outside of what you might have assumed PostgreSQL was constrained to – while obfuscating itself to make detection harder.
But honestly, none of these were the BIG news. The biggest splash was from Microsoft. On Mar 20, they announced that PostgreSQL on Azure is officially GA! I’m still seeing retweets today! And maybe this explains some of the changes they announced last month.
Yay! That’s a big announcement. I agree with what Craig Kerstiens wrote on March 20: this will mean more choices and more quality for PostgreSQL and for its users.
But wait… as if that weren’t enough, the very next day, Amazon announced the availability of Aurora with PostgreSQL 9.6.6 Compatibility! And I actually think this is a bigger deal than many people realize just yet.
Why do I say the Aurora announcement is a big deal? (Besides that I currently work for Amazon, often with the team that’s building it. ) Let’s take a look at a seemingly unrelated blog post. We are two weeks away from one of the biggest PostgreSQL conferences in the United States: PostgresConf US in Jersey City. On Mar 26, Joshua Drake published an interview with the head of data at Pivotal talking about the extensive Greenplum content at PostgresConf US this year.
Jacque mentions that Greenplum 5.0 is based on PostgreSQL 8.3 and the open-source version of Greenplum database is based on PostgreSQL 9.0. The next major release is aiming to be compatible with PostgreSQL 9.4 …still three major releases behind the current PostgreSQL community release.
Amazon Aurora with PostgreSQL Compatibility only became GA last year, and many people are still working out exactly how to position it in the market. I think it’s noteworthy that we have now shipped a new GA release incorporating a new community PostgreSQL minor. The timeline for this release wasn’t as fast as Citus or 2ndQuadrant or EDB, but it’s also nowhere near Greenplum either. This is something new, and (IMO) worth watching to see what happens next!
Now you might ask, why is it such a big deal to keep up with the latest community releases of PostgreSQL? And actually that’s a fair question. Couldn’t the value added by greenplum and other postgres-compatible databases (or forks) give them enough momentum in their own right that they can go their own way? Develop their own features, manage their own bug fixes, etc?
It seems like this has been the predominant business model to date. The thing that will make this increasingly difficult moving forward is the surprising velocity of new features arriving in community PostgreSQL. Lets look at a few recent articles.
Well, not a new feature per se, but on Mar 27 Stephen O’Grady from RedMonk wrote a fascinating analysis about how support from major cloud vendors over the last 10 years has completely changed the picture for Enterprise Adoption of Open Source databases including PostgreSQL.
I usually like to wait until the release to talk about new features… but since we’re talking about feature velocity, I’ll point out just a few exciting updates to the PostgreSQL 11 code base. Personally I’m most excited about LLVM integration. This has already shown up to 20% performance improvement on some queries by JIT compiling expression evaluation into native machine code. An announcement from Michael Larabel on Mar 22:
I’ve been surprised by the strong and sustained development effort behind community PostgreSQL. As a result, it’s not surprising that we continue to see a steady stream of articles about PostgreSQL 10 new features and upgrades.
On March 21, Ashutosh Bapat from EnterpriseDB published results from some test results that used PostgreSQL 10 native partitions to control table bloat.
On March 6, Samay Sharma from Citus published an article about the new PostgreSQL 10 column group statistics feature which can significantly improve the accuracy of query plan costs when there are multiple columns with correlated values. Similar features exist in Oracle, DB2 and SQL Server.
I’ve seen a good number of retweets of the March 14 article written by Reinier Haasjes from payments platform Adyen. He talked about upgrading a large PostgreSQL database. The most striking things to me about this article were two: (1) that you can evolve your operational strategies as you scale, and (2) the numbers – Adyen’s largest PG cluster is 74 TB, and the [in-place] pg_upgrade run on their 50TB database took only 3-5 minutes!
But Bricklen Anderson wins with the most detailed article about upgrading. On March 27 he published his 35-step process along with notes for upgrading 1500 postgres clusters in 6 data centers to PG10 with total size 55TB and downtime windows of 15 minutes. This kind of article is astounding.
So… I feel like I’m just getting started. But this is a ton of stuff and I’ve already spent too much time on this week’s update! So I’m just going to throw in a few quick technical articles and then ship this thing.
First, a little cluster of articles about monitoring PostgreSQL:
Tomas Barton‘s slide deck about “The Sad State of PostgreSQL Monitoring” from the Prague PostgreSQL meetup on March 26.
And I’ll close up with two quick tips:
On Mar 1, Hans-jürgen Schönig from Cybertec published an article about knowing how the optimizer works and avoiding unneeded sorting by writing queries well.
Lastly, this month at work I referenced an article from Dec 5 last year that gives an excellent description of pg_stat_statements – no idea how I missed this in my roundup at the end of the year. This was published by Pavel Trukhanov at okmeter.
That’s a wrap for today.
Have a great week and keep learning!