Going PaaS? Think Google Cloud Platform for big data.

Posted in: Business Insights

While a lot of companies still think of cloud in terms of infrastructure-as-a-service—virtualized, distributed hardware that brings agility and cuts costs—more and more are branching out to explore other “as-a-service” flavors.

For enterprises that want to simplify and speed up application development, platform-as-a-service (PaaS) offerings are an increasingly compelling choice—despite potential concerns about “black box” solutions or vendor lock-in.

So if you know you want to go the PaaS route, when would Google Cloud Platform be a good option?

The logical time to start looking at Google Cloud Platform would be when you know you’re going to be working with big data. Google’s storage is inexpensive and available completely on demand, so you don’t have to plan ahead. You can get as much or as little space as you need and pay only for what you use. And it’s easy to transfer data to Google Cloud Platform from other storage, whether in another cloud or on premises.

Once your data is in the Google cloud, Google—like other providers—has a fair array of tools to help you process it. But here again, where Google really stands out is on scale. You don’t have to provision clusters; you just start them up when you need them and shut them down when you’re done to handle any volume of data. This on-demand processing power can go a long way toward minimizing costs, particularly since Google charges by the minute instead of by the hour.

So if you’re working with big and variable volumes of data, Google Cloud Platform has a lot of value to offer. At Pythian, we advise clients to stage their Google migrations—to keep them manageable, to allow for a bit of “tire-kicking” to get a sense of what the Google PaaS can do and how it behaves, and most importantly to ensure a seamless and successful transfer of data.

I’ll write about that further in my next post. In the meantime, if you’re learning about how Pythian can help leverage the power of Google Cloud Platform to manage, mine, analyze, and utilize your data, and improve business outcomes, visit our Google Technology page.


About the Author

Big Data Principal Consultant
Vladimir is currently a Big Data Principal Consultant at Pythian, and well-known for his expertise in a variety of big data and machine learning technologies including Hadoop, Kafka, Spark, Flink, Hbase, and Cassandra. As a big data expert with over 20 years of global experience, he has worked on projects for enterprise clients across five continents while being part of professional services teams for Apple Computers Inc., Sun Microsystems Inc., and Blackboard Inc. Throughout his career in IT, Vladimir has been involved in a number of startups. He was Director of Application Services for Fusepoint (formerly known as RoundHeaven Communications), which grew by over 1,400% in 5 years, and was recently acquired by CenturyLink. He also founded AlmaLOGIC Solutions Incorporated, an e-Learning Analytics company.

1 Comment. Leave new

100% agree that Big Data PaaS is absolutely the way to go for net-new developments (where possible), whereas IaaS deployments of say Hadoop (for example) is best for lift/shift or when quite simply you just need a VM (for some other specific software reasons)

However unfortunately the rest of this article is misleading at best, and Google advertising at worst.

“Goggle’s storage is inexpensive and available completely on demand, so you don’t have to plan ahead.”
Yes, all public cloud storage vendors have incredibly cheap storage (in many ways this is the whole driver of public cloud) and all of the them are on demand.

“You don’t have to provision clusters; you just start them up when you need them and shut them down when you’re done to handle any volume of data. ”
Huh? Other public providers do this too! For example Azure has Data Lake, and HDInsight (which includes HBase, Storm, Spark and R-Server). Both are PaaS. Both on demand. Both pay per use only. For example – if I am using Azure I can spin up / down entire clusters in minutes!

“Goggle’s charges by the minute instead of by the hour”
Yes, well, only AWS charge by the nearest hour. If using Azure its also by the minute.

And how about regional support?
Here in Australia Azure is across multi-regions, and AWS has multiple DC “zones” in same region (however this non-geo located design recently had major downtime).
Google runs a similar design to AWS and could suffer the same single-region issues.


Leave a Reply

Your email address will not be published. Required fields are marked *