Apache beam pipelines With Scala: Part 2 – Side Input

In the second part of this series we will develop a pipeline to transform messages from “data” Pub/Sub topic with the ability to control the process via “control” topic. How to pass effectively non-immutable input into DoFn, is not obvious,…

Read More >

Apache beam pipelines with Scala: part 1 – template

In this 3-part series I’ll show you how to build and run Apache Beam pipelines using Java API in Scala. In the first part we will develop the simplest streaming pipeline that reads jsons from Google Cloud Pub/Sub, convert them…

Read More >

When to use Amazon Athena

Amazon Athena enables you to access data present in flat files stored in S3 (Simple Storage Service) as if it were in a table in the database. And you don’t have to set up a server or any other software…

Read More >

Cosmos DB Geo-replication – SQL on the edge episode 14

As Microsoft Azure’s NoSQL service offering, Cosmos DB has received a lot of investment and development effort. Microsoft considers Cosmos as a “ring zero” service, which means that it is available by default from all regions as soon as they…

Read More >

Tips for preparing for the AWS Cloud Architect Associate Exam

Networking is a pivotal concept in cloud computing. Knowing it is a must to be a successful Cloud Architect. Of course you won’t be physically peeling the cables to put RJ45 connectors on but you must know various facets of…

Read More >

A serverless audio transcription pipeline

Having attended Serverlessconf NYC a few weeks ago, and given my undying excitement around all things serverless, lately, I’ve been quite entrenched in the numerous possibilities of serverless architectures. Also, I’ve been itching to build something, as I haven’t had…

Read More >

Understanding Cosmos DB Request Units–SQL On The Edge Episode 13

As part of their cloud offerings, all major providers (Microsoft, Amazon, Google) have developed very exciting and interesting NoSQL/NewSQL offerings. Microsoft Azure’s is called Cosmos DB, a rebranding of the product initially known as DocumentDB. Cosmos DB is no longer…

Read More >

Architecting a Modern Data Warehouse – Live Webinar

Join Pythian and DBTA for a live roundtable webinar Architecting a Modern Data Warehouse Live Roundtable Webinar Thursday, November 16, 2017 11:00 am PT / 2:00 pm ET REGISTER TODAY Today, the world of decision-making, along with the data sources…

Read More >

Datascape podcast episode 15 – machine learning primer for enterprise IT with Paul Spiegelhalter

Joining us today we have my esteemed colleague Paul Spiegelhalter. Paul is a data scientist and machine learning specialist with expertise in predictive analytics and algorithmic modeling across a number of industries, including computer vision, online advertising and user analysis,…

Read More >

Pythian at All Things Open 2017

All Things Open 2017 is coming up fast, and Pythian is proud to be supporting one of the largest open source-focused conferences in the US! All Things Open is a conference in Raleigh, North Carolina on October 23-24, 2017. ATO…

Read More >