Why Achieving Quick Business Wins Should Be Built Into Your D&A

data & analytics

Having worked in the Data & Analytics (D&A) space for decades, the struggle to gain business insights through data is constant. I’ve used many approaches; some have worked, while others looked better on paper. Over time, however, the path to…

Read More >

Migrate RDB to Cloud SQL Using Google’s Dataflow

rdbms data transfer

Most corporations have huge amounts of data in RDBMS (relational database management system). When considering a RDBMS data transfer and you only need a subset of data to migrate to the cloud, follow this very efficient and easy data ingestion…

Read More >

Orchestrating dbt Pipelines with Google Cloud: Part 2

dbt pipelines

In part 1, we defined and deployed two data services to Cloud Run. Each service provides endpoints that perform specific tasks, such as loading a file to BigQuery or running dbt models. In this post, we’ll define and deploy some…

Read More >

Orchestrating dbt Pipelines With Google Cloud: Part 1

orchestra

In my previous post I showed you how to use dbt to expedite data preparation tasks on Google BigQuery. This time, I’ll show you how to integrate those dbt pipelines into workflows that load, validate and transform data.    …

Read More >

Python: Using Dataclasses to Model Your Data

3d chart data model tablet

Here at Pythian, we love our data. Our code is no exception (pun sort of intended), so I’ll be covering dataclasses in Python today. The problem As a Python developer, you’ve almost certainly run into code that looks like the…

Read More >

Caching Alternatives in Google Dataflow: Avoiding Quota Limits and Improving Performance

The problem When building data pipelines, it’s very common to require an external API call to enrich, validate or obfuscate data using external services. This might happen with streaming or batch pipeline. The situation is the same: call external services…

Read More >

Data preparation with dbt and BigQuery

Raw incoming data needs to go through a series of data preparation steps before it can be used for analysis. These steps include tasks such as type casting, renaming columns, cleaning values and identifying duplicates. Writing code to perform these…

Read More >

Snowflake System Function Error: Argument 0 to Function SYSTEM$PIPE_STATUS Needs to Be Constant

I recently encountered the above issue which prompted me to write this blog post so I can easily reference the solution whenever I need it. However, I also hope it might help anyone out there who hits a similar issue….

Read More >

Replicating MySQL to Snowflake with Kafka and Debezium—Part Two: Data Ingestion

Here we go again Hello, and welcome to this second part of my “Replicating MySQL to Snowflake” series. If you landed here from a web search and missed part one, you can take a look here: part one. What’s up?…

Read More >

How to Deploy Machine Learning on Google Cloud Platform

Editor’s Note: Because our bloggers have lots of useful tips, every now and then we update and bring forward a popular post from the past. Today’s post was originally published on August 15, 2019. In this post, I’ll describe a…

Read More >
Page 1 of 1112345...10...Last Page »