Author: Enrique Lopez de Lara

Data Streaming with Kafka and Flink on AWS – Part 1

Apache Kafka and Apache Flink are popular data streaming applications platforms. However, provisioning and managing your own clusters can be challenging and incur operational overhead. Amazon Web Services (AWS) provides a fully managed, highly available version of these platforms that…

Read More >

Orchestrating dbt Pipelines with Google Cloud: Part 2

dbt pipelines

In part 1, we defined and deployed two data services to Cloud Run. Each service provides endpoints that perform specific tasks, such as loading a file to BigQuery or running dbt models. In this post, we’ll define and deploy some…

Read More >

Orchestrating dbt Pipelines With Google Cloud: Part 1

orchestra

In my previous post I showed you how to use dbt to expedite data preparation tasks on Google BigQuery. This time, I’ll show you how to integrate those dbt pipelines into workflows that load, validate and transform data.    …

Read More >

Data preparation with dbt and BigQuery

Raw incoming data needs to go through a series of data preparation steps before it can be used for analysis. These steps include tasks such as type casting, renaming columns, cleaning values and identifying duplicates. Writing code to perform these…

Read More >