Data governance has become an essential agenda item for many organizations. The reasons are varied, but two of the most compelling ones are: (1) as the number of data assets grows, it becomes harder to properly control access to them;…
Read More >Preparing for the Snowflake SnowPro Advanced: Architect Certification can be challenging. You need to have a deep understanding of the core concepts as well as specific feature capabilities. Snowflake’s excellent documentation provides detailed information. However, it is difficult to determine…
Read More >Apache Kafka and Apache Flink are popular platforms for data streaming applications. However, provisioning and managing your own clusters can be challenging and incur operational overhead. Amazon Web Services (AWS) provides a fully managed, highly available version of these platforms…
Read More >Apache Kafka and Apache Flink are popular data streaming applications platforms. However, provisioning and managing your own clusters can be challenging and incur operational overhead. Amazon Web Services (AWS) provides a fully managed, highly available version of these platforms that…
Read More >In part 1, we defined and deployed two data services to Cloud Run. Each service provides endpoints that perform specific tasks, such as loading a file to BigQuery or running dbt models. In this post, we’ll define and deploy some…
Read More >In my previous post I showed you how to use dbt to expedite data preparation tasks on Google BigQuery. This time, I’ll show you how to integrate those dbt pipelines into workflows that load, validate and transform data. …
Read More >Raw incoming data needs to go through a series of data preparation steps before it can be used for analysis. These steps include tasks such as type casting, renaming columns, cleaning values and identifying duplicates. Writing code to perform these…
Read More >