Tag: Google Cloud Platform (GCP)

Consuming Tweets Using Apache Beam on Dataflow

Apache Beam is an SDK (software development kit) available for Java, Python, and Go that allows for a streamlined ETL programming experience for both batch and streaming jobs. It’s the SDK that GCP Dataflow jobs use and it comes with…

Read More >

Near Real-Time Data Processing for BigQuery: Part Two

This post is part two of describing (near) real-time data processing for BigQuery. In this post, I will use Dataform to implement transforms as well as ASSERTS on the data and unit testing of BigQuery code and SQL statements. Part…

Read More >

Near Real-Time Data Processing for BigQuery: Part One

This post describes (near) real-time data processing for BigQuery with unique and other check constraints, and unit testing. This is part one of two, and describes the real-time ingestion of the data. Part two will describe how to implement ASSERTS…

Read More >

How to Deploy Machine Learning on Google Cloud Platform

Editor’s Note: Because our bloggers have lots of useful tips, every now and then we update and bring forward a popular post from the past. Today’s post was originally published on August 15, 2019. In this post, I’ll describe a…

Read More >

How to Implement Airflow Best Practices From a Data Scientist’s Perspective

Editor’s Note: Because our bloggers have lots of useful tips, every now and then we update and bring forward a popular post from the past. Today’s post was originally published on August 8, 2019. This blog post is a compilation…

Read More >

Embracing the New Normal With Google Workspace – Part Three

“Companies that change may survive, but companies that transform thrive.” — The 4 Dos of Change Management, Nick Candito, Forbes.com For many people and companies, the first part of the COVID-19 pandemic was all about holding on. Survival alone was…

Read More >

Dipping Your Toes Into Building an Analytics Platform on Google Cloud Platform

“We have many disparate data sources and we’re having a hard time getting a global view of all our data across our organization.” “Our data is currently all in <enter data warehouse name here> and we want to migrate it…

Read More >

How to Connect from Cloud Functions to the Private IP Address of Cloud SQL in Google Cloud

Cloud functions allow you to run single-purpose functions without having to manage instances in Google Cloud. Cloud SQL is Google Cloud’s managed SQL service. For better security, it’s best practice to disable public IP in Cloud SQL. In terraform, the…

Read More >

How Pythian and Google helped reduce pesticide use in high-value crops

Semios is a Vancouver-based data analytics company for growers of high-value crops such as almonds and apples. It uses a combination of machine learning, in-crop wireless networks and half a million IoT sensors across 80,000 acres to offer real-time monitoring…

Read More >

Join, Group By, and Aggregate in Cloud Data Fusion

Good news! Cloud Data Fusion is now GA. Announced at Google Next ‘19 UK on November 21, 2019, Cloud Data Fusion is a fully managed, cloud-native, enterprise data integration service for quickly building and managing data pipelines. Cloud Data Fusion…

Read More >
Page 1 of 1112345...10...Last Page »