Tag: airflow

How to Deploy Machine Learning on Google Cloud Platform

Editor’s Note: Because our bloggers have lots of useful tips, every now and then we update and bring forward a popular post from the past. Today’s post was originally published on August 15, 2019. In this post, I’ll describe a…

Read More >

How to Implement Airflow Best Practices From a Data Scientist’s Perspective

Editor’s Note: Because our bloggers have lots of useful tips, every now and then we update and bring forward a popular post from the past. Today’s post was originally published on August 8, 2019. This blog post is a compilation…

Read More >

How to schedule weekdays only on Airflow

Consider the following situation: You have a data ingestion pipeline where the data comes in real-time on weekdays and is stored in a dated folder.  The day’s data needs to be ingested within four hours. An instant response may be…

Read More >

Airflow Scheduler: the very basics

Apache Airflow is a great tool for scheduling jobs. It has a nice UI out of the box. It allows you to create a directed acyclic graph (DAG) of tasks and their dependencies. You can easily look at how the…

Read More >

Creating dynamic tasks using Apache Airflow

What’s Airflow? Apache Airflow is an open source scheduler built on Python. It uses a topological sorting mechanism, called a DAG (Directed Acyclic Graph) to generate dynamic tasks for execution according to dependency, schedule, dependency task completion, data partition and/or…

Read More >