Data Pipeline Frameworks: The Dream and the Reality

Published on January 23, 2019


One of the core problems in data engineering is defining and orchestrating scheduled ETL pipelines. While aspects of the problem are general, the dream is to choose and use a framework that does “everything but write your query.”

In reality, frameworks are useful but do less than they promise.  Your team still has to do the rest of the work required to fit the framework to your business, existing code, and ops practices.

Hear more about what I learned from these tradeoffs in my first phase of working with open source pipeline framework Airflow in my talk for Data Council NYC 2018.

 

Back to the Blog

Latest posts

September 23, 2020
The secret weapon of optimization without cookies: log data

By Paul Knegten

Read More
September 15, 2020
September Release: New Buzz API, CPC Pacing, and SKAdNetwork

By The Beeswax Team

Read More

Latest Downloads

Beyond the DSP: Why It's Time For Your Own Bidder
Survey Results: Impact of crisis on digital advertising
White Paper: The Future of Digital Identity

Speak to an RTB Expert