728x90

https://youtu.be/2g1nBbHgZbY?si=_zVw2PPifVqkHolN

 

Airflow Summit 2024

What's coming in Airflow 3 and beyond?

 

Airflow History

 

2014๋…„ : Airflow ๋“ฑ์žฅ!
2015๋…„ : Apache์˜ ์ตœ๊ณ  ์ˆ˜์ค€ ํ”„๋กœ์ ํŠธ
2019๋…„ : HA ์Šค์ผ€์ค„๋Ÿฌ, REST API๋ฅผ ํ†ตํ•œ ์™„์ „ํ•œ ์ŠคํŽ™ ์ •์˜ ๋“ฑ์„ ํฌํ•จํ•œ Enterprise Production-Ready ์™„๋ฃŒ
2020๋…„ ~ 2024๋…„: ๋น„๋™๊ธฐ ์ž‘์—…์ž, ๋™์  ์ž‘์—…, ์„ค์น˜ ๋ฐ ํ•ด์ฒด, Airflow ObjectStore ๋“ฑ์„ ํฌํ•จํ•˜์—ฌ ์‚ฌ์šฉ ํŽธ์˜์„ฑ๊ณผ ํšจ์œจ์„ฑ ์ฆ๊ฐ€
2021๋…„ ~ 2024๋…„: ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ์Šค์ผ€์ค„๋ง, ์กฐ๊ฑด๋ถ€ ๋ฐ์ดํ„ฐ์…‹, ํ†ตํ•ฉ๋œ ๋ฐ์ดํ„ฐ์…‹ ๋ฐ ์‹œ๊ฐ„ ์Šค์ผ€์ค„๋ง, ๋ฐ์ดํ„ฐ์…‹ ์ด๋ฒคํŠธ API ๋“ฑ์„ ํฌํ•จํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์ธ์‹ ๊ธฐ๋Šฅ์„ ์ถ”๊ฐ€
2024๋…„ ~ 2025๋…„: Airflow 3๊ฐ€ ์ถœ์‹œ๋  ์˜ˆ์ •!

 

 

DAG Versioning ๊ธฐ๋Šฅ ์ถ”๊ฐ€

Airflow DAG versioning

 

 Airflow๋Š” ๋งŽ์€ ํŽธ๋ฆฌ์„ฑ์„ ์ œ๊ณตํ•˜์ง€๋งŒ Dag Script๊ฐ€ ์–ธ์ œ ์‹œ์ž‘๋˜๊ณ  ์–ด๋–ค ๋ฒ„์ „์˜ Script๋กœ ๊ตฌ๋™์ด ๋˜์—ˆ๋Š”์ง€ ์ถ”์ ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด dbt tool์„ ์‚ฌ์šฉํ•ด ์Šคํฌ๋ฆฝํŠธ๋ฅผ ๋ช…์‹œํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์ง€๋งŒ ๋‹ค๋ฅธ ํˆด์— ์˜์กดํ•ด์•ผ ํ•˜๋Š” ๊ฒƒ ๋˜ํ•œ ๋ถˆํŽธํ•œ ๋ฐฉ๋ฒ•์ด๋‹ค. ์„ค๋ฌธ ์กฐ์‚ฌ์—์„œ ์—ญ์‹œ Airlfow ์‚ฌ์šฉ์ž์˜ 52.2%๊ฐ€ ๋ถˆํŽธํ•จ์„ ๊ฒช๊ณ  ์žˆ๋‹ค๊ณ  ์‘๋‹ตํ–ˆ๋‹ค. 

 

Airlfow ๊ฐœ์„  ์‚ฌํ•ญ

 

 

Stronger Security

 ์•ž์„œ ์†Œ๊ฐœํ•œ DAG Versioning ์ด์™ธ์— Stronger Security: Airflow metadatabase ๊ด€๋ฆฌ์— ๋Œ€ํ•ด ๊ฐœ์„ ์‚ฌํ•ญ์ด ์˜ˆ์ •์ด๋‹ค. Task ๋‹จ์œ„์˜ ๋“ฑ๊ธ‰์—์„œ Airflow metadatabase์— ์ ‘๊ทผ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. 

Airflow 2 vs Airflow 3 : Task Excution

 Airflow Worker์—์„œ Task๊ฐ€ ์‹คํ–‰๋  ๋•Œ Metabase์— ์ง์ ‘ ์ ‘๊ทผํ•  ๋•Œ ์žฆ์€ I/O๋กœ ์ƒ๊ธฐ๋Š” ๋ฌธ์ œ๊ฐ€ ๋งŽ๋‹ค. ๊ฐ€๋ น xcom์—์„œ ์ƒ๊ธฐ๋Š” memory์˜ ์–‘์ด database์— ๋”ฐ๋ผ ์ œํ•œ๋˜์–ด ์žˆ๊ฑฐ๋‚˜ Task ๊ฐ„ ๋…๋ฆฝ์„ฑ์ด ๋ณด์žฅ๋˜์ง€ ์•Š์„ ๋•Œ๊ฐ€ ์žˆ๋‹ค.

 

Run at any time

Airflow : Run at anytime

 

Event-driven scheduling(์ด๋ฒคํŠธ ๊ธฐ๋ฐ˜ ์Šค์ผ€์ค„๋ง)

 ์™ธ๋ถ€ ์†Œ์Šค์˜ ์ด๋ฒคํŠธ์— ์ž๋™์œผ๋กœ ๋ฐ˜์‘ํ•˜๋Š” ํŒŒ์ดํ”„๋ผ์ธ์„ ํ†ตํ•ด ์‹ค์‹œ๊ฐ„์œผ๋กœ ์‘๋‹ตํ•˜๊ณ  ์‹คํ–‰ํ•œ๋‹ค. ๋ฐ์ดํ„ฐ ์ž์‚ฐ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌ์ถ•๋จ.

 

Data partitioning

์‹œ๊ฐ„ ๋ฐ ์„ธ๊ทธ๋จผํŠธ ๊ธฐ๋ฐ˜์˜ ํŒŒํ‹ฐ์…˜์„ ํ†ตํ•ด ํŠน์ • ๋ฐ์ดํ„ฐ ์Šฌ๋ผ์ด์Šค๋ฅผ ๋…๋ฆฝ์ ์œผ๋กœ ์ฒ˜๋ฆฌ. ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐ ์ฒ˜๋ฆฌ ์‹œ๊ฐ„๊ณผ ๋ฆฌ์†Œ์Šค ์‚ฌ์šฉ์„ ์ค„์ž„.

 

Inference Execution ๋™์ผํ•œ DAG์˜ ๋™์‹œ ์‹คํ–‰. ๋™๊ธฐํ™”๋œ DAG ์‹คํ–‰.

 

์ถ”๊ฐ€๋กœ JS, Javascript, TS language๋ฅผ ์ง€์›ํ•œ๋‹ค. 

 

 

 

 ์ถœ์‹œ ์ผ์ž๋Š” Dev ๋ฒ„์ „์ด 2025๋…„ 1์›” 1์ผ๋กœ ์˜ˆ์ •๋˜์–ด ์žˆ๊ณ  Beta RC๋Š” 3์›” 31์ผ๋กœ ์˜ˆ์ •๋˜์–ด ์žˆ๋‹ค. ๋งˆ์ผ์Šคํ†ค๊ณผ Confluence ํŽ˜์ด์ง€์—์„œ ์ƒ์„ธํ•œ ๊ณ„ํš๊ณผ ๊ฐœ๋ฐœ๊ณผ์ •์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

 

MileStone

https://github.com/apache/airflow/milestone/35

 

GitHub - apache/airflow: Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows - apache/airflow

github.com

 

https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+3.0

 

Airflow 3.0 - Airflow - Apache Software Foundation

This document outlines the essential elements for releasing Airflow 3.0. As a living document, it will be frequently updated based on ongoing progress and community discussions/decisions. The scope and timeline presented here serve as a โ€œstatement of int

cwiki.apache.org

 

 

๋ฐ˜์‘ํ˜•
๋‹คํ–ˆ๋‹ค