Kafka 7

์‹ค์‹œ๊ฐ„ ๋ถ„์„ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ• - 05. Delta Lake ์‹ค์‹œ๊ฐ„ ๋ถ„์„ ํ™˜๊ฒฝ ๊ตฌ์„ฑ

Delta lake ํ™˜๊ฒฝ์—์„œ Kafka์™€ ์—ฐ๋™ํ•˜์—ฌ ์‹ค์‹œ๊ฐ„ ๋ถ„์„ ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.Spark์˜ Structed Stream์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. 1. Kafka → Structured Streaming → Delta Lake ์—ฐ๋™๐Ÿ’ฟKafka ๋ฐ์ดํ„ฐ ์ƒ์„ฑKafka ํ† ํ”ฝ์— JSON ๋ฐ์ดํ„ฐ๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋ณด๋‚ด๋Š” Producer ์ฝ”๋“œ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค. Python์˜ Faker ํŒจํ‚ค์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ User ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.pip install faker kafka-pythonfrom kafka import KafkaProducerfrom faker import Fakerimport randomimport jsonimport timedef generate_user(): fake = Faker() return { ..

์‹ค์‹œ๊ฐ„ ๋ถ„์„ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ• - 03. Kafka-Iceberg ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ

์ด๋ฒˆ ๊ธ€์—์„œ๋Š” Kafka์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์†Œ๋น„ํ•˜์—ฌ Iceberg์— ์ €์žฅํ•˜๋Š” ๊ฐ„๋‹จํ•œ ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ตฌ์„ฑํ•ด๋ณด๊ณ ,Kafka์˜ ํŒŒํ‹ฐ์…˜ ์ค‘ ์ผ๋ถ€๊ฐ€ ์ฒ˜๋ฆฌ๋˜์ง€ ๋ชปํ•˜๋Š” ๊ฒฝ์šฐ์—๋„ Iceberg๋Š” ์–ด๋–ค ๋ฐฉ์‹์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š”์ง€ ์‹คํ—˜ํ•˜๊ณ  ์ดํ•ดํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด์ „๊ธ€(https://jongwho.tistory.com/37)์—์„œ ์„ค์น˜ํ•œ Iceberg ๊ธฐ๋ฐ˜์œผ๋กœ ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.๊ตฌ์„ฑ: Kafka → Spark → Iceberg์‹คํ—˜ ์‹œ๋‚˜๋ฆฌ์˜ค:Kafka์—๋Š” 3๊ฐœ์˜ ํŒŒํ‹ฐ์…˜์ด ์žˆ๊ณ , ๊ฐ ํŒŒํ‹ฐ์…˜์—๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ์ˆœ์ฐจ์ ์œผ๋กœ ๋“ค์–ด์˜ด1๋ฒˆ ํŒŒํ‹ฐ์…˜์˜ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ์ค‘ ์˜ค๋ฅ˜ ๋ฐœ์ƒ2๋ฒˆ, 3๋ฒˆ ํŒŒํ‹ฐ์…˜์€ ์ •์ƒ ์ฒ˜๋ฆฌ๋จ์ด ์ƒํ™ฉ์—์„œ Iceberg๋Š” ์–ด๋–ป๊ฒŒ ์ปค๋ฐ‹์„ ์ฒ˜๋ฆฌํ• ๊นŒ?๐Ÿ”ง 1๋‹จ๊ณ„: Kafka & Iceberg ๊ธฐ๋ณธ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์„ฑKafka ๊ตฌ์„ฑ3๊ฐœ ํŒŒํ‹ฐ์…˜์„ ..

์‹ค์‹œ๊ฐ„ ๋ถ„์„ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ• - 01. Kafka, Iceberg ์„ค์น˜

RedPanda, Iceberg๋ฅผ Docker Compose๋กœ ๊ตฌ์„ฑํ•ด์„œ ์‹ค์‹œ๊ฐ„ ํŒŒ์ดํ”„๋ผ์ธ ๊ธฐ์ดˆ ๊ตฌ์„ฑ ์ด๋ฒˆ ๊ธ€์—์„œ๋Š” Redpanda์™€ Iceberg, Minio๋ฅผ ๊ตฌ์„ฑํ•ด์„œ ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ๋ ˆ์ดํฌ ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๐Ÿ“Œ ๋ชฉํ‘œRedpanda ์„ค์น˜Apache Iceberg ์„ค์น˜Minio ์„ค์น˜์œ„ ์„ค์น˜๋ฅผ Docker compose๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ตฌํ˜„โ„๏ธ Iceberg๋ž€ ๋ฌด์—‡์ธ๊ฐ€์š”?Apache Iceberg๋Š” ๋Œ€๊ทœ๋ชจ ํ…Œ์ด๋ธ”์„ ์œ„ํ•œ ์˜คํ”ˆ์†Œ์Šค ๋ฐ์ดํ„ฐ ๋ ˆ์ดํฌ ํฌ๋งท์ž…๋‹ˆ๋‹ค. ๊ธฐ์กด Hive ๋ฉ”ํƒ€์Šคํ† ์–ด ๊ธฐ๋ฐ˜์˜ ๋А๋ฆฌ๊ณ  ๋น„ํšจ์œจ์ ์ธ ์ฟผ๋ฆฌ๋ฅผ ๊ทน๋ณตํ•˜๊ณ ์ž ์„ค๊ณ„๋˜์—ˆ์œผ๋ฉฐ, Spark, Trino, Flink ๋“ฑ ๋‹ค์–‘ํ•œ ๋ถ„์„ ๋„๊ตฌ์™€ ์‰ฝ๊ฒŒ ํ†ตํ•ฉ๋ฉ๋‹ˆ๋‹ค.โœ… Iceberg์˜ ํŠน์ง•ACID ํŠธ๋žœ์žญ์…˜ ์ง€์›Schema Evolution (์Šคํ‚ค๋งˆ ๋ณ€๊ฒฝ) ๊ฐ€๋ŠฅPart..

์‹ค์‹œ๊ฐ„ ๋ถ„์„ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ• - 00. ์•„ํ‚คํ…์ฒ˜ ์†Œ๊ฐœ

Kafka๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ณ , Iceberg์™€ Delta Lake์— ์ €์žฅํ•œ ๋’ค,Spark๋กœ ์ฒ˜๋ฆฌํ•ด๋ณด๋Š” ์‹ค์‹œ๊ฐ„ ๋ถ„์„ ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ตฌ์ถ•ํ•ฉ๋‹ˆ๋‹ค.๐Ÿ”ฅ ์ด ์‹œ๋ฆฌ์ฆˆ์˜ ๋ชฉํ‘œDocker ๊ธฐ๋ฐ˜์œผ๋กœ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง์—์„œ ์ž์ฃผ ๋“ฑ์žฅํ•˜๋Š” ๊ธฐ์ˆ  ์Šคํƒ์ธ Kafka, Data Lake(Iceberg, Delta lake)๋ฅผ ์ง์ ‘ ๊ตฌ์„ฑํ•˜๊ณ  ํ…Œ์ŠคํŠธ ํ•ด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์•„๋ž˜ ๋‚ด์šฉ๋“ค์— ๋Œ€ํ•ด ์†Œ๊ฐœํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.Kafka ์‹ค์‹œ๊ฐ„ ๋ถ„์„ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ•Docker ๊ธฐ๋ฐ˜ ๋น…๋ฐ์ดํ„ฐ ๋ถ„์„ ํ™˜๊ฒฝ ๊ตฌ์ถ•Iceberg Vs Delta Lake ์ฐจ์ด์  ๋น„๊ต ์•„ํ‚คํ…์ฒ˜ ์†Œ๊ฐœKafka๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ง์ ‘ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋„๋ก RedPanda๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.Iceberg์™€ Delta-Lake๋Š” ๊ณต์‹ ์‚ฌ์ดํŠธ ๋ฐ ๊ณต์‹ github์—์„œ ์ œ๊ณตํ•˜๋Š” ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค...

์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ™˜๊ฒฝ ๊ตฌ์ถ• - 11. Streamlit, Clickhouse๋กœ ์‹ค์‹œ๊ฐ„ ๋Œ€์‹œ๋ณด๋“œ ๊ตฌํ˜„ ๐Ÿ”จ(part. 2)

Streamlit, Clickhouse๋กœ ์‹ค์‹œ๊ฐ„ ๋Œ€์‹œ๋ณด๋“œ ๊ตฌํ˜„ ๐Ÿ”จ(part. 2)์ง€๋‚œ ๊ธ€์—์„œ๋Š” Streamlit์—์„œ ClickHouse์™€ ์—ฐ๊ฒฐํ•ด์„œ ์‹ค์‹œ๊ฐ„ ๋Œ€์‹œ๋ณด๋“œ๋ฅผ ๋งŒ๋“œ๋Š” ๊ธฐ์ดˆ๋ฅผ ๋‹ค๋ค˜์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ๊ธ€์—์„œ๋Š”, Streamlit์—์„œ Kafka๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๋‚ด๊ณ  → ClickHouse๊ฐ€ ์ด๋ฅผ ์ˆ˜์ง‘ → ์‹ค์‹œ๊ฐ„ ์‹œ๊ฐํ™”ํ•˜๋Š” ์ „์ฒด ํ๋ฆ„์„ ๋งŒ๋“ค์–ด๋ด…๋‹ˆ๋‹ค!-์ง€๋‚œ๊ธ€: ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ™˜๊ฒฝ ๊ตฌ์ถ• - 10. Streamlit, Clickhouse๋กœ ์‹ค์‹œ๊ฐ„ ๋Œ€์‹œ๋ณด๋“œ ๊ตฌํ˜„ ๐Ÿ”จ(part. 1) ๐Ÿ—‚๏ธ ๊ตฌ์„ฑ ์š”์•ฝStreamlit ๋ฐ๋ชจ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๋ฐ์ดํ„ฐ ํ๋ฆ„์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.[์‚ฌ์šฉ์ž] → Streamlit (๋ฒ„ํŠผ ํด๋ฆญ) → Kafka์— JSON ๋ฉ”์‹œ์ง€ ์ „์†ก → ClickHouse Kafka ์—”์ง„ ํ…Œ์ด๋ธ”์—์„œ ..

์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ™˜๊ฒฝ ๊ตฌ์ถ• - 05. Clickhouse ์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ…Œ์ด๋ธ” ์ƒ์„ฑํ•˜๊ธฐ

๐Ÿš€ ์ด ๊ธ€์—์„œ ๋‹ค๋ฃฐ ๋‚ด์šฉ1๏ธโƒฃ Kafka์™€ ์—ฐ๊ฒฐ๋œ ClickHouse ํ…Œ์ด๋ธ” ์ƒ์„ฑ (ํ™˜๊ฒฝ์„ค์ • ํฌํ•จ)2๏ธโƒฃ ์‹ค์Šต์šฉ ๋ฐ์ดํ„ฐ์…‹ kafka produce3๏ธโƒฃ Kafka ํ…Œ์ด๋ธ”์—์„œ ๋ฐ์ดํ„ฐ ์…‹ ํ™•์ธ ๋ฐฉ๋ฒ•1. Kafka์™€ ์—ฐ๊ฒฐ๋œ ClickHouse ํ…Œ์ด๋ธ” ์ƒ์„ฑ๐Ÿ”น Kafka ๋ฐ์ดํ„ฐ๋ฅผ ClickHouse์—์„œ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ์‹ClickHouse๋Š” Kafka์™€ ์ง์ ‘ ์—ฐ๊ฒฐํ•˜์—ฌ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ๋Š” Kafka ์—”์ง„์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.๊ธฐ๋ณธ์ ์œผ๋กœ Kafka → Buffer ํ…Œ์ด๋ธ” → MergeTree ํ…Œ์ด๋ธ”์˜ ๊ตฌ์กฐ๋กœ ์šด์˜๋ฉ๋‹ˆ๋‹ค.๐Ÿ”น ClickHouse ์„ค์ • ๋ณ€๊ฒฝ (Kafka ์‚ฌ์šฉ์„ ์œ„ํ•œ ์„ค์ •)๋จผ์ €, clickhouse-server์˜ ํ™˜๊ฒฝ์„ค์ •์„ ์ˆ˜์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.ClickHouse ์„ค์ • ํŒŒ์ผ (config.xml)์—์„œ Ka..

์‹ค์‹œ๊ฐ„ ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ™˜๊ฒฝ ๊ตฌ์ถ• - 04. Kafka UI ํ™˜๊ฒฝ ๊ตฌ์„ฑ(Redpanda Console)

์ด์ „ Apache Kafka ์„ค์น˜ ํ›„ ํ† ํ”ฝ ์ƒ์„ฑ, ๋ชฉ๋ก ํ™•์ธ์— ์˜ค๋ฅ˜๊ฐ€ ์žˆ์–ด kafka ๊ตฌ์„ฑ์ด ์–ด๋ ค์› ์Šต๋‹ˆ๋‹ค. ์ด์— UI ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•˜์—ฌ kafka ํ† ํ”ฝ์„ ์‰ฝ๊ฒŒ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋„๋ก Console ํ™˜๊ฒฝ์„ ๊ตฌ์ถ•ํ•ฉ๋‹ˆ๋‹ค.๐Ÿ’ก ์ด ๊ธ€์—์„œ ๋‹ค๋ฃฐ ๋‚ด์šฉ:Redpanda Console์ด๋ž€?Helm์„ ์‚ฌ์šฉํ•ด Redpanda Console ๋ฐฐํฌKafka์™€ ์—ฐ๊ฒฐ ๋ฐ UI์—์„œ ํ† ํ”ฝ ์ƒ์„ฑ๋ฉ”์‹œ์ง€ Produce ํ…Œ์ŠคํŠธ1. Redpanda Console์ด๋ž€?Redpanda Console์€ Kafka ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์‰ฝ๊ฒŒ ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ์›น UI์ž…๋‹ˆ๋‹ค.Kafka์˜ ๊ธฐ๋ณธ CLI๋ณด๋‹ค ํŽธ๋ฆฌํ•˜๊ฒŒ ํ† ํ”ฝ ์ƒ์„ฑ, ๋ฉ”์‹œ์ง€ ์กฐํšŒ, ์†Œ๋น„์ž ๊ทธ๋ฃน ๋ชจ๋‹ˆํ„ฐ๋ง ๋“ฑ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.โ— Redpanda Console์€ Kafka์™€ ์™„์ „ํžˆ ํ˜ธํ™˜๋˜๋ฏ€๋กœ Kafka ํด๋Ÿฌ์Šคํ„ฐ์™€..