SlideShare a Scribd company logo
Fluentd and Kafka
Hadoop / Spark Conference Japan 2016

Feb 8, 2016
Who are you?
• Masahiro Nakagawa
• github: @repeatedly
• Treasure Data Inc.
• Fluentd / td-agent developer
• Fluentd Enterprise support
• I love OSS :)
• D Language, MessagePack, The organizer of several meetups, etc…
Fluentd
• Pluggable streaming event collector
• Lightweight, robust and flexible
• Lots of plugins on rubygems
• Used by AWS, GCP, MS and more companies
• Resources
• http://www.fluentd.org/
• Webinar: https://www.youtube.com/watch?v=6uPB_M7cbYk
Popular case
App
Push
Push
Forwarder Aggregator Destination
• Distributed messaging system
• Producer - Broker - Consumer pattern
• Pull model, replication, etc











Apache Kafka
App
Push
Pull
Producer Broker DestinationConsumer
Push vs Pull
• Push:
• Easy to transfer data to multiple destinations
• Hard to control stream ratio in multiple streams
• Pull:
• Easy to control stream flow / ratio
• Should manage consumers correctly
There are 2 ways
• fluent-plugin-kafka
• kafka-fluentd-consumer
fluent-plugin-kafka
• Input / Output plugin for kafka
• https://github.com/htgc/fluent-plugin-kafka
• in_kafka, in_kafka_group, out_kafka, out_kafka_buffered
• Pros
• Easy to use and output support
• Cons
• Performance is not primary
Configuration example
<source>
@type kafka
topics web,system
format json
add_prefix kafka.
# more options
</source>
<match kafka.**>
@type kafka_buffered
output_data_type msgpack
default_topic metrics
compression_codec gzip
required_acks 1
</match>
https://github.com/htgc/fluent-plugin-kafka#usage
kafka fluentd consumer
• Stand-alone kafka consumer for fluentd
• https://github.com/treasure-data/kafka-fluentd-consumer
• Send cosumed events to fluentd’s in_forward
• Pros
• High performance and Java API features
• Cons
• Need Java runtime
Run consumer
• Edit log4j and fluentd-consumer properties
• Run following command:

$ java 

-Dlog4j.configuration=file:///path/to/log4j.properties 

-jar path/to/kafka-fluentd-consumer-0.2.1-all.jar 

path/to/fluentd-consumer.properties
Properties example
fluentd.tag.prefix=kafka.event.
fluentd.record.format=regexp # default is json
fluentd.record.pattern=(?<text>.*) # for regexp format
fluentd.consumer.topics=app.* # can use Java Rege
fluentd.consumer.topics.pattern=blacklist # default is whitelist
fluentd.consumer.threads=5
https://github.com/treasure-data/kafka-fluentd-consumer/blob/master/config/fluentd-consumer.properties
With Fluentd example
<source>
@type forward
</source>
<source>
@type exec
command java -
Dlog4j.configuration=file:///path/to/
log4j.properties -jar /path/to/kafka-
fluentd-consumer-0.2.1-all.jar /path/
to/config/fluentd-
consumer.properties
tag dummy
format json
</source>
https://github.com/treasure-data/kafka-fluentd-consumer#run-kafka-consumer-for-fluentd-via-in_exec
Conclusion
• Kafka is now becomes important component

on data platform
• Fluentd can communicate with Kafka
• Fluentd plugin and kafka consumer
• Building reliable and flexible data pipeline with

Fluentd and Kafka

More Related Content

What's hot (20)

PDF
Building robust CDC pipeline with Apache Hudi and Debezium
Tathastu.ai
 
PDF
쿠키런 1년, 서버개발 분투기
Brian Hong
 
PPTX
[211] HBase 기반 검색 데이터 저장소 (공개용)
NAVER D2
 
PDF
ksqlDB로 시작하는 스트림 프로세싱
confluent
 
PDF
AWS를 활용하여 Daily Report 만들기 : 로그 수집부터 자동화된 분석까지
Changje Jeong
 
PPTX
4. 대용량 아키텍쳐 설계 패턴
Terry Cho
 
PPSX
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
PDF
마이크로서비스를 위한 AWS 아키텍처 패턴 및 모범 사례 - AWS Summit Seoul 2017
Amazon Web Services Korea
 
PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
PDF
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
Jo Hoon
 
PDF
[215]네이버콘텐츠통계서비스소개 김기영
NAVER D2
 
PDF
Azure Redis Cache
Chourouk HJAIEJ
 
PDF
Apache Kafka Fundamentals for Architects, Admins and Developers
confluent
 
PDF
Elasticsearch
Shagun Rathore
 
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Kai Wähner
 
PPTX
[DevGround] 린하게 구축하는 스타트업 데이터파이프라인
Jae Young Park
 
PDF
Deploying Kafka Streams Applications with Docker and Kubernetes
confluent
 
PDF
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Yongho Ha
 
PDF
Kafka tiered-storage-meetup-2022-final-presented
Sumant Tambe
 
PDF
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
confluent
 
Building robust CDC pipeline with Apache Hudi and Debezium
Tathastu.ai
 
쿠키런 1년, 서버개발 분투기
Brian Hong
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
NAVER D2
 
ksqlDB로 시작하는 스트림 프로세싱
confluent
 
AWS를 활용하여 Daily Report 만들기 : 로그 수집부터 자동화된 분석까지
Changje Jeong
 
4. 대용량 아키텍쳐 설계 패턴
Terry Cho
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
마이크로서비스를 위한 AWS 아키텍처 패턴 및 모범 사례 - AWS Summit Seoul 2017
Amazon Web Services Korea
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
Jo Hoon
 
[215]네이버콘텐츠통계서비스소개 김기영
NAVER D2
 
Azure Redis Cache
Chourouk HJAIEJ
 
Apache Kafka Fundamentals for Architects, Admins and Developers
confluent
 
Elasticsearch
Shagun Rathore
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Kai Wähner
 
[DevGround] 린하게 구축하는 스타트업 데이터파이프라인
Jae Young Park
 
Deploying Kafka Streams Applications with Docker and Kubernetes
confluent
 
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Yongho Ha
 
Kafka tiered-storage-meetup-2022-final-presented
Sumant Tambe
 
Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (L...
confluent
 

Similar to Fluentd and Kafka (20)

PDF
Fluentd: Unified Logging Layer at CWT2014
N Masahiro
 
PDF
Fluentd and AWS at classmethod
Treasure Data, Inc.
 
PDF
Fluentd Unified Logging Layer At Fossasia
N Masahiro
 
PPTX
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Data Con LA
 
PDF
Fluentd and Embulk Game Server 4
N Masahiro
 
PDF
Fluentd Overview, Now and Then
SATOSHI TAGOMORI
 
PDF
Fluentd - road to v1 -
N Masahiro
 
PDF
Fluentd at HKOScon
N Masahiro
 
PDF
Fluentd introduction at ipros
Treasure Data, Inc.
 
PDF
Treasure Data and OSS
N Masahiro
 
PDF
Cloud Native Logging / Fluentd Summit Tokyo
Eduardo Silva Pereira
 
PDF
Fluentd - RubyKansai 65
N Masahiro
 
PDF
Insight Data Engineering: Open source data ingestion
Treasure Data, Inc.
 
PDF
Open source data ingestion
Treasure Data, Inc.
 
PDF
Fluentd 101
SATOSHI TAGOMORI
 
PDF
A la rencontre de Kafka, le log distribué par Florian GARCIA
La Cuisine du Web
 
PDF
Fluentd Project Intro at Kubecon 2019 EU
N Masahiro
 
PDF
Fluentd v1 and future at techtalk
N Masahiro
 
PDF
Docker and Fluentd
N Masahiro
 
PDF
Fluentd v0.12 master guide
N Masahiro
 
Fluentd: Unified Logging Layer at CWT2014
N Masahiro
 
Fluentd and AWS at classmethod
Treasure Data, Inc.
 
Fluentd Unified Logging Layer At Fossasia
N Masahiro
 
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Data Con LA
 
Fluentd and Embulk Game Server 4
N Masahiro
 
Fluentd Overview, Now and Then
SATOSHI TAGOMORI
 
Fluentd - road to v1 -
N Masahiro
 
Fluentd at HKOScon
N Masahiro
 
Fluentd introduction at ipros
Treasure Data, Inc.
 
Treasure Data and OSS
N Masahiro
 
Cloud Native Logging / Fluentd Summit Tokyo
Eduardo Silva Pereira
 
Fluentd - RubyKansai 65
N Masahiro
 
Insight Data Engineering: Open source data ingestion
Treasure Data, Inc.
 
Open source data ingestion
Treasure Data, Inc.
 
Fluentd 101
SATOSHI TAGOMORI
 
A la rencontre de Kafka, le log distribué par Florian GARCIA
La Cuisine du Web
 
Fluentd Project Intro at Kubecon 2019 EU
N Masahiro
 
Fluentd v1 and future at techtalk
N Masahiro
 
Docker and Fluentd
N Masahiro
 
Fluentd v0.12 master guide
N Masahiro
 
Ad

More from N Masahiro (19)

PDF
Fluentd and Distributed Logging at Kubecon
N Masahiro
 
PDF
Fluentd v1.0 in a nutshell
N Masahiro
 
PDF
Fluentd v1.0 in a nutshell
N Masahiro
 
PDF
Presto changes
N Masahiro
 
PDF
Fluentd v0.14 Overview
N Masahiro
 
PDF
fluent-plugin-beats at Elasticsearch meetup #14
N Masahiro
 
PDF
Dive into Fluentd plugin v0.12
N Masahiro
 
PDF
Technologies for Data Analytics Platform
N Masahiro
 
PDF
How to create Treasure Data #dotsbigdata
N Masahiro
 
PDF
Treasure Data and AWS - Developers.io 2015
N Masahiro
 
PDF
SQL for Everything at CWT2014
N Masahiro
 
PDF
Can you say the same words even in oss
N Masahiro
 
PDF
I am learing the programming
N Masahiro
 
PDF
Fluentd meetup dive into fluent plugin (outdated)
N Masahiro
 
PDF
D vs OWKN Language at LLnagoya
N Masahiro
 
PDF
Goodbye Doost
N Masahiro
 
KEY
Final presentation at pfintern
N Masahiro
 
ZIP
Kernel VM 5 LT
N Masahiro
 
ZIP
D言語のコミッタになる一つの方法
N Masahiro
 
Fluentd and Distributed Logging at Kubecon
N Masahiro
 
Fluentd v1.0 in a nutshell
N Masahiro
 
Fluentd v1.0 in a nutshell
N Masahiro
 
Presto changes
N Masahiro
 
Fluentd v0.14 Overview
N Masahiro
 
fluent-plugin-beats at Elasticsearch meetup #14
N Masahiro
 
Dive into Fluentd plugin v0.12
N Masahiro
 
Technologies for Data Analytics Platform
N Masahiro
 
How to create Treasure Data #dotsbigdata
N Masahiro
 
Treasure Data and AWS - Developers.io 2015
N Masahiro
 
SQL for Everything at CWT2014
N Masahiro
 
Can you say the same words even in oss
N Masahiro
 
I am learing the programming
N Masahiro
 
Fluentd meetup dive into fluent plugin (outdated)
N Masahiro
 
D vs OWKN Language at LLnagoya
N Masahiro
 
Goodbye Doost
N Masahiro
 
Final presentation at pfintern
N Masahiro
 
Kernel VM 5 LT
N Masahiro
 
D言語のコミッタになる一つの方法
N Masahiro
 
Ad

Recently uploaded (20)

PPTX
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
PDF
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
PDF
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
PDF
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PPTX
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PDF
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
PDF
July Patch Tuesday
Ivanti
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Q2 FY26 Tableau User Group Leader Quarterly Call
lward7
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Building Real-Time Digital Twins with IBM Maximo & ArcGIS Indoors
Safe Software
 
Transcript: New from BookNet Canada for 2025: BNC BiblioShare - Tech Forum 2025
BookNet Canada
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
Using FME to Develop Self-Service CAD Applications for a Major UK Police Force
Safe Software
 
What Makes Contify’s News API Stand Out: Key Features at a Glance
Contify
 
IoT-Powered Industrial Transformation – Smart Manufacturing to Connected Heal...
Rejig Digital
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
AI Penetration Testing Essentials: A Cybersecurity Guide for 2025
defencerabbit Team
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Agentic AI lifecycle for Enterprise Hyper-Automation
Debmalya Biswas
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
NewMind AI - Journal 100 Insights After The 100th Issue
NewMind AI
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
LLMs.txt: Easily Control How AI Crawls Your Site
Keploy
 
July Patch Tuesday
Ivanti
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 

Fluentd and Kafka

  • 1. Fluentd and Kafka Hadoop / Spark Conference Japan 2016
 Feb 8, 2016
  • 2. Who are you? • Masahiro Nakagawa • github: @repeatedly • Treasure Data Inc. • Fluentd / td-agent developer • Fluentd Enterprise support • I love OSS :) • D Language, MessagePack, The organizer of several meetups, etc…
  • 3. Fluentd • Pluggable streaming event collector • Lightweight, robust and flexible • Lots of plugins on rubygems • Used by AWS, GCP, MS and more companies • Resources • http://www.fluentd.org/ • Webinar: https://www.youtube.com/watch?v=6uPB_M7cbYk
  • 5. • Distributed messaging system • Producer - Broker - Consumer pattern • Pull model, replication, etc
 
 
 
 
 
 Apache Kafka App Push Pull Producer Broker DestinationConsumer
  • 6. Push vs Pull • Push: • Easy to transfer data to multiple destinations • Hard to control stream ratio in multiple streams • Pull: • Easy to control stream flow / ratio • Should manage consumers correctly
  • 7. There are 2 ways • fluent-plugin-kafka • kafka-fluentd-consumer
  • 8. fluent-plugin-kafka • Input / Output plugin for kafka • https://github.com/htgc/fluent-plugin-kafka • in_kafka, in_kafka_group, out_kafka, out_kafka_buffered • Pros • Easy to use and output support • Cons • Performance is not primary
  • 9. Configuration example <source> @type kafka topics web,system format json add_prefix kafka. # more options </source> <match kafka.**> @type kafka_buffered output_data_type msgpack default_topic metrics compression_codec gzip required_acks 1 </match> https://github.com/htgc/fluent-plugin-kafka#usage
  • 10. kafka fluentd consumer • Stand-alone kafka consumer for fluentd • https://github.com/treasure-data/kafka-fluentd-consumer • Send cosumed events to fluentd’s in_forward • Pros • High performance and Java API features • Cons • Need Java runtime
  • 11. Run consumer • Edit log4j and fluentd-consumer properties • Run following command:
 $ java 
 -Dlog4j.configuration=file:///path/to/log4j.properties 
 -jar path/to/kafka-fluentd-consumer-0.2.1-all.jar 
 path/to/fluentd-consumer.properties
  • 12. Properties example fluentd.tag.prefix=kafka.event. fluentd.record.format=regexp # default is json fluentd.record.pattern=(?<text>.*) # for regexp format fluentd.consumer.topics=app.* # can use Java Rege fluentd.consumer.topics.pattern=blacklist # default is whitelist fluentd.consumer.threads=5 https://github.com/treasure-data/kafka-fluentd-consumer/blob/master/config/fluentd-consumer.properties
  • 13. With Fluentd example <source> @type forward </source> <source> @type exec command java - Dlog4j.configuration=file:///path/to/ log4j.properties -jar /path/to/kafka- fluentd-consumer-0.2.1-all.jar /path/ to/config/fluentd- consumer.properties tag dummy format json </source> https://github.com/treasure-data/kafka-fluentd-consumer#run-kafka-consumer-for-fluentd-via-in_exec
  • 14. Conclusion • Kafka is now becomes important component
 on data platform • Fluentd can communicate with Kafka • Fluentd plugin and kafka consumer • Building reliable and flexible data pipeline with
 Fluentd and Kafka