SlideShare una empresa de Scribd logo
1 de 37
Descargar para leer sin conexión
Apache NiFi Deep Dive 300
Timothy Spann
Developer Advocate / StreamNative
Agenda
Tuesday 17:10 UTC
Apache NIFi Deep Dive 300
For Data Engineers who have flows already in production, I will dive deep into best practices, advanced use cases, performance
optimizations, tips, tricks, edge cases, and interesting examples. This is a master class for those looking to learn quickly things I have
picked up after years in the field with Apache NiFi in production. This will be interactive and I encourage questions and discussions.You
will take away examples and tips in slides, github, and articles.
This talk will cover:
● Load Balancing
● Parameters and Parameter Contexts
● Stateless vs Stateful NiFi
● Reporting Tasks
● NiFi CLI
● NiFi REST Interface
● DevOps
● Advanced Record Processing
● Schemas
● RetryFlowFile
● Lookup Services
● Record Path
● Expression Language
● Advanced Error Handling Techniques
My Other Talks & Apache Pulsar Talks
● Tuesday 17:10 UTC - Apache NIFi Deep Dive 300 by Tim Spann
● Tuesday 18:00 UTC - Apache Deep Learning 302 by Tim Spann
● Wednesday 15:00 UTC - Smart Transit: Real-Time Transit Information with FLiP by David Kjerrumgaard & Tim Spann
● Wednesday 15:50 UTC - Replicated Subscriptions: taking Apache Pulsar Geo-Replication to next level by Matteo Merli
● Wednesday 17:10 UTC - Cracking the Nut, Solving Edge AI... by David Kjerrumgaard & Tim Span
● Wednesday 17:10 UTC - Exclusive Producer: Using Apache Pulsar to build distributed applications by Matteo Merli
● Thursday 14:10 UTC - Apache NiFi 101: Introduction and Best Practices - Tim Spann
© 2020 Cloudera, Inc. All rights reserved. 4
https://github.com/tspannhw https://www.datainmotion.dev/
streamnative.io
Timothy Spann
Developer Advocate, StreamNative
Tim Spann is a Developer Advocate for StreamNative.
He works with StreamNative Cloud, Apache Pulsar,
Apache Flink, Flink SQL, Apache NiFi, MiniFi, Apache
MXNet, TensorFlow, Apache Spark, big data, the IoT,
machine learning, and deep learning. Tim has over a
decade of experience with the IoT, big data,
distributed computing, streaming technologies, and
Java programming.
FLaNK and FLiP Stacks
● Apache Flink
● Apache NiFi
● Apache Kafka
● Apache Flink
● Apache Pulsar
● StreamNative's Flink Connector for Pulsar
● Apache +++
Apache projects are the way for all streaming
use cases.
StreamNative Solution
Application Messaging Data Pipelines Real-time Contextual Analytics
Tiered Storage
APP Layer
Computing
Layer
Storage
Layer
StreamNative
Platform
IaaS Layer
Micro
Service
Notification Dashboard Risk Control Auditing
Payment ETL
All Data - Anytime - Every Cloud - Multi-Protocol
Multi-
inges
t
Multi-
inges
t
Multi-ingest Merge
Priority
Pick Data Source(s) Validate and Convert Aggregate or Eliminate Send to Data Sink(s)
NiFi Flows
Apache Pulsar, Databases,
Files, REST Endpoints, Cloud
Resources, NoSQL, JMS,
MQTT,TCP/IP, etc...
Validate types, nulls,
convert types and check
against schemas.
build up key data with
multiple sources or
removing unneeded via
Queries on records
Apache Pulsar,
Databases, Files, REST
Endpoints, Cloud
Resources, NoSQL,
JMS, MQTT,TCP/IP,
etc...
Architecture
https://nifi.apache.org/docs/nifi-docs/html/overview.html
Version Control (Github and Beyond)
Repositories
https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#repositories
Record Processors
https://www.datainmotion.dev/2019/03/advanced-xml-processing-with-apache.html
● XML, CSV, JSON, AVRO and more
● Schemas or Inferred Schemas
● Easily convert between them
● Support SQL with Apache Calcite
Backpressure & Prioritizers
https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html
System Diagnostics
Scheduling
SOLR Connectors
● XML, CSV, JSON, AVRO and more
● Schemas or Inferred Schemas
● Use Records or Raw Text
● Support SQL with Apache Calcite
Simple CDC
https://dev.to/tspannhw/simple-change-data-capture-cdc-with-s
ql-selects-via-apache-nifi-flank-19m4
https://www.datainmotion.dev/2019/04/simple-apache-nifi-operations-dashboard.html
Bulletin Board
DevOps: Automate All The Things
https://www.datainmotion.dev/2021/01/automating-starting-services-in-apache.html
nifi-toolkit/bin/cli.sh nifi pg-enable-services -u http://server:8080
--processGroupId root
nifi pg-status -u http://server:8080 -verbose -pgid a8e909d9f-adf-asdsdf
curl 'http://server:8080/nifi-api/flow/parameter-contexts'
nifi pg-list -u http://server:8080
No More Spaghetti Flows - DO NOT
https://dev.to/tspannhw/no-more-spaghetti-flows-2emd
Do Not
● Do not Put 1,000 Flows on one workspace.
● If your flow has hundreds of steps, this is a Flow Smell. Investigate why.
● Do not Use ExecuteProcess, ExecuteScripts or a lot of Groovy scripts
as a default, look for existing processors
● Do not Use Random Custom Processors you find that have no
documentation or are unknown.
● Do not forget to upgrade, if you are running anything before Apache NiFi
1.14, upgrade now!
● Do not run on default 512M RAM.
● Do not run one node and think you have a highly available cluster.
● Do not split a file with millions of records to individual records in one
shot without checking available space/memory and back pressure.
● Use Split processors only as an absolute last resort. Many processors
are designed to work on FlowFiles that contain many records or many
lines of text. Keeping the FlowFiles together instead of splitting them
apart can often yield performance that is improved by 1-2 orders of
magnitude.
No More Spaghetti Flows - DO
https://dev.to/tspannhw/no-more-spaghetti-flows-2emd
Do
● Reduce, Reuse, Recycle. Use Parameters to reuse common modules.
● Put flows, reusable chunks (write to Slack, Database, Kafka) into
separate Process Groups.
● Write custom processors if you need new or specialized features
● Use Record Processors everywhere
● Read the Docs!
● Use the NiFi Registry for version control.
● Use NiFi CLI and DevOps for Migrations.
● Walk through your flow and make sure you understand every step and
it’s easy to read and follow. Is every processor used? Are there dead
ends?
● Do run Zookeeper on different nodes from Apache NiFi.
● Use routing based on content and attributes to allow one flow to
handle multiple nearly identical flows is better than deploying the same
flow many times with tweaks to parameters in same cluster.
● Use the correct driver for your database. There's usually a couple
different JDBC drivers.
Split JSON and EvaluateJSONPath
https://dzone.com/articles/lets-build-a-simple-ingest-to-cloud-data-warehouse
Query Record - SQL
https://dzone.com/articles/lets-build-a-simple-ingest-to-cloud-data-warehouse
MergeRecord
https://dzone.com/articles/lets-build-a-simple-ingest-to-cloud-data-warehouse
Parameters
https://dzone.com/articles/exploring-apache-nifi-110-parameters-and-stateless
Parameters
https://dzone.com/articles/exploring-apache-nifi-110-parameters-and-stateless
Load Balancing
https://dev.to/tspannhw/apache-nifi-load-balancing-via-load-balanced-connections-593m
Load Balancing
https://dev.to/tspannhw/apache-nifi-load-balancing-via-load-balanced-connections-593m
https://www.datainmotion.dev/2019/03/advanced-xml-p
rocessing-with-apache.html
Apache Pulsar Overview
● Pub-Sub
● Geo-Replication
● Pulsar Functions
● Horizontal Scalability
● Multi-tenancy
● Tiered Persistent Storage
● Pulsar Connectors
● REST API
● CLI
● Many clients available
● Four Different Subscription Types
● Multi-Protocol Support
○ MQTT
○ AMQP
○ JMS
○ Kafka
○ ...
What are the Benefits of Pulsar with NiFi?
Data Durability
Scalability Geo-Replication
Multi-Tenancy
Unified Messaging
Model
https://github.com/david-streamlio/pulsar-nifi-bundle
● https://www.datainmotion.dev/2020/06/no-more-spaghetti-flows.html
● https://github.com/tspannhw/EverythingApacheNiFi
● https://www.datainmotion.dev/2019/03/apache-nifi-101.html
● https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html
● https://pierrevillard.com/best-of-nifi/
● https://blogs.apache.org/nifi/
● https://www.nifi.rocks/documents/nifi-expression-language-cheat-sheet.pdf
● https://dev.to/tspannhw/new-features-of-apache-nifi-1-13-0-45ln
● https://dev.to/tspannhw/tracking-satellites-with-apache-nifi-44o7
● https://www.datainmotion.dev/2020/07/sizing-your-apache-nifi-cluster-for.html
● https://www.datainmotion.dev/2021/01/flank-using-apache-kudu-as-cache-for.html
● https://www.datainmotion.dev/2020/12/basic-understanding-of-cloudera-flow.html
Deeper Content
@PaasDev
https://datainmotion.dev/
timothyspann

Más contenido relacionado

La actualidad más candente

Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureTimothy Spann
 
Using the FLiPN stack for edge ai (flink, nifi, pulsar)
Using the FLiPN stack for edge ai (flink, nifi, pulsar)Using the FLiPN stack for edge ai (flink, nifi, pulsar)
Using the FLiPN stack for edge ai (flink, nifi, pulsar)Timothy Spann
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesTimothy Spann
 
Spark optimization
Spark optimizationSpark optimization
Spark optimizationAnkit Beohar
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeTimothy Spann
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaTimothy Spann
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...Timothy Spann
 
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...Timothy Spann
 
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...Timothy Spann
 
Spark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-fullSpark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-fullJim Dowling
 
Cracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksCracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksTimothy Spann
 
Data minutes #2 Apache Pulsar with MQTT for Edge Computing Lightning - 2022
Data minutes #2   Apache Pulsar with MQTT for Edge Computing Lightning - 2022Data minutes #2   Apache Pulsar with MQTT for Edge Computing Lightning - 2022
Data minutes #2 Apache Pulsar with MQTT for Edge Computing Lightning - 2022Timothy Spann
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Matt Stubbs
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020Timothy Spann
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceTimothy Spann
 
Cracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksCracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksTimothy Spann
 
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)Timothy Spann
 
Flink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devopsFlink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devopsTimothy Spann
 
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Timothy Spann
 

La actualidad más candente (20)

Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Using the FLiPN stack for edge ai (flink, nifi, pulsar)
Using the FLiPN stack for edge ai (flink, nifi, pulsar)Using the FLiPN stack for edge ai (flink, nifi, pulsar)
Using the FLiPN stack for edge ai (flink, nifi, pulsar)
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
 
Spark optimization
Spark optimizationSpark optimization
Spark optimization
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
 
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
Pass data community summit - 2021 - Real-Time Streaming in Azure with Apache ...
 
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
 
Spark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-fullSpark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-full
 
Cracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksCracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworks
 
Data minutes #2 Apache Pulsar with MQTT for Edge Computing Lightning - 2022
Data minutes #2   Apache Pulsar with MQTT for Edge Computing Lightning - 2022Data minutes #2   Apache Pulsar with MQTT for Edge Computing Lightning - 2022
Data minutes #2 Apache Pulsar with MQTT for Edge Computing Lightning - 2022
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020Learning the basics of Apache NiFi for iot OSS Europe 2020
Learning the basics of Apache NiFi for iot OSS Europe 2020
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open source
 
Cracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworksCracking the nut, solving edge ai with apache tools and frameworks
Cracking the nut, solving edge ai with apache tools and frameworks
 
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)ApacheCon 2021:  Cracking the nut with Apache Pulsar (FLiP)
ApacheCon 2021: Cracking the nut with Apache Pulsar (FLiP)
 
Flink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devopsFlink sql for continuous sql etl apps & Apache NiFi devops
Flink sql for continuous sql etl apps & Apache NiFi devops
 
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
 

Similar a ApacheCon 2021 - Apache NiFi Deep Dive 300

AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101Timothy Spann
 
Sql bits apache nifi 101 Introduction and best practices
Sql bits    apache nifi 101 Introduction and best practicesSql bits    apache nifi 101 Introduction and best practices
Sql bits apache nifi 101 Introduction and best practicesTimothy Spann
 
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...StreamNative
 
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...InfluxData
 
Using FLiP with influxdb for EdgeAI IoT at Scale
Using FLiP with influxdb for EdgeAI IoT at ScaleUsing FLiP with influxdb for EdgeAI IoT at Scale
Using FLiP with influxdb for EdgeAI IoT at ScaleTimothy Spann
 
Real time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrTimothy Spann
 
E2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/LivyE2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/LivyRikin Tanna
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Timothy Spann
 
Codeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flinkCodeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flinkTimothy Spann
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampTimothy Spann
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Timothy Spann
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceTimothy Spann
 
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps_Fest
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifiAnshuman Ghosh
 
Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021Lalit Panwar
 
OSMC 2010 | Monitoring mit Icinga by Icinga Team
OSMC 2010 | Monitoring mit Icinga by Icinga TeamOSMC 2010 | Monitoring mit Icinga by Icinga Team
OSMC 2010 | Monitoring mit Icinga by Icinga TeamNETWAYS
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Timothy Spann
 
[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event StreamingTimothy Spann
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Timothy Spann
 

Similar a ApacheCon 2021 - Apache NiFi Deep Dive 300 (20)

AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101
 
Sql bits apache nifi 101 Introduction and best practices
Sql bits    apache nifi 101 Introduction and best practicesSql bits    apache nifi 101 Introduction and best practices
Sql bits apache nifi 101 Introduction and best practices
 
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
 
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
 
Using FLiP with influxdb for EdgeAI IoT at Scale
Using FLiP with influxdb for EdgeAI IoT at ScaleUsing FLiP with influxdb for EdgeAI IoT at Scale
Using FLiP with influxdb for EdgeAI IoT at Scale
 
Real time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solr
 
E2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/LivyE2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/Livy
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
 
Codeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flinkCodeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flink
 
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
 
Apache Deep Learning 201
Apache Deep Learning 201Apache Deep Learning 201
Apache Deep Learning 201
 
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
Apache Deep Learning 101 - ApacheCon Montreal 2018 v0.31
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open Source
 
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
DevOps Fest 2020. Даніель Яворович. Data pipelines: building an efficient ins...
 
Introduction to data flow management using apache nifi
Introduction to data flow management using apache nifiIntroduction to data flow management using apache nifi
Introduction to data flow management using apache nifi
 
Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021
 
OSMC 2010 | Monitoring mit Icinga by Icinga Team
OSMC 2010 | Monitoring mit Icinga by Icinga TeamOSMC 2010 | Monitoring mit Icinga by Icinga Team
OSMC 2010 | Monitoring mit Icinga by Icinga Team
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
 
[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar
 

Más de Timothy Spann

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...Timothy Spann
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-PipelinesTimothy Spann
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTimothy Spann
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-ProfitsTimothy Spann
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...Timothy Spann
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsTimothy Spann
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Timothy Spann
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI PipelinesTimothy Spann
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...Timothy Spann
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesTimothy Spann
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data PipelinesTimothy Spann
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoTimothy Spann
 
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC MeetupTimothy Spann
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiTimothy Spann
 

Más de Timothy Spann (20)

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...2024 XTREMEJ_  Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
2024 XTREMEJ_ Building Real-time Pipelines with FLaNK_ A Case Study with Tra...
 
28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines28March2024-Codeless-Generative-AI-Pipelines
28March2024-Codeless-Generative-AI-Pipelines
 
TCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI PipelinesTCFPro24 Building Real-Time Generative AI Pipelines
TCFPro24 Building Real-Time Generative AI Pipelines
 
2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits2024 Build Generative AI for Non-Profits
2024 Build Generative AI for Non-Profits
 
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
2024 February 28 - NYC - Meetup Unlocking Financial Data with Real-Time Pipel...
 
Conf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python ProcessorsConf42-Python-Building Apache NiFi 2.0 Python Processors
Conf42-Python-Building Apache NiFi 2.0 Python Processors
 
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
Conf42Python -Using Apache NiFi, Apache Kafka, RisingWave, and Apache Iceberg...
 
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
2024 Feb AI Meetup NYC GenAI_LLMs_ML_Data Codeless Generative AI Pipelines
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
NY Open Source Data Meetup Feb 8 2024 Building Real-time Pipelines with FLaNK...
 
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time PipelinesOSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
OSACon 2023_ Unlocking Financial Data with Real-Time Pipelines
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
 
JConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
 
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
 
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
 
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
26Oct2023_Adding Generative AI to Real-Time Streaming Pipelines_ NYC Meetup
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
 

Último

A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 

Último (20)

A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 

ApacheCon 2021 - Apache NiFi Deep Dive 300

  • 1. Apache NiFi Deep Dive 300 Timothy Spann Developer Advocate / StreamNative
  • 2. Agenda Tuesday 17:10 UTC Apache NIFi Deep Dive 300 For Data Engineers who have flows already in production, I will dive deep into best practices, advanced use cases, performance optimizations, tips, tricks, edge cases, and interesting examples. This is a master class for those looking to learn quickly things I have picked up after years in the field with Apache NiFi in production. This will be interactive and I encourage questions and discussions.You will take away examples and tips in slides, github, and articles. This talk will cover: ● Load Balancing ● Parameters and Parameter Contexts ● Stateless vs Stateful NiFi ● Reporting Tasks ● NiFi CLI ● NiFi REST Interface ● DevOps ● Advanced Record Processing ● Schemas ● RetryFlowFile ● Lookup Services ● Record Path ● Expression Language ● Advanced Error Handling Techniques
  • 3. My Other Talks & Apache Pulsar Talks ● Tuesday 17:10 UTC - Apache NIFi Deep Dive 300 by Tim Spann ● Tuesday 18:00 UTC - Apache Deep Learning 302 by Tim Spann ● Wednesday 15:00 UTC - Smart Transit: Real-Time Transit Information with FLiP by David Kjerrumgaard & Tim Spann ● Wednesday 15:50 UTC - Replicated Subscriptions: taking Apache Pulsar Geo-Replication to next level by Matteo Merli ● Wednesday 17:10 UTC - Cracking the Nut, Solving Edge AI... by David Kjerrumgaard & Tim Span ● Wednesday 17:10 UTC - Exclusive Producer: Using Apache Pulsar to build distributed applications by Matteo Merli ● Thursday 14:10 UTC - Apache NiFi 101: Introduction and Best Practices - Tim Spann
  • 4. © 2020 Cloudera, Inc. All rights reserved. 4 https://github.com/tspannhw https://www.datainmotion.dev/
  • 6. Timothy Spann Developer Advocate, StreamNative Tim Spann is a Developer Advocate for StreamNative. He works with StreamNative Cloud, Apache Pulsar, Apache Flink, Flink SQL, Apache NiFi, MiniFi, Apache MXNet, TensorFlow, Apache Spark, big data, the IoT, machine learning, and deep learning. Tim has over a decade of experience with the IoT, big data, distributed computing, streaming technologies, and Java programming.
  • 7. FLaNK and FLiP Stacks ● Apache Flink ● Apache NiFi ● Apache Kafka ● Apache Flink ● Apache Pulsar ● StreamNative's Flink Connector for Pulsar ● Apache +++ Apache projects are the way for all streaming use cases.
  • 8. StreamNative Solution Application Messaging Data Pipelines Real-time Contextual Analytics Tiered Storage APP Layer Computing Layer Storage Layer StreamNative Platform IaaS Layer Micro Service Notification Dashboard Risk Control Auditing Payment ETL
  • 9. All Data - Anytime - Every Cloud - Multi-Protocol Multi- inges t Multi- inges t Multi-ingest Merge Priority
  • 10. Pick Data Source(s) Validate and Convert Aggregate or Eliminate Send to Data Sink(s) NiFi Flows Apache Pulsar, Databases, Files, REST Endpoints, Cloud Resources, NoSQL, JMS, MQTT,TCP/IP, etc... Validate types, nulls, convert types and check against schemas. build up key data with multiple sources or removing unneeded via Queries on records Apache Pulsar, Databases, Files, REST Endpoints, Cloud Resources, NoSQL, JMS, MQTT,TCP/IP, etc...
  • 12. Version Control (Github and Beyond)
  • 14. Record Processors https://www.datainmotion.dev/2019/03/advanced-xml-processing-with-apache.html ● XML, CSV, JSON, AVRO and more ● Schemas or Inferred Schemas ● Easily convert between them ● Support SQL with Apache Calcite
  • 17.
  • 19. SOLR Connectors ● XML, CSV, JSON, AVRO and more ● Schemas or Inferred Schemas ● Use Records or Raw Text ● Support SQL with Apache Calcite
  • 22. DevOps: Automate All The Things https://www.datainmotion.dev/2021/01/automating-starting-services-in-apache.html nifi-toolkit/bin/cli.sh nifi pg-enable-services -u http://server:8080 --processGroupId root nifi pg-status -u http://server:8080 -verbose -pgid a8e909d9f-adf-asdsdf curl 'http://server:8080/nifi-api/flow/parameter-contexts' nifi pg-list -u http://server:8080
  • 23. No More Spaghetti Flows - DO NOT https://dev.to/tspannhw/no-more-spaghetti-flows-2emd Do Not ● Do not Put 1,000 Flows on one workspace. ● If your flow has hundreds of steps, this is a Flow Smell. Investigate why. ● Do not Use ExecuteProcess, ExecuteScripts or a lot of Groovy scripts as a default, look for existing processors ● Do not Use Random Custom Processors you find that have no documentation or are unknown. ● Do not forget to upgrade, if you are running anything before Apache NiFi 1.14, upgrade now! ● Do not run on default 512M RAM. ● Do not run one node and think you have a highly available cluster. ● Do not split a file with millions of records to individual records in one shot without checking available space/memory and back pressure. ● Use Split processors only as an absolute last resort. Many processors are designed to work on FlowFiles that contain many records or many lines of text. Keeping the FlowFiles together instead of splitting them apart can often yield performance that is improved by 1-2 orders of magnitude.
  • 24. No More Spaghetti Flows - DO https://dev.to/tspannhw/no-more-spaghetti-flows-2emd Do ● Reduce, Reuse, Recycle. Use Parameters to reuse common modules. ● Put flows, reusable chunks (write to Slack, Database, Kafka) into separate Process Groups. ● Write custom processors if you need new or specialized features ● Use Record Processors everywhere ● Read the Docs! ● Use the NiFi Registry for version control. ● Use NiFi CLI and DevOps for Migrations. ● Walk through your flow and make sure you understand every step and it’s easy to read and follow. Is every processor used? Are there dead ends? ● Do run Zookeeper on different nodes from Apache NiFi. ● Use routing based on content and attributes to allow one flow to handle multiple nearly identical flows is better than deploying the same flow many times with tweaks to parameters in same cluster. ● Use the correct driver for your database. There's usually a couple different JDBC drivers.
  • 25. Split JSON and EvaluateJSONPath https://dzone.com/articles/lets-build-a-simple-ingest-to-cloud-data-warehouse
  • 26. Query Record - SQL https://dzone.com/articles/lets-build-a-simple-ingest-to-cloud-data-warehouse
  • 32.
  • 34. Apache Pulsar Overview ● Pub-Sub ● Geo-Replication ● Pulsar Functions ● Horizontal Scalability ● Multi-tenancy ● Tiered Persistent Storage ● Pulsar Connectors ● REST API ● CLI ● Many clients available ● Four Different Subscription Types ● Multi-Protocol Support ○ MQTT ○ AMQP ○ JMS ○ Kafka ○ ...
  • 35. What are the Benefits of Pulsar with NiFi? Data Durability Scalability Geo-Replication Multi-Tenancy Unified Messaging Model
  • 37. ● https://www.datainmotion.dev/2020/06/no-more-spaghetti-flows.html ● https://github.com/tspannhw/EverythingApacheNiFi ● https://www.datainmotion.dev/2019/03/apache-nifi-101.html ● https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html ● https://pierrevillard.com/best-of-nifi/ ● https://blogs.apache.org/nifi/ ● https://www.nifi.rocks/documents/nifi-expression-language-cheat-sheet.pdf ● https://dev.to/tspannhw/new-features-of-apache-nifi-1-13-0-45ln ● https://dev.to/tspannhw/tracking-satellites-with-apache-nifi-44o7 ● https://www.datainmotion.dev/2020/07/sizing-your-apache-nifi-cluster-for.html ● https://www.datainmotion.dev/2021/01/flank-using-apache-kudu-as-cache-for.html ● https://www.datainmotion.dev/2020/12/basic-understanding-of-cloudera-flow.html Deeper Content @PaasDev https://datainmotion.dev/ timothyspann