Navigating the Data Landscape: Exploring Key Elements of Superset and Kafka for a Real-Time Analytics Platform

I changed the nature of my personal sprint objective, moving away slightly from AI concerns to reconnect with more general considerations about Data and its processing: from collection to its visualization through its “processing”. Ultimately, of course, this treatment could be injected with AI or ML. As always, when we start thinking about such a vast subject, the first question is what do we want to land on to define the scope? In this case, I wondered how to approach the question of data.

So, according to me and after compiling several resources, finally the two things that seem most important to me for now:

How can I improve visualisation of existing data e.g csv for instance?
What are the ways to improve data collection?

For this post, you can find all files for each project on my GitHub account. See https://github.com/bflaven/BlogArticlesExamples/tree/master/how_to_use_superset_kafka_agile2

Key-ideas that should drives a POC

Like always, I read some stuff that I found inspiring and note everything down. Sometimes, I found some consolations and some advice that act like mantras when you start a POC to avoid the WIP’s hell.

Learning is an active process. We learn by doing. Only knowledge that is used sticks in your mind. – Dale Carnegie

For records, some good stuff from Agile2

Apparently, Agile is dead long live to Agile2.

I just read this article: https://medium.com/developer-rants/agile-has-failed-officially-8136b0522c49

Funny, this reminds me of several things:
Like any belief or ideology, it is beneficial to question/criticize any “dominant” model of thought. Quite provocative and indeed, sensible for two reasons:

Agile without leadership is worthless, it amounts to submitting to stupid and pointless formalism 🙂
Agile 2 is also a clever way to sell even more, it’s a bit like Persil laundry detergent, a new formula. We sell you the same product but with extra soul.

A good reminder to meditate.

At some point, a project must produce a final product.

Some other excerpts from the Agile2 core values on some specific aspects for project management.

(i) Planning, Transition & Transformation

Any initiative requires both a vision or goal, and a flexible, steerable, outcome-oriented plan.
Any significant transformation is mostly a learning journey – not merely a process change.
Product development is mostly a learning journey – not merely an “implementation.”

(ii) Product, Portfolio & Stakeholders

Obtain feedback from the market and stakeholders continuously.
Work iteratively in small batches.
The only proof of value is a business outcome.
Organizations need an “inception framework” tailored to their needs.
Create documentation to share and deepen understanding.

(iii) Continuous Improvement

Place limits on things that cause drag.
Integrate early and often.

Streamline the data analytics flow…

Anyway, let’s get back to the main course. I was about to make a complete benchmark among the market solutions: Superset, Redash and even ClickHouse. But, finally Superset is enough for my product discovery.

I was looking for some ideas on how to improve data analytics flow. Indeed, it is always useful “to set up a system that allows you to get a deeper understanding of the behaviour of your customers”.

Source: https://xebia.com/blog/real-time-analytics-divolte-kafka-druid-superset/

Having an alternative pipeline that could be called “real-time analytics platform” can help to better perform:

Descriptive analysis: either the act of analyzing the data and describing what they say at a given time;

Predictive analysis: predicting a potential result based on data extracted from past or current activities.

Source: https://betterprogramming.pub/building-an-order-delivery-analytics-application-with-fastapi-kafka-apache-pinot-and-dash-part-ca276a3ee631

Based on these 2 posts, I decide to go for :

Improve visualisation of existing data with Superset.
Explore quickly Kafka to improve data collection.

1. Superset with Docker

Superset is the most intuitive tool that I found to improve visualisation. Here is a quick way to install and manage Superset with Docker.

# go to path
cd /Users/brunoflaven/Documents/01_work/blog_articles/how_to_use_superset/


# command to open docker
open -a docker

# clone the dir
git clone https://github.com/apache/superset.git superset


# get into the dir
cd superset


# get the superset stuff 
docker compose -f docker-compose-non-dev.yml pull

# start the superset stuff 
docker compose -f docker-compose-non-dev.yml up

# start using Superset
# http://localhost:8088
# username: admin
# password: admin

2. Connecting Superset to Databases

You need to have some databases e.g. mariadb, mongodb, mysql, postgresql installed on your machine to leverage on a database. To be sure to connect Superset to your local database, you need to use the hostname docker.for.mac.host.internal instead of localhost.

On mac, the best way to do so: install databases, it is to use homebrew. Here are the commands to install databases.

# list services for database
brew services list


# classical commands before install
brew update
brew upgrade
brew doctor

To Install Homebrew if you haven’t already: https://brew.sh/

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/main/install.sh)"

# Update the Homebrew formulae:
brew update

2.1 MARIADB
MariaDB is an open-source, community-developed relational database
management system (RDBMS) that serves as a drop-in replacement for MySQL.
It offers a robust and flexible SQL engine with features such as stored
procedures, views, subqueries, and triggers.

Advantages of MariaDB:

Compatibility: MariaDB is highly compatible with MySQL, making it easy
to migrate existing databases without having to rewrite code or modify
applications.
Performance: MariaDB offers improved performance compared to MySQL
through enhancements like optimistic optimization and asynchronous
replication. This results in faster query processing and reduced latency
in real-time environments.
Scalability: MariaDB is designed for scalability, with features such as
partitioning that can help handle large datasets efficiently while
ensuring optimal performance.

# Managing the mariadb database

# mariadb
brew search mariadb
brew install mariadb
brew services start mariadb
brew services stop mariadb

# connect to mariadb (no password)
mysql
mysql -u brunoflaven 

# create a root user with all privileges
CREATE USER 'root'@'hostname' IDENTIFIED BY 'root';
# CREATE USER 'root'@'%' IDENTIFIED BY 'root';
SELECT USER,is_role,default_role FROM mysql.user;
GRANT ALL PRIVILEGES ON *.* TO 'root'@localhost IDENTIFIED BY 'root';
FLUSH PRIVILEGES;
SHOW GRANTS FOR 'root'@localhost;

# connection infos
# select mysql
# add port 3306
# add host docker.for.mac.host.internal or 127.0.0.1
# db_name : mydatabase_try_mariadb
# user: root
# pwd: root



# useful commands
# create databases
CREATE DATABASE try_mariadb;
USE try_mariadb;
CREATE TABLE testtable
(
 id int not null primary key,
 name varchar(20) not null,
 lastupdate timestamp not null
 );

# insert
INSERT INTO testtable
 (id, name, lastupdate)
 values (1,'Sample name','2022-09-22 18:53');

INSERT INTO testtable
 (id, name, lastupdate)
 values (2,'Sample name 2','2022-09-22 18:54');

# update
UPDATE testtable set name = 'updated name' where id=1;

# delete one record with the id equal to 4
DELETE FROM testtable where id = 4;

# select all content from the table testtable 
SELECT * FROM testtable;

# drop
DROP TABLE testtable;

# empty
TRUNCATE testtable;

2.2 POSTGRES

PostgreSQL (Postgres) is a powerful, open-source object-relational
database system with a strong emphasis on reliability, data integrity, and
correctness. It supports a wide range of data types, including
geographical data, large objects such as images, JSON, and XML documents,
and advanced features like stored procedures, triggers, and rules.

Advantages of PostgreSQL:

Robustness: Postgres is known for its robustness in handling complex
queries, concurrency, and reliability. It follows the ACID (Atomicity,
Consistency, Isolation, Durability) principles to ensure data integrity.
Extensibility: Postgres offers a rich ecosystem with built-in support
for various languages and data types. Its plugin architecture allows for
seamless integration of new features and functionality without altering
the core system.
Compatibility: PostgreSQL is highly compatible with many popular
database systems, including SQL Server, Oracle, MySQL, and DB2. This makes
it easy to migrate existing applications oruser

# install and start postgresql with homebrew 
brew search postgresql
brew install postgresql
brew services start postgresql
brew services stop postgresql

# connect to postgres
psql postgres

# way_1 to connect to postgresql
# in the console
CREATEUSER -s postgres
# in the postgres client
ALTER USER postgres WITH PASSWORD 'password';

# way_2 to connect to postgresql

# in the postgres client
CREATE ROLE root WITH LOGIN PASSWORD 'root';
ALTER ROLE root CREATEDB;

# connect to postgres in a terminal
psql postgres

# your username should be listed
postgres=# \du

# let's validate it
postgres=# \q;

# and then:
psql -U brunoflaven postgres;
psql -U root postgres;

# to quit
postgres=# \q;

# list all databases
postgres=# \list;
postgres=# \l;

# connect to a certain database
postgres=# \c; 

# examples with real postgres databases
postgres=# \c postgres;
postgres=# \c mydatabase_try_postgresql;
postgres=# \c template1;

# list all tables in the current database using your search_path
postgres=# \dt;


# In postgres, in the console, give a complete creation tables and and insert datas for a database named "mydatabase_try_postgresql"


# connect to the newly created database
\c mydatabase_try_postgresql

# Create 'users' table
CREATE TABLE users (
    user_id SERIAL PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL
);

# Create 'orders' table
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    user_id INT REFERENCES users(user_id),
    order_date DATE,
    total_amount DECIMAL(10, 2) NOT NULL
);




# configure postgresql db in superset
# Not working localhost or 127.0.0.1 on Mac
# Working docker.for.mac.host.internal or 127.0.0.1 on Mac


# select postgresql
# add port 5432
# add host docker.for.mac.host.internal
# db_name : mydatabase_try_postgresql
# user: postgres
# pwd: password


# createdb mydatabase_try_postgresql
# dropdb mydatabase_try_postgresql

This time, for this POC, I did not install mysql and mongodb as I want to have a “quick and dirty” csv conversion into superset.

Customization with .env for Superset
Modify the .env files for environment-specific configurations and install additional Python packages by adding them to requirements-local.txt.
You can use a combinaison of a docker-compose.yml file and .env file to install Superset with docker-compose.

# You can create some_random_base64_string using this command in shell
openssl rand -base64 42
# OUTPUT: uKqlflwJGdDH/+NpwuRhJh8mZrNsTGu45OMT7akZhGhaBlOqkkOR0xMP

# in the mac terminal define the SUPERSET_SECRET_KEY
export SUPERSET_SECRET_KEY="uKqlflwJGdDH/+NpwuRhJh8mZrNsTGu45OMT7akZhGhaBlOqkkOR0xMP"

# Lists containers (and tells you which images they are spun from)
docker ps -a                

# Lists images 
docker images               

# Removes a stopped container
docker rm     

# Forces the removal of a running container (uses SIGKILL)
docker rm -f  

# Removes an image
# Will fail if there is a running instance of that image i.e. container
docker rmi        


# Forces removal of image even if it is referenced in multiple repositories, 
# i.e. same image id given multiple names/tags 
# Will still fail if there is a docker container referencing image
docker rmi -f     


# command for docker
docker info
docker container prune -a
docker image prune
docker volume prune
docker network prune
docker system prune
docker system prune -a

3. Collecting data: Using kafka, a corner stone

Apache Kafka is a distributed streaming platform that is widely used for building real-time data pipelines and streaming applications. In the context of an analytics application, Kafka plays a crucial role in handling the flow of data between different components of the application. It provides a scalable, fault-tolerant, and high-throughput messaging system that allows seamless communication between various modules of the analytics application.

Here are some key purposes of Kafka within an analytics application:

Data Ingestion: Kafka acts as a central hub for ingesting data from various sources such as databases, logs, sensors, and other systems. It enables the application to handle large volumes of incoming data in a scalable and efficient manner.
Event Streaming: Kafka allows the streaming of events in real-time. This is beneficial for analytics applications that require continuous processing of data, enabling real-time insights and analytics.
Decoupling of Components: Kafka helps in decoupling different components of the analytics application. Producers can publish data to Kafka topics without worrying about who will consume it, and consumers can subscribe to the topics they are interested in.
Fault Tolerance and Durability: Kafka ensures fault tolerance by replicating data across multiple nodes. This makes it a reliable and durable solution, ensuring that data is not lost in case of failures.

Source: https://kafka.apache.org/

We are going to take the most straightforward way on mac, meaning using homebrew like we did previously for database. The commands will almost the same as we will use Homebrew.

Again, if you need to install homebrew, type in the console, the following command.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

To install kafka with Homebrew

brew install kafka

To list the services from Homebrew

brew services list

If you’ve installed Kafka and Zookeeper using Homebrew on your macOS system, you can use the following commands to interact with Kafka and gain an understanding of its general principles.

1. Start Zookeeper:
Zookeeper is a prerequisite for Kafka, and it manages distributed configurations and synchronization between nodes. Open a terminal and start Zookeeper:

brew services start zookeeper

2. Start Kafka Server:

Now, start the Kafka server using Homebrew:

brew services start kafka

This command will start the Kafka server as a background service.

3. Create a Topic:

Kafka organizes data into topics. Create a Kafka topic to publish and subscribe messages:

kafka-topics --create --topic brunotopic1 --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

Replace `brunotopic1` with the desired topic name.

4. List Topics:

List the existing Kafka topics:

kafka-topics --list --bootstrap-server localhost:9092

5. Produce Messages:

Produce some messages to the topic:

kafka-console-producer --topic brunotopic1 --bootstrap-server localhost:9092

This command opens a console where you can type messages. Press `Ctrl + D` to exit.

6. Consume Messages:

Open a new terminal and consume messages from the topic:

kafka-console-consumer --topic brunotopic1 --bootstrap-server localhost:9092 --from-beginning

This command subscribes to the topic and prints incoming messages.

7. Describe a Topic:

Describe the properties of a Kafka topic:

kafka-topics --describe --topic brunotopic1 --bootstrap-server localhost:9092

8. Kafka Commands Documentation:

Explore additional Kafka commands and options by checking the official documentation:

kafka-topics --help
kafka-console-producer --help
kafka-console-consumer --help

9. Stop Kafka:

When you’re done, you can stop the Kafka server:

brew services stop kafka
brew services stop zookeeper

This will stop the Kafka server running as a background service.

These commands provide a basic overview of Kafka’s functionalities. You can experiment further and refer to the [official documentation](https://kafka.apache.org/documentation/) for more in-depth understanding and configuration options.

Using faststream

To go further, you can leverage on FastStream. A kind of FastAPI for Kafka. Indeed, FastStream simplifies the process of writing producers and consumers for message queues, handling all the parsing, networking and documentation generation automatically.

Source: https://faststream.airt.ai/latest/faststream/

More infos

Superset

Welcome | Superset
https://superset.apache.org/
Apache Superset Tutorial · Start Data Engineering
https://www.startdataengineering.com/post/apache-superset-tutorial/
Introduction | Superset
https://superset.apache.org/docs/intro/
Setting up Superset GitHub Integration: 3 Easy Methods
https://hevodata.com/learn/superset-github/#l12
Course Bytes – YouTube
https://www.youtube.com/@coursebytes
Preset Cloud – Modern Open Source BI Platform | Preset
https://preset.io/product/
Get started with Apache Superset and PostgreSQL®
https://aiven.io/blog/get-started-with-apache-superset-and-postgresql
Tutorial – Creating your first dashboard — Apache Superset documentation
https://apache-superset.readthedocs.io/en/0.28.1/tutorial.html
Introduction à Apache Superset – datacorner par Benoit Cayla
https://datacorner.fr/introduction-a-apache-superset/
Documentation Superset — Restack
https://www.restack.io/docs/superset
GitHub – apache/superset: Apache Superset is a Data Visualization and Data Exploration Platform
https://github.com/apache/superset
Apache Superset Tutorial | Censius Blog
https://censius.ai/blogs/apache-superset-tutorial
Fully managed Redash — Restack
https://www.restack.io/store/redash
Apache Superset Overview Video – YouTube
https://www.youtube.com/watch?v=kGfUIOK87V8
Redash vs Superset: Which data visualization tool should you select? – YouTube
https://www.youtube.com/watch?v=U33wA0gW01M
Comment utiliser l’IA pour l’analyse des données ? Expliqué avec plusieurs cas d’utilisation
https://www.edrawsoft.com/fr/ai-tools-tips/ai-for-data-analysis.html
403 Forbidden
https://online.edhec.edu/fr/blog/le-role-de-lia-et-du-machine-learning/
Attention Required! | Cloudflare
https://geekflare.com/fr/ai-data-analysis-tools/
6 Best AI Tools for Data Analysts (January 2024) – Unite.AI
https://www.unite.ai/ai-tools-data-analysts/
AutoML and AutoAI – IBM Watson Studio
https://www.ibm.com/products/watson-studio/autoai
How to Perform Data Analysis in Python Using the OpenAI API — SitePoint
https://www.sitepoint.com/openai-api-python-data-analysis/
GitHub – Apress/python-data-analytics-2e: Source Code for ‘Python Data Analytics, 2nd Edition’ by Fabio Nelli
https://github.com/Apress/python-data-analytics-2e
Python for Data Analysis, 3E
https://wesmckinney.com/book/
100+ AI Use Cases & Applications: In-Depth Guide for 2024
https://research.aimultiple.com/ai-usecases/#ai-use-cases-for-marketing
Devinterview.io – Ace your next tech interview with confidence in 2024.
https://devinterview.io/
GitHub – DerekKane/Use-Cases-Data-Science: A list of working examples of Data Science Use Cases and Applications by Industry
https://github.com/DerekKane/Use-Cases-Data-Science
Fast Open-Source OLAP DBMS – ClickHouse
https://clickhouse.com/
GitHub – ClickHouse/ClickHouse: ClickHouse® is a free analytics DBMS for big data
https://github.com/ClickHouse/ClickHouse
GitHub – apache/zeppelin: Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
https://github.com/apache/zeppelin
GitHub – datafuselabs/databend: Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. Cloud: https://databend.com
https://github.com/datafuselabs/databend
GitHub – apache/spark: Apache Spark – A unified analytics engine for large-scale data processing
https://github.com/apache/spark
GitHub – metabase/metabase: The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
https://github.com/metabase/metabase
GitHub – getredash/redash: Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
https://github.com/getredash/redash
GitHub – microsoft/ML-For-Beginners: 12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
https://github.com/microsoft/ML-For-Beginners
Fast Open-Source OLAP DBMS – ClickHouse
https://clickhouse.com/
How to setup PostgreSQL on MacOS
https://www.robinwieruch.de/postgres-sql-macos-setup/
Apache Superset on Mac M1 Guide — Restack
https://www.restack.io/docs/superset-knowledge-apache-superset-mac-m1-guide
Advanced Apache Superset for Data Engineers — Restack
https://www.restack.io/docs/superset-advanced-apache-superset-data-engineers
GitHub – kkiaune/emails-classification: Emails classification template using python. Technology stack: docker-compose, jupyter notebooks, fastapi, airflow, posgreSQL, superset, minio, portainer, pgAdmin
https://github.com/kkiaune/emails-classification
GitHub – JDiego199/superset-docker-compose
https://github.com/JDiego199/superset-docker-compose/
GitHub – mauricioobgo/StreamingProject: This Docker Compose project establishes a data pipeline with Apache Spark, Kafka, Cassandra, MySQL, and Superset. Engineered for real-time processing of streaming data, it stores results in distributed databases and offers visualization through Superset dashboards.
https://github.com/mauricioobgo/StreamingProject
GitHub – insight-infrastructure/superset-docker-compose: Superset deployed with docker-compose and features
https://github.com/insight-infrastructure/superset-docker-compose
Supercharging Apache Superset | by Airbnb | The Airbnb Tech Blog
https://medium.com/airbnb-engineering/supercharging-apache-superset-b1a2393278bd
Running Apache Superset at Scale. A set of recommendations and starting… | by Mahdi Karabiben | Towards Data Science
https://towardsdatascience.com/running-apache-superset-at-scale-1539e3945093
How Airbnb Achieved Metric Consistency at Scale | by Robert Chang | The Airbnb Tech Blog | Medium
https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70
How Airbnb Achieved Metric Consistency at Scale | by Robert Chang | The Airbnb Tech Blog | Medium
https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70
Pricing | Preset
https://preset.io/pricing/
Connecting Your Data
https://docs.preset.io/v1/docs/connecting-your-data
How to build a real-time analytics platform using Kafka, ksqlDB and ClickHouse ? | by Florian Hussonnois | StreamThoughts | Medium
https://medium.com/streamthoughts/how-to-build-a-real-time-analytical-platform-using-kafka-ksqldb-and-clickhouse-bfabd65d05e4
Build a Real-Time Event Streaming Pipeline with Kafka, BigQuery & Looker Studio | by Tobi Sam | Towards Data Science
https://towardsdatascience.com/real-time-event-streaming-with-kafka-bigquery-69c3baebb51e
Stream data with open source Kafka by Aiven analyze with BigQuery | Google Cloud Blog
https://cloud.google.com/blog/products/data-analytics/stream-data-with-open-source-kafka-by-aiven-analyze-with-bigquery
Restack – YouTube
https://www.youtube.com/@Restackio
Better Data Science – YouTube
https://www.youtube.com/@BetterDataScience/videos
Overview of Real-Time Analytics – Microsoft Fabric | Microsoft Learn
https://learn.microsoft.com/en-us/fabric/real-time-analytics/overview#what-makes-real-time-analytics-unique
Build a real-time analytics pipeline in less time than your morning bus ride
https://aiven.io/blog/build-a-real-time-analytics-pipeline
What is real-time analytics? – DEV Community
https://dev.to/tinybirdco/what-is-real-time-analytics-5ah3
Building real-time analytics into your next project – DEV Community
https://dev.to/tinybirdco/building-real-time-analytics-into-your-next-project-3b6n
Real-time analytics with stream processing and OLAP | CNCF
https://www.cncf.io/blog/2023/08/08/real-time-analytics-with-stream-processing-and-olap/
Real-time analytics on big data architecture – Azure Solution Ideas | Microsoft Learn
https://learn.microsoft.com/en-us/azure/architecture/solution-ideas/articles/real-time-analytics
Real-time Databases: What developers need to know
https://www.tinybird.co/blog-posts/real-time-databases-what-developers-need-to-know
Real time analytics: Airflow + Kafka + Druid + Superset – [Eng] | Duy Nguyen
https://duynguyenngoc.com/posts/real-time-analytics-airflow-kafka-druid-superset/
Realtime data streaming with Apache Kafka, Apache Pinot, Apache Druid and Apache Superset | by Bruno Cardoso Farias | Medium
https://medium.com/@emergeit/realtime-data-streaming-with-apache-kafka-apache-pinot-apache-druid-and-apache-superset-e67161eb9666
Unlock Advanced Data Visualization: The Complete Guide to Installing and Using Apache Superset on Linux | by Rathish Kumar B | Level Up Coding
https://levelup.gitconnected.com/unlock-advanced-data-visualization-the-complete-guide-to-installing-and-using-apache-superset-on-afecb3c63889
Kafka in 100 Seconds – YouTube
https://www.youtube.com/watch?v=uvb00oaa3k8
Apache Superset: Installing locally is easy using the makefile – DEV Community
https://dev.to/lyndsiwilliams/apache-superset-installing-locally-is-easy-using-the-makefile-4ofi
Setting up a local Apache Kafka instance for testing – Sahan Serasinghe – Engineering Blog
https://sahansera.dev/setting-up-kafka-locally-for-testing/
Postico 2
https://eggerapps.at/postico2/
DBeaver Community | Free Universal Database Tool
https://dbeaver.io/
Manage MySQL, MongoDB and PostgreSQL using Homebrew Services
https://www.chrisjmendez.com/2017/02/04/homebrew-services/
Install MongoDB, MySQL, and Postgres using Homebrew
https://www.chrisjmendez.com/2016/05/09/easy-mongodb-and-mysql-management-on-a-mac/
GitHub – datablist/sample-csv-files
https://github.com/datablist/sample-csv-files
Tweets Sample | Kaggle
https://www.kaggle.com/datasets/ahmedshahriarsakib/tweet-sample
Expanding Visibility With Apache Kafka – Salesforce Engineering Blog
https://engineering.salesforce.com/expanding-visibility-with-apache-kafka-e305b12c4aba/
How to setup Apache SuperSet – YouTube
https://www.youtube.com/watch?v=08jK2FbPMNI

Kafka

Building an Order Delivery Analytics Application With FastAPI, Kafka, Apache Pinot, and Dash, Part 1 | by Valerio Uberti | Better Programming
https://betterprogramming.pub/building-an-order-delivery-analytics-application-with-fastapi-kafka-apache-pinot-and-dash-part-ca276a3ee631
Building an Order Delivery Analytics Application with FastAPI, Kafka, Apache Pinot, and Dash, Part 2 | by Valerio Uberti | Better Programming
https://betterprogramming.pub/building-an-order-delivery-analytics-application-with-fastapi-kafka-apache-pinot-and-dash-part-f98202296d64
Generating production-level streaming microservices using AI – DEV Community
https://dev.to/airtai/generating-production-level-streaming-microservices-using-ai-41ji
An asynchronous Consumer and Producer API for Kafka with FastAPI in Python – Home
https://vinybrasil.github.io/portfolio/kafkafastapiasync/
GitHub – pedrodeoliveira/fastapi-kafka-consumer: A Python RESTful API using FastAPI with a Kafka Consumer
https://github.com/pedrodeoliveira/fastapi-kafka-consumer
FastStream: Python’s framework for Efficient Message Queue Handling – DEV Community
https://dev.to/airtai/faststream-pythons-framework-for-efficient-message-queue-handling-3pd2
Getting Started – FastStream
https://faststream.airt.ai/latest/getting-started/
Streamlining Asynchronous Services with FastStream | NATS blog
https://nats.io/blog/nats-supported-by-faststream/
Kafka – DEV Community
https://dev.to/t/kafka
Demystifying Apache Kafka: An exploratory journey for newcomers – DEV Community
https://dev.to/ladmerc/demystifying-apache-kafka-an-exploratory-journey-for-newcomers-18k9
Install Apache Kafka on macOS using Homebrew – DEV Community
https://dev.to/andremare/install-apache-kafka-on-macos-using-homebrew-5gno
FastStream: Python’s framework for Efficient Message Queue Handling – DEV Community
https://dev.to/airtai/faststream-pythons-framework-for-efficient-message-queue-handling-3pd2
GitHub – Aiven-Labs/python-fake-data-producer-for-apache-kafka: The Python fake data producer for Apache Kafka® is a complete demo app allowing you to quickly produce JSON fake streaming datasets and push it to an Apache Kafka topic.
https://github.com/Aiven-Labs/python-fake-data-producer-for-apache-kafka
Getting started with Apache Kafka using Python – DEV Community
https://dev.to/rubnsbarbosa/getting-started-with-apache-kafka-using-python-36ko
How to Start Using Apache Kafka in Python
https://kafkaide.com/learn/how-to-start-using-apache-kafka-in-python/
Installing and running Apache Kafka on MacOS with Apple Silicon | by Taapas Agrawal | Medium
https://medium.com/@taapasagrawal/installing-and-running-apache-kafka-on-macos-with-m1-processor-5238dda81d51
GitHub – BenasB/kafka-faker: User friendly and convenient Apache Kafka JSON message faking
https://github.com/BenasB/kafka-faker
Kickstart your Kafka with Faker Data – Speaker Deck
https://speakerdeck.com/ftisiot/kickstart-your-kafka-with-faker-data
Intro to Kafka using Docker and Python – DEV Community
https://dev.to/boyu1997/intro-to-kafka-4hn2