Kafka Tutorial - Quick Start Demo
HTML-код
- Опубликовано: 26 ноя 2016
- Spark Programming and Azure Databricks ILT Master Class by Prashant Kumar Pandey - Fill out the google form for Course inquiry.
forms.gle/Nxk8dQUPq4o4XsA47
-------------------------------------------------------------------
Data Engineering using is one of the highest-paid jobs of today.
It is going to remain in the top IT skills forever.
Are you in database development, data warehousing, ETL tools, data analysis, SQL, PL/QL development?
I have a well-crafted success path for you.
I will help you get prepared for the data engineer and solution architect role depending on your profile and experience.
We created a course that takes you deep into core data engineering technology and masters it.
If you are a working professional:
1. Aspiring to become a data engineer.
2. Change your career to data engineering.
3. Grow your data engineering career.
4. Get Databricks Spark Certification.
5. Crack the Spark Data Engineering interviews.
ScholarNest is offering a one-stop integrated Learning Path.
The course is open for registration.
The course delivers an example-driven approach and project-based learning.
You will be practicing the skills using MCQ, Coding Exercises, and Capstone Projects.
The course comes with the following integrated services.
1. Technical support and Doubt Clarification
2. Live Project Discussion
3. Resume Building
4. Interview Preparation
5. Mock Interviews
Course Duration: 6 Months
Course Prerequisite: Programming and SQL Knowledge
Target Audience: Working Professionals
Batch start: Registration Started
Fill out the below form for more details and course inquiries.
forms.gle/Nxk8dQUPq4o4XsA47
--------------------------------------------------------------------------
Learn more at www.scholarnest.com/
Best place to learn Data engineering, Bigdata, Apache Spark, Databricks, Apache Kafka, Confluent Cloud, AWS Cloud Computing, Azure Cloud, Google Cloud - Self-paced, Instructor-led, Certification courses, and practice tests.
========================================================
SPARK COURSES
-----------------------------
www.scholarnest.com/courses/s...
www.scholarnest.com/courses/s...
www.scholarnest.com/courses/s...
www.scholarnest.com/courses/s...
www.scholarnest.com/courses/d...
KAFKA COURSES
--------------------------------
www.scholarnest.com/courses/a...
www.scholarnest.com/courses/k...
www.scholarnest.com/courses/s...
AWS CLOUD
------------------------
www.scholarnest.com/courses/a...
www.scholarnest.com/courses/a...
PYTHON
------------------
www.scholarnest.com/courses/p...
========================================
We are also available on the Udemy Platform
Check out the below link for our Courses on Udemy
www.learningjournal.guru/cour...
=======================================
You can also find us on Oreilly Learning
www.oreilly.com/library/view/...
www.oreilly.com/videos/apache...
www.oreilly.com/videos/kafka-...
www.oreilly.com/videos/spark-...
www.oreilly.com/videos/spark-...
www.oreilly.com/videos/apache...
www.oreilly.com/videos/real-t...
www.oreilly.com/videos/real-t...
=========================================
Follow us on Social Media
/ scholarnest
/ scholarnesttechnologies
/ scholarnest
/ scholarnest
github.com/ScholarNest
github.com/learningJournal/
========================================
Want to learn more Big Data Technology courses. You can get lifetime access to our courses on the Udemy platform. Visit the below link for Discounts and Coupon Code.
www.learningjournal.guru/courses/
Thanks for the tutorial. Works like a charm. I typed up the commands being used for others to copy-paste.
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic MyFirstTopic1 --partitions 2 --replication-factor 1
bin/kafka-topics.sh --zookeeper localhost:2181 --list
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic MyFirstTopic1
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic MyFirstTopic1
Thanks, however, all of those are already available for copy paste at www.learningjournal.guru
@@ScholarNest Your course is very useful. Thanks, I found it on your site, but had to click and search to locate the page. Will do your full tutorial from your site this week.
www.learningjournal.guru/courses/kafka/kafka-foundation-training/quick-start-demo/
It's really nice tutorial series with clear explanations to anybody start to learn Kafka first time. Gone through many tutorials and I can say this is best as of today for beginners with little java background. Kudos to your work. Keep posted on advanced topics.
This is a great tutorial series. Thanks a lot. Keep up the good work Sir.
Very good series of videos, thanks.
The tutorial is awesome!
Really helpful . Good for starting, but after a few weeks diving, I hope you sir could give some deep-in video tutorials about kafak, especially, the kafka connect and kafka streaming .Thanks .
Fantastic Tutorial. I like it. Hope, more video will come in future
I already have 20 videos on Kafka. They cover Kafka core. However, streams and connect will be covered in future videos.
Excellent, Very very clear
Thank you sir great lectures
it's simple superb.
Hi Sir,,Could you please share the kafka tutorial document.Thank You.
Good tutorial for all levels,
Is it possible to transfer files from one location to another through KAFKA,pull data from DB, if so please do let me know or make an videos on it
Great Explanation and thanks for sharing it, Can you please also share something for kerberos authentication with Kafka.
Very good tutorial. Thanks :)
Very nice, good job!
its a really good series on Kafka. I have a query ==> as per my understanding in kafka consumer side mechanism is pull based means consumer will pull data from Kafka but in console example the data goes directly to consumer once provider sends data to broker.
Hello Sir,
Thank you for valuable guidance..!
I have one doubt, In my usecase, Sensors are generating real time data and I want to load this data into Kafka Topic using TCP/IP, how to do it??
Hello sir, could you please share some video explaining the implementation of data extraction from Mysql and feeding to kafka .. TIA
Awsome. Thanks
Dear Sir. It's really nice tutorial. but know i faced issue with logs. i want to know about where Kafka logs saved in linux?
Thanks again for this Tutorial.
while creating consumer it is asking command must include exactly one action --list, --describe, --delete;
I have a question about partitions and brokers. In your demo at 6:20 you created 2 partitions and Kafka had no choice but allow 2 partitions live in the same broker. What if later on there is a new borker joined the Kafka cluster? Will Kafka reassign one of the partitions to the new broker? How do I reassign partitions to new joined brokers manually?
That's an operations question which I haven't covered yet. It doesn't happen automatically. You have to rebalance the load manually.
Thanks!
You are just the best.
Easy to understand, easy English.👍
you are just amazing
Hi sir,
The videos you have are very informative. However it would be helpful if you explain on how to write those commands for creating producer and consumer.
I have other videos explain creating producer and consumer. Check the playlist or get it at www.learningjournal.guru
I liked your tutorial so far but you didn't gave an overview of how it's different from existing message broker like active mq, rabbit mq.
Would NiFi be an alternative to Producer/Consumer without the need for APIs? What's the common practice that you see being adopted instead of API approach?
+Deepak Sharma Yes, you can think of NiFi as a pickup vehicle and Kafka as a central bus.
Tried a simple DataFlow using Nifi and Kafka to transfer a local file from/to NiFi server (Kafka running on a different server btw). Was able to do the transfer using GetFile->PublishKafka ConsumeKafka->PutFile, but the original filename got lost on the final output (of PutFile). A simple GetFile->PutFile in NiFi preserves the name though. Any thoughts?
While trying to start zookeeper, m getting an error like
-Xloggc is deprecated.
And Could not create the Java Virtual Machine.
How to solve it?
Thanks for this tutorial Sir.
Hi Learning Journal,
Appreciate your channel to learn kafka basics.
Please share spark streaming + kafka using scala tuts as soon as possible.
Thanks
Thanks
Thanks
please note the corrected command ( or probably newer version is only accepting this )
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --bootstrap-server localhost:9092 --topic myfirsttopic
now send message from producer
What do you mean by corrected command? Do you notice any problem in the command shown in the video?
very good and easy to understand lesson.
@Learning Journal Really nice videos & good explanation, i have a question i just finished the flume and in flume also i establish the connection between 2 terminal for messages & data streaming also so WHAT IS THE DIFFERENCE B/W FLUME & KAFKA?
Apache flame was originally designed to consume log files to Hadoop. It does that well. However, Kafka has a wider use cases. It can transport data from virtually anything to anything. Kafka has got wider adoption, more active community and also backed by a startup. Kafka ecosystem is growing but we don't see flume getting focus and active development. Hope that helps.
Learning Journal Thank you so much
I watched until Video 10 in this series and gave up because it was getting too technical. But, I got a good idea about the whole infrastructure and process flow, thanks!
There are still a few basic things that I am not clear, and I think these should be explained up-front or early on:
- In your example, you started Producer and Consumer on the same Linux VM, but in the real world where exactly do you 'Start' the 'Producers'
- If I just want some CSV files from a few remote servers, I assume there will need to be a Producer running on each of those servers, but is there an example of how to get those files to a local storage?
- In a Hadoop environment, I assume the consumers would place the messages in a Data Lake and not in HDFS, some other process will do that later on, is that correct?
- In a real world, you embed the producer in your application itself. So, while the application is writing data to a local store, it can also send it to a Kafka broker.
- If you just want to bring a file to Kafka, you would simply import it using a Kafka connector.
- What do you mean by data lake in Hadoop environment but not HDFS? Do you think it's something other than HDFS?
I am new to this area, so pardon my ignorance. My thinking was that we get data from sources and stage it first into a storage area that's logically termed Data-Lake. From there you would use scripts and tools to copy the data to HDFS. That's why I thought that Kafka consumers 'stage' the data that they get from the Brokers, and then another process/connector would load into HDFS. Pls clarify.
All depends upon your design. You can use Kafka broker as a staging storage. Remember that Kafka Broker is a short-term storage. We can get it into HDFS and use HDFS as a long-term data storage. Then we can have another process that does the necessary transformation and push the final data into Hive/Cassandra/Mongo etc. for consumption from frontend. You can also have a Spark read it directly from Kafka Broker and push it to multiple storages in parallel.
Good explanation, thanks
How broker will connect here to zookeeper ?? Are we giving any zookeeper port details to broker properties before starting it ?
Nice to watch! can you please cover broker storage where messages seating physically?
That's covered in my Kafka Course available at Udemy.
Hi,
I am using MACBook Air, so, can I directly install KAFKA Setup in MAC Terminal, following the steps that is demonstrated in your video?
regards,
There are many ways to install Kafka. You can use tar or zip file on Mac. It's just uncompressing the tar/zip, and it should work.
Say there are 2 brokers for a particular topic, each broker handling particular set of partition. When producer A produces something in which broker will the produced message will be saved ?
By default it is based on the message key or round robin to balance the load. I have covered it in other videos.
Thanks for posting videos on kafka tutorial. I installed Kafka and when i'm trying to start zookeeper it is prompting error message "Java not found". Do we need to install the java separately or it will come along with the product ?
You should have JDK. It doesn't come along.
Excellent explanation, I have a small request can you please paste the commands that you used during this video in the description box.
That may not make sense. However, I have plans to add an associated blog entry for each video on the website and keep the blog link in the description. A Blog will fulfill several purposes.
Thanks a lot..
for new machine- please install "yum install java-1.8.0-openjdk.x86_64" and then start kafka
I am using Mac OSX Mojave. After I tried to create zookeeper the, there has been no message as `started`. Moving forward, when I am trying to create topic, getting error that `Replication factor: 2 larger than available brokers: 0`. I believe this means that zookeeper is not started. I am not able to understand why. I am using Kafka_2.11-2.0.0
I published a new video for Kafka 2. Check below url. It explains a multi node cluster. However, if you wanted a single node Kafka on your Mac, change the zookeeper and Kafka data directory and also set all topic parameters to 1. It should work.
ruclips.net/video/m8aEVx0gCEI/видео.html
And
www.learningjournal.guru/article/kafka/installing-multi-node-kafka-cluster/
Can we alter the number of partitions after the topic creations?
There are ways to do that, but you should plan to avoid that situation.
Will the data be saved in Kafka broker before reaching consumer?
+nikhil reddy yes. Broker saves data in a log file. You will learn all that in next videos.
can someone help with following error
Error: Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain
java --version
if it has version, you probably didn't untar.
it's really nice tutorial it helps me a lot in understanding Kafka. But I am one question that Is it necessary to install Kafka at producer side.
No, a producer is just an application that should have connectivity to Kafka cluster. You might need dependent jars to build your producer application but you certainly don't need to install Kafa there.
Thanks
I get this error for produce and consumer both..
[2019-07-08 15:02:47,321] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 1249 : {MyFirstTopic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
I followed all your steps
when I starts zookeeper , getting error "Address already in use" . Anyone knows how to solve this? how to kill the process already running?
I changed the port from 2181 to 2180 in zookeeper and server properties. it's working now . thanks. receiving msgs . feeling great :)
@@gobiviswa7848 How did you change it?
Kindly also explain how to install and configure Kafka on Windows.
Why on windows? Linux is the ideal platform. That's what you are going to get in a real life project.
Sir, Do i need to install zookeeper first ?
No. It comes with Kafka. You just need to start it as explained in the video.
what is the linux flavor/version you are using ? I am going to download kafka_2.11-2.0.0.tgz
CentOS
Learning Journal I am unable to start the kafka, as zookeeper is not getting started. As it is a dirty ZK I am unable to debug. IS Installing a separate ZK is a solution ? Please let me know if I can have your (@learning journal) email id or instagram id ...as sharing error log is not possible.
if any one has faced that due to zk failure kafka server not coming up can help.
Hey sir,
when I an trying to run command "bin/kafka-server-start.sh config/server.properties", it shows me an error message "usage: dirname path
Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file /../logs/kafkaServer-gc.log due to Permission denied
Error: Could not find or load main class config.server.properties"
Can you help?
Thanks in advance!
As your error said "Permission denied" make sure the user executing the command have permissions to do so. You might need to use "chown" and/or "chmod" if using unix based system.
I am getting an error like this while trying to initiate kafka server: kafka-run-class.sh: line 270: exec: java: not found
I couldn't find a solution to this online. Could you help? Thank you
+Siddhant Aggarwal I think you need to install jdk.
Kafka cannot be Installed On Windows?
+arunbm123 why windows? None of your real life projects would need it on windows. If you want it on windows just because you don't have a Linux machine to try it. Check out my Google cloud playlist. You can get a free Linux vm in Google cloud.
👍
If anyone gets the following error while starting the consumer
'bootstrap-server' is not a recognized option
Then try using
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic MyFirstTopic
can you explain or give me example of offline mode in kafka using spring please? or github repo...
What do you mean by kafka offline mode? You may want to explain a little bit.
@Learning Journal i'm waiting for your opinion ....
When Kafka broker is down, and you still want to send a message, your producer keeps trying for 2147483647 (default number of retries) times. If the broker comes back in that time, your message is sent successfully. Otherwise, it fails.
Other than this, there is no other thing called offline mode in Kafka.
How to display client.id on kafka console consumer?
Can you let me know what are trying with client id? I will test it once and share details.
Learning Journal Hello my use case is basically to extract client.id to extract source identification. I was able to display key on the console using --property print.key=true. Just want to know how to display all types of message metadata on console.
can we send data from GPS to Kafka broker ..???
+Pushkar Kumar can you elaborate it a bit? What exactly you want to achieve?
Message d u on Facebook
+Pushkar Kumar thanks for reaching out to me. I hope you got your answer. Let me know if you need anything else.
Hey, how to get back to root@sandbox after kafka server is started at 4:58
press ctrl+C to get back but remember it will terminate the service.
Learning Journal yes it shuts it down again
When i start zookeeper ir gives this error
ERROR Unexpected exception, exiting abnormally (org.apache.zookeeper.server.ZooKeeperServerMain)
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
Thank you very much for this tutorial. I really appreciate the efforts that you put in to create such a wonderful tutorial on Kafka. I have below doubt:
1. What if I dont start consumer and push lots of messages using producer. will those messages will be available till the consumer wakes up?
2. Lets assume we have a replication factor as 5, it means message will be replicated on 5 different partitions. So when we initiate a consumer group which would start consuming messages from different partitions, how Kafka avoids same message getting consumed?
Thanks in advance
Sachin
+sachinkumar Nagarare
1. By default, Kafka stores messages for 7 days.
2. Replication factor and partition are different things. You can have one partition and 5 replication. You will get better idea of it as you reach the fault tolerance video. I have covered it in detail.
Thank you very much for the answers. That explains everything.
I am getting the following error while typing message in the producer console ..
"Warn error while fetching metadata with correlation in id (int number) : {gvtopic=LEADER_NOT_AVAILABLE}(org.apache.Kafka.clients.NetworkClient)
please explain this...
have you created the topic? Is your Kafka broker running?
Yes I created the topic
How to check whether broker is running
If you were able to create a topic without any error, then it should work. Try it again.
hi how to know both broker and topic created and running ?
kafka-topics.sh --list --zookeeper localhost:2181
This command should give you a list of all topics.
getting an error while running this:
MAC_XYZ:kafka-2.3.0-src user1$ bin/zookeeper-server-start.sh config/zookeeper.properties
Classpath is empty. Please build the project first e.g. by running './gradlew jar -PscalaVersion=2.12.8'
You are in source code directory.
@@ScholarNest yeah Sorry !!! thanks for the reply. Fixed the issue and it worked like a charm. Keep up the good work :)
While starting zookeeper I got the following error :
Classpath is empty. Please build the project first e.g. by running './gradlew jar -Pscala_version=2.11.11'
Can help me on this? Thank you.
+Chaitanya Shahane I don't think you r following the steps from the video.
command line from your video:
bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic TEST --partitions 2 --replication-factor 3
command line from apache.kafka.org:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
As per the above 2 commands both has been scripted to create a topic, but the fields are mismatched . will it still work if we mismatch the fields after the .sh ?
Please correct me if I'm wrong .
Yes, it will. I showed it working in the video.
Thanks a lot !
Hi..I am getting permission denied while starting the Kafka Server. Can you assist?Rakeshs-MacBook-Air:kafka_2.11-0.10.2.0 rakeshpattanayak$ > bin/kafka-server-start.sh config/server.properties
-bash: config/server.properties: Permission denied
Rakeshs-MacBook-Air:kafka_2.11-0.10.2.0 rakeshpattanayak$
Haven't tried it on Mac but you should be able to fix the permission issue. Is it that difficult?
Learning Journal
Not sure i tried some sudo command for permission. But that did not solve the issue. I got rid of this issue by NOT moving my targ file from downloads to local folder. Weird but it worked
Learning Journal
Thank you for your quick response
anyone here who searched for kafka demo from honkai? 💀
meeee lmfao
Hello Sir,
First of all thank you for such a wonderful series on Kafka.However I am encountering a problem while starting Zookeeper.
zakaria@zakaria-Inspiron-14-3467:~/Downloads/kafka_2.12-1.0.0$ bin/zookeeper-server-start.sh config/zookeeper.properties
Error: Could not find or load main class org.apache.zookeeper.server.quorum.QuorumPeerMain
Can you please help me out?
+MD.ZAKARIA Barbhuiya you are downloaded Kafka 1.0 for Scala 2.12. There are some changes in the startup shall scripts. Check out apache Kafka documentation quick start for your version. Instead of using zookeeper-server-start.sh you should be using kafka-server-start.sh for this version.
Hi ,
I went through the documentation and even there it mentions to start the zookeeper with the command that I have used and then start the server.I tried starting the server first this time but I got the following error-
zakaria@zakaria-Inspiron-14-3467:~/Downloads/kafka_2.12-1.0.0$ bin/kafka-server-start.sh config/server.properties
Error: Could not find or load main class kafka.Kafka
Do I have to make changes anywhere in properties or something?