Distributed Computing using a Redis Queue
HTML-код
- Опубликовано: 13 сен 2024
- My approach to using a queue (Redis in this example) to create a highly scalable distributed computing system, and why I prefer this approach to Python's sub processing functionality.
#mathbyteacademy #python
Code for this Video
================
Available in GitHub blog repo: github.com/fba...
Direct link: tinyurl.com/2c...
My Python Courses
=================
- Python 3 Fundamentals (introduction to Python)
www.udemy.com/...
- Python 3 Deep Dive (Part 1 - Functional)
www.udemy.com/...
- Python 3 Deep Dive (Part 2 - Iteration, Generators)
www.udemy.com/...
- Python 3 Deep Dive (Part 3 - Hash Maps)
www.udemy.com/...
- Python 3 Deep Dive (Part 4 - OOP)
www.udemy.com/...
Man, I've been struggling with multiprocessing in python and this solution is a game changer! Thanks a lot for that!
Glad it helped!
Very good video, bravo to the creator! Very clear explanation, not boring. Would love to see more and more content from you
Thank you! I add content to this channel regularly, there's also quite a number of videos already, plus I have my Udemy courses with even more content.
+ for video on RabbitMQ!
thanks, Fred! that's the kind of content I've been waiting from you!
You're welcome!
I like these more project-based/conceptual videos a lot, make more plz!
Please make a Python deep dive about Asyncio, couldn't find a good course on this matter.
I believe many people are looking for one too
Hey, there's one great book on Asyncio: Matthew Fowler's "Python concurrency with asyncio", difficult, but very well written and structured; saved me from vagueness of other resources once. If you take this extra mile and study it over a few days/weeks, you'll likely end up in a near-expert position on async :) Good luck
I've stumbled across this video and recognized your voice from your udemy videos. I've watched all of them (except pydantic v2 which is on my list now) and have grown considerably thanks to your efforts.
I'm glad my courses could help!
Really great tutorial ! Thanks for all your videos or courses on Udemy, I'm learning a lot
Your examples are always very clear and understandable
Thank you, glad you find them useful!
Thanks for the great and to-the-point tutorial. I too would like to see some strategies for dealing with failures i.e. when a worker crashes etc.
Please, cover the problem related to the resilience to message losses in case of worker failures. Thanks a lot in advance.
sir, that was the best python tutorial of my life, thankyou very much.
You're welcome!
Thank you very much for this great tuturial. Your Deep Dive Tutorials at Udemy are legendary! It would be great to see more of Redis, Reliable Queue and load balancing with python.
Thanks, glad you like them! I have no experience with ReliableQueue, but I am planning a video on ElasticMQ / SQS at some point.
Thank you very much for this great video.
all your videos are a garant for quality.
I would be very happy if you make more videos to this great topic.
Thank you!
Brilliant video.
Glad you enjoyed it
Your explanation is very clear. Thanks a lot
You are welcome!
Thank you Fred! Would appreciate a follow-up video on how to auto scale up and down the number of workers across different computers. Would you kindly provide any reference on that topic.
The way I'm used to is using Kubernetes - but dev ops is not my thing. I have a colleague who's an absolute wizard at it, and he's the one that sets up the cluster with the workers. If you're looking to do this yourself, you might want to look at learning Kubernetes (also you'll want to add readyness and liveness probes to the workers).
Great video, nice code structure, very scalable and easy to maintain.
Thanks! That's what I really like about this approach, it gives good scalability for a variety of problems, without any code complexity. And the queue can be anything you already have in your environment, rabbit, sqs, etc, or even redis in a pinch.
Thank you for your work! This tutorial is amazing, very thorough and detailed 🙂
Glad you like it!
Thank you very much Fred! This is very good video and I have learnt a lot from you
You are very welcome!
Gracias Fred, follow-up Redis video would be great.
this was super helpful! Thanks for making this video
Glad it was helpful!
Amazing video. Thanks!!! It looks like exactly what i needed!!
Glad it was helpful!
Great as always Fred! Thanks a lot.
My pleasure!
Thanks a lot!
You're welcome!
Excellent material!
Glad you enjoyed it
Thank You Fred. Maybe you will add some topic about testing? pytest, mocking etc.
Anyway thanks a lot.
Thanks! Good suggestion, and it's already on my to-do list at some point.
fault tolerance w/Redis: a worker simultaneous RPOP 1 job from a queue & PUSH it to a 'processing set' w/Time-To-Live & workerID which another process monitors to restart expired TTL's & check/restart worker... worker deletes the job from the processing set as its last step before consuming a new job.
there's all kinds of custom rules or workflows that can be build with Redis, i was just looking into 'priority weight' and couldn't find a way to handle it in a single AMQP queue but a Redis queue was no problem to filter for priority.
Thanks for that!
I assume you can make RPOP and PUSH as a single atomic transaction ?(never tried it, so curious if that is the case)
@@mathbyteacademy shoot, can't perform atomic ops on multiple keys. In that case consuming worker must filter retrieve and flag the next noon flagged job as processing with TTL and worker id, which probably requires a script to combine as atomic... But apparently my first suggestion could use a script
Edit: bonus of this approach is we don't need a failure manager, each worker can check the TTL on active jobs and grab oldest expired on a single stack.
@petemoss3160 Good to know, I haven't really touched Redis for this in a long time because of the issue, so wasn;t sure if something had changed. I have ended up using a simpler solution with SQS (or ElasticMQ as the local equivalent) for this - although I still use Redis, but only to de-duplicate SQS messages (since they are only guaranteed at-least-once, unless their are FIFO, but more expensive).
@@mathbyteacademy it definitely feels like unnecessary boilerplate to reinvent a process implemented better in software like you mentioned... Of which we have too many to choose from. DDS (used by ROS2) just came to mind and apparently supports ordering and prioritizing, so maybe that's what I was looking for.
@petemoss3160 Lots to choose from indeed. I was not aware of ROS2/DDS, looks interesting and something I'll definitely read up more on - thanks for mentioning it!
Thank you for the video. It was. Really nice, please can you help with the problem of data loss in condition of worker failure
I would probably opt for a different queue - something as simple as SQS (or elasticMQ if you prefer to runs things locally or without AWS) works well, and is easy to setup. There seems to be too many hoops to jump through to use Redis as a reliable and durable queue - but if you have to have Redis, here's an article from the Redis team on how to achieve that:
redis.com/glossary/redis-queue/
perfect as always
hi Fred, what about some video about dependency injection in Python? I guess it could be nicely applicable to this video - to inject different message brokers.
👍🏻👍🏻👍🏻👍🏻👍🏻
Thanks!
Great tutorial! But what about redis-queue? It isn't an option of the same kind as RabbitMQ or the others? What do you think about it?
Any tips on using Janus queues in the context of event driven architecture with ZeroMQ would be much appreciated.
Sorry, I don't know Janus and only worked with ZeroMQ briefly some years ago.
Great as always. You can try tmux 😜. Thanks for the great content, keep going🙂.
Thanks! Ah, tmux, I haven't used it in such a long time, I don't think I remember the shortcuts anymore 😀
+ for RabbitMQ
why do we need celery then? 😕😕😕😕😕😕
Celery? You mean the vegetable that leaves a bad taste in my mouth?
or... the software library, that also leaves a bad taste in my mouth? 😀
Celery is OK, but not my first choice when writing a micro service type of architecture. I'd rather have complete granular control over everything. Past experiences in Celery made me look elsewhere. It's been a while so maybe things have improved, but I did run across this recent blog post that indicates they may not: docs.hatchet.run/blog/problems-with-celery
+ for video on RabbitMQ!