Thank you buddy. I am thinking about making some sort of zoom clone for my master's thesis. You definitely cleared my mind a little. Once more - thank you.
Thanks Jordan for amazing content. One question. Why does chat server need to UDP to recoding server? This has to be reliable right? we can use a queue here itself without needing to UDP
I think the recording is best efforts, and my concern is that using TCP and caching packets in the time before they are acknowledged could add quite a bit of extra load here. As I understand it the main hope is the recording server is simply good enough to hopefully get all of the UDP packets and "figure it out".
I've never used any of the sites personally, but Stefan Mai who I just had on the channel runs Hello Interview. I do think you could perhaps just do this with your friends but you know what works best for you
I'm not so good, and nothing that I'm covering is in any way invented or novel to me :) I'm just regurgitating things I google and information I aggregated
Hey Jordan! Thanks for the video. Question about the Kafka part. How is the video represented in the Kafka queue? Bytes of the video and audio streams? Can Kafka handle larger payloads like that? Thanks
I assume @jordanhasnolife5163 was joking about base64. The correct approach would be to compress using HEVC or something then serialize to kafka as byte arrays.
At 20:48, you used a timestamp which will be provided by central server to combine the frames in the stateful consumer. And central server is also partitioned right? So in case of expensive chats where there are multiple partitions involved in a single chat, there will be multiple partitions which will assign a timestamp field......leading to clock skew. Let me know if I missed something.
@@jyotsnamadhavi6203 Sorry, I mean that each partition of the central server (keep in mind each chat is on just one of them) is using its own timestamps, so the frames of each video call can actually use those timestamps since they're coming from one pc
If chant server is going to add timestamp to each frame across members of the same chat, why it can;t be a Kafka producer and post it to Kafka? Timestamping implies serialization, and not getting how subsequent divergence to different stream servers would help in performance.
I don't really think timestamping implies serialization, it just implies sending a timestamp with each UDP packet. I guess the bigger thing here is that sending messages to kafka probably has to be done over TCP, meaning that they're blocking, meaning we could have to cache all of those frames on our kafka producer. In practice though, maybe it could just send directly to kafka. We'd have to try it out!
This is like your girlfriend telling you how much better you made her life right before she ends the relationship. Congrats though dude, well deserved!!
I watched your videos and other resources but still did badly in the interview, not because of the technical difficulties, just because i didn't fully understand or clarify the problem and this led to wrong assumptions or missing cases
@jordanhasnolife5163 Another takeaway is patience, i usually rush to start drawing or designing tables without fully understanding the business problem, that's because i am always concerned about time, and i want to quickly talk about the fancy stuff about distributed systems, i should spend more time to understand the case carefully and then start solving
I suppose so but gist is: 1) webcrawler (we did this) 2) page rank (algorithm in spark) 3) order websites in search index by page rank and distribute search index 4) lots of caching
Thanks for the great content Jordan. One minor thing: we might need to separate video and audio streams.
Thank you buddy.
I am thinking about making some sort of zoom clone for my master's thesis. You definitely cleared my mind a little. Once more - thank you.
Thank you Jordan! Awesome breakdown as always.
Thanks Jordan for amazing content. One question. Why does chat server need to UDP to recoding server? This has to be reliable right? we can use a queue here itself without needing to UDP
I think the recording is best efforts, and my concern is that using TCP and caching packets in the time before they are acknowledged could add quite a bit of extra load here.
As I understand it the main hope is the recording server is simply good enough to hopefully get all of the UDP packets and "figure it out".
really good content, keep it up bro !!
Love the content Jordan! Any place you'd recommend for mock interviews?
I've never used any of the sites personally, but Stefan Mai who I just had on the channel runs Hello Interview. I do think you could perhaps just do this with your friends but you know what works best for you
Great context! One question - how are you so good? How'd you learn?
I'm not so good, and nothing that I'm covering is in any way invented or novel to me :) I'm just regurgitating things I google and information I aggregated
Hey Jordan! Thanks for the video. Question about the Kafka part. How is the video represented in the Kafka queue? Bytes of the video and audio streams? Can Kafka handle larger payloads like that? Thanks
Yeah basically, and it should be able to handle big payloads! This would just be for like a frame at a time anyways
Thanks, Jordan! One question regarding Kafka part. How to store frame images and audio parts in Kafka messages?
You can probably encode in base64 or something
I assume @jordanhasnolife5163 was joking about base64. The correct approach would be to compress using HEVC or something then serialize to kafka as byte arrays.
@@htm332 I wasn't, I just don't know better. Anything that hashes the image to a string/byte array.
Good content.
Can you take examples on cache invalidations?
Can you elaborate on this?
At 20:48, you used a timestamp which will be provided by central server to combine the frames in the stateful consumer. And central server is also partitioned right? So in case of expensive chats where there are multiple partitions involved in a single chat, there will be multiple partitions which will assign a timestamp field......leading to clock skew. Let me know if I missed something.
Chats themselves are partitioned, but there will always be one central server to a chat, that's why it's "central".
@@jordanhasnolife5163 Do you mean there is a one central server that provides timestamps to the partitions?
@@jyotsnamadhavi6203 Sorry, I mean that each partition of the central server (keep in mind each chat is on just one of them) is using its own timestamps, so the frames of each video call can actually use those timestamps since they're coming from one pc
Hey Jordan, we deserve a further reading list as well, no? Like you used to do it in old videos.
And I deserve a model wife
@@jordanhasnolife5163 No Life, No Wife
Kafka shard on recording id? Did you mean that all messages with the same recording id go to the same topic partition?
Yep!
If chant server is going to add timestamp to each frame across members of the same chat, why it can;t be a Kafka producer and post it to Kafka? Timestamping implies serialization, and not getting how subsequent divergence to different stream servers would help in performance.
I don't really think timestamping implies serialization, it just implies sending a timestamp with each UDP packet. I guess the bigger thing here is that sending messages to kafka probably has to be done over TCP, meaning that they're blocking, meaning we could have to cache all of those frames on our kafka producer. In practice though, maybe it could just send directly to kafka. We'd have to try it out!
Yoo I just got an IC4 offer from Meta primarily due to your videos. 🙏 Appreciate your work it makes a real impact
Unsubscribed
This is like your girlfriend telling you how much better you made her life right before she ends the relationship.
Congrats though dude, well deserved!!
What if some UDP packets get dropped? Is there any way we can guarantee the video is completely recorded?
Only thing I can think of would be to have the central server cache some data, recording server requests data again based on packet sequence numbers.
how can we guarantee the stream is completely recorded? For UDP the packets may be lost.
Responded to your duplicate question, use sequence numbers and see stock exchange video
I watched your videos and other resources but still did badly in the interview, not because of the technical difficulties, just because i didn't fully understand or clarify the problem and this led to wrong assumptions or missing cases
Sounds like you may want to aim to improve on your communication skills then. Any other takeaways?
@jordanhasnolife5163 Another takeaway is patience, i usually rush to start drawing or designing tables without fully understanding the business problem, that's because i am always concerned about time, and i want to quickly talk about the fancy stuff about distributed systems, i should spend more time to understand the case carefully and then start solving
@@hazemabdelalim5432 Sound reasonable to me! Sometimes the questions you ask are a more important signal than the design you put out!
can you make a search engine design?
I suppose so but gist is:
1) webcrawler (we did this)
2) page rank (algorithm in spark)
3) order websites in search index by page rank and distribute search index
4) lots of caching
@@jordanhasnolife5163 oh yeah mb XD
Why not to use multicast?
we do?
Thanks much!
watched. --