Low Latency Stock Exchange Design Deep Dive with Google SWE! | Systems Design Interview Question 15

Поделиться
HTML-код
  • Опубликовано: 16 июн 2024
  • While I don't exchange stocks, I do a lot of exchanging cocks
    00:00 Introduction
    01:44 Functional Requirements
    02:42 Capacity Estimates
    03:07 API Design
    04:03 Architectural Overview
  • НаукаНаука

Комментарии • 80

  • @ThisIsntmyrealnameGoogle
    @ThisIsntmyrealnameGoogle Год назад +17

    Love these videos! Not even necessarily to pass interviews, it's just really fun learning this stuff and you make it easy to digest

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +2

      I appreciate it man! Lmk if there's anything else I can help you digest rawr

  • @jh0720
    @jh0720 Год назад +22

    I know the subs may be coming in slowly, but your content is way better than most on RUclips. Keep grinding my man 👉👈

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +6

      Thanks dawg! Will do my best! The subs don't really matter to me too much, but I guess the more of them that I have the less I can be afraid to tell people about this channel haha

  • @yangsong6025
    @yangsong6025 10 месяцев назад +9

    Excellent video! I believe the idea comes from the video of Jane Street "how to build exchange". However, this is way more clear, and your graph helps me a lot in understanding how re-transmitter works. Will recommend to anyone who is preparing for a HFT interview🥺

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 месяцев назад +2

      Haha you've reverse engineered my sources! But glad I could add something, thanks!

  • @immanuelifere3654
    @immanuelifere3654 8 дней назад +1

    I'm watching how casually this guy is breaking down my bugs and I'm like who is this guy?

  • @drakezen
    @drakezen Год назад +1

    Good content, appreciate the work you put in. Keep at it!

  • @LeDebut15
    @LeDebut15 Год назад +3

    your sense of humor is fantastic lmao

  • @blazkowicz666
    @blazkowicz666 Год назад +4

    Such an underrated channel!

  • @2005kpboy
    @2005kpboy Год назад +5

    Stock exchange design is not just low latency, but also high throughput and high degree of consistency.

  • @bumq5514
    @bumq5514 Месяц назад +1

    Just found this channel and the intros are crazy lmao did not have to call me out like that off rip

  • @art4eigen93
    @art4eigen93 Год назад +2

    Subbed after you surfaced on Gaurav's channel. Doordashing SYSD never seen that fast.

  • @nibanna-ai
    @nibanna-ai Год назад +2

    The part about the matching engine was very interesting!

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +2

      Thanks! I wanted to provide a bit more insight into what goes on in there than some of the other videos have.

  • @nikenuke
    @nikenuke 2 месяца назад +1

    Nicely done! Thank you

  • @niranjhankantharaj6161
    @niranjhankantharaj6161 11 месяцев назад +3

    Thanks for the great video! we use udp multicast to send the results of matching engine (that has in-memory data structure for buy and sell prices for each stock ticker) so its fast. But, after the matching engine matches, the execution engine should still persist in SQL DB (and may be publish it to. queue for history service). Can we have architectural diagram that describes these along with order of the flow. I assume the order is as follows:
    1. Matching engine identifies a match and calls trade executor service
    2. trade executor service executes the trade and stores it in SQL DB as a transaction and responds back to matching engine
    3. Trade executor then returns to matching engine that returns response to client
    4. If trade executor fails, matching engine should retry ?
    or
    matching engine sends a mutlicast message, and we have executor service (Clearing service ), TRF (trade report service), as the recipients of matching engine

    • @jordanhasnolife5163
      @jordanhasnolife5163  11 месяцев назад +1

      I think in practice you'd probably have a clearing service listening to the feed and storing which trades have occurred in a database, the whole intention is to bog down the matching engine as little as possible

  • @aniruddhkhera510
    @aniruddhkhera510 Год назад +2

    Hey Jordan, amazing videos! I recently started following you and I have learned a lot from your videos.
    Just a small request/suggestion, can you add the resources which you looked up in the video description, or maybe if you are talking about something just show some visuals in the video.
    For example, in this video, the limit order book algorithm was an interesting topic, it would have been amazing if you had a small pseudo code of what you talked about in the video itself, or if not, just add some resources which you looked up related to it!

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +3

      Totally understand! Truthfully I could do it it's just quite a bit of effort and I'm short on time haha! Happy to link more resources though

  • @user-sz3nn2lb6u
    @user-sz3nn2lb6u 4 месяца назад +1

    nice work! thanks

  • @raysdev
    @raysdev Год назад +2

    "so yeah, that's about all i got there, it's kind of a shame how many times i've had to say that sentence to girls" 😅
    i'm getting lots out of these & assuredly others as well, so keep churning out these informative vids! 🙌

  • @danielvega-myhre4201
    @danielvega-myhre4201 7 месяцев назад +3

    If the matching engine is broadcasting completed trades via UDP multi-casting to the retransmitter (among other services), how can we know the retransmitter actually receives all the completed trades properly since UDP has no sequence IDs or error recovery mechanism? The retransmitter service is supposed to provide a solution to the fact that UDP has no idea if any packets were dropped, no error recovery, etc. but if the retransmitter itself is receiving the data from the matching engine via UDP how do we know the retransmitter has the complete history of all the trades that day?

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 месяцев назад +2

      I think that even though UDP itself doesn't build sequence numbers into the messages we do still end up attaching trade sequence numbers to each individual message, so the retransmitter should be able to see if it misses some and request them from the main matching engine again.
      There are certainly some race conditions that come out of this design that could lead to weird states, and at least afaik exchanges will kinda deal with this post hoc since it doesn't happen often

    • @danielvega-myhre4201
      @danielvega-myhre4201 7 месяцев назад +1

      @@jordanhasnolife5163 "so the retransmitter should be able to see if it misses some and request them from the main matching engine again" -> so does this mean the matching engine is also keeping a historical log of matched orders, so that it can retransmit it to the retransmitter if necessary?
      How long does the matching engine keep the records of matched trades in memory (I assume it doesn't write them to disk) before deciding the retransmitter must have successfully received them, and deleting them? It seems like we would need to either decide on an arbitrary threshold and accept some occasional/rare data loss, or have the retransmitter acknowledge the receipt of each of the matched order - however, between adding sequence IDs to the messages and now acknowledgement of the messages, it seems we have re-implemented key parts of TCP here, which make me think we're on the wrong track.
      Also, the matching engine storing previously matched trades in memory would use substantial memory resources that could otherwise be used by the main matching engine data structures storing limit orders.
      Am I missing something? What do you think?

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 месяцев назад +1

      @@danielvega-myhre4201 Don't see any reason why it couldn't write them to disk asynchronously, but it could keep them in memory too. I still think what you're describing is much faster than TCP as there's no flow/congestion control. Nonetheless, I don't actually work for an exchange haha, but you've done a good job describing the tradeoffs of the different design choices that we can make here.

  • @mehtubbhai9709
    @mehtubbhai9709 2 месяца назад +1

    Awesome content! You definitely deserve more subs.
    Perhaps you could also add some programming language tutorials. I would love to see the best practices of how a Google SE writes code.

    • @jordanhasnolife5163
      @jordanhasnolife5163  2 месяца назад

      I'm a sub, I deserve more doms
      I appreciate the suggestion! At least for the time being, I'm not sure that I'm going to do something like this, as I think there are many more qualified people than I to do language tutorials haha. And if I'm being honest with you, I write garbage code, as I'm super lazy

    • @mehtubbhai9709
      @mehtubbhai9709 2 месяца назад

      @@jordanhasnolife5163 Lol!

  • @sarthakuiit
    @sarthakuiit Год назад +1

    Great content Jordan! Already watched 5 of your videos back to back.
    AFAIK Order book is maintained per stock , so does it make sense to shard the matching engine based on stocks?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +1

      Hi Sarthak I touched upon this a bit in the video - in theory yes, but in reality some traders will want to conditionally trade multiple stocks, and if so having different matching engines for those stocks is very complicated because then you need to do distributed transactions which are super slow

    • @sarthakuiit
      @sarthakuiit Год назад

      Ok So just to understand the scenario well… The condition on which trader trades is somehow related across various stocks (at broker level I guess).
      So, Even in that case if broker places say 5 orders for different stocks to the stock exchange , can we not maintain these orders in separate machines? In other words, why distributed transaction is needed?
      May be I am missing something.

  • @Snehilw
    @Snehilw Год назад +1

    @Jordan Do you have any favorite resources that you often go to to learn about System Design stuff? I know you have mentioned a couple of times that you read technical papers on new approaches. I am curios to know what leads you to them or if you have any favorite resources/blogs, etc. in this area.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +3

      TBH I mostly just google and go from there - anytime you see a term you don't understand make sure to Google that too! And write down what you learn so you don't forget it!

  • @payammesgari2011
    @payammesgari2011 10 месяцев назад +1

    Thanks for your great content. Could you explain what communication protocols and methods are supposed to be used between the Order Service and the Matching Engine? As well as, between the Matching Engine and the Retransmitter?

    • @jordanhasnolife5163
      @jordanhasnolife5163  10 месяцев назад +2

      Typically you'll send orders to the matching engine via TCP but the matching engine and retransmitter communicate via UDP (that's the whole point for the retransmitter existing)

  • @varshard0
    @varshard0 4 месяца назад +2

    Instead of UDP. Why primary and secondary matching engines don't subscribe to the same Kafka topic so that both engines get the same information?
    Is that because Kafka is too slow?

    • @jordanhasnolife5163
      @jordanhasnolife5163  4 месяца назад +2

      Yep lol, would be much more fault tolerant to use Kafka here but it's too slow

  • @andreiriakhovskii7555
    @andreiriakhovskii7555 Год назад +2

    Hey Jordan, thank you for your video!
    The thing I didn't get. How given design (assuming Order service is a part of stock exchange infrastructure) guarantees "fairness" of an execution order for different clients. For example, if client1 sends LimitOrder1 to OrderService1 before client2 sends LimitOrder2 to OrderService2 for the same equity, can it be that because of congestion or other reasons OrderService1 reaches matching engine after OrderService2 and as a result handling order is reversed? Is there a single entry point like router after which the order of execution orders is preserved and identical to order of receiving IP packets?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +1

      Hi Andrei! As far as I'm aware, the true definition of which order is "first" is which is first executed by the order book, not which one first reaches an order service. Speed is the name of the game here haha but I see your point.

  • @nishilvy
    @nishilvy 8 месяцев назад +1

    Very methodical process, and detailed explanation. Thanks for sharing such a gem of knowledge here. Have a Q about design interview preparation: Since there are so many systems, is the way to go over as many designs as possible and remember all algorithms, or is it possible to have a pool of Design Algos (like use of heap in matching engine), and use it in similar scenarios? Thanks so much, in advance!

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад +1

      Thanks!! I think that for your question, you don't have to remember everything, and in fact most of these systems design problems don't involve any algorithms.
      The reason that I put a lot of effort into making "concepts" videos on my channel in addition to just practice problems is that I think once you really understand those well, you have the building blocks to tackle like 95% of systems design problems on the fly. Of course it's always possible that one does come along where you'll need an algorithm, and if so, tough luck, but there comes a point where you just get diminishing returns from trying to remember everything.

    • @nishilvy
      @nishilvy 8 месяцев назад

      @@jordanhasnolife5163 Got it. Thank you for prompt help!

  • @dsinkey
    @dsinkey 4 месяца назад +1

    You mentioned a Jane Street video that you watched to get some ideas for this video, do you have a link to it? Also, where did you get the information about the data structures for the order book?

    • @jordanhasnolife5163
      @jordanhasnolife5163  4 месяца назад +1

      ruclips.net/video/b1e4t2k2KJY/видео.html&ab_channel=JaneStreet
      gist.github.com/halfelf/db1ae032dc34278968f8bf31ee999a25

  • @Ms452123
    @Ms452123 Год назад +3

    In the long run the channel will be like neetcode but for system design. So keep going and it will pay. Might even generate enough money to not work for big tech and just do your thing in the future lol

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +7

      Ha you're not wrong regarding the money - truthfully though while I like studying these systems there are only around 25ish questions that actually get asked in these interviews. As a result, I could go around and make videos that are redundant (e.g. just do every single variation of a service slightly similar to Facebook messenger) but I don't really wanna waste other people's time (or my own for that matter). I think it's more likely that I eventually pivot to studying different material for my own sake or just become a shill like all these other channels and make crap videos just for views. We'll see though!

  • @franklinyao3833
    @franklinyao3833 Год назад +1

    why do you use zookeeper at 15:14 to determine if the primary goes down while earlier you said coordination service is too slow?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +2

      Coordination is too slow - zookeeper is an example of what you could use but in reality they'd probably have some sort of custom consensus solution, just used zookeeper to get the point across :). Also, for the purposes of figuring out whether a node is down, zookeeper needs to agree on far fewer things than if it had to hold the distributed log of every single trade message.

  • @franklinyao3833
    @franklinyao3833 Год назад +1

    I am confused at 16:49 when you mixed sharding and replication. Sharding and replication are different, why do you mix them?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +1

      Could you elaborate on this? My point is that you need replication in order to back up the matching engine if it goes down (state machine replication), and that if you want to potentially increase throughput you could shard instances of the matching engine by ticker, at the cost of decreased functionality in cross ticker trading. Thanks for the question!

  • @dibll
    @dibll Год назад +1

    Please spend more time on explaining the limit order book data structure in V2 of this video( I am assuming you are creating a brand new series of the entries set of design questions).

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад

      Point noted and I appreciate the feedback, it'll take me a while to get to this but will do

    • @dibll
      @dibll Год назад +2

      @@jordanhasnolife5163 Thanks. I feel your course content is the best out there in terms of technical depth. spending more time on explaining things and adding few illustrations along the way would make it much more better.

  • @y2k898
    @y2k898 Год назад +1

    quick question, all these make sense. but on the apigateway side, how do you get response after place an order? e.g
    request { symbol: APPL , amount: 100, price: 200}. , given your description, all these happened in a event driven manner, so question is : how do you get the response for that particiar request? Specifically how do you get a partially filled order response?
    Does it mean OrderService will. have to wait for the event (filledEvent) to arrive before response to the request?
    Thanks.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +1

      I'm a bit confused by this question but basically the exchange will have multiple output feeds. One of which will have trade acknowledgements, in which you can poll to see what volume was filled in your order

    • @y2k898
      @y2k898 Год назад +1

      @@jordanhasnolife5163 thanks I think it might also make sense to implement CQRS on order management service which consume the order filled event so that its own order status can be updated. Of course direct subscription to events stream for notifications purposes on order updates would be much faster

  • @davendrasingh518
    @davendrasingh518 5 месяцев назад +1

    So you are saying primary and secondary will have to synchronous communication to be fault tolerant and sync to be each other ?.

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 месяцев назад

      Yeah that's a good question, I guess the idea would be that if the matching engine sends a message and the backup is down, it could ideally come up and query the re-transmitters in order to get back up to date on its state.
      I'm actually not sure in reality the guarantees that exchanges offer in terms of whether a message that is sent out can be erroneous, as to ensure that this wouldn't happen we would need some sort of consensus or two phase commit or synchronous replication as you've mentioned. That would obviously slow things down. It's possible that in real life, the exchange might just message participants and say "shoot we messed up".

  • @user-zn8dm8cp5t
    @user-zn8dm8cp5t 4 месяца назад +1

    Does the exchange in any way verify/validate the request sent by the broker? If so the verification has to be done before the Order is added to the OrderBook right? Just eager to know cause doing these extra checks would prolly slow down the req

    • @jordanhasnolife5163
      @jordanhasnolife5163  4 месяца назад +1

      I'd imagine that any authorization validations would be performed by the order service - otherwise yes they would slow down the matching engine.

  • @adityakrverma612
    @adityakrverma612 Год назад +1

    Can you bit elaborate how DS will look like with two hashmap. What key-value for both hashmap at 7:24, why do we need two hapmap.
    It would be great if you can provide link to resources for your thought process.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +1

      Sorry the link is in the low level design for this one, but it's two heaps:
      gist.github.com/halfelf/db1ae032dc34278968f8bf31ee999a25

    • @adityakrverma612
      @adityakrverma612 Год назад +2

      @@jordanhasnolife5163 : Perfect, I was just reading same article before posting my question. Thank you once again

  • @tofahub
    @tofahub Год назад +2

    Loving this channel. It would've blown up if it weren't for your obscene jokes

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +2

      After all the McDonalds I ate last night that's not the only thing that would be blowing up

    • @tofahub
      @tofahub Год назад

      I truly love your channel but I gotta defer your jokes. They’re not funny

  • @utkarshgupta2909
    @utkarshgupta2909 Год назад +1

    This doesn't seems to me a tree but min and Max heap.. what u say?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +1

      I think either a min and max heap works, or better yet you can use a binary search tree (in which case I drew the diagram wrong, and the points would be at the top left and bottom right of trees respectively).

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад

      gist.github.com/halfelf/db1ae032dc34278968f8bf31ee999a25

  • @hanmuzuo3264
    @hanmuzuo3264 2 месяца назад +1

    how can we make sure retransmitters don't lost any matched exchange via udp?

    • @jordanhasnolife5163
      @jordanhasnolife5163  2 месяца назад +1

      To be honest, I'm not sure that we can. In a distributed system, we basically require consensus in order to not lose things, and that's just going to be too slow here.

    • @hanmuzuo3264
      @hanmuzuo3264 2 месяца назад

      Alex Xu mentioned reliable udp in his book. i think that might be the solution. Thank you@@jordanhasnolife5163 for bring this amazing video.

  • @kaushikkomanduri4126
    @kaushikkomanduri4126 7 месяцев назад +1

    did you stop doing these videos tho?😂

  • @mollusk_musk2467
    @mollusk_musk2467 11 месяцев назад +2

    I just subscribed so you don't go do Only Fans