thank you jodron. Best content in the internet for distributed systems. hoping to pass the interview tomorrow. I covered most of your videos :). Passing Good karma to you
can anyone enlighten me about the last example ? If we real all values from snapshot 15 and before, wouldn't the total value be different than 1 Million, and we will have an inconsistent read ?
The point here is just that when we make a read at a specific timestamp, and read many rows at once, we will see the DB in a consistent state. If we read many rows at once using different timestamps than we can expect to see things in an inconsistent state.
Feels like a lot of stuff that the database needs to store per row then. I'm hoping there would be ways to trim older transactions per row, otherwise running the find operation at time T -- even if logarithmic using binary search, could be expensive both in memory requirements + time complexity. Very interesting though! Thanks for your content man.
Reads should still be pretty fast, we now have to store more data, basically all the transaction history of each row. Plus, snapshot isolation is still not perfectly serializable.
The trick for snapshot isolation is storing the old values, which looks same as what we did for read committed, which also stores old values. What's the difference?
Hi Jordan, So I was doing some readin about the differences between read committed isolation and snapshot isolation, and there was something that I couldn't fully understand. It said that Read committed isolation doesn't check conflicts at the commit time but whereas snapshot isolation does that. A bit unclear to me, hope you could help!
I'm not really sure what you mean by conflicts in this sense. Read committed isolation will just ensure you aren't going to read any uncommitted data. Snapshot isolation will just ensure that if you read multiple rows, it was after one distinct transaction time, and you wouldn't ever see the database in a state where a transaction was only partially completed.
So this means to read a value, we should not just read the row, but also the write ahead logs. How bad is this overhead? How do we know for which row we should even look into the write ahead logs?
Not sure what you mean by this. In snapshot isolation, we store multiple values with their transaction number where we would typically store just one, and read from there.
for Snapshot Isolation to be effective the read part needs to happen as part of some transaction, right? I can't just use multiple read statements without enclosing them in begin and end transactions block.
Thanks for clearing that out. I just wanted to suggest that you should share your videos on linkedin in “how would you solve this problem kinda format” with a link to the video it might help a lot.
@@hardikmenger4275 I appreciate the suggestion! While I may one day do this, I don't want to share on Linkedin just yet as I don't want people to know IRL finding out about these haha - I say too much dumb stuff
@@jordanhasnolife5163 I really appreciate you answering questions on here. What about when other isolation level transactions interact with a snapshot isolation transaction?
@@jordanhasnolife5163 Well, I'm not speaking from experience, but everything I read online says multiple transactions from more than one isolation level can run concurrently.
@@jordanhasnolife5163 If they don't do it then do they store all transactions corresponding to individual entities/users because it seemed like that's how snapshots were being taken?
thank you jodron. Best content in the internet for distributed systems. hoping to pass the interview tomorrow. I covered most of your videos :). Passing Good karma to you
Best of luck, I believe you'll pass if you lern too spel bettur
@@jordanhasnolife5163 hello jodron. Jokes apart, amazing content!
@@jordanhasnolife5163 jodron sounds transformerish name
Bro is smart, funny and humble. W !
" i am a happy guy" .. i am repeating this to myself.
At 2:45 you said writes do not block reads. Reads and write do block each other when using locks. Please correct me if I am missing something.
You're right that a write would block reads, thanks
hey Jordan, your videos are super helpful
Great stuff, man 🐐
can anyone enlighten me about the last example ? If we real all values from snapshot 15 and before, wouldn't the total value be different than 1 Million, and we will have an inconsistent read ?
The point here is just that when we make a read at a specific timestamp, and read many rows at once, we will see the DB in a consistent state.
If we read many rows at once using different timestamps than we can expect to see things in an inconsistent state.
Feels like a lot of stuff that the database needs to store per row then. I'm hoping there would be ways to trim older transactions per row, otherwise running the find operation at time T -- even if logarithmic using binary search, could be expensive both in memory requirements + time complexity.
Very interesting though! Thanks for your content man.
Yep! You can drop old versions of data as needed/do compaction
Amazing content as usual
So it seems to be a tradeoff of never locking on reads, but reads may be slower since we now have to look at transaction history per row?
Reads should still be pretty fast, we now have to store more data, basically all the transaction history of each row. Plus, snapshot isolation is still not perfectly serializable.
The trick for snapshot isolation is storing the old values, which looks same as what we did for read committed, which also stores old values. What's the difference?
In snapshot isolation, we store many old values, as opposed to just the most recent one in read committed isolation.
Hi Jordan, So I was doing some readin about the differences between read committed isolation and snapshot isolation, and there was something that I couldn't fully understand.
It said that Read committed isolation doesn't check conflicts at the commit time but whereas snapshot isolation does that. A bit unclear to me, hope you could help!
I'm not really sure what you mean by conflicts in this sense.
Read committed isolation will just ensure you aren't going to read any uncommitted data.
Snapshot isolation will just ensure that if you read multiple rows, it was after one distinct transaction time, and you wouldn't ever see the database in a state where a transaction was only partially completed.
I can’t believe you left out Kim
I'll be honest with you I've never found her attractive
amazing content so easy to understand by the kardashian reference, i have msft interview in 2 days
Good luck! Let us know how it goes
How did it went?
is the snapshot taking the old values or new value? t1:100 means 100is the old value before transaction or new value after the transaction? thx
The new values. t1:100 means transaction 1 is writing the value 100.
So this means to read a value, we should not just read the row, but also the write ahead logs.
How bad is this overhead?
How do we know for which row we should even look into the write ahead logs?
Not sure what you mean by this. In snapshot isolation, we store multiple values with their transaction number where we would typically store just one, and read from there.
@@jordanhasnolife5163 I see, got it
Description 🗿
I let the intrusive thoughts get to me
for Snapshot Isolation to be effective the read part needs to happen as part of some transaction, right? I can't just use multiple read statements without enclosing them in begin and end transactions block.
Yep!
Is this a feature in normal eventually consistent databases like dynamodb or has to implemented bh the user? I hope the answer is the first thing.
This is one of those things that would be built into the database. Not sure if it's in dynamo truthfully
Thanks for clearing that out. I just wanted to suggest that you should share your videos on linkedin in “how would you solve this problem kinda format” with a link to the video it might help a lot.
@@hardikmenger4275 I appreciate the suggestion! While I may one day do this, I don't want to share on Linkedin just yet as I don't want people to know IRL finding out about these haha - I say too much dumb stuff
Thanks you!
So when do transactions run concurrently here? And when do transactions need to wait for a different transaction to complete?
There's no locking in snapshot isolation, you're basically just guaranteed to be able to make reads from a consistent snapshot.
@@jordanhasnolife5163 I really appreciate you answering questions on here. What about when other isolation level transactions interact with a snapshot isolation transaction?
@@Spreadlove5683Not quite sure what you mean by this truthfully - typically databases will use one style of isolation at a time
@@jordanhasnolife5163 Well, I'm not speaking from experience, but everything I read online says multiple transactions from more than one isolation level can run concurrently.
Curious, in distributed systems how do we generate monotonically increasing number for snapshot? Via Zookeeper type system?
Eventually that's how you'd do it yeah. But for the sake of a single node, just using a write ahead log is sufficient.
Do all databases which avoid this race condition use Write Ahead Log?
I guess they wouldn't have to but the write ahead log is very important for achieving atomicity and general fault tolerance
@@jordanhasnolife5163 If they don't do it then do they store all transactions corresponding to individual entities/users because it seemed like that's how snapshots were being taken?
haha, Keeping up with the Kardashians 😂
subbing, so you better beat the tiktok girl .
Legend! She's past 100k, we've got a bit to go
lol holy shit that israel palestine joke will age worse year by year
In my defense it was pre 10/7 and I'm Jewish
@@jordanhasnolife5163 oh i knew both of those things
@@scottyjacobson4655 lol
That joke was incredible
Kim fell off
Based