Little bit unclear on 35:20 about why MySQL and Postgres differ in read/write? I understand why writes are much faster using Delta Storage, in that we don't need to record a whole new tuple if we've only modified a few attributes of it. Then Dana mentions something like "The disadvantage [of Delta Storage] is that you have to replay deltas to put the tuple back together to its correct value (?)". It looks to me like we _are_ storing the current version of a record in our Main Table though, right? There isn't any reconstruction/replay work that we'd need to do to just serve that latest version to a client? OhHhHhHh I forgot that we're talking about Multiversion Concurrency Control here :) So to be able to serve a transaction some past version of a record that it should be privy to, we have to replay in reverse the operations in the delta storage segment onto the latest version of the tuple. Gotttchhaaaaa. And this is slower than doing either Append Only Storage or Time Travel Storage as a Version Control scheme for MVCC because for those two options, past versions of a tuple exist in their entirety (either in the Main Table or in the TimeTravel table, respectively). So when we use Delta Storage we're getting (+) better write perf, lower storage overhead, but some (-) reconstruction overhead for reads by transactions, because in delta storage we aren't storing full tuples for past versions of tuples, and have to reconstruct those past versions through replay. I love this class = )
nice observations. few points i could think of : 1. why MySQL may be faster in writes is that it does not have to update all secondary indexes of key k1, when we update k1's attributes by inserting new version of record. This is because either it's secondary index point to primary index (logical) or index point to oldest version of the record. So, secondary indexes don't need to change. Otherwise, one update on key k1, leads to multiple updates of secondary tuple records. 2. replaying deltas is required when a query wants to run a query on past data ..
17:44, Dana mentions reviewing the lecture on Isolation levels from last year, I couldn't find it. If it's uploaded on youtube, can someone share the link please?
Carnegie Mellon ? r u kidding me? she was just reading something... suggest to watchi this ruclips.net/video/GILqZvxD6_g/видео.html , also from CMU, one year earlier
Andy has a great team. Dana's an awesome lecturer!
Little bit unclear on 35:20 about why MySQL and Postgres differ in read/write? I understand why writes are much faster using Delta Storage, in that we don't need to record a whole new tuple if we've only modified a few attributes of it.
Then Dana mentions something like "The disadvantage [of Delta Storage] is that you have to replay deltas to put the tuple back together to its correct value (?)". It looks to me like we _are_ storing the current version of a record in our Main Table though, right? There isn't any reconstruction/replay work that we'd need to do to just serve that latest version to a client?
OhHhHhHh I forgot that we're talking about Multiversion Concurrency Control here :) So to be able to serve a transaction some past version of a record that it should be privy to, we have to replay in reverse the operations in the delta storage segment onto the latest version of the tuple. Gotttchhaaaaa.
And this is slower than doing either Append Only Storage or Time Travel Storage as a Version Control scheme for MVCC because for those two options, past versions of a tuple exist in their entirety (either in the Main Table or in the TimeTravel table, respectively).
So when we use Delta Storage we're getting (+) better write perf, lower storage overhead, but some (-) reconstruction overhead for reads by transactions, because in delta storage we aren't storing full tuples for past versions of tuples, and have to reconstruct those past versions through replay.
I love this class = )
nice observations. few points i could think of :
1. why MySQL may be faster in writes is that it does not have to update all secondary indexes of key k1, when we update k1's attributes by inserting new version of record. This is because either it's secondary index point to primary index (logical) or index point to oldest version of the record. So, secondary indexes don't need to change. Otherwise, one update on key k1, leads to multiple updates of secondary tuple records.
2. replaying deltas is required when a query wants to run a query on past data ..
17:44, Dana mentions reviewing the lecture on Isolation levels from last year, I couldn't find it. If it's uploaded on youtube, can someone share the link please?
ruclips.net/video/tHlIxfTCMoY/видео.html
Carnegie Mellon ? r u kidding me? she was just reading something... suggest to watchi this ruclips.net/video/GILqZvxD6_g/видео.html , also from CMU, one year earlier