This is the ideal db lectures every school should offer but not every student deserves... only in CMU... just awesome.. Prof. Pavlo, knows db, luvs db...
@@jaffreyjoy cuz u have to get admitted by school like CMU first.. most of their students have been working hard on the admission. Yep not everyone deserves.
1:04:48 - I'm not sure I understand one thing. If the compaction caused the slot number to change for a tuple "ccc", that means that the upper parts of the system (e.g. indexes) HAVE TO get notified that the way they used to refer to that tuple (page:74608, slot:2) is not correct anymore.
notes to self: we have a page directory to help find exact mem location of a page and we inside each page we have a header + slot array that help locate the mem location of a tuple
For a moment I thought that slots (or offset/slot) part of the record identifier never changed and that the only thing that changed was what that slot pointed to. Like, in case tuplets get re-organized/de-fragmented, slots would update their pointers. But it seems, based on this lecture, that slots that the system exposes as part of the record identifier can change.
You said that rowid(s) are useful because we don't have to update indexes, but when we inserted into SQL Server, and it changed ids, does it mean that it would have to update indexes as well? Wouldn't it be better to keep ids as is, even if we had to move tuples inside the page (to compact data), and fill the empty slot in the middle with a new tuple using the address after existing tuples? Or do slots have to have strictly increasing offsets for data at the page? Well, I guess, the answer is "it depends" 🙂
I don't understand why we need the slotted pages. If a page is full and we do something like update all the tuples so that they are a bit bigger, won't we then have too much data to store in the page, and we'll need to deal with invalid references anyway? Or can the size of tuples not change? Or do we need to deal with invalid references in that case but we just prefer not to do that all the time for efficiency reasons, and the slots just help us do it less?
does this mean that maximum row size can never be greater than 16kb ? Im using postgres at work and i think some rows easily exceed 16kb (with jsonb data).
a very important question does this course feels like it's in depth course ? i mean does software developers that aren't gonna specialize in DB administration have to know all of this stuffs like data storage in DBs ?
THB I don’t really like the fact that the beep muting the original words, which does interrupt and lose the original feel of the course. I think we should honor what the professor said unchanged.
it would be better to just leave the vid as what it was, right? he thought it was okay to use profanity during live lecture (school seems okay with him) and then dozens of youtubers use profanity here as well .... personally i think it would be better to not mute it.. the beeeeeeeeep sound really hits my eardrum & getting annoyed of that
I hope the part about not talking to his family because of a voting choice was a joke :(. I know politics are important in America, but your family should be even more important.
This instructor is so cool. He makes database course fun to learn.
I love how he curses, and has a DJ! 🤣
So much more down to earth than I'm used to.
This is the ideal db lectures every school should offer but not every student deserves... only in CMU... just awesome.. Prof. Pavlo, knows db, luvs db...
Not every student deserves???
@@jaffreyjoy cuz u have to get admitted by school like CMU first.. most of their students have been working hard on the admission. Yep not everyone deserves.
I have been DBA for years and I did not know these intimate details. Great thanks to AP. You are simply awesome.
This course is awesome! This guy is like sensei of database systems.
Amazing. I was reading a totally different db book and wondering why we weren't using virtual memory. This is exactly the answer I needed!
1:06: In one statement from Oracle: insert into r (select 101,'aaa' from dual union select 102,'bbb' from dual union select 103,'ccc' from dual)
loll this class is sick. I wish my profs were this cool.
This course is so good. Andy is awesome
1:04:48 - I'm not sure I understand one thing. If the compaction caused the slot number to change for a tuple "ccc", that means that the upper parts of the system (e.g. indexes) HAVE TO get notified that the way they used to refer to that tuple (page:74608, slot:2) is not correct anymore.
I exactly had this question and scrolling down comments to find that I got it right .
Invaluable content!! I have been looking for this for a long time
best db course
notes to self:
we have a page directory to help find exact mem location of a page
and we inside each page we have a header + slot array that help locate the mem location of a tuple
Love this song at the end of each lecture.
kudos to the video auditor that took the time to beep the sh*t out of the video
For a moment I thought that slots (or offset/slot) part of the record identifier never changed and that the only thing that changed was what that slot pointed to. Like, in case tuplets get re-organized/de-fragmented, slots would update their pointers.
But it seems, based on this lecture, that slots that the system exposes as part of the record identifier can change.
What is the book thay use in this course?
From within Oracle 'set history on' gets you command recall.
Thanks for sharing this great course.
You said that rowid(s) are useful because we don't have to update indexes, but when we inserted into SQL Server, and it changed ids, does it mean that it would have to update indexes as well? Wouldn't it be better to keep ids as is, even if we had to move tuples inside the page (to compact data), and fill the empty slot in the middle with a new tuple using the address after existing tuples? Or do slots have to have strictly increasing offsets for data at the page? Well, I guess, the answer is "it depends" 🙂
He is just awesome!!. Can you please upload the lectures un-beeped? (may be by age-restricting them) The beeps are not good! Thanks a lot Andy!!
Thank you so much for sharing this invaluable content.
Amazing video
can someone explain "WHY NOT USE THE OS?" part 18:42 can someone demonstrate with an example?
See this paper: db.cs.cmu.edu/mmap-cidr2022/
is this still relevant? i thought everything is stored in memory nowadays with spark etc
Can a tuple be larger than a page size and be splitted into two pages
讲得很好,许多地方和同学一起讨论后更清楚了
The best part starts at 22:08
Here lies one who hated mmap! xD
Does the page have any relation with the page terminology in Operating System page?
TDD and CI/CD for DBs/Data is the neglected frontier.
This is an amazing class
great course!
I don't understand why we need the slotted pages. If a page is full and we do something like update all the tuples so that they are a bit bigger, won't we then have too much data to store in the page, and we'll need to deal with invalid references anyway? Or can the size of tuples not change? Or do we need to deal with invalid references in that case but we just prefer not to do that all the time for efficiency reasons, and the slots just help us do it less?
i think it might be the 3rd one. Not sure though. Have to see next lectures to figure this out.
Bro is a straight g
Like the begining~
is doing "vaccum full" on prod databases from time to time a good idea? since it appears to save space
Vacuum full locks tables when it rewrites them, so you need to be careful when you run it.
does this mean that maximum row size can never be greater than 16kb ? Im using postgres at work and i think some rows easily exceed 16kb (with jsonb data).
No. For Postgres, they store larger values in separate TOAST storage tables.
a very important question
does this course feels like it's in depth course ?
i mean does software developers that aren't gonna specialize in DB administration have to know all of this stuffs like data storage in DBs ?
You should have a good idea how to use databases, but if you are not going to specialise in this then maybe you don’t need to know everything.
How does one create record id's for external tables ?
how to get the H.W Q's please ? thanks for the best instructor.
15445.courses.cs.cmu.edu/fall2021/assignments.html
did you get it yet ?
@@saifmohamed1776 did anyone get it?
So does the OS uses virtual memory for everything except the i/o when running the database server? Won't that be a bottleneck?
Can you elaborate with an example?
So Andy is also called Andrew ...
what is the difference between files and pages?
A page is a small chunk of data inside a file. Like a cluster. It's a unit of storage.
This was so cool
Who is Tim Kraska? And how did he betray you?
THB I don’t really like the fact that the beep muting the original words, which does interrupt and lose the original feel of the course. I think we should honor what the professor said unchanged.
1:00:00 Tim Kraska betrayed me.. lol
This course require me to have some SQL language basis.
Thanks so much for telling your students you hate Trump... Super helpful!!!!
Will teach LOG-STRUCTURED FILE ORGANIZATION in next lecture
you are talking way too fast, or i think i'm too slow. Excellent course!
50:00
Did he just say shit in a lecture and bleep it out?
23:58
58:40
why are you bleeping the shit out of andrew?
it would be better to just leave the vid as what it was, right?
he thought it was okay to use profanity during live lecture (school seems okay with him) and then dozens of youtubers use profanity here as well ....
personally i think it would be better to not mute it.. the beeeeeeeeep sound really hits my eardrum & getting annoyed of that
I hope the part about not talking to his family because of a voting choice was a joke :(. I know politics are important in America, but your family should be even more important.
course dj...🤣
fuck he talks so fast