I mean, let's be honest, the generate shitty c code button isn't *great*, but it's still a whooole lot better than trying to actually read assembly directly. I have done a fair bit of game reversing (specifically for the pc monster hunter games), and I would have dropped that very quickly if not for the ghidra decompiler.
I have to comment before even watching: 1. The video/audio quality is amazing! 2. On your self-hosted video platform the quality seems to be _even better_ than on youtube. I feel like the 720p video on your site looks as good as 1080p on youtube.
Hey y’all, can someone read my notes below and help me understand what is going on towards the end? I tried my best to replay and write stuff down but I still don’t understand. 0x7ffff7f -> prefix all below addresses with this 0x90d50 -> glxSwapBuffers 0xa8000 -> glxSwapBuffersCopy 0x90da0 -> relative jump within glxSwapBuffers 16:32 0xa8012,+14 -> presumably inside glxSwapBuffers copy 16:34 0xa801a points to address in original glxSwapBuffers, specifically 0x90d56 0x90d56 -> part in glxSwapBuffers that occurs after a jump-but I think this is a mistake-I think he meant to jump to 0x90d80, right? 0x90d80 is the address after the jump occured…. 0x90d56 is the address where “mov r12,rdi” occurs and before that there is no jump, so why 0x90d56?? 16:42 decode at least five…WHAT (size of rel32)? I am so dumb and do not get :( 16:47 if at least five WHAT? I am guessing instructions, right? Or does he mean NOPs? Or bytes? I am so confused. so if there at least five instructions and no rel32 jumps, we can rel32 to our glxSwapBuffersCopy? 16:48 rel8 jumping for plan b when we do not have at least five somethings and NOP sled into a rel32 jump into glxSwapBuffersCopy 17:00 “if no five [???] without relative, look before function” 0x555555562480 -> detour_sample::glxSwapBuffers -> what is this function? I thought 0xa8000 was glxSwapBuffers copy? 0x555555562480-5 = 0x55555556247b-> exactly five NOPs because alignment stuff I don’t understand, but works for me 17:18 I can understand rel8 jump is smaller (only two bytes) and that we jump to NOPs that have been overwritten to a rel32 jump (5 bytes large), but why do we rel8 jump in the first place? why not just rel32 jump right away? 17:30 I give up.
15:46 0x90d50 is the original glxSwapBuffer function: it starts with an endbr64 (4 bytes), then a push (2 bytes) 15:48 hooked glxSwapBuffers function: the detour crate wrote a rel32 jump (5 bytes) to a function I declared myself, where I can do anything (like reading the screen with glReadPixels) before calling the original glxSwapBuffers. At this point in the execution, the call has been intercepted, we could return without swapping buffers if we wanted. Note that the rest of the disassembly is wrong, because we replaced a 4-byte instruction with a 5-byte instruction, so GDB is confused. 16:33 0x8012 is a bit of executable memory allocated by the detour crate to write what's called a "trampoline". It contains the instructions that were overwritten (push r12, the endbr64 doesn't matter I guess, it's counted as padding?) and then jumps to the rest of the original (0x90d56 - starting with the first instruction that is still intact). Here's the tricky bit: why (or rather how?) does it jump to 0x90d56? The jmp at 0xa8014 doesn't jump to "the next instruction" (rip+0x0). It jumps to the value written at address rip+0x0, also written as [rip+0x0]. (The square brackets here are important, they "dereference"). That's what the rest of the screenshot shows: 0x1801a isn't code (the "push rsi; or eax, ..." decompile is a red herring), it's data: look at the bytes, not the instruction - it's the address it jumps to, in little endian. The next GDB command I do reads 1 "biG word" (8 bytes) as heXadecimal. Hopefully that clarifies the first strategy. As for the rel8 strategy: it's 5 bytes (the size of a rel32 instruction), or 5 1-byte NOPs, not "5 instructions". The addresses shown in the disassembly in this part don't match up because the "let's just rel32" approach works with the actual glxSwapBuffers. So in that last part I show what we could've done if the function we hooked didn't have room for a rel32 at the beginning. 17:01 shows what such a function might look like: it's two bytes, and then it returns immediately. We can't fit a rel32 in there. But we can fit a rel8 to /before/ the function, if there's padding before the function. And in that padding we can fit a rel32, and do as before. (The demo function for the rel8 strategy is also called glxSwapBuffers but is actually written in assembly and has nothing to do with the libGLX.so.1 library). In summary: when we want to hook a function, if it starts with at least 5 bytes' worth of instructions that aren't relative jumps (of any kind), we can overwrite it with a rel32 jump to our code, create a trampoline elsewhere that allows executing the original prologue then jumping to the rest of the original function. If the function is smaller than 5 bytes, or if it immediately has a relative jump (of any kind), our only hope of hooking it (while retaining the ability to call the original code!) is to find padding within -127/128 bytes of the start of the function, where we can write a rel32 jump to, at which we'll arrive with a rel8 jump that we'll write at the start of the function, overwriting whatever original code was there.
The video was good right up until you decided to start speaking in a very hard to parse way right as things got interesting and complicated. What are you, a code obfuscation library?
confuse,,, no... info great.... alignments,,, must get documentation... think better hijack jump destination for branch+and+back than alter jump source... bye.. .. bye... ;-)
Did we just watch Amos devolve into a caveman?
lol I had to rewatch caveman part a couple times haha but this video was awesome
At exactly the part where a coherent explanation would've been really helpful, yes.
caveman smarter
caveman big better not caveman
Why use many word when few do trick
He turned Vietnamese
"A program is a miserable little pile of functions." subbed.
I mean, let's be honest, the generate shitty c code button isn't *great*, but it's still a whooole lot better than trying to actually read assembly directly.
I have done a fair bit of game reversing (specifically for the pc monster hunter games), and I would have dropped that very quickly if not for the ghidra decompiler.
15:13 He starts doing the "explain it to me like I'm 5 years old" thing.
Why use many word, when few word to trick.
Why many word when few word do trick? 15:50
©️Kevin Malone
Love the improved video and audio quality! ❤
Thanks for all your help!
also the script quality. especially at the end!
Oh, those small words; I was wondering if I had suffered a stroke or you...
Well done!
I have to comment before even watching:
1. The video/audio quality is amazing!
2. On your self-hosted video platform the quality seems to be _even better_ than on youtube. I feel like the 720p video on your site looks as good as 1080p on youtube.
the 720p probably had better compression than RUclips's
15:42 this is how assembly should be explained
Video Quality good a lot better, I like it!
Your decompilation of the script had some issues on the last section.
"we hard together" deaded me
Hey y’all, can someone read my notes below and help me understand what is going on towards the end? I tried my best to replay and write stuff down but I still don’t understand.
0x7ffff7f -> prefix all below addresses with this
0x90d50 -> glxSwapBuffers
0xa8000 -> glxSwapBuffersCopy
0x90da0 -> relative jump within glxSwapBuffers
16:32
0xa8012,+14 -> presumably inside glxSwapBuffers copy
16:34
0xa801a points to address in original glxSwapBuffers, specifically 0x90d56
0x90d56 -> part in glxSwapBuffers that occurs after a jump-but I think this is a mistake-I think he meant to jump to 0x90d80, right? 0x90d80 is the address after the jump occured…. 0x90d56 is the address where “mov r12,rdi” occurs and before that there is no jump, so why 0x90d56??
16:42
decode at least five…WHAT (size of rel32)? I am so dumb and do not get :(
16:47
if at least five WHAT? I am guessing instructions, right? Or does he mean NOPs? Or bytes? I am so confused.
so if there at least five instructions and no rel32 jumps, we can rel32 to our glxSwapBuffersCopy?
16:48
rel8 jumping for plan b when we do not have at least five somethings and NOP sled into a rel32 jump into glxSwapBuffersCopy
17:00
“if no five [???] without relative, look before function”
0x555555562480 -> detour_sample::glxSwapBuffers -> what is this function? I thought 0xa8000 was glxSwapBuffers copy?
0x555555562480-5 = 0x55555556247b-> exactly five NOPs because alignment stuff I don’t understand, but works for me
17:18
I can understand rel8 jump is smaller (only two bytes) and that we jump to NOPs that have been overwritten to a rel32 jump (5 bytes large), but why do we rel8 jump in the first place? why not just rel32 jump right away?
17:30
I give up.
15:46 0x90d50 is the original glxSwapBuffer function: it starts with an endbr64 (4 bytes), then a push (2 bytes)
15:48 hooked glxSwapBuffers function: the detour crate wrote a rel32 jump (5 bytes) to a function I declared myself, where I can do anything (like reading the screen with glReadPixels) before calling the original glxSwapBuffers. At this point in the execution, the call has been intercepted, we could return without swapping buffers if we wanted. Note that the rest of the disassembly is wrong, because we replaced a 4-byte instruction with a 5-byte instruction, so GDB is confused.
16:33 0x8012 is a bit of executable memory allocated by the detour crate to write what's called a "trampoline". It contains the instructions that were overwritten (push r12, the endbr64 doesn't matter I guess, it's counted as padding?) and then jumps to the rest of the original (0x90d56 - starting with the first instruction that is still intact).
Here's the tricky bit: why (or rather how?) does it jump to 0x90d56? The jmp at 0xa8014 doesn't jump to "the next instruction" (rip+0x0). It jumps to the value written at address rip+0x0, also written as [rip+0x0]. (The square brackets here are important, they "dereference"). That's what the rest of the screenshot shows: 0x1801a isn't code (the "push rsi; or eax, ..." decompile is a red herring), it's data: look at the bytes, not the instruction - it's the address it jumps to, in little endian. The next GDB command I do reads 1 "biG word" (8 bytes) as heXadecimal.
Hopefully that clarifies the first strategy. As for the rel8 strategy: it's 5 bytes (the size of a rel32 instruction), or 5 1-byte NOPs, not "5 instructions". The addresses shown in the disassembly in this part don't match up because the "let's just rel32" approach works with the actual glxSwapBuffers. So in that last part I show what we could've done if the function we hooked didn't have room for a rel32 at the beginning.
17:01 shows what such a function might look like: it's two bytes, and then it returns immediately. We can't fit a rel32 in there. But we can fit a rel8 to /before/ the function, if there's padding before the function. And in that padding we can fit a rel32, and do as before.
(The demo function for the rel8 strategy is also called glxSwapBuffers but is actually written in assembly and has nothing to do with the libGLX.so.1 library).
In summary: when we want to hook a function, if it starts with at least 5 bytes' worth of instructions that aren't relative jumps (of any kind), we can overwrite it with a rel32 jump to our code, create a trampoline elsewhere that allows executing the original prologue then jumping to the rest of the original function.
If the function is smaller than 5 bytes, or if it immediately has a relative jump (of any kind), our only hope of hooking it (while retaining the ability to call the original code!) is to find padding within -127/128 bytes of the start of the function, where we can write a rel32 jump to, at which we'll arrive with a rel8 jump that we'll write at the start of the function, overwriting whatever original code was there.
Hey man great video! Sorry for the shallow question, but would you share the font and colour scheme you're using at 5:58 please?
The font is Iosevka (it's in my Twitter bio because everyone keeps asking) and the theme is GitHub Dark, most likely
@@fasterthanlime thanks
amazing
Detouring from SNES injection to talk about detour, nice.
i can't wait :)
"We hard together.. Uhh, video end now"
15:50 cursed
I... can't... leave... stuck here by whatever force
He speak perfect Nopon, meh
Where my TASBot at?
Why use many word, when few word to trick.
The video was good right up until you decided to start speaking in a very hard to parse way right as things got interesting and complicated. What are you, a code obfuscation library?
confuse,,, no... info great.... alignments,,, must get documentation... think better hijack jump destination for branch+and+back than alter jump source... bye.. .. bye... ;-)
CPUs interpret machine code _by_ translating them into micro-ops (-: