@@deschia_ We did testing on it back in college, comparing hand-coded assembly, C, Fortran 77, PL/1, and last (and least) Cobol. C and Fortran compilers did a reasonable job of producing something pretty close to what we did in assembly. PL/1 threw in some extra overhead which I think was related to memory management. And Cobol created a scary pile of machine code that we decided not to look into too deeply. I think it was summoning something from Cthulhu.
I don't care what other viewers say. Keep using paper! Sometimes you have to go the extra mile to make a point. I like your teaching style. Thanks for the videos. Very good info here!
+JP Aldama I agree. There is something I love in making notes on printed text. Plus, you can explain something so much quicker on paper because drawing and organising information is quick and intuitive, whereas doing such on a computer takes time to plan out.
They don't teach us that because the last forty years of computing history has been all about NOT reinventing the wheel. People got tired of having to start over every time a new computer came around, so we standardized our hardware, and operating systems (most notably, Unix) became portable between CPU architectures. Developers (the vast majority of them, at least) stopped caring about the low level stuff because they didn't need to anymore, and the computer science world progressed towards higher level things. They don't teach us how to actually do it because to go from nothing to even just a bare bones, functional shell environment by yourself would take years and years of development. So they just teach us the theory behind how it works and leave it up to you to do that stuff, if you want to. I feel where you're coming from, though. I used to feel the same way and I tried to learn things from the bottom up, but trust me; you'll be a lot better off if you start with the higher level systems and work your way down. It gives you kind of a bigger picture to see where the little things fit into.
It's not a problem, though. Nobody teaches 8-bit assembly because nobody uses 8-bit assembly anymore except hobbyists, and hobbyists already have many resources available them to learn from. Not to mention that most people interested in 8-bit assembly grew up with computers that ran it, and thus already know it! In fact, we have access to all the resources they did and more with the help of the internet. We can't expect the world to cater to our extremely niche interests. That's why we're all so grateful to Ben for sharing his knowledge and guiding us through the process
Not just hobbyists. Assembly language can also be useful for hacking. I'd imagine it be really useful for reverse engineering, finding certain exploits, and malware development.
The instruction at 0x10000f63 is moving the result of the printf function (the number of characters written) to a location in memory (even though it isn't used)
I never figured out what the printf() was supposed to be; It is implemented in 16-bit code that has to keep two registers pointed to the same address; It runs much much slower than what makes sense to me; A data-block like 1024 or whatever shroud be alloc at init; Like above while ( int ) I found much established C/S to be Horror Code of the Damned written by relatives of the Munsters to prevent use of sanity checks like if do while which works much much better due to zero based indexing
When I first started programming in C (mid 80s) I wanted to make sure the compiler was doing a good job and would always check the assembly for timing critical code. After doing this for a while I realized I could write the C code in such a way to influence the compiler to output very efficient assembly. Nowadays, the few times I do this, I'm amazed at how good modern compilers have gotten at optimizing for speed.
@@random-user-s theres not much you can do now in days lol. Also not really worth it imho because of how good compilers have gotten. But there are some reserved keywords in C and C++ that can tell the compiler certain things. All i really know about is marking functions inline can speed up the compilation process sometimes and can boost performance. Again, its not really worth doing that because the compiler should do all of that for you when necessary (if you mark the compiling command with -O3) its pretty easy to look up and youll eventually get the hang of it when you code more
@@random-user-s Why would you? What's wrong with your compiler that you want to upgrade it already as a beginner? Maybe you should just try another one? Msvc, mingw or clang.
@@squizex7463 Maybe just due to curiosity? Or because a man wants to understand things better or just keen to do hard things? By your logic, one doesn't have to do anything because all the good things, by which you can do your software, are already written. So all you left to do is use them, which's boring af
I guess it's tightly related to how memory and cpu work internally. And it is very limiting due to the binary nature as well as a frequency ceiling of the transistors. Dead end if you will in my opinion. Invention of multiple cpu cores bought us some time I suppose but the future is somewhere else.
I always thought assembly is useless and just a waste of time and money to take that class in uni but after I finished the class I realized how important it is, this might seem like an exaggeration but Assembly made me finally understand how Computers actually work and its diff one of the most important classes in CS . also its really useful for reverse engineering a TA in my uni showed me how to crack a program just by understanding assembly
@Adam Richard lol so true, I tried making more elaborated programs and instantly gave up. The fact it might be very different for each processor one might have makes it very discouraging. Or just raging, don't even need the "disco"
Ah, I thought that the memory adresses were just chosen "ranndomly" by the compiler". But this makes me wonder though... how does the computer know how much space a variable takes up? Nothing in the machine code in the video shows that. What if the variable took up more than 4 bytes?
@@aurelia8028 in many languages you determine the datatype right? in "int x = 2;" an "int" is for example always 4 bytes and "double y = 5.4;" would make it 8 bytes etc *edit:* the size also depends on your platform... as mentioned by another commenter below, an int may be 2 bytes as well
The eax register will contain the return value of the printf function. Evidently it is being stored on the stack in the expectation that it will be needed later. Presumably you had the optimiser turned off when you compiled it.
I'm genuinely surprised C makes so much use of the hardware stack, since if you looked at the C2 compiler in Java for example it absolutely hates using stacks and almost always does everything in registers unless it has no other choice
@@theshermantanker7043 If you compile on any level of optimization, it usually doesn't make as much use of the stack. By default, GCC compiles with absolutely no optimizations on, though. I find it's easier to make sense of the compiler's assembly on -O1 (the lowest level for GCC), because it puts things in registers a lot more, like a human would.
@@theshermantanker7043Originally that is what the register keyword was for. It told the compiler you wanted it to store variables in registers if possible, but it was just a request and not a given.
I see, thanks for pointing that out, its interesting that the compiler still consider that [printf] would need to going back to where it come from even when it see that the loop is infinite
One of my college profs was in the Navy and needed to write assembly for the Navy to optimize COBOL code. He wrote it in FORTRAN and turned in the assembly. They had strict goals on lines of assembly to be written and debugged per day. He always met his goals. His reasoning was that FORTRAN was a pretty efficient language, and so he probably couldn't do much better. The Navy never knew they were converting their COBOL to FORTRAN.
You’ve reminded me of a talk I gave this year showing how some fortran code appeared in assembly. Fortran is still widely used in my field (supercomputing) and understanding the impact of such things like compiler optimisation is very helpful.
I don't know why, but this video is very satsfying to watch as a programmer. It's very logical and makes sense. Like if you'd suddenly have a partial look into a womans brain and actually start understanding something.
Why, I think everyone learns backwards, If they would start at low level which is cold hard logic memory movement and work up the chain I believe they would learn how to program much faster. Lang like basic trigger bad habits that become hard to break such as never clearing your memory or initializing variables and things like C++ have turned into a cluster fuck due to the Total Over use of OPP everyone seems hell bent on these days. I would suggest if someone wants to learn to code go back to DOS, Get Turbo C and use that, It was a great lang with great documentation to help you telling what every single command did ect.
If he tried to learn Java before ASM hes going to be crying like everyone else on this video is about how hard ASM is to understand when its WAYYYYYYY easier to understand then any lang I have ever used including Basic. I think the Fail comes with most people because they don't comment their code and lose track of whats what but its simple top down programming that can be traced with ease.
I know I will catch a mess load of flak for saying it because I still get a lot of flak for using it from time to time but I honestly believe DarkBasic is one of the better things for a programmer to start in.... Hear me out before yall hate on me. Starting off a programmer wants results, ASAP. With darkbasic its as simple as Sync On Make Object Cube(1,10) Position object (1,0,0,0) Position Camera (0,100,0) Point Camera (0,0,0) do control camera using arrow keys 0,1,1 loop wait key That code above will draw a cube on the screen and point the camera at as well as allow you to look around with the arrow keys, it which is a great starting point for most hobby programmer since the will feel the excitement going right away with a 3d object they can manipulate. This same code in say C++ for instance would literally take hundreds or thousands of boiler plate code just to setup the engine to draw the cube and accept the input. Look into darkbasic. Its old but its effective and its fun as all hell to toy with.
I started on a TRS 80 Model 1 with 2k ram and a cassette tape player. Basic. Then a Commodore 64. Commodore Basic. Then C on my BSD systems at home, took online local community college courses for Visual basic .net and C - grew tiresome. Right about then it became evident that code monkeys had to compete with $3/hr dev teams in India. Writing on the wall was that the money would be in Java. I stuck with sys admin needs; Perl and C. FEAR of Java, FEAR of having to think about this stuff, FEAR of actually applying what I've learned in school... NEVER learned these basics. (been TAUGHT it many times!) Never formed this solid foundation. In other words; I can't code to save my life...but I have worked for years making money doing it. Flying by the seat of your pants every day...making it work, doing the seemingly impossible. There is reward in that, at least. It feels good to actually DO this stuff in the real world for real world paying client needs. I can't even last in a programming conversation for two minutes. My point? - Just *do* *it*.
Just a little remark for people wondering why the code generated by the compiler contains strange and unuseful constructs. It is simply because the code was generated with the -O0 parameter which means, no optimization whatsoever. This means that the compiler basically does a nearly 1 to 1 translation of the C code to the assembly, without considering if the operation are redundant, unused or stupid. It is only when optimization is enabled that the compiler will generate better code. In this example, for example, it is stupid to read & write x, y, z continuously from memory. An optimizing compiler will assign register in the inner loop and will never write their values to memory. The spilling of the printf return value 'movq eax, 0x14'bp)' will of course not be emitted/
Interesting that clang -O2 results in the output values (1, 1, 2, ... 144, 233) being hardcoded into the binary. The clang compiler is evaluating the result of the loop at compile time.
@@zoomosis Hahaha, thats very interesting. I've always thought compiler does so complicated stuff that Im not gonna even try to understand it. So I always assume that they can do pretty much anything. I wish to write my own compiler one day, very simple though.
Can you define what you mean by 'spilling'? I mean, yeah, the return value of printf is loaded into this memory location, but it is never checked for success anyways, so why isn't it redundant?
@@lukasseifriedsberger3208 That 'redundant' store of the printf() result _IS_ the 'spilling'. The 'prototype' of the printf() function shows that it returns an int so, by default, the compiler will SAVE that value somewhere (even though the value is never used!) If the source code is compiled with some degree of optimisation (eg: -O1, -O2 etc), then it will remove this redundant store of the printf() result since it's never USED! For further reading, what does the returned value of the printf() function actually mean!!! (Not many people have ever USED this printf() return value, so they don't know what it actually signifies - It's probably more relevant for sprintf() or fprintf())
I like how the compiler optimised the while(1) into an unconditional jump instead of actually evaluating the expression "1". I know compilers have been doing that for decades, also it's a very basic optimisation, but I enjoyed seeing it on paper :D
@@_yakumo420 I think it is pretty interesting that even without any optimization, it became an unconditional jump, rather than test whether the int 1 evaluated to 1 (I'm pretty sure that's how while(..) works in C). I guess it's common enough that the GCC developers just hard coded that optimization construct into the compiler?
@@splashhhhhhhhhh Yes indeed but that wasn’t the point. The compiler detected that it’s a tautology and optimised it even without the optimisation flags set.
Even without optimizations on there are some optimizations that will always take place, such as not using hardware multiply/divide/modulus on powers of 2 etc
Regarding, moving eax onto the stack. eax contains the return value of the printf call. It's not actually needed by this example. It's probably saved to help a C debugger display what was returned and is likely a nuance of the compiler.
So basically, it's almost like the compiler turned "printf ("%d ", x);" into "int oX14 /* I chose the name as a mock of the memory location shown in the above assembly */ = printf ("%d ", x);"?
This makes sense, but I was wondering why this instruction only occurs after the prior 7 lines instead of right after the call instruction? I'm guessing this might be because the cmpl instruction will actually overwrite the value of eax to store the comparison result. Does this have to do with the compiler not being able to look ahead to see if the value will be referenced and just postponing storing the value for future reference until it absolutely has to? Also, this would mean the instruction wouldn't be there if the routine wouldn't reuse eax and just returned instead, correct? What code could have followed and still use this value at this point, without explicitly assigning it to a variable right away? Can you give an example?
Thanks for the explanation, but I'm still unclear on part of it. I understand that eax/rax contains the return value of the printf function and by the time "movl %eax, -0x14(%rbp)" gets executed, that's still the value of eax. From what you're saying, I get that trying to access -0x14 from assembler code would be a mistake, and I get that, but I don't see why the value needs to be kept around at all - it's clearly not referenced anywhere in the source code? What use is the return value of the printf function at that point? And why does it only get moved to that address at that point in time, instead of sooner?
Yes, I suppose so, in that I agree with you: it's really a question about the compiler and not so much about the program either in C or assembler. I'm a software engineer myself, and having written compilers, as well as tinkered with command interpreters in the age of DOS on an 8086, I can strongly relate to what you're saying. My curiosity was raised by the question raised in the video, about the meaning of that particular instruction - which was answered above by +Dameon Smith: it's the return value of the printf function that's being saved for whatever reason, independently of the program under consideration. I suppose I could look into the inner workings of the GCC compiler to find out, I more or less hoped someone might have an intuitive (and therefore short) reason off the top of their heads. But I agree with you, that's likely not the case - and certainly not the topic of the video, as the author rightfully stepped over the problem and seems to have taken some care to write their C code in such a way that the assembler would be as clean as possible for demonstration purposes.
Bvic3 Notice that something is being already saved to the position 0x04 at the top. And the number is basically an offset to the base pointer (%rbp) so 0x00 would be the base(?) of the stack frame. I don't know, maybe something is stored there
The compiler emits movb $0,%al because printf() takes a variable number of arguments. The ABI specifies that when calling such functions, %al must contain the number of floating point arguments. There are no floating point arguments passed to printf() in your example, so %al is set to zero.
Which ABI are you referencing? I tried to look for an appropriate OS X ABI that would cover the cdecl calling convention, but nothing I found mentioned this approach to counting floating point arguments.
Too late in this discussion, but the zero inside "movb $0,%al" is just an information, that the printed value should go into stdout stream (in normal circumstances it means that it will be printed on the screen). Anyway, this video and discussion have returned back a lot of memories... And last but not least, If anybody would like to, source codes for printf() are available, but be warned this function is really complicated one, because of a posibility to use variable list of of arguments with all kinds of types, formats and architectures.
Yeah, you're correct; machine code is literally just binary. Otool seems to be a disassembler; it tries to format the machine code into something a little easier for a person to read Trying to read an executable written for an operating system through a hex editor or something would leave all the header information and such in the output; making it a little more difficult to see what's going on
I could be wrong, but the actual machine code would be 1s and 0s of the low level language the CPU uses. The code shown in the video is that code translated into a kind of assembly.
Machine Code is binary ... The nemonic we use LDA ... etc is assmebly language and in hex because 255 ones and zeros take up a ton space on a line ... while ffff doesnt converting from binary to assembly you run an ASSEMBLER and to convert a langauge like C++ you compile it into assembly language then assemble it in to machine code ... because sending ffff is easier to handle than 255 ones and zeros in a line
frozen_dude - Yeah, I was hopping to have "otool" installed, but I didn't. I looked around and found this: stackoverflow.com/questions/137038/how-do-you-get-assembler-output-from-c-c-source-in-gcc There are lots and lots of ways to get gcc to output the intermittent stages of compilation. I love gcc! If people have never walked through the stages of compilation, I highly recommend doing it.
I thought I'd throw an example of the complete compilation stages out there... I guess because I find it interesting and informative. So when you compile a C source file, the process goes through 4 stages: Preprocessing, Compiling, Assembling, and Linking. 1. Preprocessing: 'gcc -E example.c -o test.i' < The example.c file is preprocessed with the include files, and other directives, #ifdef, #include, and #define. 2. Compiling: 'gcc -S example.i -o example.s' < The source file is compiled into assembly. 3. Assembling: 'gcc -c example.s -o example.o' < The assembly file is converted into an object file, a machine code file. 4. Linking: 'gcc example.o -o example' < The machine code file is linked together with other machine code objects and/or object libraries into an executable binary file. The *.i and *.s files can be examined in your favorite text editor. The *.o file and the final binary file are both binaries, so you'll need a hex editor to view their contents.
This is a good example on why learning coding without understanding how computer technologies layer on each other seems so daunting. Just learning a coding language is not really that difficult. But coding is complexity built on complexity, and each layer down it become exponentially more complex. From an outside perspective, like when when I first started learning code, it feels like you don't just need to know the top layer of knowledge, be it python or c++, but you need to understand what makes that work and how something else makes that work. At the end of the day Id have the impression I was going to have to learn how electricity works to understand the chipsets or ram to understand the next layer to understand the next layer all the way up to my code. The great thing is that these languages were made so we don't have to do that. OOP and modern tech has almost made everything so independent and modular that you can learn the end result without knowing fuck all about how it works. You don't even need to know to code to write games anymore.
You are right but there is one thing.. I dont think learning OOP or coding language is easy .. They are also difficult because if you want to learn really well they steal a loot of time from you :(
I was thinking this exactly today! I was wondering how much do I need to know about this stuff and how may It help me. Although I know I don't need to know all of this stuff is so interesting to me and I think It can give me a better understanding of computer science as a whole, so I'm planning on at least do some research. It's only been 8 months since I started learning Web Development but I am fascinated with everything related to computer science.
AKA "dry running", in the days when computer time was horribly expensive. It's still the best way to understand what's going on in code, and uncovering places for code optimisation, if performance is a problem. Don't optimise code before you've considered the algorithm, though.
@@thewhitedragon4184 they give you a block of code with a lot of unusual stuff and you have to answer what it outputs, or what are some elements of an array or something similar
I taught myself BASIC then Pascal then C++. Learning was actually fun with some of the books they had in the '80s. I got a C64 for my 8th birthday, and I got the C64 Programmer's Reference Guide. It's just amazing the things that were in that book. It went from teaching you BASIC to showing you the memory maps, the pinouts for all the chips, and how to do graphics and sound. But it also had a 100-page chapter teaching assembly! It confused me because it made cryptic references to an assembler called 64MON which I had no idea how to get, but that made it more intriguing. The assembler class I took in college was also one of the only interesting classes I ever took. But I'm pretty weird. I was such a nerdy kid that in middle school I wrote letters to Brian Fargo and John Carmack asking for career advice.
@@captaincaption Brian Fargo actually wrote me back! That would've been about 1990 or 1991. I don't know what happened to the letter. I really loved Bard's Tale III and Wasteland. And today, 21 years later, I'm doing 2nd round interviews for L5 (senior dev) at Google ... but I just wanted to see if they offered anything interesting.
Is it just me or you are feeling excited as well when you see machine language? I was learning python and working on stuff with for like everyday in 8 months. I started learning C and now it just feels a lot of fun language to work with! I even gave a break to python for the time being. Watching assembly feels interesting as well.
@@unknownguywholovespizzaTo me it is eye-opening to see the true atoms of computation. It bridges the understanding of high-level programming and the understanding of how hardware fundamentally operates on the values stored in memory. I am a beginning game developer. I have heard stories of how developers have written their games in C or even directly in assembly to maximize performance while keeping the size of the games very low. While most of my projects use existing engines and much higher-level languages for the ease they provide, I wish to pursue skill in C and assembly so that I may be able to write games that perform as well as humanly possible.
This is so cool, and I think this would be a way more fun/efficient way to learn Assembly than what's taught in colleges. It's way easier to see where these commands come from and what they mean if they're being directly compared to an actual C program. Much harder if a bunch of Assembly terms you've never heard are tossed at you and all of a sudden you're expected to code a program like this.
Try coding in machine language, now that was a chore. Assembly is just a higher level language that is converted/compiled into machine code. I originally started out studying electronics, so we had a course in machine code and had to write a program using it.
I appreciate your effort to make this teaching video to share what you know and honestly say don't know to things you don't know. Well done. I'm not sure either what's the point of moving the contents of eax register on to stack.
so it can be formatted loaded and printed ... it has to strip the format out of the print ... the the data pointer then the data then print it ... and a stack is the best place to do that from as you can shift left and grab the format ... and then shift left and set format up then load the next chunk and shift left ... read data pointer ...and shift left ... load data .. shift left and finally print ...
Remembering my first programming. You looked up the op codes and entered them on a keypad in Hexadecimal. This literally was writing the cpu instructions directly. I miss the 6502.
Way back when I was in school, we had a lab course working with the M6800 (6800, not 68000). I used to write my programs in C then hand-compile them into M6800 assembler. And of course, hand convert that into machine code, which then had to get toggled into the machine.
Hey me too man! Learned Basic on my Apple II and when I wanted to include some heavy-duty math subroutines, I'd POKE the hex code into a memory location then call it when needed. Even on that old 8-bit processor it ran blazingly fast!
The 6502 instruction set was very nice and clear, as was the Z80's to some extent. The intel instruction set was ugly in comparison. ARM assembly language is even worse, it's not meant for humans. Every instruction can do something and can also do something completely different, depending on some weird prefixes. I hope no human being was ever forced to write ARM assembly code.
This actually makes sense. As mainly a C# dev, C isn't actually hard, first off. Pointers and such can get a bit complex, but they make sense. This code is certainly simple. The assembly makes sense too. It is beautiful how simple it is and how it uses such sinple functionality in order to create more complex end results. This helped my understanding of Assembly and it might be one of the things that help me finally make a PS2 game one day.
It seems simple, until you have to do implement data structures in C; then you find yourself crying for days on end, because you can't seem to resolve the clobbered memory errors that keep popping up on you!
@@IM-qy7mf AddressSanitizer makes this significantly easier to debug, though. It's like a plugin for compilers that instruments code using the compilers' own semantic information. You should also get in the habit of writing asserts for potentially incorrect or dangerous code.
Since it's always true, checking it is a waste of time. Even with optimizations "off" some optimizations are always done. Such as bit shifting instead of MUL/DIV by powers of 2.
this brings me back to my assembly class at university, in 2002. i liked that class a lot, but i've never used it again since i didn't go into a career in embedded
@@tamny9963 - So you think that because someone can't remember something from 20-years ago that they're automatically lying? Or, are you just looking for attention?
Really enjoy your videos, started my programming journey, if you will, about 5 years ago with the idea of wanting to make video games. i later found assembly programming and electronics engineering FAR more interesting than game design. I have been learning 8086 ASM on DosBox lately hoping i can get enough experience to understand how computers work entirely, i am currently in the process of learning how different IC's work on a breadboard and hope to build my own 8bit computer soon. Thanks for getting me started on such a fun hobby i hope to make my job someday, keep up the excellent videos! Hope to see your channel continue to grow :)
Maybe he figured out that using machine language in software makes your product un-portable. There are many reasons *not* to write in assembler. And there are distinct instruction sets for different CPU architectures, so you can learn one ISA (inst set architecture) or you can learn all of them; compilers *do* have their advantages. All digital computers work the same way (registers, storage, interrupts, etc) but the devil's in the detail level you can't avoid in assembler. Everybody should *know* what compilers do and appreciate that today's compilers (I've been doing this for 40 years) are very, very good. You should also understand the overhead of interpreted languages like Java & Python (and the list goes on) before you make an implementation/design decision. Knowing the heart of how most of your customers' machines work (x86_64 for {lap,desk}tops, ARM ISAs for phones/tablets) is a valuable datum, should motivate us all to write code that's as efficient as possible. I still check my assembler output most of the time, but I'm about ready to retire ... probably an "old skool" type. But today's typical bloatware sucks. *Fight it.* Take pride in your work, know what you're delivering :-) _and good luck on your autodidactic journey!_
Minor correction, because I used to program in 8080 and Z-80 Assembly: Those instructions from the disassembly are more properly referred to as assembly code instructions. Machine code would be represented by nice hex numbers for the opcodes and operands.
Actually Z80 machine language is relatively easy to program by hand, for each opcode there are few bits of prefix and then register addressing etc. Then you convert all the bits in a hex number and done
Early textbooks used to make a distinction between assembly mnemonics and machine code. Looks like those days are long gone and the terms are used interchangeably.
Z80... My computer life started programming a TK-82C at 1982... Good times... 15 Minutes to load a 15 KB program from a cassette tape (after many attempts)...
Since we're being pedantic here about the difference between assembly code and machine code, it doesn't HAVE to use 'nice hex numbers'. Some CPU architectures were more suited to OCTAL representations, and technically, binary would be equally valid! Footnote: Check out the MODR/M byte in x86 code and you'll see how well-suited it is to use octal in this specific case! Having said that, I willingly admit that I'm predominantly a binary and hex man... LOL
The mnemonics directly represent those hex numbers. If he did print out the instructions in hex, you may as well then complain that it's not really machine code because it's not stored electrically in a computer, but printed with ink. It doesn't matter how you represent something, it's the same thing.
Man, well done to you, you perfectly explained in 10 minutes what a professor in University had 6 months to demonstrate and still wasn't able to. Really interesting.
I'm a programmer but i don't consider myself one because there is just so much abstraction in high level and middle level languages...... Sometimes I feel like normal who can use PowerPoint . According to me, there is only one or two more level of abstraction in PowerPoint
Earliest versions of Pokemon series games were completely programmed in assembly language. Just think for a moment how much time and focus it would have taken for those programmers.😉😉
8:30 after googling around for a bit im about 60% sure its just an assembly representation of "while" because the %eax(Extended Accumulator" - used for arithmetic and logical operations, as well as for storing return values from functions.) is preceding the cmpl of x to hex of 255. I dont really have much experience in assembly but this is my best amateur educated guess but if theres any assembly experts pls explain what that line would mean
It's storing the return value of printf into the stack, this is because the code was compiled without optimizations so GCC included the superfluous store in the final code too
Just fantastic to see how efficient the code produced by the C compiler is. I spent years writing assembler as a kid and used to have competitions with other on how fast and small we could make our code..
Spilling every value (including even the unused printf return value) on the stack isn't exactly the most efficient thing to do-however, that's exactly the thing to expect when compiling with optimization disabled.
More than awesome video bro! :D ... and I have a guess for for movl %eax, -0x14(%rbp): CPU Register -------------------------------------- EAX = 4 bytes -------------------------------------- | AX = 2 bytes | AH | AL = 1 byte -------------------------------------- Since the printf block played around with al ... and we have stuff (x and y) on -0x8(%rbp) and on -0xc(%rbp), respectivelly ... it seems really suspicious that line playing with -0x14(%rbp), which has an offset 12 bytes away in memory from our -0x8(%rbp). If I remember correctly, the bus actually aligns the data before sending it to the CPU from memory to improve performance, and this means including some bytes that might be used soon like 0xc(%rbp) ( cache y :D ); for instance, or even send garbage bytes so we don't have to create circuitry to get the exact byte from memory. What this means is that even though our data to be printed is on 0x8(%rbp), it will be also sent to the CPU 0xc(%rbp), 0x10(%rbp) and -0x14(%rbp). Therefore, I am going to guess this is actually the flush of buffer call for printing... and this the exact time when the printf is actually displaying the values for x on the screen... I guess more information could be given if you compiled with -g -O0 ... however, this video is an awesome explanation. A+!
+Desnes Augusto Nunes do Rosário Right, it seems specific to the author's platform, I compiled the same program with Ubuntu 14.04 and don't see the same spurious instruction when using any of the -O options, but I do see changes in the assembler to optimize z = x + y, so yeah, a good debugger run would help interpret who's responsible for that out of place instruction.
its the compiler he is using actually and the version of the language and the system he is on ... the eax is his usable side of the c language stdio.h ... and it is used to allow formating ... as his printf statement wants to print a %D data bit then do a carriage return ... with the data pointed to by the value x .... . eax is a formating stack alu and program controller in itself ... because he sent a format command the language has to strip the format out of the print command ... and the data pointer and then load the data ... . printf ("%d%/n",x) .... prrintf is in stdio.h ... so the first thing is to push it onto a stack to pull the format info out .. then advance and find the data pointer ... then advance and place the data into the formatted array and advance ... then send it off to the default display device .... just like when you step from 0000 0001 and have to fetch the first code line and strip it apart then find what it means and do it ... youre doing the exact same thing here just with software
@@0623kaboom Dude, no. Stop. That line is a spill to cache the value of eax on the stack because it will be clobbered by the return value of the next printf call. The only purpose of eax within this stack frame is to hold the return value of printf. Literally nothing more. WIth even the smallest level of optimization turned on you see the line disappear as it isn't even remotely needed.
imagine randomly stepping on an anonymous programming language, then you try to do some classic "print (hello world)" but it actually printed it into a paper
A disassembly might be coherent enough to understand, but I wouldn't call it "machine language", since a human wouldn't write it like that, that is to say it isn't a "language".
I miss programming in assembly. The first code I ever wrote was 6502 Assembly on an Atari 600xl. I also programmed in the following assembly languages over the years: 8088, 80286, IBM 360, R10000 and MIPS. After 20+ other languages over the years, assembly is still the one I liked best. It just felt natural. When I first learned C and was using the Turbo C compiler, I often wrote the function headers and variable declarations in C, and just inlined the guts in assembly. Those were the days...
I don't. At all. I wrote Railsounds II in Assembly because the processor (Microchip 17C42) had 2k code space and 160 bytes of ram. It ran at 4MIPS and at the time (93) was the fastest micro on the market. I couldn't wait until I could rewrite in C. Which we did. The hardest part was convincing Neil Young, my client, that we needed to do that. The rest is history. Over a million units sold.
Agreed. Very creative, very obedient. CPU does exactly what you tell it; nothing more, nothing less. If errors exist nobody to blame but yourself; and maybe the standard libraries which for assembly are minimal and usually just the startup code. I also wrote assembly for Honeywell DPS 8 mainframe; now THAT was programming!
@@thomasmaughan4798 Not so much on the obedient part. I remember seeing in a presentation that intel's 486 was the last x86 processor to simply run the instructions, in their order. After that came the out-of-order execution optimisations. And things like processing both outcomes of a check in the time the required value is fetch from memory and then simply using the correct outcome. So, nowadays, you don't really know what and how are things actually executing inside of a processor. Sometimes a less optimized code can be better optimized by the CPU optimzer.
rax is a 64-bit register eax is a 32-bit register which refers to the lower 32-bits of rax ax is a 16-bit registers which refers to the lower 16-bits of eax ah is an 8-bit register which refers to the upper 8-bits of ax al is an 8-bit register which refers to the lower 8-bits of ax gcc -S -masm=intel program.c ATT syntax is ok, but I prefer Intel personally... you’re welcome and thanks for the good video!
@@wh7988 Pick a processor, read the documentation, the documentation will tell you what commands there are and what they do. You can look up youtube videos or books for the processor and how to program in assembly for the processor. The class I am taking right now has us using code-warrior (ide) for programming the HCS12 (mircro-controller). I am assuming going with an arms processor would be a better idea though, they are more popular.
School! A good (but expensive) Assembly book is "Assembly Language" by Kip Irvine. You can use Visual Studio, admittedly a "long" process to set up, to write, run, and debug MASM. Give it a go.
Almost a year late... On x86-based computers, eax is usually for return values. Don't forget that printf is not void, it returns a length. The compiler is a macro-assembler so it stores it on the stack anyway. What you can do is ignore the stack & use only the registers ebx, ecx & edx to store x, y & z, so in theory, it should execute faster. If i remember well, if you only want 8 bits, you can use even bx, cx & dx, or even b, c ,d
When doing: x = y, why isn't the assembly code just: movl -0xc(%rbp), -0x8(%rbp) ? Why do we need to do: movl -0xc(%rbp), $esi then movl %esi, -0x8(%rbp) ?
Because you typically have one register that is your memory address register. That is what tells the memory which data the program wants to look at. This register can only be set to one value at a time, so you first point it to where you want to read data from, put that data somewhere (like another register, say, %esi), then you point the memory to where you want to write to, then move the data in. Cheers
@@nakitumizajashi4047 I reckon there could have been a set of microcode instructions that would make up a memory to memory move command, but IDK if any architectures implement that at all. It would use more steps than your average instruction so might as well leave it to the program.
My 14 year old self back in 2003 would be extremely excited and thankful if someone would explain machine naguage in such a clear way. Thank you and well done!
Simple and interesting explanation, I have experience with assembler, and C ++ is my main language, but I tried to watch this like i'm a beginner. And in my opinion, that was very easy to understanding. Big respect!) Sry for my bad eng)))0
I've always regarded C as a sorts of macro generator. You can almost see the result in asm when you write C. Although with any level above O1, things get totally too much for a human to read, unless you wrote the compiler.
If you have access to the original source code you can use: clang -S -masm=intel prog_name.c which will generate prog_name.s with Intel assembly syntax.
So at college I have learnt C and Assembly Language for 8085 mp. So I can say that we can program 8085 using assembly language to do specific tasks. With C we can actually write any code and hence it is being converted to Machine Language by the compiler for the mp. So can we say that C language provides us a greater flexibility of "programming the mp with ease as we are writing code in HLL and that being converted to Machine code"?
8:15 I believe that line puts tge x value to the aex, where it can set a flag. The next line sets the flag, and the next line uses it to determine wether to jump or not.
I see your reference there. But I got to say, most professional coders don't do stuff this hard for work. Not that I think journalists could learn low level or high level languages to proficiency.
Thanks for the video! Glad to find others who think this is super cool. I just finished my assembly course and I'm sad its over. I'm pretty sure I'm the only student who actually did my assignments and didn't just find code to poach on stack exchange. I'm even more sure I was the only one who really enjoyed the class and preffered it over C++ and way more than Visual Basic. My C++ teacher has been giving me a hard time. Assembly is "neat" he says, but VB can make "real world programs" Humph. I figure if I love something that most people dislike, even if I don't do it directly, there's a market for doing that kind of thinking....???????
Tell your C++ teacher he is an idiot (you can quote me). VB is the worst for making real world programs. Create a Hello WOrld program in VB and compile it. You get a program that is >10K. Do it in assembly and it is 128 bytes..... He must have stock in storage manufacturers.... I'm CIO that used to teach machine code/Assembly when the first PC's came out. Wrote games on C64's until the C compiler couldn't comile them anymore and switched to (macro) assembler. You don't know programming until you have done that at least once for a larger project.
I have just started to learn assembly in school with an msp430 processor, won't the compiler optimize the code so it uses registers instead of ram? Isn't it a lot faster?
Yes, the compiler will optimize the code to use registers instead of ram. You just have to turn optimization on. As you can see here: godbolt.org/g/UndH1q The first output is with no optimization flags, and the second is with -O2. There are no risks to doing this. Shodan doesn't know what he's talking about.
My father, who has been working on microcomputer programming for almost three decades, converted his programming language from the machine language to the C language at his forties, as"Machine language is too hard to comprehend and debug."
The C code is based on a “higher-order” modeling of the DEC PDP -11 assembly language. Many of the C-code has a direct relationship to the assembly code. I’m old enough to have talked with the Bell Lab guys!
I remember spending hours upon hours typing almost endless lines of hexadecimal code into the computer's RAM and then compiling it overnight and recording it onto DAT cassettes so I could play computer games. Intel 4004 processor, 4k of RAM, with a 12" amber CRT... Good times... Good times...
I made a hello world program in C then edited the output in the binary using VSCode Hex Editor on (line?) 00002000. I compiled the program on Linux x86_64 with gcc 12.2.0. edit: edited some empty lines and nothing changed, does this mean I can encode stuff in executables lol
If you fully know the file structure, address values, and you can change them if your dimension increases, then yes. But not with all the bytes in the structure, this will work, but with many it will work.
cant believe if you change the executable it will change what it does, that's so unexpected! there's a joke: "for someone who knows assembly very well, every program is open source"
I see people online saying "Recursion is easier to read, faster". Whilst the last one may be true, I don't know nearly enough lol, recursive functions have always been pretty much impossible for me to read.
@@psun256 recursion definitely shouldn't be faster, as a general rule all the repeated function calls that have to be allocated on the stack make the recursive version of a function either slower or at least more resource intensive, the only case i've ever seen recursive recommended for is when it makes code easier to read (and the only example of this i've personally experienced was with binary trees)
@@jake3736 Not necessarily. Some languages (Scala comes to my mind straight away) have tail recursion optimisation, so effectively the compiler is translating recursive code into iterative one. Of course the problem of stack allocation (and eventually stack overflow) is another reason to stay away from recursion if the trade off are not very well understood (and usually young university student don't understand those at all).
I was curious, so I dug up what movb $0x0, %al was doing. printf is a variadic function and when such functions are called, the AMD64 ABI requires that %al contain the number of floating point arguments being passed. In this case, there are none. Therefore, %al gets the value 0 prior to the call to printf.
That's not machine code it's assembly language. Machine code is the hexadecimal or octal output from compiled assembler or manually written.. Just saying
5:56 "Not sure what this other thing is. It writes 0 to the lower byte of the eax register (rax on 64bit but you seem to have a 32 bit machine). The other line is just setting the value of eax into the stack. Eax will hold the return of the last printf function.
"It writes 0 to the lower byte of the eax register " so what... you didn't push the envelope. It specifies "0 floating point arguments in registers passed in to variadic function".
*I am going to bed, but this looks like a nice video. Thanks! If made a video about writing a synthesizer on a discrete computer, reply with a link to that video. Thanks in advance!.*
I think you've made a mistake when you told about the stack frame. Actually it was already set up one line higher and "movl $0x0, -0x4($rbp)" just sets up one of your variables (=
I'm studying IT, and coursing a few subjects that include C, C++, Assembler and Pentium processors architecture. And this is one of the best, and more interesting video that I've seen. Great work!
Is there a reason why, when setting x to y or y to z, an addition register is used? Would it not be faster if 0x8 would be set to 0xc, or is this limited by the way the hardware is designed?
+Kwin van der Veen CPUs rely on registers to do any computations on data, think of it this way, if you wanted to copy a memory address to another, you would have to know the length and type of the values you are copying, it's simpler to work with 32 bits registers that are fixed size.
I think it might have to do with persistence- just because I want x to be y doesn't mean I want y to be 0. It might also have to do with the fact that = could, theoretically, fail. There might be an overflow or the command might not make sense at all. These are just guesses, though, and I, too, would like to hear an authoritative answer.
those arent registers they are memory locations ... the numbers 0x04 08 10 0c 14 ... are all locations that come AFTER the final jmp that starts the program over ... ... the 0x04 offset adds 4 to the last instruction (gets passed its area of use) and defines This is where i am going to put integer values .... the 0x08 is actually making a place for the X variable .. 10 is Y an 0c is Z .... if he had chars or boolean or string ... there would be another offset when he initializes it to an offset that falls in after the last amount of stuff from any previously set variables ... if he added a string variable .. say... its offset would fall at 0x18 for the defined space ... and then the first string would be either 1c or 21 for 4 bit and 8 bit lengths
K van der Veen 2 Years later; The answer to your question is the CPU has address and data lines to memory but memory does not have address and data lines to other memory. So to move any data the CPU must take the data into a register and then from the register to the desired other data location.
Learning little by little. This is a great explanation! Note: Everytime I see your name, I can't help myself but remember the song "Maneater" by Hall & Oates. It would be a perfect fit if you change the chorus to "He is Ben Eater" 😆
You have to explain it very well especially for beginner. Explain why u start at that of line 0x0 and -0x8(%rbp) "How do I know where I start to see the variable, what is this 0000f2f, $, %"? Explain cleared and basic , provide hexadecimal. Indian demo and rus was better than this. Just to give u and advise.
Probably start by reading about fibonacci series. You'll find interesting videos explaining how it appears in nature. Then read some basics of how C programming language can be used to perform certain operations like printing something on the standard output, like in this case we are printing the fibonacci series
As a web dev, watching this makes me feel like I just swallowed the red pill and saw the real world for the first time.
Yeah I know how this feeling too. It just kicks in like "Oh we evolved all the way to here, jeez"
As an electrical engineer this makes me say "here we go again".
@@hattrickster33 you could say that c is one of the closest to the metal in the high language class.
@@hattrickster33 well compared to other languages, C is probably the closest thing to machine code, but C itself is still a high level language
@@deschia_ We did testing on it back in college, comparing hand-coded assembly, C, Fortran 77, PL/1, and last (and least) Cobol. C and Fortran compilers did a reasonable job of producing something pretty close to what we did in assembly. PL/1 threw in some extra overhead which I think was related to memory management. And Cobol created a scary pile of machine code that we decided not to look into too deeply. I think it was summoning something from Cthulhu.
I don't care what other viewers say. Keep using paper! Sometimes you have to go the extra mile to make a point. I like your teaching style. Thanks for the videos. Very good info here!
+JP Aldama I agree. There is something I love in making notes on printed text. Plus, you can explain something so much quicker on paper because drawing and organising information is quick and intuitive, whereas doing such on a computer takes time to plan out.
They don't teach us that because the last forty years of computing history has been all about NOT reinventing the wheel. People got tired of having to start over every time a new computer came around, so we standardized our hardware, and operating systems (most notably, Unix) became portable between CPU architectures. Developers (the vast majority of them, at least) stopped caring about the low level stuff because they didn't need to anymore, and the computer science world progressed towards higher level things.
They don't teach us how to actually do it because to go from nothing to even just a bare bones, functional shell environment by yourself would take years and years of development. So they just teach us the theory behind how it works and leave it up to you to do that stuff, if you want to.
I feel where you're coming from, though. I used to feel the same way and I tried to learn things from the bottom up, but trust me; you'll be a lot better off if you start with the higher level systems and work your way down. It gives you kind of a bigger picture to see where the little things fit into.
It's not a problem, though. Nobody teaches 8-bit assembly because nobody uses 8-bit assembly anymore except hobbyists, and hobbyists already have many resources available them to learn from. Not to mention that most people interested in 8-bit assembly grew up with computers that ran it, and thus already know it! In fact, we have access to all the resources they did and more with the help of the internet.
We can't expect the world to cater to our extremely niche interests. That's why we're all so grateful to Ben for sharing his knowledge and guiding us through the process
In the UK assembler is taught as part of A-level electronics. The kids love it
Not just hobbyists. Assembly language can also be useful for hacking. I'd imagine it be really useful for reverse engineering, finding certain exploits, and malware development.
The instruction at 0x10000f63 is moving the result of the printf function (the number of characters written) to a location in memory (even though it isn't used)
Thank you! This comment should be pinned.
I never figured out what the printf() was supposed to be;
It is implemented in 16-bit code that has to keep two registers pointed to the same address; It runs much much slower than what makes sense to me; A data-block like 1024 or whatever shroud be alloc at init; Like above while ( int ) I found much established C/S to be Horror Code of the Damned written by relatives of the Munsters to prevent use of sanity checks like if do while which works much much better due to zero based indexing
@@opus_X And I get paid well for it 🤣
@@craig1231 how much time u spend in learning machine code, i want to learn too!! Its cool
So you're saying the code was suboptimal in execution time?
When I first started programming in C (mid 80s) I wanted to make sure the compiler was doing a good job and would always check the assembly for timing critical code. After doing this for a while I realized I could write the C code in such a way to influence the compiler to output very efficient assembly. Nowadays, the few times I do this, I'm amazed at how good modern compilers have gotten at optimizing for speed.
this guy is the real dela
I would like to learn to make the compiler more efficient. But I just started whit c and c++.
@@random-user-s theres not much you can do now in days lol. Also not really worth it imho because of how good compilers have gotten. But there are some reserved keywords in C and C++ that can tell the compiler certain things. All i really know about is marking functions inline can speed up the compilation process sometimes and can boost performance. Again, its not really worth doing that because the compiler should do all of that for you when necessary (if you mark the compiling command with -O3) its pretty easy to look up and youll eventually get the hang of it when you code more
@@random-user-s Why would you? What's wrong with your compiler that you want to upgrade it already as a beginner? Maybe you should just try another one? Msvc, mingw or clang.
@@squizex7463 Maybe just due to curiosity? Or because a man wants to understand things better or just keen to do hard things?
By your logic, one doesn't have to do anything because all the good things, by which you can do your software, are already written. So all you left to do is use them, which's boring af
Idk why but there's something so satisfying about seeing terminal output on paper. Especially C code and disassembled code. Mmmmmm.....
Too bad its at&t syntax though. Eww.
yea lol intel 4ever
random offspring Ikr
random offspring , you deserve a stack of tractor-feed paper with alternating green & white lines :)
IKR can't explain it either, but it just looks so satisfying and perfectly organized. something like asmr
0:20 how did you get that infinitely long paper?
it's a vector h ah ahaha
iam bad , iam going to commit a suicide,bye world, sorry people that were actually hurt by this joke
Coz while(1) is an infinite loop
I would rather call it "indefinitely long" :P
its still being printed out, he just cut out a part of it
I was wondering the same thing. Wizardry?
At Uni I made a Snake game in Assembly IA-32 for a course. Never again, thanks.
Github? :P
I wrote the a-star pathfinding algo in x86-64. Just for fun...
I feel terrible for you. I tried messing with assembly once but i couldnt get anything working
Play shenzen io
@@xaiano794 i dont need to buy shenzen i/o to experience the pain of assembly
The type of video that makes you ask "How did people come up with this?"
The type of video that makes you ask "about the type of people that came up with this?!"
@@hiotis75 Ελληνάρα
The crash course yt channel has a series on computer science. Clears a lot of things up.
Of course aliens taught these people lol
I guess it's tightly related to how memory and cpu work internally.
And it is very limiting due to the binary nature as well as a frequency ceiling of the transistors. Dead end if you will in my opinion. Invention of multiple cpu cores bought us some time I suppose but the future is somewhere else.
In only 10 minutes, you made me want to learn assembly language. Il looks so simple when it's explained so well. You did a great job, Ben Eater.
Hahaha......
Go for it. Sure a fun language, you start seeing everything the compiler or interpreter does in background for your happiness
I always thought assembly is useless and just a waste of time and money to take that class in uni but after I finished the class I realized how important it is, this might seem like an exaggeration but Assembly made me finally understand how Computers actually work and its diff one of the most important classes in CS .
also its really useful for reverse engineering a TA in my uni showed me how to crack a program just by understanding assembly
@Adam Richard lol so true, I tried making more elaborated programs and instantly gave up. The fact it might be very different for each processor one might have makes it very discouraging. Or just raging, don't even need the "disco"
The real question is which flavor? Arm? Intel? 68000? PIC?
7:05 you can actually notice how each variable takes 4 bytes of memory from the way they are located always 0x4 apart from each other
same thought!
Ah, I thought that the memory adresses were just chosen "ranndomly" by the compiler". But this makes me wonder though... how does the computer know how much space a variable takes up? Nothing in the machine code in the video shows that. What if the variable took up more than 4 bytes?
@@aurelia8028 in many languages you determine the datatype right?
in "int x = 2;" an "int" is for example always 4 bytes
and "double y = 5.4;" would make it 8 bytes
etc
*edit:* the size also depends on your platform... as mentioned by another commenter below, an int may be 2 bytes as well
@@SreenikethanI Sort of, it depends on hardware and/or compiler. `int` can be 2 bytes as well.
@@TeoTN oh right yeah
Compilers were invented in 1952. People in 1951:
pretty much, yeah
The eax register will contain the return value of the printf function. Evidently it is being stored on the stack in the expectation that it will be needed later. Presumably you had the optimiser turned off when you compiled it.
I'm genuinely surprised C makes so much use of the hardware stack, since if you looked at the C2 compiler in Java for example it absolutely hates using stacks and almost always does everything in registers unless it has no other choice
@@theshermantanker7043 If you compile on any level of optimization, it usually doesn't make as much use of the stack. By default, GCC compiles with absolutely no optimizations on, though. I find it's easier to make sense of the compiler's assembly on -O1 (the lowest level for GCC), because it puts things in registers a lot more, like a human would.
@@theshermantanker7043Originally that is what the register keyword was for. It told the compiler you wanted it to store variables in registers if possible, but it was just a request and not a given.
@@theshermantanker7043 THIS.
THIS is a comment I like.
I wish I had a save button like Reddit here...
I'm replying instead. Thanks!
I see, thanks for pointing that out, its interesting that the compiler still consider that [printf] would need to going back to where it come from even when it see that the loop is infinite
One of my college profs was in the Navy and needed to write assembly for the Navy to optimize COBOL code. He wrote it in FORTRAN and turned in the assembly. They had strict goals on lines of assembly to be written and debugged per day. He always met his goals. His reasoning was that FORTRAN was a pretty efficient language, and so he probably couldn't do much better. The Navy never knew they were converting their COBOL to FORTRAN.
You’ve reminded me of a talk I gave this year showing how some fortran code appeared in assembly. Fortran is still widely used in my field (supercomputing) and understanding the impact of such things like compiler optimisation is very helpful.
I think we had the same college professor
@@mohamedrh4093 of what college?
@@18890426 aui ?
@@18890426 Al Akhawayn
wtf am I doing here, I can't even code
I don't know why, but this video is very satsfying to watch as a programmer. It's very logical and makes sense. Like if you'd suddenly have a partial look into a womans brain and actually start understanding something.
Why, I think everyone learns backwards, If they would start at low level which is cold hard logic memory movement and work up the chain I believe they would learn how to program much faster. Lang like basic trigger bad habits that become hard to break such as never clearing your memory or initializing variables and things like C++ have turned into a cluster fuck due to the Total Over use of OPP everyone seems hell bent on these days. I would suggest if someone wants to learn to code go back to DOS, Get Turbo C and use that, It was a great lang with great documentation to help you telling what every single command did ect.
If he tried to learn Java before ASM hes going to be crying like everyone else on this video is about how hard ASM is to understand when its WAYYYYYYY easier to understand then any lang I have ever used including Basic. I think the Fail comes with most people because they don't comment their code and lose track of whats what but its simple top down programming that can be traced with ease.
I know I will catch a mess load of flak for saying it because I still get a lot of flak for using it from time to time but I honestly believe DarkBasic is one of the better things for a programmer to start in.... Hear me out before yall hate on me. Starting off a programmer wants results, ASAP. With darkbasic its as simple as
Sync On
Make Object Cube(1,10)
Position object (1,0,0,0)
Position Camera (0,100,0)
Point Camera (0,0,0)
do
control camera using arrow keys 0,1,1
loop
wait key
That code above will draw a cube on the screen and point the camera at as well as allow you to look around with the arrow keys, it which is a great starting point for most hobby programmer since the will feel the excitement going right away with a 3d object they can manipulate. This same code in say C++ for instance would literally take hundreds or thousands of boiler plate code just to setup the engine to draw the cube and accept the input. Look into darkbasic. Its old but its effective and its fun as all hell to toy with.
I started on a TRS 80 Model 1 with 2k ram and a cassette tape player. Basic. Then a Commodore 64. Commodore Basic. Then C on my BSD systems at home, took online local community college courses for Visual basic .net and C - grew tiresome. Right about then it became evident that code monkeys had to compete with $3/hr dev teams in India. Writing on the wall was that the money would be in Java. I stuck with sys admin needs; Perl and C.
FEAR of Java, FEAR of having to think about this stuff, FEAR of actually applying what I've learned in school...
NEVER learned these basics. (been TAUGHT it many times!) Never formed this solid foundation. In other words; I can't code to save my life...but I have worked for years making money doing it. Flying by the seat of your pants every day...making it work, doing the seemingly impossible. There is reward in that, at least. It feels good to actually DO this stuff in the real world for real world paying client needs. I can't even last in a programming conversation for two minutes. My point? - Just *do* *it*.
Just a little remark for people wondering why the code generated by the compiler contains strange and unuseful constructs. It is simply because the code was generated with the -O0 parameter which means, no optimization whatsoever. This means that the compiler basically does a nearly 1 to 1 translation of the C code to the assembly, without considering if the operation are redundant, unused or stupid.
It is only when optimization is enabled that the compiler will generate better code.
In this example, for example, it is stupid to read & write x, y, z continuously from memory. An optimizing compiler will assign register in the inner loop and will never write their values to memory. The spilling of the printf return value 'movq eax, 0x14'bp)' will of course not be emitted/
Interesting that clang -O2 results in the output values (1, 1, 2, ... 144, 233) being hardcoded into the binary. The clang compiler is evaluating the result of the loop at compile time.
@@zoomosis Hahaha, thats very interesting.
I've always thought compiler does so complicated stuff that Im not gonna even try to understand it. So I always assume that they can do pretty much anything. I wish to write my own compiler one day, very simple though.
Can you define what you mean by 'spilling'? I mean, yeah, the return value of printf is loaded into this memory location, but it is never checked for success anyways, so why isn't it redundant?
@@lukasseifriedsberger3208 That 'redundant' store of the printf() result _IS_ the 'spilling'.
The 'prototype' of the printf() function shows that it returns an int so, by default, the compiler will SAVE that value somewhere (even though the value is never used!)
If the source code is compiled with some degree of optimisation (eg: -O1, -O2 etc), then it will remove this redundant store of the printf() result since it's never USED!
For further reading, what does the returned value of the printf() function actually mean!!! (Not many people have ever USED this printf() return value, so they don't know what it actually signifies - It's probably more relevant for sprintf() or fprintf())
Thanks for that info... This excellent video's inspired lots of useful comments!
I like how the compiler optimised the while(1) into an unconditional jump instead of actually evaluating the expression "1".
I know compilers have been doing that for decades, also it's a very basic optimisation, but I enjoyed seeing it on paper :D
Except it didn't optimise anything. This was without any optimisations
@@_yakumo420 I think it is pretty interesting that even without any optimization, it became an unconditional jump, rather than test whether the int 1 evaluated to 1 (I'm pretty sure that's how while(..) works in C). I guess it's common enough that the GCC developers just hard coded that optimization construct into the compiler?
C doesn’t have booleans… so 1 == True
@@splashhhhhhhhhh Yes indeed but that wasn’t the point. The compiler detected that it’s a tautology and optimised it even without the optimisation flags set.
Even without optimizations on there are some optimizations that will always take place, such as not using hardware multiply/divide/modulus on powers of 2 etc
Regarding, moving eax onto the stack. eax contains the return value of the printf call. It's not actually needed by this example. It's probably saved to help a C debugger display what was returned and is likely a nuance of the compiler.
So basically, it's almost like the compiler turned "printf ("%d
", x);" into "int oX14 /* I chose the name as a mock of the memory location shown in the above assembly */ = printf ("%d
", x);"?
Dameon Smith this was going to be my guess
This makes sense, but I was wondering why this instruction only occurs after the prior 7 lines instead of right after the call instruction? I'm guessing this might be because the cmpl instruction will actually overwrite the value of eax to store the comparison result. Does this have to do with the compiler not being able to look ahead to see if the value will be referenced and just postponing storing the value for future reference until it absolutely has to? Also, this would mean the instruction wouldn't be there if the routine wouldn't reuse eax and just returned instead, correct?
What code could have followed and still use this value at this point, without explicitly assigning it to a variable right away? Can you give an example?
Thanks for the explanation, but I'm still unclear on part of it. I understand that eax/rax contains the return value of the printf function and by the time "movl %eax, -0x14(%rbp)" gets executed, that's still the value of eax. From what you're saying, I get that trying to access -0x14 from assembler code would be a mistake, and I get that, but I don't see why the value needs to be kept around at all - it's clearly not referenced anywhere in the source code? What use is the return value of the printf function at that point? And why does it only get moved to that address at that point in time, instead of sooner?
Yes, I suppose so, in that I agree with you: it's really a question about the compiler and not so much about the program either in C or assembler. I'm a software engineer myself, and having written compilers, as well as tinkered with command interpreters in the age of DOS on an 8086, I can strongly relate to what you're saying.
My curiosity was raised by the question raised in the video, about the meaning of that particular instruction - which was answered above by +Dameon Smith: it's the return value of the printf function that's being saved for whatever reason, independently of the program under consideration. I suppose I could look into the inner workings of the GCC compiler to find out, I more or less hoped someone might have an intuitive (and therefore short) reason off the top of their heads. But I agree with you, that's likely not the case - and certainly not the topic of the video, as the author rightfully stepped over the problem and seems to have taken some care to write their C code in such a way that the assembler would be as clean as possible for demonstration purposes.
you didnt mention why y is allocated in 0xC. That is because integers have a sizeof 4 bytes so 0x8 + 4 = 0xC
Technically a long. :P
that's only in C definitions
Same for z: 0x0C + 4 bytes => 0x10
Zupprezed And why does it starts at 8 instead of 0 ?
Bvic3 Notice that something is being already saved to the position 0x04 at the top. And the number is basically an offset to the base pointer (%rbp) so 0x00 would be the base(?) of the stack frame. I don't know, maybe something is stored there
The compiler emits movb $0,%al because printf() takes a variable number of arguments. The ABI specifies that when calling such functions, %al must contain the number of floating point arguments. There are no floating point arguments passed to printf() in your example, so %al is set to zero.
Which ABI are you referencing? I tried to look for an appropriate OS X ABI that would cover the cdecl calling convention, but nothing I found mentioned this approach to counting floating point arguments.
@@Hamled Personally, I just assumed that the string has to be null terminated. But I have no idea what that %al stands for.
@@Hamled The System V ABI, I'm pretty sure
Too late in this discussion, but the zero inside "movb $0,%al" is just an information, that the printed value should go into stdout stream (in normal circumstances it means that it will be printed on the screen).
Anyway, this video and discussion have returned back a lot of memories...
And last but not least, If anybody would like to, source codes for printf() are available, but be warned this function is really complicated one, because of a posibility to use variable list of of arguments with all kinds of types, formats and architectures.
I understood in theory how C went up to other languages. Now I understand how C goes down to bits. Awesome work.
Sorry for the noob question, but isn't this actually assembly? I thought machine language was basically just ones and zeros?
Yeah, you're correct; machine code is literally just binary. Otool seems to be a disassembler; it tries to format the machine code into something a little easier for a person to read
Trying to read an executable written for an operating system through a hex editor or something would leave all the header information and such in the output; making it a little more difficult to see what's going on
I could be wrong, but the actual machine code would be 1s and 0s of the low level language the CPU uses. The code shown in the video is that code translated into a kind of assembly.
Assembler code is human readable, the assembler program turns it into machine code.
Machine Code is binary ... The nemonic we use LDA ... etc is assmebly language and in hex because 255 ones and zeros take up a ton space on a line ... while ffff doesnt
converting from binary to assembly you run an ASSEMBLER and to convert a langauge like C++ you compile it into assembly language then assemble it in to machine code ... because sending ffff is easier to handle than 255 ones and zeros in a line
It is x86 (-64) assembly. Machine code is literally just bytes.
Old video, but I still want to remark that you can add the "-S" switch to make GCC output assembly directly into the output file.
Nice tip, thanks!
frozen_dude - Yeah, I was hopping to have "otool" installed, but I didn't. I looked around and found this: stackoverflow.com/questions/137038/how-do-you-get-assembler-output-from-c-c-source-in-gcc
There are lots and lots of ways to get gcc to output the intermittent stages of compilation. I love gcc! If people have never walked through the stages of compilation, I highly recommend doing it.
or
> otool -tv main > main.s
I thought I'd throw an example of the complete compilation stages out there... I guess because I find it interesting and informative.
So when you compile a C source file, the process goes through 4 stages: Preprocessing, Compiling, Assembling, and Linking.
1. Preprocessing: 'gcc -E example.c -o test.i' < The example.c file is preprocessed with the include files, and other directives, #ifdef, #include, and #define.
2. Compiling: 'gcc -S example.i -o example.s' < The source file is compiled into assembly.
3. Assembling: 'gcc -c example.s -o example.o' < The assembly file is converted into an object file, a machine code file.
4. Linking: 'gcc example.o -o example' < The machine code file is linked together with other machine code objects and/or object libraries into an executable binary file.
The *.i and *.s files can be examined in your favorite text editor. The *.o file and the final binary file are both binaries, so you'll need a hex editor to view their contents.
otool is an OS X program. On Linux, use objdump -d.
I feel calm when people use paper to explain :) very educational and relaxing
This is a good example on why learning coding without understanding how computer technologies layer on each other seems so daunting. Just learning a coding language is not really that difficult. But coding is complexity built on complexity, and each layer down it become exponentially more complex. From an outside perspective, like when when I first started learning code, it feels like you don't just need to know the top layer of knowledge, be it python or c++, but you need to understand what makes that work and how something else makes that work. At the end of the day Id have the impression I was going to have to learn how electricity works to understand the chipsets or ram to understand the next layer to understand the next layer all the way up to my code.
The great thing is that these languages were made so we don't have to do that. OOP and modern tech has almost made everything so independent and modular that you can learn the end result without knowing fuck all about how it works.
You don't even need to know to code to write games anymore.
if you want to know what hardware is doing, learn Computer architecture
Like the Techmen from Foundation. They knew how to work on nuclear power plants but had no idea how that shite worked
wrg, no such thin gas dauntingx
You are right but there is one thing.. I dont think learning OOP or coding language is easy .. They are also difficult because if you want to learn really well they steal a loot of time from you :(
I was thinking this exactly today! I was wondering how much do I need to know about this stuff and how may It help me. Although I know I don't need to know all of this stuff is so interesting to me and I think It can give me a better understanding of computer science as a whole, so I'm planning on at least do some research. It's only been 8 months since I started learning Web Development but I am fascinated with everything related to computer science.
to any one doing c++ exams on paper.. do a table with all variables and update their values like the code says.. this way you keep track of everything
AKA "dry running", in the days when computer time was horribly expensive. It's still the best way to understand what's going on in code, and uncovering places for code optimisation, if performance is a problem. Don't optimise code before you've considered the algorithm, though.
I had to do both courses of programming (Pascal and C) on paper, and there's no time to do that (if you want the highest grade)
@@dowrow6898 When writing code or answering what a block of code gives as an answer?
@@thewhitedragon4184 they give you a block of code with a lot of unusual stuff and you have to answer what it outputs, or what are some elements of an array or something similar
@@dowrow6898 I have the feeling we attended the same collage because it's the same garbage here 😂
Nice video bro!
Chris!
why your not verified
I like it too!!!!!!
I taught myself BASIC then Pascal then C++. Learning was actually fun with some of the books they had in the '80s. I got a C64 for my 8th birthday, and I got the C64 Programmer's Reference Guide. It's just amazing the things that were in that book. It went from teaching you BASIC to showing you the memory maps, the pinouts for all the chips, and how to do graphics and sound. But it also had a 100-page chapter teaching assembly! It confused me because it made cryptic references to an assembler called 64MON which I had no idea how to get, but that made it more intriguing. The assembler class I took in college was also one of the only interesting classes I ever took. But I'm pretty weird. I was such a nerdy kid that in middle school I wrote letters to Brian Fargo and John Carmack asking for career advice.
That is seriously awesome.
@@captaincaption Brian Fargo actually wrote me back! That would've been about 1990 or 1991. I don't know what happened to the letter. I really loved Bard's Tale III and Wasteland. And today, 21 years later, I'm doing 2nd round interviews for L5 (senior dev) at Google ... but I just wanted to see if they offered anything interesting.
@@RaquelFoster great stories! It sounds like you’re doing well in your career and interest. That’s always good to read!
Is it just me or you are feeling excited as well
when you see machine language?
I was learning python and working on stuff with for like everyday in 8 months.
I started learning C and now it just feels a lot of fun language to work with!
I even gave a break to python for the time being.
Watching assembly feels interesting as well.
Yeah same I'm having more fun learning assembly than the high level languages maybe that's because I'm a computers' nerd lol
@@unknownguywholovespizzaTo me it is eye-opening to see the true atoms of computation. It bridges the understanding of high-level programming and the understanding of how hardware fundamentally operates on the values stored in memory.
I am a beginning game developer. I have heard stories of how developers have written their games in C or even directly in assembly to maximize performance while keeping the size of the games very low. While most of my projects use existing engines and much higher-level languages for the ease they provide, I wish to pursue skill in C and assembly so that I may be able to write games that perform as well as humanly possible.
Using paper. I've gotta give you a thumbs up.
@LoveLiveKillBillLife Paper is a technology bruh
Your video is still helpful in 2020 and I'm sure other people would also understand concepts from it in coming years. Subscribed!
"Back in my day we had to compile code by hand"
This is so cool, and I think this would be a way more fun/efficient way to learn Assembly than what's taught in colleges. It's way easier to see where these commands come from and what they mean if they're being directly compared to an actual C program. Much harder if a bunch of Assembly terms you've never heard are tossed at you and all of a sudden you're expected to code a program like this.
Try coding in machine language, now that was a chore. Assembly is just a higher level language that is converted/compiled into machine code. I originally started out studying electronics, so we had a course in machine code and had to write a program using it.
@@johnshaw6702assembly is machine code put in a readable way for a human.
Fr 💀
@@johnshaw6702 That might be interesting to know how those 0s and 1s run your Processor right ?
@@johnshaw6702where to start
I appreciate your effort to make this teaching video to share what you know and honestly say don't know to things you don't know. Well done. I'm not sure either what's the point of moving the contents of eax register on to stack.
so it can be formatted loaded and printed ... it has to strip the format out of the print ... the the data pointer then the data then print it ... and a stack is the best place to do that from as you can shift left and grab the format ... and then shift left and set format up then load the next chunk and shift left ... read data pointer ...and shift left ... load data .. shift left and finally print ...
Remembering my first programming. You looked up the op codes and entered them on a keypad in Hexadecimal. This literally was writing the cpu instructions directly. I miss the 6502.
Way back when I was in school, we had a lab course working with the M6800 (6800, not 68000). I used to write my programs in C then hand-compile them into M6800 assembler. And of course, hand convert that into machine code, which then had to get toggled into the machine.
Hey me too man! Learned Basic on my Apple II and when I wanted to include some heavy-duty math subroutines, I'd POKE the hex code into a memory location then call it when needed. Even on that old 8-bit processor it ran blazingly fast!
The 6502 instruction set was very nice and clear, as was the Z80's to some extent. The intel instruction set was ugly in comparison.
ARM assembly language is even worse, it's not meant for humans. Every instruction can do something and can also do something completely different, depending on some weird prefixes. I hope no human being was ever forced to write ARM assembly code.
This actually makes sense. As mainly a C# dev, C isn't actually hard, first off. Pointers and such can get a bit complex, but they make sense. This code is certainly simple. The assembly makes sense too. It is beautiful how simple it is and how it uses such sinple functionality in order to create more complex end results. This helped my understanding of Assembly and it might be one of the things that help me finally make a PS2 game one day.
Not to be the party stopper but ps2 is a dead thing of the past
Sure it is all simple. But it takes a genius to appreciate the simplicity. Shamelessly paraphrased.
It seems simple, until you have to do implement data structures in C; then you find yourself crying for days on end, because you can't seem to resolve the clobbered memory errors that keep popping up on you!
@@IM-qy7mf structs are very trivial.......
if you have massive experience
@@IM-qy7mf AddressSanitizer makes this significantly easier to debug, though. It's like a plugin for compilers that instruments code using the compilers' own semantic information. You should also get in the habit of writing asserts for potentially incorrect or dangerous code.
Interesting how the "clever" compiler converts an infinite loop while(1) in absolute jump
it has a no optimization parameter in this specific case
computers are stupid. we just give them instructions,
Since it's always true, checking it is a waste of time. Even with optimizations "off" some optimizations are always done. Such as bit shifting instead of MUL/DIV by powers of 2.
0x0f's where given on that day
@strontiumXnitrate It was a joke referring to "0 fucks given"
@strontiumXnitrate ok booomer
@@افاداتواستفادات why you gotta do em like that
inxane有害な wooooshhhh
@@افاداتواستفادات
Where the hell did that come from?
this brings me back to my assembly class at university, in 2002. i liked that class a lot, but i've never used it again since i didn't go into a career in embedded
Can you tell me what does the 0000000100000f2e under _main: means
@@tamny9963 - So you think that because someone can't remember something from 20-years ago that they're automatically lying? Or, are you just looking for attention?
@@deepkarmakar5346 virtual address (image base + VA = full address) of the instruction ?
@@radon-sp thinku
@Jonathan Dahan you okay?
Really enjoy your videos, started my programming journey, if you will, about 5 years ago with the idea of wanting to make video games. i later found assembly programming and electronics engineering FAR more interesting than game design. I have been learning 8086 ASM on DosBox lately hoping i can get enough experience to understand how computers work entirely, i am currently in the process of learning how different IC's work on a breadboard and hope to build my own 8bit computer soon. Thanks for getting me started on such a fun hobby i hope to make my job someday, keep up the
excellent videos! Hope to see your channel continue to grow :)
Redxone Gaming How is your progress if you don't mind asking?
Yes I am interested to know too. I would like to build 8 bit computer too.
Please answer us bro!
Maybe he figured out that using machine language in software makes your product un-portable. There are many reasons *not* to write in assembler. And there are distinct instruction sets for different CPU architectures, so you can learn one ISA (inst set architecture) or you can learn all of them; compilers *do* have their advantages. All digital computers work the same way (registers, storage, interrupts, etc) but the devil's in the detail level you can't avoid in assembler. Everybody should *know* what compilers do and appreciate that today's compilers (I've been doing this for 40 years) are very, very good. You should also understand the overhead of interpreted languages like Java & Python (and the list goes on) before you make an implementation/design decision. Knowing the heart of how most of your customers' machines work (x86_64 for {lap,desk}tops, ARM ISAs for phones/tablets) is a valuable datum, should motivate us all to write code that's as efficient as possible. I still check my assembler output most of the time, but I'm about ready to retire ... probably an "old skool" type. But today's typical bloatware sucks. *Fight it.* Take pride in your work, know what you're delivering :-) _and good luck on your autodidactic journey!_
7:24 why doesnt the compiler just do:
movl 0xc, 0x8
Instead of
movl 0xc, esi
movl esi, 0x8
It need ti store value
Registers store the data
Wonder if he tried deleting that "eax" line or replacing it with a no-op or something to see if it mattered, or if it was eronious compiler overhead.
Minor correction, because I used to program in 8080 and Z-80 Assembly: Those instructions from the disassembly are more properly referred to as assembly code instructions. Machine code would be represented by nice hex numbers for the opcodes and operands.
Actually Z80 machine language is relatively easy to program by hand, for each opcode there are few bits of prefix and then register addressing etc. Then you convert all the bits in a hex number and done
Early textbooks used to make a distinction between assembly mnemonics and machine code. Looks like those days are long gone and the terms are used interchangeably.
Z80... My computer life started programming a TK-82C at 1982... Good times... 15 Minutes to load a 15 KB program from a cassette tape (after many attempts)...
Since we're being pedantic here about the difference between assembly code and machine code, it doesn't HAVE to use 'nice hex numbers'. Some CPU architectures were more suited to OCTAL representations, and technically, binary would be equally valid!
Footnote: Check out the MODR/M byte in x86 code and you'll see how well-suited it is to use octal in this specific case!
Having said that, I willingly admit that I'm predominantly a binary and hex man... LOL
The mnemonics directly represent those hex numbers. If he did print out the instructions in hex, you may as well then complain that it's not really machine code because it's not stored electrically in a computer, but printed with ink.
It doesn't matter how you represent something, it's the same thing.
Man, well done to you, you perfectly explained in 10 minutes what a professor in University had 6 months to demonstrate and still wasn't able to. Really interesting.
I'm a programmer but i don't consider myself one because there is just so much abstraction in high level and middle level languages...... Sometimes I feel like normal who can use PowerPoint . According to me, there is only one or two more level of abstraction in PowerPoint
Earliest versions of Pokemon series games were completely programmed in assembly language.
Just think for a moment how much time and focus it would have taken for those programmers.😉😉
Android pinball
@Cristi wow! I couldn't imagine how hard that must have been...
I don't know C nor assembly but I watched this from start to finish with my mouth hanging open. So interesting.
Scrolling through your videos i can see the depth of your knowledge , its brilliant and inspiring. I just subscribed.
I want to be knowledgeable like him about computers one day 😍
8:30 after googling around for a bit im about 60% sure its just an assembly representation of "while" because the %eax(Extended Accumulator" - used for arithmetic and logical operations, as well as for storing return values from functions.) is preceding the cmpl of x to hex of 255. I dont really have much experience in assembly but this is my best amateur educated guess but if theres any assembly experts pls explain what that line would mean
It's storing the return value of printf into the stack, this is because the code was compiled without optimizations so GCC included the superfluous store in the final code too
Just fantastic to see how efficient the code produced by the C compiler is. I spent years writing assembler as a kid and used to have competitions with other on how fast and small we could make our code..
Spilling every value (including even the unused printf return value) on the stack isn't exactly the most efficient thing to do-however, that's exactly the thing to expect when compiling with optimization disabled.
More than awesome video bro! :D ... and I have a guess for for movl %eax, -0x14(%rbp):
CPU Register
--------------------------------------
EAX = 4 bytes
--------------------------------------
| AX = 2 bytes
| AH | AL = 1 byte
--------------------------------------
Since the printf block played around with al ... and we have stuff (x and y) on -0x8(%rbp) and on -0xc(%rbp), respectivelly ... it seems really suspicious that line playing with -0x14(%rbp), which has an offset 12 bytes away in memory from our -0x8(%rbp).
If I remember correctly, the bus actually aligns the data before sending it to the CPU from memory to improve performance, and this means including some bytes that might be used soon like 0xc(%rbp) ( cache y :D ); for instance, or even send garbage bytes so we don't have to create circuitry to get the exact byte from memory. What this means is that even though our data to be printed is on 0x8(%rbp), it will be also sent to the CPU 0xc(%rbp), 0x10(%rbp) and -0x14(%rbp).
Therefore, I am going to guess this is actually the flush of buffer call for printing... and this the exact time when the printf is actually displaying the values for x on the screen...
I guess more information could be given if you compiled with -g -O0 ... however, this video is an awesome explanation. A+!
Yeah man I agree.
+Desnes Augusto Nunes do Rosário Right, it seems specific to the author's platform, I compiled the same program with Ubuntu 14.04 and don't see the same spurious instruction when using any of the -O options, but I do see changes in the assembler to optimize z = x + y, so yeah, a good debugger run would help interpret who's responsible for that out of place instruction.
its the compiler he is using actually and the version of the language and the system he is on ... the eax is his usable side of the c language stdio.h ... and it is used to allow formating ... as his printf statement wants to print a %D data bit then do a carriage return ... with the data pointed to by the value x ....
.
eax is a formating stack alu and program controller in itself ... because he sent a format command the language has to strip the format out of the print command ... and the data pointer and then load the data ...
.
printf ("%d%/n",x) ....
prrintf is in stdio.h ... so the first thing is to push it onto a stack to pull the format info out .. then advance and find the data pointer ... then advance and place the data into the formatted array and advance ... then send it off to the default display device .... just like when you step from 0000 0001 and have to fetch the first code line and strip it apart then find what it means and do it ... youre doing the exact same thing here just with software
@@0623kaboom Dude, no. Stop. That line is a spill to cache the value of eax on the stack because it will be clobbered by the return value of the next printf call. The only purpose of eax within this stack frame is to hold the return value of printf. Literally nothing more. WIth even the smallest level of optimization turned on you see the line disappear as it isn't even remotely needed.
The moment you realise that he actually printed it.
imagine randomly stepping on an anonymous programming language,
then you try to do some classic "print (hello world)" but it actually printed it into a paper
A disassembly might be coherent enough to understand, but I wouldn't call it "machine language", since a human wouldn't write it like that, that is to say it isn't a "language".
I miss programming in assembly. The first code I ever wrote was 6502 Assembly on an Atari 600xl. I also programmed in the following assembly languages over the years: 8088, 80286, IBM 360, R10000 and MIPS. After 20+ other languages over the years, assembly is still the one I liked best. It just felt natural. When I first learned C and was using the Turbo C compiler, I often wrote the function headers and variable declarations in C, and just inlined the guts in assembly. Those were the days...
I don't. At all. I wrote Railsounds II in Assembly because the processor (Microchip 17C42) had 2k code space and 160 bytes of ram. It ran at 4MIPS and at the time (93) was the fastest micro on the market. I couldn't wait until I could rewrite in C. Which we did. The hardest part was convincing Neil Young, my client, that we needed to do that. The rest is history. Over a million units sold.
Agreed. Very creative, very obedient. CPU does exactly what you tell it; nothing more, nothing less. If errors exist nobody to blame but yourself; and maybe the standard libraries which for assembly are minimal and usually just the startup code. I also wrote assembly for Honeywell DPS 8 mainframe; now THAT was programming!
@@thomasmaughan4798 Not so much on the obedient part. I remember seeing in a presentation that intel's 486 was the last x86 processor to simply run the instructions, in their order. After that came the out-of-order execution optimisations. And things like processing both outcomes of a check in the time the required value is fetch from memory and then simply using the correct outcome. So, nowadays, you don't really know what and how are things actually executing inside of a processor. Sometimes a less optimized code can be better optimized by the CPU optimzer.
rax is a 64-bit register
eax is a 32-bit register which refers to the lower 32-bits of rax
ax is a 16-bit registers which refers to the lower 16-bits of eax
ah is an 8-bit register which refers to the upper 8-bits of ax
al is an 8-bit register which refers to the lower 8-bits of ax
gcc -S -masm=intel program.c
ATT syntax is ok, but I prefer Intel personally... you’re welcome and thanks for the good video!
where do u learn all of this? any good books or websites as I want to understand how the machine runs c programs better
AH, ok
@@wh7988 Pick a processor, read the documentation, the documentation will tell you what commands there are and what they do. You can look up youtube videos or books for the processor and how to program in assembly for the processor. The class I am taking right now has us using code-warrior (ide) for programming the HCS12 (mircro-controller). I am assuming going with an arms processor would be a better idea though, they are more popular.
School!
A good (but expensive) Assembly book is "Assembly Language" by Kip Irvine.
You can use Visual Studio, admittedly a "long" process to set up, to write, run, and debug MASM. Give it a go.
T-rex is a dinousaur-bit
I like how he can explain this so well and is barely able to write :)
No need to write when you can type :)
nice... Can we pass value on the memory position(%rbp) direct to another whithout use the %esi?
for exemple: move1 -0xc(%rbp), -0x8(%rbp)
This the piece of the puzzle I was looking for years, thank you.
Almost a year late... On x86-based computers, eax is usually for return values. Don't forget that printf is not void, it returns a length. The compiler is a macro-assembler so it stores it on the stack anyway. What you can do is ignore the stack & use only the registers ebx, ecx & edx to store x, y & z, so in theory, it should execute faster. If i remember well, if you only want 8 bits, you can use even bx, cx & dx, or even b, c ,d
ax, bx, cx, dx are 16 bit, the lower and upper half registers al, ah, bl, bh, cl, ... are actually 8bit. Obscure knowlege FTW!
Came in the comments to find out what this line did. Thank you sir.
you can use the register keyword in C then it will compile like that
As far as I know, the "move" instructions are not "mov1" but "movl" - Move Long - where long means 4 bytes.
Ain't that exactly what he had on paper?!?
@@motsgar On paper "l" and "1" look very similar. The first time I heard the video I understood "move one".
@@fnunnari actually mee too but started to question that so concluded that it must be l
Why does the machine need to have a tmp registary, %esi in this case. Why can't it just move -0xc to -0x8? 7:50
Because the computer can't directly set values. Esi is a register, and the computer can directly set that
When doing: x = y, why isn't the assembly code just: movl -0xc(%rbp), -0x8(%rbp) ? Why do we need to do: movl -0xc(%rbp), $esi then movl %esi, -0x8(%rbp) ?
The reason is because there is no such assembler command, i.e. no direct memory to memory move.
In assembly, you a value from a block of memory is movable only to a register, not to another block of memory.
Because you typically have one register that is your memory address register. That is what tells the memory which data the program wants to look at. This register can only be set to one value at a time, so you first point it to where you want to read data from, put that data somewhere (like another register, say, %esi), then you point the memory to where you want to write to, then move the data in.
Cheers
@@nakitumizajashi4047 I reckon there could have been a set of microcode instructions that would make up a memory to memory move command, but IDK if any architectures implement that at all. It would use more steps than your average instruction so might as well leave it to the program.
because you can either read or write from the memory at the same time, can't do both simultaneously. So you need an intermediate register.
My 14 year old self back in 2003 would be extremely excited and thankful if someone would explain machine naguage in such a clear way. Thank you and well done!
Simple and interesting explanation, I have experience with assembler, and C ++ is my main language, but I tried to watch this like i'm a beginner.
And in my opinion, that was very easy to understanding. Big respect!)
Sry for my bad eng)))0
could you repeat that again plz?
I didn't know otool existed so I tabbed over to a shell on my Mac and typed 'man otool' ... this quickly prompted me to alias man to 'peter' 😏
And was your reply in fractured French?
I've always regarded C as a sorts of macro generator. You can almost see the result in asm when you write C. Although with any level above O1, things get totally too much for a human to read, unless you wrote the compiler.
Linux and Mac uses AT&T assembly which is so difficult to read.
I prefer intel notation.
If you have access to the original source code you can use:
clang -S -masm=intel prog_name.c
which will generate prog_name.s with Intel assembly syntax.
you don't think source should come before destination?
me, I think I'd say a = b to mean b gets into a, hence mov rax, rbx
We used ARM assembly in school and it was pretty much identical to this.
great video thanks to share
So at college I have learnt C and Assembly Language for 8085 mp. So I can say that we can program 8085 using assembly language to do specific tasks.
With C we can actually write any code and hence it is being converted to Machine Language by the compiler for the mp.
So can we say that C language provides us a greater flexibility of "programming the mp with ease as we are writing code in HLL and that being converted to Machine code"?
Also note that compilers are to be created for each architectures. GCC supports around 30 architectures
8:15 I believe that line puts tge x value to the aex, where it can set a flag. The next line sets the flag, and the next line uses it to determine wether to jump or not.
Ohhhhh this helped me for my malware and reverse engineering final. THANK YOU!
Some “journalist” need this video, for sure :)
I see your reference there. But I got to say, most professional coders don't do stuff this hard for work.
Not that I think journalists could learn low level or high level languages to proficiency.
^ It depends on whom you call "professional coders", buddy
@@chillappreciator885 professional coders= people who won't jump out of the window if their code doesn't work
The most inconsistent writting of '1' I've ever seen 😂
Thanks for the video! Glad to find others who think this is super cool. I just finished my assembly course and I'm sad its over. I'm pretty sure I'm the only student who actually did my assignments and didn't just find code to poach on stack exchange. I'm even more sure I was the only one who really enjoyed the class and preffered it over C++ and way more than Visual Basic. My C++ teacher has been giving me a hard time. Assembly is "neat" he says, but VB can make "real world programs" Humph. I figure if I love something that most people dislike, even if I don't do it directly, there's a market for doing that kind of thinking....???????
Visual Basic, ewww!! :)) Yes there is a big market for assembler and c programmers - think hardware controllers and other fancy things.
Tell your C++ teacher he is an idiot (you can quote me). VB is the worst for making real world programs. Create a Hello WOrld program in VB and compile it. You get a program that is >10K. Do it in assembly and it is 128 bytes..... He must have stock in storage manufacturers.... I'm CIO that used to teach machine code/Assembly when the first PC's came out. Wrote games on C64's until the C compiler couldn't comile them anymore and switched to (macro) assembler.
You don't know programming until you have done that at least once for a larger project.
visual basic is dead. it hasn't had a real application in literally decades
I have just started to learn assembly in school with an msp430 processor, won't the compiler optimize the code so it uses registers instead of ram? Isn't it a lot faster?
***** so the program doesn't have access to the registers?
Yes, the compiler will optimize the code to use registers instead of ram. You just have to turn optimization on. As you can see here: godbolt.org/g/UndH1q
The first output is with no optimization flags, and the second is with -O2. There are no risks to doing this. Shodan doesn't know what he's talking about.
My father, who has been working on microcomputer programming for almost three decades, converted his programming language from the machine language to the C language at his forties, as"Machine language is too hard to comprehend and debug."
What a Chad. I feel his pain. 😅
The C code is based on a “higher-order” modeling of the DEC PDP -11 assembly language. Many of the C-code has a direct relationship to the assembly code. I’m old enough to have talked with the Bell Lab guys!
I remember spending hours upon hours typing almost endless lines of hexadecimal code into the computer's RAM and then compiling it overnight and recording it onto DAT cassettes so I could play computer games. Intel 4004 processor, 4k of RAM, with a 12" amber CRT... Good times... Good times...
How old are you?
@@CamaradaArdi 150 y.o at least
@@MrKidori How old do you think digital computers are?
I made a hello world program in C then edited the output in the binary using VSCode Hex Editor on (line?) 00002000. I compiled the program on Linux x86_64 with gcc 12.2.0.
edit: edited some empty lines and nothing changed, does this mean I can encode stuff in executables lol
If you fully know the file structure, address values, and you can change them if your dimension increases, then yes. But not with all the bytes in the structure, this will work, but with many it will work.
cant believe if you change the executable it will change what it does, that's so unexpected!
there's a joke: "for someone who knows assembly very well, every program is open source"
Refreshing to see Fibonacci being implemented with a loop, instead of the usual (and very terrible) recursion solution.
I see people online saying "Recursion is easier to read, faster". Whilst the last one may be true, I don't know nearly enough lol, recursive functions have always been pretty much impossible for me to read.
@@psun256 Faster, I don't know. As fast, if written properly. I too find recursive functions hard to read.
@@psun256 recursion definitely shouldn't be faster, as a general rule all the repeated function calls that have to be allocated on the stack make the recursive version of a function either slower or at least more resource intensive, the only case i've ever seen recursive recommended for is when it makes code easier to read (and the only example of this i've personally experienced was with binary trees)
@@DavideAnastasia recursion if I'm not wrong takes up far more memory, so I don't see how it could be faster.
@@jake3736 Not necessarily. Some languages (Scala comes to my mind straight away) have tail recursion optimisation, so effectively the compiler is translating recursive code into iterative one. Of course the problem of stack allocation (and eventually stack overflow) is another reason to stay away from recursion if the trade off are not very well understood (and usually young university student don't understand those at all).
I was curious, so I dug up what movb $0x0, %al was doing. printf is a variadic function and when such functions are called, the AMD64 ABI requires that %al contain the number of floating point arguments being passed. In this case, there are none. Therefore, %al gets the value 0 prior to the call to printf.
That's not machine code it's assembly language. Machine code is the hexadecimal or octal output from compiled assembler or manually written.. Just saying
Although IIRC machine code is translatable 1-1 to assembly and vice versa, no compilation necessary.
5:56 "Not sure what this other thing is. It writes 0 to the lower byte of the eax register (rax on 64bit but you seem to have a 32 bit machine). The other line is just setting the value of eax into the stack. Eax will hold the return of the last printf function.
"It writes 0 to the lower byte of the eax register " so what... you didn't push the envelope. It specifies "0 floating point arguments in registers passed in to variadic function".
THATS SO COOL, i always programmed in C and was thinking about how it worked inside of the processor
*I am going to bed, but this looks like a nice video. Thanks! If made a video about writing a synthesizer on a discrete computer, reply with a link to that video. Thanks in advance!.*
"subq $0x20, %rsp" will reserve 32 bytes of space on stack for the function to use x,y and z.
and other variables used by the compiler, as well
I think you've made a mistake when you told about the stack frame. Actually it was already set up one line higher and "movl $0x0, -0x4($rbp)" just sets up one of your variables (=
I think it's a result of stack alignment to 16 bytes, and gcc is zero-initializing the unused data
I'm studying IT, and coursing a few subjects that include C, C++, Assembler and Pentium processors architecture. And this is one of the best, and more interesting video that I've seen. Great work!
This example stays mostly true with current compilers (only that GCC likes to compare x
Is there a reason why, when setting x to y or y to z, an addition register is used? Would it not be faster if 0x8 would be set to 0xc, or is this limited by the way the hardware is designed?
+Kwin van der Veen CPUs rely on registers to do any computations on data, think of it this way, if you wanted to copy a memory address to another, you would have to know the length and type of the values you are copying, it's simpler to work with 32 bits registers that are fixed size.
I think it might have to do with persistence- just because I want x to be y doesn't mean I want y to be 0. It might also have to do with the fact that = could, theoretically, fail. There might be an overflow or the command might not make sense at all.
These are just guesses, though, and I, too, would like to hear an authoritative answer.
OH! Another thought: ACC is much closer/ faster to get to during runtime. 0xc might be on very slow RAM or something.
those arent registers they are memory locations ... the numbers 0x04 08 10 0c 14 ... are all locations that come AFTER the final jmp that starts the program over ...
... the 0x04 offset adds 4 to the last instruction (gets passed its area of use) and defines This is where i am going to put integer values ....
the 0x08 is actually making a place for the X variable .. 10 is Y an 0c is Z .... if he had chars or boolean or string ... there would be another offset when he initializes it to an offset that falls in after the last amount of stuff from any previously set variables ...
if he added a string variable .. say... its offset would fall at 0x18 for the defined space ... and then the first string would be either 1c or 21 for 4 bit and 8 bit lengths
K van der Veen
2 Years later; The answer to your question is the CPU has address and data lines to memory but memory does not have address and data lines to other memory.
So to move any data the CPU must take the data into a register and then from the register to the desired other data location.
Your teaching is great, informative and esthetic. I loved watching it. Thanks!
holy cow he literally printed his print outputs 0:11
Bro it aint that hard. 0 1 1 2 3 5 8 13 21 34 55 89 144 so on so forth
i dont even remember commenting this, also i probably meant that it was pointless to print it but whatever
Learning little by little. This is a great explanation!
Note: Everytime I see your name, I can't help myself but remember the song "Maneater" by Hall & Oates. It would be a perfect fit if you change the chorus to "He is Ben Eater" 😆
I feel like I got little more smarter after watching this video.
Bruh this aint machine language.... this is assembly.. how can u not know that?
that at&t syntax tho, intel syntax ftw
You have to explain it very well especially for beginner. Explain why u start at that of line 0x0 and -0x8(%rbp) "How do I know where I start to see the variable, what is this 0000f2f, $, %"? Explain cleared and basic , provide hexadecimal. Indian demo and rus was better than this. Just to give u and advise.
I have no freaking idea what the hell did this men said but still satisfied
@blvckmetxl it was just an expression and I have my right to express.
And frankly I was expecting someone to explain this to me.
Probably start by reading about fibonacci series. You'll find interesting videos explaining how it appears in nature. Then read some basics of how C programming language can be used to perform certain operations like printing something on the standard output, like in this case we are printing the fibonacci series
Your channel is awesome.
Ta chaîne est géniale.
Merci
How can you tell if a software engineer is an extrovert?
When he talks he looks at YOUR shoes.
*HILARIOUS!!!*
*PS: By the way, told one of the original Apple garage employees about the channel.*