The Beauty of Assembly

Hussein Nasser

Просмотров 26 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 25 янв 2025

Комментарии • 67

@hnasr 3 месяца назад ⁺⁴⁰
Let us keep peeling that onion...
@ravikumarmistry 3 месяца назад ⁺⁸
Until we reach to the electrons and channel turns into physics channel
@lakhveerchahal 3 месяца назад ⁺¹
@@ravikumarmistrythe ultimate goal
@hanifa205 3 месяца назад ⁺¹
@@ravikumarmistry 🤣🤣🤣 Indeed
@shivansh2301 3 месяца назад
merch!
@randyt700 3 месяца назад
Ahh yes, so we can cry/tear up ever more. And i like to chop my onions instead so somedays ill code a react component and then an hour after i glue together some relay switches to make some gnarly XOR gates.
@jameshunt1822 3 месяца назад ⁺⁹⁵
Every one who works in software should get a 8085 microprocessor course. And physically code the programs in hex code using the training kits. Trust me it unlocks something in brain.
@harrytsang1501 3 месяца назад ⁺⁸
I do DSP assembly and nothing can replace writing assembly and seeing the signal being generated in an oscilloscope
@saiphaneeshk.h.5482 3 месяца назад
@harrytsang1501 really envy you guys, when compared to work with flutter app development
@ivanjermakov 3 месяца назад ⁺³
I can also recommend completing a game SIC-1, where you write code using only one instruction (subleq). Second half of challenges is mindblowing because of self-modifying code.
@mitaskeledzija6269 3 месяца назад
How do I get this course
@sh0k0nes 3 месяца назад
Any recommendations?
@AhmadAlMutawa_abunoor 2 месяца назад ⁺⁴
I love the japanese accent at 06:21 !! very nice.
@mrrolandlawrence 3 месяца назад ⁺¹⁵
i used to code ARM assembler back in the 1990s. It was such a joy to code VS x86. the conditional execution bits were just awesome. no need to jump over multiple bits of code.
@nonae9419 3 месяца назад ⁺¹⁴
"gcc is the same" no by default gcc is a symbolic link to clang on MacOS
Can you use "-masm=intel" on x86_64 next time 'cause att syntax is quite ugly.
"-O3 can break things sometimes" no it is not allowed to break things. If "-O3" breaks things then it is either a compiler bug (which is unlikely) or you simply did something, that is not allowed by ISO/IEC 14882.
@atiedebee1020 3 месяца назад ⁺¹
I think he confused O3 for Ofast
@nmprecious 3 месяца назад ⁺¹
I'll be happy to see you talk about the instruction set architecture (ISA), then different microarchitectures, then the logical building blocks, MOSFETs then boom silicon wafers, etc as you dive deeper.
@rijojohn85 3 месяца назад ⁺³
gcc on mac is an alias for clang, if you have Xcode utilities installed. You need to install gcc separately with homebrew and run with gcc-version (eg gcc-14)
@Quacky_Batak 3 месяца назад ⁺¹²
Building a NES emulator and debugging games made me appreciate Assembly and 80s game dev so much more!!
@chrishabgood8900 3 месяца назад ⁺⁵
I did a bucketload of assembly in college 25 years ago, not sure about it being beautiful.
@sreedeepcv866 3 месяца назад ⁺²
A course on assembly will be game changer 😊
@Minolrx 3 месяца назад
Man. You made me want to learn the lang. I learned the most basics while being entertained.
@marsovac 3 месяца назад ⁺⁹
x86 registers are also inheriting previous size registers. AH/AL(8 bit) -> AX (16 bit) -> EAX (32 bit) -> RAX (64 bit).
It's just that GCC chose to do it in another way so you did not get the same code.
But you can move AX if you want to move half the RAX register.
GCC probably did it different because it is faster in the way it did.
@MaulikParmar210 3 месяца назад
There's a difference in number of cycles required to move data, GCC or any other compiler will always choose the fastest path whenever possible. Microcode optimizations are so integrated in tool chain that is hardly discussed or taught unless you're doing it. The fact that very few people work at metal level limits the exposure of the knowledge, and it reflects in the software side as well.
P.S the instructions to opcode is usually hidden in CISC machines, different architecture impmemtations will have different approach to same problem varying in number of clock ticks to perform same operation.
@parvagrawal1043 3 месяца назад
Fun watch! Would love to see a video on compiler optimizations, they can be very spooky at times.
@roz1 3 месяца назад ⁺¹
Hi @HusseinNaser I beg to disagree .... In arm also there are similar instructions like adds, addc movs, and several other added to the main instruction mov... The difference between arm and x86 is that there are no in memory operation in arm unlike x86 so in arm you have to bring the value from memory to register using load and operate on that and then store it back. This helps to clear up the pipeline as now each instruction for most of the time it will not take more1 cycle that's why arm is so efficient. A 6 stage pipeline in arm is comparable to a 20 stage pipeline in x86.
@oleksiizubko 4 месяца назад ⁺²
Hey Hussein :)
Thanks for your amazing videos and courses.
Offtop question, why did you set it up so that your videos can't be played in the background? I would like to listen to your spoken videos while doing something else (cycling, for instance), without having to keep the app open on my phone all the time.
@cheebadigga4092 4 месяца назад ⁺²
isn't that a RUclips problem? do videos in general play in the background?
@oleksiizubko 4 месяца назад
@@cheebadigga4092yes, others videos from other channels plays in background
@lakhveerchahal 3 месяца назад
He has a podcast channel (Backend Engineering Show) on Spotify as well, you can try that.
@lakhveerchahal 3 месяца назад
@@cheebadigga4092it requires RUclips premium for background play.
@lakhveerchahal 3 месяца назад ⁺¹
and why is your comment shown as 3 wk ago, if the video is published only 15 hr ago 😅
@Nathan00at78Uuiu 3 месяца назад
subscribed. this was awesome. i will have to check out the os course.
@Praveenstein 3 месяца назад ⁺¹
How about RISC V processors??
@O...Maiden...O 3 месяца назад ⁺¹
is this AT&T syntax? 🤔
@maddada 3 месяца назад ⁺²
Can you do a video explaining why arm is better than x86 in terms of performance per watt, heat, etc? (explaining from the code or in depth hardware side)
I'm very fascinating by this but the videos I saw all speak in a high level and don't show examples from the code or cpu behavior.
@erkintek 3 месяца назад ⁺¹
All commands means a circuitry. I many circuitry is waiting, ie 10 different mov's for x86 but in arm 1-2. 10 circuits burns much power.
@theairaccumulator7144 3 месяца назад
@@erkintek it's also because ARM has less complexity and can use smaller transistor nodes which are inherently more energy efficient. x86 is always a couple generations behind ARM in terms of fabrication process, Intel being stuck on 14nm for like 5 gens was memed to hell and back lol.
@energy-tunes 3 месяца назад ⁺⁴
logic gates coming up next
@parlor3115 3 месяца назад
That mov keyword, is it related to move semantics by any chance? Asking for a friend.
@vfjpl1 3 месяца назад ⁺²
Nope. Just the same name. Btw in asembly it actualy copy the operand but the name stuck so
@AK-vx4dy 3 месяца назад ⁺¹
Someone can elaborate on this strange aligment on arm version?
@brice.rhodes 3 месяца назад
compiler was probably aligning the stack for cache purposes.
"The ARMv7-M architecture guarantees that stack pointer values are at least 4-byte aligned." That comes from the reference manual as well
@pikachulovesketchup666 3 месяца назад
There's no strange alignment. ARM64 compiler will try to allocate at least 64bit per register (stack is a virtual register for the compiler, ARM64 uses 64bit regs) and align stack to 128 bit alignment, as potential 128bit loads and and stores (ldp, stp) require at least 128bit alignment. You can probably use "-fconserve-stack" on GCC, however you should never expect that if you write "a = b + 1", the compiler will ever emit addition operator, as the optimizer will perform various tricks.
@AK-vx4dy 3 месяца назад
@@brice.rhodes Cache can explain something, i know that alignment maybe 4 but first or last long (64bit) variable was 8 and next variable was on 20, so 16 bytes for 8 byte variable seems strange
@AK-vx4dy 3 месяца назад
@@pikachulovesketchup666 That what I'm asking for, pontetnial 128 bit loads explain alignment somewhat. Are those command load or store multiple registers at once? I k
@pikachulovesketchup666 3 месяца назад
@@AK-vx4dy ARM 32bit has stm,ldm,stmia,ldmia, etc. with multiple registers (this can used to implement e.g. memset operation), AArch64 has only ldr/str with single register and ldp, stp with register pair, optional offset and increment/decrement of the address. AFAIR 32bit ARM still requires 64bit alignment if you load and store multiple regs (I used it mostly in kernel mode exception handler entry/exit code). Compiler will usually emit e.g. "stp x0, x1, [sp]" as "push" of 2 registers, however stack must be aligned to 128bit, otherwise alignment exception may occur. The code in video is unoptimized code, otherwise the entire core would just be "mov x0, #something" and then "ret".
@andrewm4894 3 месяца назад
Some assembly and a smattering of philosophy :)
@blaisofotso3439 3 месяца назад ⁺¹
where can one lear Assembly?
@OCTAGRAM 3 месяца назад
Donald E. Knuth. The art of computer programming.
Sample programs and exercises are in fictional MIX and later MMIX assembly language for ideology reasons. So this is high quality programming course with everything in assembly. Actually, big part of Volume 1 is math class that is required for performance estimation. I.e. generating functions. Hardly you can find better introduction to assembly programming.
MMIX is fictional, but nowadays it gives a bonus of simulators with performance measurements. Each operation has declared cost, in oops and mems, and simulator can measure that.
Notes on MIX to MMIX migration. MIX is quite old machine. Each byte is 6 bits, and each word contains one sign bit and 5 bytes, 31 bit in total. Memory is 4000 MIX words for both program and data. This is roughly 16kb of normal bytes. MMIX is less esoteric. Bytes are 8 bits, memory addressing is 64-bit, registers are 64-bit. So migration from MIX to MMIX is occurring slowly. Most recent editions still did not incorporate all changes. To work with MMIX, you need TAoCP itself. Then Volume 1 Fascicle 1. MMIX - "A RISC computer for the new millennium". It is supposed to be incorporated into Volume 1, but not yet. Then you need Martin Ruckert «The MMIX Supplement: Supplement to The Art of Computer Programming Volumes 1, 2, 3 by Donald E. Knuth». Fascicle 1 only contains description of MMIX. The MMIX Supplement contains sample programs that are supposed to replace MIX sample programs in Volumes 1-3. Volume 4A and next ones are in MMIX from the beginning.
@prenomnom5637 3 месяца назад ⁺¹
Assembly Language for x86 Processors by Kip Irvine
@zedzpan 3 месяца назад
Prison Break is unreal! Great video getting under the bonnet.
@azizul1975 3 месяца назад
i learned motorola 68k processor assembly language....
@jackgame8841 3 месяца назад
6:26 i thought i have ability to translate into japan
@esra_erimez 3 месяца назад ⁺¹
x86 is built on decades of cruft. ARM was designed by Sophie Wilson and Steve Furber , with a design philosophy based on simplicity and efficiency
@your_skyfall 3 месяца назад
Interesting
@omd_0 3 месяца назад
حيو
@seephor 3 месяца назад ⁺²
Real programmers represent asm into binary and read it that way.
@pprocacci 3 месяца назад ⁺⁴
ruclips.net/video/cOYK3nbpa2w/видео.htmlsi=r4vNzksLHyhNWbMX&t=619
You said you don't understand why the immediate is first moved to the register only to then be moved into the stack address.
x86-64 supports a 64-bit absolute addressing mode only for load/store operations. It's why loading the immediate into a register first is required.
@vfjpl1 3 месяца назад ⁺⁴
Omg, there are two different syntax for x86 asembly. If you chose Intel syntax then you will get one mov keyword. Please please do more research before making video. I would love to watch you but you make me angry for this reason :(
@vgololobov 3 месяца назад ⁺¹
1 - compiler explorer
2 - meaningless video, you should compile with -O2 optimisations
@Serjgap 3 месяца назад ⁺¹
a total waste of my time...
@chandantalreja08 3 месяца назад
Apart from the fake accent he tries to use, the knowledge you share is amazing. I kindly request that you stop using the fake accent and focus on better communication, as it's extremely annoying.
@nou4605 3 месяца назад ⁺⁵
What fake accent lmao. That's just how he talks. He has the typical accent of people of his ethnicity/nationality. Don't tell me you thought he's Indian.

Следующие

Автовоспроизведение

How many kernel system calls do runtimes make?