C and Assembly Language: How To!

Поделиться
HTML-код
  • Опубликовано: 25 апр 2023
  • Dave shows you how to combine C and ASM in the same project and same binary, while keeping them separate and avoiding inline assembly. Featuring 6502 C and ASM on the Commodore/MOS KIM-1
  • НаукаНаука

Комментарии • 205

  • @Clank-j6w
    @Clank-j6w Год назад +86

    "Hey it's Dave, operating system developer for the Kim-1"

  • @SquallSf
    @SquallSf Год назад +5

    ClearScreen is very unoptimized!
    1. You need to clear 320x200/8 = 8000, but your program clears 8192, that is extra 192 times internal loop, which for 6502 is way too much extra.
    2. Handling the outer loop is very slow! If you do the math (but it is obvious in the code too), outer loop is 32 ($20). So you could optimize that by:
    LDX #32
    ...
    DEX
    BNE :-
    This way you will reduce the instructions in half (from 2 to 1), and the generated code will be just 1 byte (instead of 4 as it is now).

  • @spazda_mx5
    @spazda_mx5 Год назад +38

    I understood about 5% of this, but still enjoyed it and look forward to more 😊

    • @milk-it
      @milk-it Год назад +3

      Keep watching and reading up on it. It's like learning a foreign language. Once you've seen, heard and practiced it enough, you'll be fluent in it!

    • @lucasgerosa4177
      @lucasgerosa4177 Год назад +3

      Just watch it 19 more times then

    • @NoOne-ev3jn
      @NoOne-ev3jn Год назад

      I understood 0.5% of your 5% 😅

    • @kamehameha38
      @kamehameha38 Год назад

      Same over here

    • @kcvinu
      @kcvinu Месяц назад

      I also had a keen interest in assembly language. Finally got my hands on Microsoft's MASM Assembler. Created a DLL and used it in Python. It was some nice functions to create a window. But it ran 4 times slower than DLLs I made in other languages ​​like C. With that, I just stopped my MASM endeavor.

  • @topperdude2007
    @topperdude2007 Год назад +13

    Fun project! Reminds me of when we did something similar back in the early 90's (our undergrad days) - buddy and I started by reverse engineering and writing an application that did the same thing as Norton Disk Doctor in C. This was in the pre-Windows days and on the old (state of the art back then) 80286 based computer with floppy disks since PCs were not as widely available in our country back then and we had to do all the development after school hours by reserving computers in our school's lab (unless the seniors wanted it for their projects).
    Anyways, once we re-wrote the entire thing in Assembly, boy was it fast! Much faster than Norton's version and about 2.x times faster than our own C based version. Love watching these videos - brings back some nostalgic memories. Thank you, Sir! 👍

    • @sjococo
      @sjococo Год назад

      Yes, good old times. 60KB ramdisk in videocard memory in order to use unused RAM in videocard while in textmode to name just one remarkeble program

    • @milk-it
      @milk-it Год назад

      Awesome stuff, dude!

  • @d.jensen5153
    @d.jensen5153 Год назад

    I liked this a lot! 6502 assembly on the Apple II is where my computer education began. Assembly and C are still are still what I enjoy the most. It's the proliferation of platforms that has kept me busy.

  • @milk-it
    @milk-it Год назад +4

    Love it. These small projects in C and Assembly on the older hardware along with the makefile overview are short and sharp enough to practice with, even if I have to convert it to 680xx Assembly and C on the Amiga. The structure and procedures are essentially the same. Thanks, Dave!

  • @rjy8960
    @rjy8960 Год назад +1

    I had a Commodore 16 when I was a kid and spent a lot of time messing with the monitor and writing assembly with it and began playing with self modifying code. It was instrumental in me building a love for coding in assembly which I did professionally for quite a few years albeit with 4-bit and other 8-bit families primarily.
    I've spent time with higher level languages but never developed the same affection that I have for assembly.

  • @stepannovotny4291
    @stepannovotny4291 Год назад

    Thank you! I have done this in a hacky way in SDCC +SDAS so I am delighted to see some additional perspectives on this sort of thing.

  • @naukowiec
    @naukowiec Год назад +1

    Brings back good memories of writing hardware-specific code, and lots of days debugging interpreted languages by looking at machine code.

  • @jacoblf
    @jacoblf Год назад

    This is great. Thanks. I love how you dont waste time getting into a project.

  • @mcmaddie
    @mcmaddie Год назад +1

    Haven't mixed C and asm since mid 90's. Brings back memories.

  • @RonZuckerman
    @RonZuckerman Год назад +2

    Good stuff, Dave! I never worked in 6502 assembly, but it looks close enough to old 8-bit micros that I programmed in the past that I was able to follow the code pretty well.

  • @morganskinner3863
    @morganskinner3863 Год назад +50

    The Z80 has an instruction that makes setting a load of bytes to the same value, so the CLS function in Z80 is effectively (once the registers have been setup) a single opcode. I used this to amaze my computing teacher in 1981. Happy days!

    • @JohnnieWalkerGreen
      @JohnnieWalkerGreen Год назад +4

      Forty years after using MPF-1, I remember that the "LD HL" opcode is "21", and "LD DE" is "31".

    • @SerBallister
      @SerBallister Год назад +2

      How do interrupts work with such a long instruction? Is the whole op-code un-interruptable ? Or does the Z80 have special handling to resume the opcode half way ?

    • @morganskinner3863
      @morganskinner3863 Год назад +5

      To zero out 1K at 0x4000, you would do this…
      LD HL, 0x4000
      LD BC, 0x1000
      LD A, 0
      LDIR
      I think ldir is interruptible, but don’t know for sure, it’s 40 odd years since I used it in anger!

    • @JohnnieWalkerGreen
      @JohnnieWalkerGreen Год назад +4

      @@SerBallister Z80 completes the current instruction before servicing the interrupt, even if it is interrupted in the middle of its execution.

    • @milk-it
      @milk-it Год назад +1

      Nice going.

  • @RonaldvanderPutten
    @RonaldvanderPutten Год назад +2

    I'm just smiling... old school stuff... back to the good ol' days!

  • @alphabasic1759
    @alphabasic1759 Год назад +5

    Personal opinion…multi-language (and I particular C and assembly) is very powerful. Used this combo heavily back in the early 80s

  • @stonedhackerman
    @stonedhackerman Год назад +2

    Awesome! I still think stories/explanations of Windows internals were the best and most interesting content on this channel, but this is awesome too.

  • @fbodirector7464
    @fbodirector7464 Год назад +1

    This is exactly the type of content I subbed for.

  • @jemdeweare6432
    @jemdeweare6432 8 месяцев назад

    Nice to see the combo , thank you dave

  • @muddyexport5639
    @muddyexport5639 Год назад +2

    Cool code and explanation. I like the conciseness. Efficient...

  • @toast_on_toast1270
    @toast_on_toast1270 Год назад +1

    Very nice, only looked at assembly briefly in 1st year cs when I wasn't paying attention - now I understand the beauty of it

    • @toby9999
      @toby9999 Год назад

      There is beauty to low level coding with asm and C that I miss in these days of bloatware.

  • @Davemte34108
    @Davemte34108 Год назад

    Brings back memories of the coding I did during the 80's and 90's in a steel mill in Northwest Indiana.

  • @VoidloniXaarii
    @VoidloniXaarii 8 месяцев назад

    Beautiful engineering! Thank you for sharing that! ❤

  • @chrisdixon5241
    @chrisdixon5241 Год назад

    Great video Dave, I'm really enjoying this series.
    In the pursuit of performance, I'd probably have unrolled those loops from the start at the expense of a little more memory, but seeing it run it looked fast enough!
    Great work!

    • @DavesGarage
      @DavesGarage  Год назад

      Thanks! The C version was much slower, so it was worth the work!

  • @carltone
    @carltone Год назад

    Thanks for creating this well done video. Was a trip down Memory lane for me. My programming career began in the late 70’s early 80’s writing 8080, then 8085 assembler ( better ICE) using an Intel MDS210. I was building Industrial apps on Single board processors. I vaguely remember using using the linker/ locator to place code into specific addresses ( read) EPROMs. I had one eprom that I could interchange with variable data. The blurry good old days. 😊

    • @rwatson2609
      @rwatson2609 Год назад +1

      Ha, I remember doing this as well back about then on the intel 8039 microcontroller series. Built my own eraser from a bug zapper and made a programmer from a ton of switches and discrete components, but just like you, they are pretty fuzzy memories. I had lots of time on my hands back then.

  • @Fbiman93
    @Fbiman93 Год назад

    Love your shirt. I also love your explanation of C and assembly two things. I really want to learn.👍

  • @ronaldroe4548
    @ronaldroe4548 Год назад +1

    Can't speak for anyone else, but I love these videos.

  • @kencreten7308
    @kencreten7308 Год назад +2

    Lot's of fun. Thanks, Sir.

  • @TheRojo387
    @TheRojo387 7 месяцев назад

    Bing Chat taught me a handy trick for even faster compilation: inlining machine code directly instead. Hand-assembling code might seem tedious at first, but it's far more rewarding once you get the hang of how it's done, and the memory mapping of your hardware. The technique is no secret, and involves setting a function pointer to your hand-assembled snippet, causing it to be executed as machine code whenever that function is called.

  • @An.Individual
    @An.Individual Год назад

    In the mean time and inbetween time, I'm waiting for your next video.😁

  • @treeturtle9378
    @treeturtle9378 Год назад +2

    Love the shirt Dave 👍

  • @michaelguerrero7232
    @michaelguerrero7232 Год назад

    Just awesome!!! Retro to the steel awesome!!! Thank you

  • @rickmellor
    @rickmellor Год назад

    My favorite channel. I click “like” before “play”. 😂

  • @deevs3973
    @deevs3973 Год назад

    Love the video. Takes me back to DOS programming, especially with the make files. I did Clipper (Compiled DBase/Foxbase) database programming with C and ASM function libraries I coded to provide things missing in the language that I needed.
    BTW.. Be aware of replies from imposters posing as Dave. I've seen a few on here.

  • @gerakore8948
    @gerakore8948 Год назад

    i remember using assembly to speed up a 3d engine i wrote in qbasic. worked really well actually. fun times. the assembly basically pushed memory straight to the video memory. this allowed me to stack different layers together beforehand so there was no need to clear the screen.

  • @BleuSquid
    @BleuSquid Год назад

    You take back what you said about Makefiles! I frikken love them.

    • @DavesGarage
      @DavesGarage  Год назад +1

      They're like a redheaded stepchild. You love them, but...

  • @mhoover
    @mhoover Год назад

    Back in the 80s I did this a lot with IBM BASIC and 8088 assembly. Man, I'm old!

  • @SassyToll
    @SassyToll Год назад

    This is brilliant Dave Thank you

  • @thisisreallyme3130
    @thisisreallyme3130 Год назад

    Yessss.. more CC65… TY!! 🎉

  • @dougpark1025
    @dougpark1025 Год назад

    I have only very rarely written any assembly. In a case like this I would probably start by looking at the code the C compiler generated and then start counting clock cycles to see if there are any obvious optimizations that could be made.
    It would be useful to provide an example of how C passes arguments into a function in assembly as well.

  • @keptil
    @keptil Год назад

    It's been like 2 months since I watched one of your videos, only to come back and find that you've grown a Jedi Master beard.
    I'm okay with this, Obi Wan Dave..obi...
    I'll figure out a jedi name for you eventually.

  • @rbolo29
    @rbolo29 Год назад

    I don't understand too much of this, but appreciate the effort.

  • @andrewdunbar828
    @andrewdunbar828 Год назад

    Good stuff Dave. Might've been good to mention why name mangling is even needed/done for C at all.

  • @xredhead7135x
    @xredhead7135x Год назад

    awesome work

  • @wkjagt
    @wkjagt 9 месяцев назад

    Really cool video! I've never tried mixing C with assembly, but now I want to :-). About local labels in ca65 (I think what you're using are actually called unnamed labels), you can also branch two (or more) colons ahead with `:++`, `:+++` etc. Same thing when branching back multiple labels, you add multiple minuses. Local labels are also a thing by the way. They're labels that start with a @ (at sign), and they're only visible between two non-local labels. This is handy when you want to reuse common label names like @loop, @done, etc.

  • @JohnnieWalkerGreen
    @JohnnieWalkerGreen Год назад

    My second C compiler (after Unix V6) was Mark Williams C for MS/PC-Dos.

  • @SEEMERIDECOM
    @SEEMERIDECOM Год назад +43

    Would love to see this ASM version running next to the C only version for comparison.
    When I rewrite something in ASM I always keep timings. Even when I'm optimizing I start with the slow code and as I make each improvement, I verify if it's actually an improvement. Sometimes, you make it slower.

    • @Ittiz
      @Ittiz Год назад +3

      seconded

    • @greatwolf.
      @greatwolf. Год назад +2

      third. Would also be interesting to compare the assembly generated by the compiler vs the hand-written version. Is it still substantially faster even with `-O3`? Where would the speedup come from?

    • @TAP7a
      @TAP7a Год назад +3

      Turns out modern compilers are wicked smaht

    • @kayakMike1000
      @kayakMike1000 Год назад

      Well, try to do count down loops instead of count up.
      Depends on which compiler.

    • @kayakMike1000
      @kayakMike1000 Год назад +2

      ​@@greatwolf. godbolt!

  • @JPEaglesandKatz
    @JPEaglesandKatz Год назад

    Think I only used the MAC65+ assembler on the 6502 (atari) back in the days but that sure looks familiar :) Nice to see a follow up video on this

  • @moshehalevihalemo1604
    @moshehalevihalemo1604 Год назад

    I love your shirt with 1s an 0s all over it 🙂

  • @NoX-512
    @NoX-512 Год назад +1

    Yes, it*s a bug. If my memory serves, STA doesn*t affect the status flags, so you could move DEY to the start of the loop. Otherwise, you can subtract 1 from your address calculation (SCREEN + 1E9F, SCREEN + 1DFF)

  • @patrickmcginnis7
    @patrickmcginnis7 Год назад

    wow, ok. I never could run C on a 65 series cpu ... so props for that. We burned eproms and had to run our machine lang. from there. I still have my 6502E in a box, I'm not inspired to re-invent the wheel as you are. I'm happy that there's experts out there that can appreciate unbloated code, I find the compilers and tools today (and even the next couple generations of coders that have come after us) are lazy AF. We had to fit 300KB on a floppy. I still run smallish programs that are

  • @jacobpalm
    @jacobpalm Год назад +1

    Very interesting! I’ve always been fascinated by these kind of low-level optimizations, and how programmers can squeeze more performance out of the hardware.
    Do you have any timings or similar to show how much faster the ASM routines were compared to the C ones?

  • @DanielFSmith
    @DanielFSmith Год назад +2

    You can save 1 cycle/byte by using absolute addressing for STA instead of indirect. (Assuming you don''t object to self-modifying code, and that your code's not going into ROM.)

    • @nkronert
      @nkronert Год назад +3

      As a kid doing assembly programming on the C64, I was somehow "scared" of these indirect addressing modes, so I would always do those self-modifying code loops. Never spent time to realize that in ROM this wouldn't work 😊

  • @RobertLBarnard
    @RobertLBarnard Год назад +3

    Thanks for going through this demo.
    Curious how you became familiar with MOS & the KIM kit?
    I worked on the Motorola 680x line for an industrial process control company (we also used DEC Vax & PDP in top of it all), fun times chasing bits. But never played with the MOS stuff, remembering it was well regarded.
    Oh, the funnest part of that job was bootstrapping up an HP 3060a automated test system (bed of nails, HP-IB, HP "calculator", the whole she-bang!). Nearly 40 years later... I "think" I miss it. No, probably not.

    • @DavesGarage
      @DavesGarage  Год назад

      I never saw one in the day, but they're so similar to the PET and C64 in general architecture that a KIM is like a "mini PET" in a way!

  • @mirror1766
    @mirror1766 Год назад

    Always enjoy the programming videos. Wondered if the 1 to 127 count of bits in the last video should have been 0 to 127 or if 0 was left out for some special reason. Would be interesting to know what was going on in the C copy vs your assembly copy to compare the steps vs performance. Ironic that this was updating the display faster than Microsoft would on my Tandy 1000 DOS prompt but figured I ran into a compatibility fallback of sorts having updated the OS.

  • @davidtrzil
    @davidtrzil 4 месяца назад

    so cool

  • @nathantron
    @nathantron Год назад +1

    Do you know if there's a way to use the compiler to merge a bunch of cpp files into one? allowing it to do the header and cpp merging for us so we dont have duplicate code?

  • @JeremyNasmith
    @JeremyNasmith Год назад +6

    I love the work you're up to for the KIM-1. Is your end goal to give it most or all of the functionality we expect from other Commodore machines Kernals? That would be a great series indeed.

    • @DavesGarage
      @DavesGarage  Год назад +6

      That's sort of what I've been tinkering with, but with no keyboard interface, not sure how far I'll take it!

    • @lwilton
      @lwilton Год назад +1

      @@DavesGarage I've got an old KIM-1 single board computer, stock except that I increased the memory by something like 8K, if I recall. I also have a wide-carriage IBM Executive typewriter with a pin-feed platen that GE hacked with solenoids on the type bars, and leaf switches under the keys. It was originally used on a computer monitoring the tin plating line at Kaiser Steel. I kludged up a board to plug onto the PIA outputs that would monitor the switches and drive the solenoids, and had myself a teleprinter. It was pretty much useless, but I could do keyboard/printer I/O.

    • @chinesepopsongs00
      @chinesepopsongs00 Год назад +1

      @@DavesGarage I have a 6502 based board that was used end 80's begin 90's as part of home automation when the word was not yet invented. Mainly used in hospitals and prisons to automate and secure things. The boards has a ton of io both serial as parallel. It does not have normal video out, but a header for a LCD display, i have a 2 line model attached to it. Has 32kb ram 16kb rom and 16kb reserved for io space. I started to write a operating system for in somewhere in the 90's it's was never completed as i lost interest. I feel like i should dig the thing from the attic and finish it. Reason i have it is because it is the second revision of that board and i was the one who redesigned it. First gen had 16kb ram and 8kb rom and some other minor differences.

    • @Davemte34108
      @Davemte34108 Год назад

      @@lwilton Sounds about right, while working at LTV's Indiana Harbor Works (former Youngstown), GE was the main contractor for the rolling mills, similar things happened.

  • @PrimalNaCl
    @PrimalNaCl Год назад +1

    That is a sweet shirt!

  • @igot64problems42
    @igot64problems42 Год назад

    7:37 Yes, that's a bug - it will not store 0 at location SCREEN+$1E00. It will store 0 at SCREEN+$1EA0 on the first iteration but it will also store 0 at SCREEN+$1F40 which is the first byte just off the screen. There's a couple of ways you could fix this. One way would be to start Y at 0 then do INY and CPY #$A0. Executing the compare on every iteration will slow things down slightly.
    Another way would be to use the Negative flag. To do this, you'll have to unwind the loop into 4 instead of 2 so that Y can start at a positive number; a value less than 128:
    ldy #$4F
    : sta SCREEN+$1EF0,y
    sta SCREEN+$1EA0,y
    sta SCREEN+$1E50,y
    sta SCREEN+$1E00,y
    dey
    bpl :-
    When Y rolls around from 0 to $FF, it's a negative number and it won't branch.
    A quick fix to the problem would be to just do STA SCREEN+$1E00 straight after the loop.

  • @greg4367
    @greg4367 Год назад

    This brought back all the reasons I prayed in the z80/8080 world and HATED the 6502.

    • @toby9999
      @toby9999 Год назад +2

      I loved the 6502. I wanted to learn Z80 but didn't have a Z80 system. My favourite was the 68000.

  • @pekahon
    @pekahon Год назад

    In z80 chips have 2 different sets of instructions, common documented and tested ones, and outer layer instructions that may work on with some chips.

  • @OggVorbis69
    @OggVorbis69 Год назад

    Ah that brings me back to the time I was writing exactly that for Apple ][. Thank you so much for the memories. For scrolling text though isn’t it going to be faster to scroll with one row of text up instead of single line?

  • @georgecop9538
    @georgecop9538 Год назад

    This is similar to linking the bootloader written in asm and the kernel to create the binary. Great video anyways!

  • @sukivirus
    @sukivirus Год назад +2

    I see C programming, I just hit like :)

  • @nelsonamador5412
    @nelsonamador5412 Год назад

    Hi Dave I liked so much your video thanks for sharing. Don't you have a video working with Real Mode and Protected mode to work with PC without O.S?

  • @kermitdafrog8
    @kermitdafrog8 Год назад

    What optimization do you use during the build process.

  • @Peter-House-Jr
    @Peter-House-Jr Год назад

    I would really like to see how you put together your development environment for this project. Looks like you are using VSCode. What compiler toolchain and how are you transfering your final bits to the KIM1?
    Miss the Gentle Giant - where has he gone?

  • @enablerrelbane
    @enablerrelbane Год назад

    Where did you get the shirt? Do you have a link to it?

  • @NullPointer
    @NullPointer Год назад

    When I wrote my tiny kernel in x86, to scroll the screen I just moved the entire block of memory up except for the first row, instead of doing it row by row, it has the disadvantage that you lose whatever was there, but it's blazing fast if you don't care about that

  • @H2Obsession
    @H2Obsession Год назад

    Great video. Regarding 07:30, the scroll screen will fail to clear $BE00 (= SCREEN+$1E00) like you suspected. Fixed with easy-to-understand version is:
    lda #0
    : sta SCREEN+$1E00,y
    sta SCREEN+$1EA0,y
    iny
    cpy #$A0
    bne :-
    However this adds an extra instruction (cpy #$A0) inside the loop which is bad for speed. You wrote it as assembly for speed right? A better but harder-to-understand version is:
    lda #0
    ldy #$A0
    : dey
    sta SCREEN+$1E00,y
    sta SCREEN+$1EA0,y
    bne :-
    This runs at same speed as original, and doesn't miss clearing address $BE00. It looks weird because DEY instruction appears 3 lines above the BNE instruction... but it works because the two STA instructions do not affect any flags.
    Oh the joys of assembly language!

  • @techsteph
    @techsteph 10 месяцев назад

    Hi Dave, I was wondering what is the shortest. though most useful program you ever wrote?
    Using old MS-DOS Debug I was down to 5 bytes of code and is the fastest programm I ever wrote :) Any idea what it is??

  • @stebbygamer508
    @stebbygamer508 Год назад

    Very nice sh1rt!! ❤

  • @benetelrae
    @benetelrae Год назад

    The Dan Flashes sponsorship finally hit.

  • @joshhiner729
    @joshhiner729 Год назад

    Great video!! Do you have a github page with the examples you use in your videos so we can download and get hands on with them? Thanks for the great videos either way.

  • @__hannibaal__
    @__hannibaal__ Год назад

    Hello; Please What name of this editor of assembly and when can i get it.

  • @godfreypoon5148
    @godfreypoon5148 Год назад

    With only a few chips, you could have an row offset register for the video RAM address...

  • @shawnj4545
    @shawnj4545 Год назад

    I guess you can what you want, but typically you'd want to set your variables in assembly as zpsym instead of just hard coding a zero page address like you did (with CA65/CC65). I'm assuming you did that to simplify for viewers.
    PS. Will you write a full conio implementation for the kim-1 for cc65. :)

  • @colinmaharaj
    @colinmaharaj Год назад

    Since C++ builder went CLANG, their assembler is at+t assembler. Kinda tricky. I started programming in 1991, using borland C. I liked how easy it was to do the assembler. I used it to write high performance text screen io, editors, comport vt200 terminals (competed with procomm plus) all that nice stuff.
    Never used Microsoft compilers, so don't ask me about visual studio lol😂

    • @toby9999
      @toby9999 Год назад

      I have only used Microsoft compilers, so don't ask me about the others :)
      Actually, not quite true. I did dabble with other stuff back in the 70s and 80s but MS for the past 30 years.

  • @Felice_Enellen
    @Felice_Enellen Год назад

    Is cc65 not smart enough to turn certain kinds of loops into register-indexed loops, rather than stack-variable-indexed loops?
    When I was writing code as a professional game dev and needed asm-speed code, I would try to write out a good implementation in asm and then create C or C++ code that generated something as close as possible to it, with comments on why certain things were arranged _just so._ This allowed the code to be portable, to be read and even debugged by asm-unaware programmers, and yet be highly performant.
    Another option is to have a reference version in C/C++ and a per-platform version that overrides it if present.

  • @thisisreallyme3130
    @thisisreallyme3130 Год назад

    ​ @Dave's Garage Is that binary shirt from Geek Tropical? I need this. Please post if you have an affiliate link to order this AWESOME shirt..

  • @guilherme5094
    @guilherme5094 Год назад +1

    👍👍!!

  • @JonBailey
    @JonBailey Год назад

    @davepl - is the scroll routine something that could see a performance benefit from loop unrolling?

    • @SquallSf
      @SquallSf Год назад

      Yes. In general everything that runs multiple times in a loop benefits from unrolling, because you reduce the overhead per iteration of a loop. However in early days memory was very expensive and in small quantity. That is why size was preferable "optimization". For example some of these early machines was sold with 4k RAM.

  • @BinaryAdventure
    @BinaryAdventure 8 дней назад

    Dave, where do I get that shirt?

  • @Zeitgeistpionier
    @Zeitgeistpionier Год назад

  • @user-st3nu2to7e
    @user-st3nu2to7e 7 месяцев назад

    Really Nice Shirt

  • @GrindAlchemyTech
    @GrindAlchemyTech 9 месяцев назад

    🙌🏽💗

  • @nuketheswamp7774
    @nuketheswamp7774 Год назад

    Good stuff. Takes me back to when I tried to optimize graphics routines in perhaps my favorite MS product ever, QuickC. If the C version was too slow you could just mix in a block of assembly right in the c source. Really made those blits fast when you could combine unrolling loops and banging in words at a time instead of bytes! I'm a little bit sad that my next machine will probably be linux running win in VM's since MS seems to regard my PC as theirs for data mining :(

  • @KurtSchwind
    @KurtSchwind Год назад

    It was probably for the 286, but I'm almost certain my C compiler let me inline the ASM code. Or am I mis-remembering? Like literally ASM { mov AX ..... }

    • @SquallSf
      @SquallSf Год назад

      yeah many high level languages (C, Pascal,..) allows you to inline asm. The the C Dave uses (CC65) has terrible syntax - each instruction require asm(..), which makes ugly code. On top of that you can't do some tricks to increase readability of the code (like macros) and some tricks like local unnamed labels. So using a separate asm code is much easier and readable.

  • @drganesh108
    @drganesh108 Год назад

    Sir there is new language k/a Rust. Is it better than c for performance and using multi-core processor.

    • @KC9UDX
      @KC9UDX 10 месяцев назад

      I've heard there are multi-core 65xx, but you actually program them?

  • @michaelraasch5496
    @michaelraasch5496 Год назад

    Dave, I have not read through the 100+ comments, so someone may have asked this already.
    In the _ClearScreen code: Is there a particular reason why you do
    inc dest_hi
    ldx dest_hi
    cpx #>SCREEN + $20
    bne :-
    instead of loading the X-register with ldx #$20 after ldy #0 and then change the logic to
    dex
    bne :-
    It would save you quite some cycles.

    • @DavesGarage
      @DavesGarage  Год назад

      I don't have the code in front of me, but that sounds like a good idea!

  • @stephenelliott7071
    @stephenelliott7071 Год назад

    Loving the low level coding episodes.

  • @dand4485
    @dand4485 Год назад

    Curious, rather than iterating line by line to copying row 1 to row 0, row 2 to row 1.... Why not just move all 8K -320 from row 1's address to row 0's address? then fill from 8k-320 with zeros for the last line? Only two copy operations, and seems the cpu will copy the data after than multiple successive 320 byte copies...

    • @0LoneTech
      @0LoneTech Год назад

      Yes. Also, due to an 8bit quirk in 6502, it's faster to do page (256 byte) aligned loops. Each time an indexed operation has to cross a page boundary costs an extra cycle. Slightly complicated by the two pointers crossing pages at different times.

  • @michaelterrell
    @michaelterrell Год назад

    I had a Kim 1 that I salvaged from an old Audiometer, but it disappered a few years back when I was moving.

    • @DavesGarage
      @DavesGarage  Год назад

      I had a brand new VT200A in the original box, it got lost in my last move. No idea how or why!

    • @michaelterrell
      @michaelterrell Год назад

      @@DavesGarage I lost two 10' by 20' warehouse full of computer and test equipment no long after 9/11. I was laid of the Friday before, and I was moving stuff from them to my house. I ended up bed ridden for a little over two years I had over a dozen PET computers, at least ten disk drives and a pile of 4023 printers. I also lost a 8023 printer. Wide carriage dot matrix that you could send a lot of formatting commands to.
      I still have one of the original LASER printers, with the same print engine that HP made famous.
      There is a SWTPC computer out in my shop, as well. I have an Altos 586 computer that I've promised to a computer museum. I'm in my '70s. I started working in a TV shop at 13. I was building Telemetry equipment for the Aerospace industry, until I ended up disabled.
      Some of the oddest computers that I've had were made by National Semiconductor nder their 'Datachecker' name. They were early POS that filled an entire enclosed relay rack. Both system were dead, and after way too long of not finding any information,I scraped them for parts. I recovered ten of the large LASER scanners. I sol the LASER tubes and power supplies, along with the front surface mirrors. I got more from the scrap aluminum housings than I paid for both systems.

  • @theoriginalrecycler
    @theoriginalrecycler Год назад +1

    I’m interested

  • @SRG-Learn-Code
    @SRG-Learn-Code Год назад

    Cool shirt! What does it says?

  • @jumeldipancaputra87
    @jumeldipancaputra87 Год назад

    Sir, with the raise of Zig programmer that it's already proven faster than assembly language for systyem programming, will assembly be replaced?

    • @DavesGarage
      @DavesGarage  Год назад +1

      For very complex problems, yes. But for interfacing with hardware, assembly is still sometime necessary, I suppose. But the number of places you need ASM is incredibly small these days!

    • @jumeldipancaputra87
      @jumeldipancaputra87 Год назад

      @@DavesGarage Ok, thank's for your answer Sir.

  • @Elvin_Black
    @Elvin_Black Год назад

    @DavesGarage, are you interested or involved with the Microsoft Volterra Dev community? We all (the whole World) need you for ARM porting, buddy! You wrote a lot of it in the first place. I hope you do, or will someday have the ambition, to help set your creations free.

  • @okaro6595
    @okaro6595 Год назад

    6:04 I fail to see how it works when it increments both src_hi and dest_hi at the same time even though the 320 is not divisible by 256.

    • @bill3143
      @bill3143 Год назад

      Other than the initialization of the src & dest pointers and the final row detection, there is no actual concept of rows in the copy. You are correct that copying 256 bytes wouldn't copy an entire row, but since it's not row-centric it doesn't matter. The first time through the loop 256 bytes are copied from row 1 to row 0 then in the second iteration of the loop bytes 257->319 of row 1 are copied to row 0 and bytes 0->191 of row 2 is copied to row 1, and so on. The check for "BE" is how you know you're at the last row.
      Sorry if it's not clear, a diagram would work much better.

    • @jerry-p
      @jerry-p Год назад

      The screen buffer is 8000 bytes, and he's copying 8000-320 bytes which is 0x1e00 (7680) bytes. So, he's copying 30 (0x1e) 256-byte pages, a page at a time with the inner loop. Then he blasts (most of) the last line (320 bytes) to zeroes. As he suggested, he's not actually zeroing the 0th byte on the last line because he does the "dey", "bne :-". The fix is pretty simple for 6502 programmers; maybe he'll show us next time. I think he's also zeroing the byte at 0xbf40 because of the way he indexes, but that's probably not a problem either.
      To get the zeroth byte cleared, change the STA instructions to:
      : sta screen+$1ea0-1,y
      sta screen+$1e00-1,y
      then it actually clears just the bytes desired. Off-by-one bug.

  • @pixelfingers
    @pixelfingers Год назад

    😊👍👍

  • @ChairmanHehe
    @ChairmanHehe Год назад

    cool shirt