RISC vs CISC - Is it Still a Thing?

Поделиться
HTML-код
  • Опубликовано: 21 дек 2024

Комментарии • 541

  • @paulk314
    @paulk314 5 лет назад +167

    I'm an engineer at ARM (actually just about to end my work day and clicked on this video) and this was a great explanation of all these concepts. I actually didn't know about delayed branch instructions, cool! I was also surprised to learn that branch prediction didn't become standard practice until a while after it was thought of. Neat!

    • @BruceHoult
      @BruceHoult 5 лет назад +15

      Hi from SiFive :-) In CISC processors, branch prediction started with the 80486 and 68040, and heated up a bit in original Pentium and PowerPC, but really wasn't very good then -- maybe something like a 20% or 30% misprediction rate. Intel cracked the problem with the Pentium MMX and Pentium Pro (SUPER SECRET SAUCE back then) with essentially what we use today with 2% or 3% misprediction rate.

    • @boriscat1999
      @boriscat1999 4 года назад +1

      SH had branch registers, you could manually load your branch destination in advance before jumping to it. Giving you some of the advantages of a delay slot and much shorter encodings (less duplication) for conditional branches.

    • @paulk314
      @paulk314 4 года назад +9

      @Dr ROLFCOPTER! Knowing about branch delay slots is rather arcane knowledge, given that it only existed on some RISC architectures and definitely isn't how modern processors work. The majority of my ARM knowledge was for AArch64, not 90s era technology. Anyway, on second viewing, the concept does sound familiar, though I think what I am remembering is possibly the (closely related) concept of load delay slots.
      At Arm I was a verification engineer and I had to possess a detailed understanding of the architecture, including concepts like virtualization, multiple stages of address translation, exception handling, and about a thousand other things specified in a architectural reference manual that was over 7,000 pages not to mention the extensions. I had to understand how all these features interact and to design tests that worked across a variety of implementations in order to stress the microarchitectural features including branch prediction, speculative execution, caching, etc. I had to carry around a lot of knowledge in my head, so I guess a few details like load delay slots that haven't been used for decades slipped my mind.
      And the reason I'm speaking in the past tense about ARM Is because I decided to accept an offer at SiFive, and am now working on learning the ins and outs of yet another architecture.

    •  4 года назад +1

      @@paulk314 Nice one Paul, and extremely well done getting a job at SiFive - an incredibly innovative company that's, quite literally, "Leading the RISC-V revolution".

    • @pnachtwey
      @pnachtwey 3 года назад

      I programmed a TI DSP C30. It had delayed branches where up to instructions could be placed after a jump. This could get tricky if the jump was conditional. In some ways it was like the conditional execution part of the ARM machine codes. The C30 could use 3 registers in an instruction. For a CISC type of DSP I think it was pretty good for its time.

  • @laustudie
    @laustudie 5 лет назад +165

    First time i actually understand the difference between cisc and risc thanks mate

    • @shirshanyaroy287
      @shirshanyaroy287 5 лет назад

      @Z3U5 Off-topic but I feel like I've seen you on Quora XD

    • @kpsayyed84
      @kpsayyed84 4 года назад

      Same

    • @abstractapproach634
      @abstractapproach634 3 года назад

      @Z3U5 that's sad, but common. It's because the students dream of leaving academia (thus the brightest aren't teaching). It may have been part of why I studied Mathematics, the students passion about the subject is reflected brighter in the instructors (at least when you get higher up, pro tip kids going to a community College first them to Uni you end up with very few Student Teachers.)

  • @green4free
    @green4free 5 лет назад +133

    As you said x86 is moving more aganist risc with things like micro ops.
    But it goes the other way too.
    With things like vector instructions(neon) and other more complex instructions Arm is moving towards cisc aswell.
    I thing everyone is just aiming for that sweetspot

    • @Waccoon
      @Waccoon 5 лет назад +24

      Microcode and nanocode have been used since the beginning, and that's not what was introduced with the Pentium Pro or Pentium M. What's changed in modern processors is the idea that nanocode can be reordered and cached independently of the main ISA, so the processor is effectively translating the main ISA into a new ISA before execution. It's way more advanced than what any RISC processor is doing, and the idea was probably inspired by (or stolen from) the now defunct Transmeta line of processors.

    • @nextlifeonearth
      @nextlifeonearth 5 лет назад +15

      Vector instructions aren't necessarily CISC though. If it has to be divided into individual instructions (fetch, op, store etc.) it is not really CISC.
      SIMD is RISC compatible.

    • @pwnmeisterage
      @pwnmeisterage 5 лет назад +3

      Most instruction extensions are primarily intended to expand the advertised feature sets which sell more processors ... they might technically be categorized as RISC (or RISC-based or RISC-compatible, whatever) ... but their use in non-synthetic applications is infrequent and specialized enough that most of the time they're little more than inert silicon and inflated transistor counts ... which basically undermines all the advantages offered by RISC philosophy.

    • @gazlink1
      @gazlink1 5 лет назад +3

      @@nextlifeonearth yup.. vector operations make the CPU GPU-like, not CISC-like.
      And no-one going to call GPU's outdated, or inefficient, quite the opposite.
      Vecorisation of instructions is great.. for parralelisable operations. "Parralelisable" is just the same benefit multi core CPUs have, but this is even more efficient, it's just a x4 or x8 of floating point or integer units, to be better (faster/more efficient) at processing larger numbers.
      If anything it's somewhat more RISC-like than Intel's "just make one core faster no matter the power needed" approach.

    • @user78405
      @user78405 5 лет назад +2

      but doing both everyone gonna for....intel cores are wider for full set while AMD is not wide to do any RISC instruction in one clock cycle...but it can do in 2 clock cycle per core ...thats gonna hurt amd future when times comes when AR and next gen windows came about 2021....that be end of AMD dominance on x86 again...due to INTEL PLAYED THIS RIGHT FROM BEGINNING ....BY MAKING SURE RISC INSTRUCTION IS WAY TO GO FORWARD WITH ARM COMPETING HUGE WITH INTEL WILL END UP WINNING THE WAR FROM BOTH SIDES....WHILE AMD BE THE VICTIM OF ITSELF FOR SELFISH SMALL MISTAKE...128BIT IS VERY IMPORTANT NOT THE MEMORY BUT PROGRAMS IS NEEDED FOR FUTURE NSTRUCTIONS IS NO LONGER DEPENDING ON GPU ANYMORE...THATS WHEN NVIDIA GONNA SEE THIS A HUGE THREAT

  • @amiralavi6599
    @amiralavi6599 5 лет назад +31

    Only you can explain such complex stuff in such a simplified manner.

  • @hoberdansilva2894
    @hoberdansilva2894 3 года назад +6

    I used to program microcontrollers in the 90s in assembler language, risc architecture was really fast and excellent for simple applications. But when things where a bit complicated I preferred x86 family as the instruction set really simplified things for me. I'm really happy to see the evolution of those architecture through the years....

  • @antonnym214
    @antonnym214 4 года назад +5

    I designed a Minimal Instruction Set architecture with only 16 instructions (4-bit opcode): ADD, AND, NOT, OR, SHR, SUB, XOR, LDA, PSH, POP, STA, RDM, JC, JN, JV, JZ . If you are familiar with an 8080 or Z-80, it's like a cut-down version of that. All ALU operations result onto the stack, which is convenient because PSH and POP to/from the stack are great for transferring between registers. No CMP (compare) is needed because a compare is nothing but a sub where you don't care about the result, only the flags. All good wishes!

  • @davejoubert3349
    @davejoubert3349 5 лет назад +1

    I appreciate that you are hinting to your viewers the beautiful layers that sit between the instruction set and the silicon.

  • @thekakan
    @thekakan 5 лет назад +2

    I love the fact that you included the microcode part.
    A lot of people talking about this topic entirely forget that fact. Thanks :)

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад

      ARM processors (which are RISC ISA) also use micro ops.
      micro-ops is not related to RISC vs CISC ISA but to superscalar architecture or not.

    • @thekakan
      @thekakan 5 лет назад

      @@Arthur-qv8np IIRC, the only microcode that ARM processors have is related to THUMB instructions.
      And yeah, I was talking about micro codes :x

    • @thekakan
      @thekakan 5 лет назад

      Interesting.
      Well, if RISC processors will start including microcodes to "simplify" instructions, wouldn't that make them the same as CISC ones?
      Anyways, I hope it doesn't happen. Compilers can do amazing stuff and the lesser the instruction set is, the easier it is for the compiler to optimize, or so I think how it should be.

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад +2

      @@thekakan From my point of view, RISC and CISC only define the philosophy of the instruction set.
      This does not define the characteristics of the architecture.
      Obviously, the instruction set and the architecture are closely related, but it's orthogonal.
      You can design both CISC or RISC processor with the same kind of architecture. For example with Out of Order, speculative execution, branch prediction, register renaming, simultaneous multithreading, vector instructions, micro ops, microcoded cpu, pipelined cpu, ..

  • @NexuJin
    @NexuJin 5 лет назад +42

    Interesting video. Kinda brings me back when the Pentium came out and back than lots of computer magazines were writing about "Is RISC dead?". 20 years later ARM processor is what made the majority of the populations actually use a computer without calling it a computer!

    • @jasonknight1085
      @jasonknight1085 5 лет назад +5

      Yes, but with SIMD extensions, NEON, virtualization instructions, the new execution level, MMU, LSE, VFP, etc, etc, can ARM really be called RISC anymore?
      Of course still pisses me off they keep adding all that stuff, but still won't provide a simple flipping set of string operations. 12 clocks even with NEON to move 32 bits from one memory address to another, or 18 clocks just to send from memory to a port (when looping) is ridiculous, hence something like a 180mhz M4 being barely on par with a DX2/66 on actual delivered computing power unless you sit there clock counting at the ASM level. (don't expect GCC to produce anything worth a damn...)
      Just like how a 1ghz A7 is about on par with a 450mhz P2 in compute per clock. Wouldn't even be useful if it didn't run circles around CISC in computer per watt... though that is the real point of it.
      But what's the old joke? RISC is for people who write compilers, CISC is for people who write programs...

    • @Waccoon
      @Waccoon 5 лет назад

      PowerPC was a major disappointment, and is what really sealed the fate of RISC on the desktop. It wasn't nearly as fast as promised, IBM and Motorola kept fighting each other with incompatible extensions, and I remember just how damn HOT they ran, too.
      ARM is okay for mobile stuff, but not really competitive on the desktop (and by that, I mean workstation). There's a reason why ARM has helped to take the "computer" out of computer.
      But, hey, ARM can do what they always do... make a new alternate ISA and continue to make their processors even more complicated and less RISC-y.

    • @lookoutforchris
      @lookoutforchris 2 года назад +2

      @@Waccoon the distinctions you’re using are 20+ years out of date. RISC v CISC is a meaningless phrase today. Read the Arstechnica article on this from 1999.

    • @Waccoon
      @Waccoon 2 года назад +1

      @@lookoutforchris It's not meaningless. The differences between RISC and CISC are almost entirely related to instruction encoding, not microarchitecture, so at the low level most RISC and CISC processors are designed and perform much the same way these days. However, the differences do matter, as the way compilers generate code for each design has to be fundamentally different if you want good performance and efficient use of the caches.
      I've read whitepapers and studied ISA encodings for more than 20 CPUs, so trust me, I'm not going to learn anything new from some watered-down article from 20 years ago.

  • @RafaelKarosuo
    @RafaelKarosuo 4 года назад +1

    Thank you for referring the RISC I: A REDUCED INSTRUCTION SET VLSI COMPUTER paper here, also really helped your quotes and the sharp descriptions.
    Great work on distilling this comparison.

  • @dlwatib
    @dlwatib 5 лет назад +7

    Excellent explanation.
    RISC was a very elegant solution, but I think CISC has inherent advantages that will win out in the end. CISC programs can be shorter because the instructions can more closely express the programmer's (or at least the compiler's) intentions. That's valuable information. Techniques like branch prediction depend on having that information available. A shorter program also means fewer instruction fetches from memory, so less load on the memory bandwidth. In other words, CISC programs have more dense and less distorted information content than RISC.
    RISC represents a premature optimization technique. It distorts and bloats the information content of the program for the purpose of making it easier for a specific machine implementation to understand and process. But machine instruction architectures last longer than a specific machine implementation. Historical artifacts like delayed branch instructions become needless complexity later on. Of course, the x86_64 architecture, a CISC architecture, has plenty of historical artifacts of its own that also add needless complexity, even bugs. But there's no reason to encourage the accumulation of cruft.

    • @sheelpriyagautam8333
      @sheelpriyagautam8333 5 лет назад +1

      CISC programmes have fewer instruction fetches but require extra logic to parse them as they have variable length. Plus, these variable length instructions are split into multiple instruction and there is no optimal way of doing it except brute force, this requires even more extra logic. This makes CISC machines spend time on doing something which is not calculation. This makes them more power hungry and less efficient. At this point, I really cant see any benefit to x86 except compatibility. I wonder how ARM processors became so fast in the last couple of years. Perhaps, its because the hardware limitations for which RISC was designed are no longer there.

    • @erikengheim1106
      @erikengheim1106 4 года назад +1

      If CISC will win out in the end then why are no new CPU designs CISC? Why is RISC taking over super computers, server and now entering the desktop market. It seems to me CISC is gradually painting itself into a corner.
      > CISC programs can be shorter because the instructions can more closely express the programmer's (or at least the compiler's) intentions. That's valuable information.
      The experience seems to be the opposite. One of the reasons RISC began to rise was that they discovered compiler writers where not very good at picking and utilizing these more complex instructions. Meanwhile juggling registers of which the RISC CPUs have many is something compilers have gotten a lot better at.
      Contrary to your representation of reality, CISC was to a larger degree made to facilitate people hand writing assembly code. RISC OTOH is designed with the assumption that code will be compiled.
      > RISC represents a premature optimization technique.
      I would say it is the opposite. CISC prematurely optimizes instructions in hardware. An area not easily or quickly changed. RISC programs in contrast can be optimized simply by getting better compilers that arrange the instructions in a more optimal fashion.
      > A shorter program also means fewer instruction fetches from memory, so less load on the memory bandwidth.
      RISC has tricks to get around this. E.g. ARM uses the Thumb format which makes most common 32 bit instructions take 16 bits. That means you double the number of instructions you can fit in memory.
      In addition the large number of registers in RISC processor allow them to reduce the number of load and store instructions further reducing required memory for instructions.
      > In other words, CISC programs have more dense and less distorted information content than RISC.
      Very odd way of putting it. RISC instructions are not distorted information. They tend to be simple instructions orthogonal instructions. CISC introduce the bloat by creating complex instructions requiring complex silicon to decode. For a compiler writer it is easier to compose things out of simple building blocks than trying to hunt down specialized instructions.
      I think the CISC approach really only benefits people doing compilation by hand, not for real compilers. If I was writing assembly code by hand I would likely prefer CISC. Having played with different assembly code, I would say Motorola 68 000 was hands down the easiest one for me to ever use. I kind of like AVR but being RISC is of course a bit cumbersome having to do so many things with multiple instructions. But there is a certain beauty in knowing every single instruction takes 1 cycle. I can easily count how long time a particular segment of code will take to execute. No such luck with CISC. For real time systems that is pretty nice.

  • @JayanandSupali
    @JayanandSupali 5 лет назад +17

    I just felt like my brain was fed with very soft baby food of Info. My dear friend Gary, you did an exceptionally good job at simplifying this for anyone to understand. Again, ThanQ so much :-) #BigFan

  • @KingsPhotographySolutions
    @KingsPhotographySolutions 5 лет назад +53

    Got to be honest, I knew nothing about this before today. 😊 Now much more informed, thanks professor. 😁 Found this fascinating and you did an amazing job of breaking it down in a way that would be easily understood. 😊
    P.s. truly this is something every computer fan should be aware of. Thanks so much for making me much more informed than I was before. I really learn a lot from your channel. 😁

    • @NexuJin
      @NexuJin 5 лет назад +1

      People from the Macintosh/IBM compactible era should still know the differences...or atleast aware there is that difference.

  • @Handskemager
    @Handskemager 3 года назад

    So refreshing that someone actually explains it and explains the x86 splitting CISC instructions down to RISC instructions to be put down the pipeline.. ty! Will be referring to this when one of my friends doesn’t get it.

  • @MarsorryIckuatuna
    @MarsorryIckuatuna 4 года назад +3

    Wow, that was a lot of information right there. I lived through that period and only had the basics covered. Awesome video.

  • @cedartop
    @cedartop 4 года назад +1

    Your explanation of RISC reminded me to my study time, when we had to program assembler on a 80C31. There you really do all that stuff, including writing the exact address to write or read into/from the RAM.

  • @Sunshrine2
    @Sunshrine2 4 года назад +133

    Well, this just aged like wine.

    • @hotamohit
      @hotamohit 4 года назад +5

      just like Gary

    • @jcdentonunatco
      @jcdentonunatco 3 года назад +3

      Why? Everything he said is still pretty relevant. The RISC vs CISC battle will continue for decades

    • @Sunshrine2
      @Sunshrine2 3 года назад +10

      @@jcdentonunatco Well, that is precisely what “aged like wine” means. = good became better.
      The other expression would be “aged like milk”.

    • @jcdentonunatco
      @jcdentonunatco 3 года назад +9

      @@Sunshrine2 lol sorry thought you were being sarcastic

    • @Sunshrine2
      @Sunshrine2 3 года назад +8

      @@jcdentonunatco No, no, I insert the needed /s if I do that on the internet :D

  • @MegaLazygamer
    @MegaLazygamer 5 лет назад +21

    7:49 x86 (Intel specifically) has an overwhelming dominance in the server market. The Power architecture used by IBM in servers and such are edge cases.

    • @denvera1g1
      @denvera1g1 5 лет назад +1

      IA64-itanium might be more common than powerPC(now), i know many banks, financial institutions, and even EDU(where i work) used those even though the only thing they would support was a special version of server08 and then some specially kerneled verions of linux/bsd, it isnt directly related to CISC or RISC, and during develompent was thought to eventually displace both RISC and CISC for servers, workstations and desktops. The main feature for this ISA was that it could execute multiple instructions per cycle, per thread. From what i understand the intel-HP EPIC/IA64/Itanium/VLIW was sort of like hyperthreading, on top of hyperthreading, which is what started in the itanium 9500 series and later, where the processors came with hyperthreading so up to 4 instructions could be run on each core in a single cycle, and in theory you could have even more istructions per thread. Immagine this arcetecture being used on that demostration purposes intel processor that had, what was it 8 threads per core on a 4 core daughter board, i think they canned it around the time xeon phi started getting into the 50 core mark probably a year or two before the launch of the x100 series because more cores is always going to outperform more threads per core

    • @skilletpan5674
      @skilletpan5674 5 лет назад

      @@denvera1g1 Yes. A cpu core needs to dump what ever is being done by the other threads when it executes a jmp. So unless your code is highly optimized it'll mean that 2 or how ever many threads need to be stopped and restarted (as I understand hyperthreading) every time a jmp needs to happen. This is why jump prediction etc has been so heavily worked on by intel and amd over the last few decades. I wonder how much the neural network thing amd use now really helps?

    • @boriscat1999
      @boriscat1999 4 года назад +1

      two of the top 3 super computers are POWER. And x86 only just makes it into the top 5 of super computers. I think x86's dominance in the server market doesn't translate to dominance in all markets. Especially if the requirements become very specialized, like a supercomputer or an mobile device.
      Ultimately it's not the architecture of x86 that drives this, it's the licensing and available software. Building an x86 server means you can optionally sell Windows license with it, have a broader market, and make more profit.
      It's hard for a company to make a modern x86 without stepping on patents and there is no way to license it from Intel (or AMD). ARM can be licensed by anyone and adapted to the special needs of a product. And IBM works very closely with vendors to adapt their POWER chips to meet the specialized requirements of supercomputers.

    • @squirlmy
      @squirlmy 4 года назад

      Are you responding to 7:07 ? His point was that it was silly to even base "CISC won the war vs RISC" on the server market. So, you are defending this in spite of it just being an indicator of an ignorent viewpoint? Are you trying to confirm this: "I really was making a really dumb argument but I was right about that particular point?" Is that what you're trying to say?

    • @squirlmy
      @squirlmy 4 года назад

      @@denvera1g1 also many years ago PowerPC dominated the automobile computer market. And that's several chips per auto. I learned this at the time Apple was leaving PowerPC for Intel, and they may still be dominant in autos. If your argument (whatever is being argued!) doesn't include cars, trucks, airplanes, large appliances, etc. you're not making a good argument about the CPU "multiprocessor" market.

  • @lastmiles
    @lastmiles 5 лет назад +1

    Always a pleasure to listen to someone that knows what they are talking about. All the way down to the wires.

  • @sennabullet
    @sennabullet 3 года назад +1

    Awesome!!! Thank you Gary Sims!!!

  • @JeremyChone
    @JeremyChone 4 года назад +1

    Wow, what a great explanation. Love the last bits about the first instruction splitters and how it relates to heat/power.

  • @DemiImp
    @DemiImp 5 лет назад +14

    "ARM V8 consists of 3 ISAs: 64Bit AAarch64 and 32 bit ARM, which is further divided up into A32 and Thumb (16 and 32).
    32-bit ARM is clearly CISC-y: variable length instructions, instructions that read/write multiple registers (push/pop), and a variety of odd instructions in Neon (floating point), just to name a few. These complex instruction crack into a variable number of ops, which is no-no in RISC.
    Aarch-64 cleaned up much of the ISA, but left in plenty of things that are CISC-y: loads/store pair, load/store with auto increment, arithmetic/logic with shifts, vector ld/st instructions in Neon to do strided reads/writes, etc. Again, fairly CISC-y. ARM instructions encode more information than say your typical DEC Alpha instruction; it’s closer to x86 than Alpha/SPARC in that sense.
    I think the RISC vs CISC lines have been blurred for over a decade, ever since out-of-order execution went mainstream. The advantage of RISC is clear in in-order machines. In OoO machines, not so much, with the sole exception of fixed-length instructions. ARM came from the embedded system world where lots of assembly code is handwritten. Accordingly, their ISA reflects the common usage patterns. At any rate, Cisc-y instructions are preferable to some of the oddball ISA choices made by early RISC ISAs (register windows, branch delay slots, load delay slots, reciprocal step instructions, etc)."

    • @erikengheim1106
      @erikengheim1106 4 года назад

      Interesting take but based on my reading this also seems a bit pedantic. E.g. Thumb follows a very RISC like philosophy in reducing cache usage. Rather than adding complex instructions to reduce memory usage, they came up with Thumb, which is just a compressed version of a subset of their 32 bit instruction set. It is not a new instruction set as such.
      As far as I understand RISC is based on the 80/20 kind of rule, that 20% of the instructions are used 80% of the time. Hence they try to keep these 20% instructions equal in length and execution time to make the pipelines work effectively.
      Maybe I am wrong about this, but I would bet that the specialized longer clock cycle ARM instructions are used in special contexts, where one might use a lot of them. As far as I understand, you want to keep feeding as many equal sized instructions in a row as possible. They may pull that off still even with variable length instructions if these instructions are not usually mingled a lot with other instructions.
      Obviously there must be something very RISC like about a lot of the ARM instruction sets when their ARM Cortex-M0 can be implemented in a mere 12 000 transistors which is HALF of that of a Intel 8086, despite being a 32 bit processor with way higher clock frequency.

    • @DemiImp
      @DemiImp 4 года назад +1

      @@erikengheim1106 The quote I pasted was talking about how ARM really isn't RISC anymore. It once was, but it has bloated a lot in the past decade.

    • @erikengheim1106
      @erikengheim1106 4 года назад

      @@DemiImp Let me rephrase my point since I don't think it got across.
      If you add a bunch of functional programming constructs to an object-oriented programming language, does that language then become functional?
      Or if you add a bunch of object-oriented features to a functional programming language, does it become object-oriented?
      There are many way of answering such a question. You have the pedantic who operate on one-drop rules. E.g. if there is only a hint of object-oriented features, the whole thing must be categorized as object oriented.
      Then there are the pedantic that go say, now both languages are multi-paradigm. The OOP and functional divide no longer makes sense!
      Yet at this point the pedantic has lot track of why humans engage in taxonomy in the first place. If you end up categorizing every single programming language as say multi-paradigm, then your categorizing criteria are worthless. Your taxonomy adds no value to people looking at a landscape of different technologies and try to reason about them.
      Hence I believe taxonomies should be pragmatic. They should be based on heuristics rather than hard rules.
      Just because you add a bunch of non-RISC like instructions doesn't mean that the core of the instruction set wasn't design around the RISC philosophy.
      It is kind of like adding OO features to a functional language. It does not change the fact that the core has been centered around functional thinking.

    • @DemiImp
      @DemiImp 4 года назад +1

      @@erikengheim1106 I would disagree. C++ is C but with more OO design. Are you suggesting that C++ is not actually an OO language?
      The defining characteristic of a RISC architecture is that it has close to the minimum set of instructions to be Turing complete. The moment you start adding in more instructions, the less RISC it becomes and the more CISC it is. Modern ARM ISAs are not very RISC-y.

    • @erikengheim1106
      @erikengheim1106 4 года назад

      DemiImp Definitions should be useful. Defining RISC as being about as few instructions as possible isn’t a useful definition. PowerPC G3 had more instructions than pentium pro e.g. Yet it was a very RISC like architecture because it used instructions of equal size, requiring same number of clock cycles etc. All to make pipelining easier. It was designed around reducing load and store instructions by using many registers. This is a very RISC like philosophy.

  • @CommandLineCowboy
    @CommandLineCowboy 5 лет назад +4

    No mention of micro-code and larger register sets. When RISC designs were first envisioned in the early 80's these two things were often talked of in the RISC vs CISC conversation. Micro--code was using a bank of ROM memory in the processor whose stored bits matched gates that controlled the flow of bits, bytes and words between the various registers, logic, address and data busses . The add one to a memory location example might have several stages. Connect the two bytes of the location from the current instruction register to the address bus, strobe the address bus read, connect the data bus to the accumulator, connect the accumulator to the increment logic, connect the accumulator to the data bus, strobe the write to store. The micro-code ROM could have an arbitrarily long sequence and this enable complex instructions. The First ARM processor eschewed micro-code, its decoding was all logic. By not spending transistors on micro code they could spend the transistors on more registers. The 32 bit x86 is a register poor device. X86 code is full of instructions that push and pop data of the stack, because there's few registers to store intermediate results of calculations. The ARM had 16 32 bit registers, of which up to 14 were available to store intermediate results. Much less accessing of the by then slower memory. The first microprocessors were only as fast as memory, by the time of the 80286 and certainly 80386 PC motherboards could contain fast static RAM to be used as processor cache because the CPUs were losing performance waiting for slower memory. Keeping data in registers is the ultimate cache. Avoiding the need to load data from external memory saves at least a couple of clock cycles, more if the memory isn't cached.

    • @dlwatib
      @dlwatib 5 лет назад +1

      Machines with 16 registers were the norm for CISC machines even before RISC was invented. Even machines that didn't have a 32-bit word size still usually had 16 general purpose registers of whatever word size they did have (though register 0 was often hardwired to hold a 0 value). The register-poor x86 architecture was the exception, not the rule.

    • @CommandLineCowboy
      @CommandLineCowboy 5 лет назад

      @@dlwatib Being doing a bit of google research. Probably the most common 16 register processor was the 68000. Also the IBM 360, VAX and NS32016. I've only worked as a programmer on 68000 and x86 machines. Any other CISC processor types with 16 registers I've missed? I would argue 'the norm' for most people was an x86 machine or an 8 bit micro at the time of RISC's introduction. Having a little trouble finding processor production numbers to justify my assumption. Any graph of number of processors built would skew to 8 bit types because of the huge number of embedded processors. By the time of the Macintosh, Amiga and Atari ST had popularised the 68000 the PC was dominant. A list of CISC processor types might show many 16 register types, but in actual computers people used 9 out of 10 would be register poor x86, Z80 and 6502.

  • @RonnieBeck
    @RonnieBeck 5 лет назад +5

    Concise, informative and well spoken. Thanks for awesome explanation!

  • @BobDiaz123
    @BobDiaz123 5 лет назад +1

    I like how the very simple 8 bit PICs deal with a jump. Most instructions take 1 cycle, but any branch or jump clears the pipeline and takes 2 cycles. Microchip is going for very low cost chips, so the delay of an extra cycle for a jump helps to keep the chip cost down. PICs are imbedded in many products and are used a lot.

  • @patrickdaxboeck4056
    @patrickdaxboeck4056 5 лет назад +34

    In fact modern X86 CPUs are internally RISC machines with a translating layer to the outside. AMD was the first to do so for it‘s 64bit instructions and later Intel followed the same path. On the other hand the classic RISC CPUs have so many instructions now and special additions for e.g. math, that they are not really RISC anymore.

    • @GaryExplains
      @GaryExplains  5 лет назад +22

      The way you say "in fact" makes me wonder if you watched the video because I talk about this subject in the video.

    • @patrickdaxboeck4056
      @patrickdaxboeck4056 5 лет назад +12

      Dear Gary, you are right, just before the end of the Video you told about the micro Ops of Intel CPUs and that was just after I sent the comment.

    • @PEGuyMadison
      @PEGuyMadison 5 лет назад +1

      Actually... Intel was... sort of.. they purchased this technology from Digital Equipment which was used on the DEC Alpha chips. This was introduced into the 2nd generation of Pentium Processors which outperformed the P52C in so many ways.
      So there is the history... long before AMD there was DEC Alpha... which is now owned by Intel.

    • @gazlink1
      @gazlink1 5 лет назад +2

      .. and they still have CISC instructions going into them.
      If they have a RISC-like inner core, surrounded with a (pointless?) CISC to RISC converter, then that just changed the definition of what a modern CISC architecture does with CISC instructions, redefining what CISC architectures.. are.
      ... But it's still a CISC architecture.
      And all that stuff surrounding the RISC-like core - that's why iPads are so damn capable and smooth at such low power.

    • @PEGuyMadison
      @PEGuyMadison 5 лет назад +1

      @@gazlink1 CISC instructions are far more packed and lower power compared to RISC instructions... plus with CISC you get a much higher dispatch rate of instructions achieving higher parallelism and usage of independent functional units within a CPU.

  • @takshpatel8109
    @takshpatel8109 2 года назад +1

    One of the best teacher on hardware stuff.

  •  5 лет назад

    Thank you Gary for the historical refresher and current usage of RISC and CISC processors. Both continue to serve useful purposes.

  • @datasailor8132
    @datasailor8132 5 лет назад +5

    Typically the RISC computers are hard-coded whereas CISC computers are micro-coded. As the presenter alluded to the individual steps of a CISC instruction are read from the microcode memory. There was one aside that I have a quibble with. He kept referring to completing one instruction per clock cycle whereas the real objective is to start one instruction per cycle. You just need to add more pipelines.
    Part of Patterson's rationale for all this was that very few compilers used the complex addressing modes. Well, he was using Dennis Ritchie's C compiler and Dennis didn't use a lot of the instructions in the VAX. Remember Berkeley was a big UNIX shop with its BSD, Berkeley Software Distribution, operation. I was a developer at Bell Labs in New Jersey in those days and a lot of people were aghast at the amount of resources we'd throw at a project. These included people like Dennis, Ken Thompson, Brian Kernighan, and especially John "Small is Beautiful" Mashey. We'd talk in staffing levels of man-millennia.
    Old Bell Labs joke. "Why does Dennis Ritchie use a text editor?" Answer: "Because the compiler won't accept code from the standard input."
    Interesting fact. In the very early days the UNIX source code was found in /usr/dmr.

  • @BruceHoult
    @BruceHoult 5 лет назад +2

    A pretty fair video. One interesting point is that x86 with instructions from 1 to 15 bytes long does not actually have more compact code than modern RISC such as ARM Thumb or RISC-V which have both gone radically away from having a single 4 byte instruction length by ... having both 2 and 4 byte instructions. Radical! That's enough to let both of those have more compact code than i686. Interestingly, the very first RISCs, the IBM 801 project and RISC-I also had both 2 and 4 byte instructions.

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад

      "does not actually have more compact code than modern RISC such as ARM Thumb or RISC-V"
      you mean "RISC-V C extension", right?

    • @BruceHoult
      @BruceHoult 5 лет назад +1

      @@Arthur-qv8np Yes. It's pretty much only student projects or tiny FPGA cores that don't implement the C extension. Once you get to even a couple of KB of code you get an overall savings in implementing C.

    • @erikengheim1106
      @erikengheim1106 4 года назад

      Bruce what is the average length of executed x86 instructions though? I have been trying to understand how much thumb matters. But it is hard to assess without knowing the length of the typical x86-64 instruction. And also how much does a 15 byte long instruction matter if it does the job of 30 RISC instructions?
      I am a RISC fan, but I want to make sure I understand the tradeoffs properly.

    • @BruceHoult
      @BruceHoult 4 года назад

      @@erikengheim1106 this talk (and referenced paper) discusses this ruclips.net/video/Ii_pEXKKYUg/видео.html

    • @erikengheim1106
      @erikengheim1106 4 года назад

      @@BruceHoult Thanks, interesting video.
      From what I could gather x86 would have 3.7 bytes per instruction on average. Their RISC-V compressed instruction set got down to 3 bytes per instruction on average.

  • @jimreynolds2399
    @jimreynolds2399 4 года назад +2

    Worth mentioning that while modern processors have billions of transistors, compared to back in the 80s, the vast majority of those are for L2 and L3 cache rather than CPU implementation. I think they also include graphics functions now as well. I remember the 6502 - it had only about 3,500 transistors.

  • @IslandHermit
    @IslandHermit 3 года назад +1

    Another motivation behind RISC was that the CPU real estate freed up by using a less complex instruction set could be used to add more registers - a LOT more registers - which would speed up computation by reducing cache hits and would also allow compilers to do much heavier optimization.

    • @fredrikbergquist5734
      @fredrikbergquist5734 Год назад +1

      That is in my opinion the real reason that RISC was so successful, it was in some way implementing a cache with very few logic elements! And implementing a cache algorithm in hardware is difficult, here the cache is actually implemented by the compiler and it can be optimized in a better way and might analyze how the program will most likely run. Today with billions of transistors and a cache in three layers that might be not so important, but still, giving the compiler a lot of control is a good idea.

  • @44r0n-9
    @44r0n-9 3 года назад

    What a great video! So easy to follow and explains stuff I didn't even know I wanted to know.

  • @feedmyintellect
    @feedmyintellect 4 года назад +1

    Thank You!!!!
    You are great at explaining complex things!!!

  • @nimrodlevy
    @nimrodlevy 5 лет назад +3

    Your lectures are always a delight! Many thanks super interesting!!!

  • @tunahankaratay1523
    @tunahankaratay1523 3 года назад +1

    The thing is that nowadays you probably better off focus on instruction level parallelism rather than crazy complex instructions. Most complex stuff can be offloaded to a dedicated coprocessor anyways(things like cryptography, video encode/decode, AI etc.). And those coprocessors can be completely powered down when not in use to save tons of power.

  • @stevenmeiklejohn4501
    @stevenmeiklejohn4501 4 года назад

    Brilliant explanation. You have yourself a new subscriber/fan.

  • @sin3r6y98
    @sin3r6y98 5 лет назад +18

    X86 today is largely moving towards more and more risc concepts, with complex instructions decoded in microcode. If Intel really wanted to make a low power phone chip they could by just simply removing all the backwards compatibile cisc instructions the emulate from the 80s and 90s. AMD did this a long time ago, if anyone was more poised to make low power chips it's be them. But the reality is that arm has largely already won this front. There's no reason for Intel or AMD to try to compete in that market alongside ARM as it would require a lot of porting effort and for that to be convincing x86 would not only have to be power comparable but also provide significant advantages. With how long ARM has spent in the past focusing on performance-per-watt I'm not entirely sure that's really even possible.

    • @RonLaws
      @RonLaws 5 лет назад +4

      People forget though that ARM don't manufacture the processors. They design the specifications for other companies to make and implement as they see fit. Intel have in the past produced ARM CPUs, all HP PocketPC from around 2003 for example shipped with an ARM5te which was an Intel chip (The Intel XScale) using the ARM Licensed Specifications.

    • @mrrolandlawrence
      @mrrolandlawrence 5 лет назад

      soon as apple get their laptops / desktops on ARM processors and windows 10 support for ARM improves... x64 will be a dying breed.

  • @Luredreier
    @Luredreier 5 лет назад

    Nice video. =)
    While I knew all of this already I don't think I'd be able to really express myself as well and explain this as clearly as you did in this video.

  • @robinbanerjee3829
    @robinbanerjee3829 5 лет назад +1

    Excellent video! Thanks a lot. Keep it up!

  • @profounddevices
    @profounddevices Год назад

    i like the explanation of risc vs cisc in this video. risc has come along way in part from faster ram and greater cache or even tightly coupling ram on the soc. i know first hand that specifically intel cisc handles loading of registers for avx and simd faster. intel might be doing a wrapper to convert cisc to risc in microcode, but the simd steps and avx loading are streamlined. risc on arm loading registers is complex and slow. when this is solved for risc, cisc may not be needed anymore. it is these specific high performance operations keeping it alive.

  • @JJDShrimpton
    @JJDShrimpton 5 лет назад +2

    An excellent video, thank you Gary.

  • @Raul_Gajadhar
    @Raul_Gajadhar 4 года назад

    Towards the end you are right, because in 2000 the Pentium 4 had 42,000,000 transistors, but something changed somewhere, because the next year in 2001 Intel started to make Itanium processor with with 25,000,000 transistors.
    I really enjoyed this presentation.

  • @boriscat1999
    @boriscat1999 4 года назад

    To me, the critical difference between CISC and RISC is that on CISC you have a fairly complex number of possible state transitions for bus access (on 8080/Z80 this was formalized as T-states). The point that RISC doesn't access memory to do operation is how this plays out. In CISC you might have a complex operation that loads a value, adds a number to it, and writes it back. That means the bus access will have to follow that read/wait/write sequence. Versus another instruction that simply reads a value and stores it in a register, that's only has a read cycle. Two different sequences for state. As you get more complicated instructions you end up with even more possible sequences.
    RISC generally doesn't need to pick from a broad set of possible sequences and does every operation (roughly) the same or in a more asynchronous fashion.

  • @BILLYZWB
    @BILLYZWB 2 года назад

    cheer man really helpful description!

  • @BruceHoult
    @BruceHoult 5 лет назад +7

    The x86 "instruction decode tax" hasn't mattered on the desktop for a long time with just one or even four or six cores. It's very noticeable (as you allude to) as you go smaller. Small 32 bit microcontrollers such as ARM Cortex M0 and SiFive E20 and PULP Zero RISCY have a similar number of transistors to a 16 bit 8086 (29000), as did the first few generations of ARM chips. The smallest modern x86, the Atom, apparently has 17,000,000. This matters both when you want to put some teeny tiny CPU in the corner of another chip, and also when you want to have thousands or tens of thousands of CPUs and need to supply them with electricity and cooling.

    • @allmycircuits8850
      @allmycircuits8850 4 года назад

      I'm pretty sure most of 17,000,000 transistors are used for cache of various levels. But instruction decode tax exists nonetheless, just not as huge as one could think :)

    • @BruceHoult
      @BruceHoult 4 года назад

      @@allmycircuits8850 no,. 17 million is just the core. Including cache etc the Silverthorne CPUs had 47 million and Lincroft 140 million after GPU and DDR controller were moved on-die.

  • @gazlink1
    @gazlink1 5 лет назад +20

    From my understanding there's another downside to CISC.. Sort of second order effects from the ones mentioned. Having to turn CISC into RISC on the fly, to get at the RISC like micro-ops is also the reason x86 has a much harder (complex to design, number of transistors, amount of silicon and power consumption) time with branch prediction, and out of order execution. Each incoming CISC instruction can take any amount of time to execute once decoded, and can jump to anywhere else in the list of CISC instructions that make up the program - each of which again needs to be decoded into micro-ops, whereas RISC knows what all the "micro-ops" will be, they're already written as RISC instructions. I guess with an equally complex compiler, you can create programs that are just as friendly on a modern x86 core as on a RISC core, with a suitably complex compiler for RISC too, but you can get more unpotimised "dogs" of programs on CISC than on RISC, where optimisation will be easier and more dependable. Micro-code updates may change with CISC, and make optimisations needed change over time.
    This is probably part of the reason that Intel (and AMD to some extent) suffers from so many security leaks in regard to their branch prediction and speculative execution - its a much more complex job to implement than with RISC architectures.
    All that spare microcode silicon will always be a power hog that ARM doesn't need, even when some of it is sitting there doing nothing most of the time because it's for legacy operations, that are used infrequently.

    • @AykevanLaethem
      @AykevanLaethem 5 лет назад +1

      ARM also suffers from these processor security issues. It's just that only their very high end processors are affected, because only those processors match a typical x86 processor in complexity and performance. So in a sense, it's the complexity and performance optimizations that led to the security issues, not ARM vs x86 (or RISC vs CISC).

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад +3

      "This is probably part of the reason that Intel (and AMD to some extent) suffers from so many security leaks in regard to their branch prediction and speculative execution - its a much more complex job to implement than with RISC architectures. "
      Not really, seculative & out of order execution and branch preduction are performed on the microops (who are in the Out of Order world). CISC instruction are issued In Order.
      And ARM is also affected by vulnerabilities that exploit speculative and out of order execution.

    • @avid0g
      @avid0g 5 лет назад +1

      The vulnerability that this thread is referring to is the lack of housekeeping on abandoned register and cache data; An error in thinking that hidden or obscure is the same as secure. The patches applied to this problem are at the software level. The insecurity has been built into the hardware for years...

    • @squirlmy
      @squirlmy 4 года назад

      @@Arthur-qv8np I think you may have written that without reading the comment above. They're probably vulnerable at the level of smartphones and some Chromebooks, but not much else. Really, it seems like any of this wouldn't be an issue today except Intel has so much IP built on old decisions about CISC vs RISC "approaches". And ARM vice versa.

    • @Arthur-qv8np
      @Arthur-qv8np 4 года назад

      @@squirlmy I wrote this comment a while ago, and I haven't developed it much.
      These vulnerabilities (like spectrum/Meltdown) are really not a problem of CISC vs RISC, it's a problem in the concept of superscalar processors (processors that read a scalar instruction flow but manage to execute these instructions in parallel by exploiting Instruction level parallelism (ILP)).
      The problem is that this sort of processor, which uses OoO & speculative execution, make the isolation between threads more complex than we thought.
      All superscalar processors have been affected to some degree, depending on their microarchitecture design.
      The parts of the micro-architecture that are responsible for the issues are not in the instruction decoder or even in the execution units of the processor, but rather in the memory system (in a very large meaning) and how the transient instructions will affect this memory system. So it's really not because of the CISC. The CISC only makes the instruction decoder more complex, not the rest of the micro-architecture.
      The more optimization the processor contains in this memory system, the more it is exposed to potential vulnerabilities.
      Intel processors contained a lot of this optimization (especially bypasses to access data faster).
      We can blame Intel's security section for not figuring out these problems, but we can't really blame the designers who created these optimizations for not having thought of security flaws that didn't exist in the first place (the meltdown flaw actually exploits a behavior described in an Intel patent).
      Superscalar processors from ARM, IBM or AMD are much less advanced in this type of optimization and have therefore been less affected.
      The processors have been affected at their "optimization" level. (can we still talk about optimization when we talk about a behavior that adds a flaw?)
      For these reasons Intel's CISC ISA is not involved, a RISC ISA would have caused the same problem to Intel.
      It is important to understand that ISA is only a very small part of a processor, its memory system is much larger and causes much more problems (processors have a computing power limited by the memory system - memory is the bottleneck).

  • @BillCipher1337
    @BillCipher1337 4 года назад +1

    wow you have explained it perfectly, now i finally understand the whole thing about CISC RISC :)

  • @kshitijvengurlekar1192
    @kshitijvengurlekar1192 5 лет назад +4

    Hey there Gary!
    Young as always

  • @richo13
    @richo13 5 лет назад +1

    Great video Gary I learnt a lot

  • @cmilkau
    @cmilkau 5 лет назад +1

    It is probably worth noting that IA64 (not x86_64) is actually kind-of a modern VLIW arch. Kind-of as it still hides implementation details that a pure VLIW would not, like the real number of ALU units.

  • @JNCressey
    @JNCressey 5 лет назад +7

    9:15 although, when the prediction isn't done exactly right, you get vulnerabilities like spectre and meltdown.

  • @alvinmjensen
    @alvinmjensen 3 года назад +1

    You forget all the little microcontrollers in e.g. washing machines are also RISC.

  • @kobolds638
    @kobolds638 4 месяца назад

    with people pushing arm to desktop and server which mean adding more instructions . will that mean in future RISC will grow become CISC ?
    take latest arm for desktop , the power consumption increase a lot. by the time arm catch up x86 in performance at that time can arm still be consider as RISC ?

  • @Jorge-xf9gs
    @Jorge-xf9gs 3 года назад

    What do you think of ZISC as a general purpose "architecture"?

  • @rndompersn3426
    @rndompersn3426 5 лет назад

    i remember years ago reading from a tech website that the Atom CPUs on mobile phones were actually more power efficient than the ARM cpus. This was a while ago.

    • @GaryExplains
      @GaryExplains  5 лет назад +5

      Since you mentioned it I went looking and foudn this: www.tomshardware.com/reviews/atom-z2760-power-consumption-arm,3387.html from 2012, I would take such a report with about 3 metric tons of salt.

  • @1MarkKeller
    @1MarkKeller 5 лет назад +2

    *GARY!!!*
    *Good Morning Professor!*
    *Good Morning Fellow Classmates!*

  • @virtualinfinity6280
    @virtualinfinity6280 5 лет назад +1

    Quite a good video. However, the micro-op sequencing done today in all modern x86 chips, has far more dire consequences, than the video explains. Conceptually it is explained right: You take a complex CISC instruction, break it down to simple, RISC-like instructions and send those down your internal pipelines to the various execution units.
    However, when those have been executed, you have to "fuse back" the result leaving the CPU, as if it where just one CISC instruction executed. For simple CPUs, that is not a problem. However, all modern CPUs are multiple-issue, out-of-order execution designs. Which means, it fetches multiple CISC instructions from cache, breaks all of them down to RISC instructions (micrp-op sequencing), send those down to the execution units to be executed in parallel, and after execution, you have to keep track which RISC micro-op "belongs" to which CISC instruction and fuse them all together in the right way, then reorder the result to the exact original CISC instruction sequence and have the results written to cache in-order (micro-op fusion).
    Sounds complex? Just wait, it gets worse
    Interrupts, which cause the CPU to stop whatever it is doing and jump to execute some different code (the interrupt handler) are delivered precise. Which means, interrupt can even cause an instruction to be haltet WHILE it is executing, then have the CPU do something else (the interrupt handler) and after that is done, resume the instruction after the interrupt has been serviced.
    To keep track of all of this is very hard. To do this with the CISC->RISC micro-op translation, is even way harder. Today, modern x86-CPUs have a hard time keeping more than 6 CISC instructions "in-flight" internally. And with micro-ops, you pay a hefty price anyway: it adds at least two pipeline-stages to the whole CPU design. Which essentially means, you need two more clock-cycles to execute an instruction. Going above the current "6-instructions-in-flight" issue would mean to add more pipeline-stages to sort out the added complexity for micro-op sequencing and micro-op fusion.
    This is by the way one of the key reasons, why CPUs haven't got significantly faster per core at the same clock frequency.
    If one would apply all the manufacturing bells&whistles of the Intels and TSMCs to a massive out-of-order RISC design, it would still arguably be faster, than the current high-end CISC designs.
    Finally, the fewer instructions of a RISC architecture allow for more internal registers. RISC has typically 32 (one of which is hard-wired to 0 for good reasons), while x86-64 has 16.

  • @sir.sonnyprof.ridwandscdba227
    @sir.sonnyprof.ridwandscdba227 4 года назад

    what kind of chip architecture do they use for quantum computer? for example: samsung phone will release samsung A quantum phone..what kind of chip architecture for that quantum cpu..? thx

    • @GaryExplains
      @GaryExplains  4 года назад

      The Samsung A Quantum isn't a smartphone with a quantum computer, it just has a built-in hardware based random number generator: www.androidauthority.com/samsung-galaxy-a-quantum-1118992/

  • @glitchysoup6322
    @glitchysoup6322 5 лет назад +2

    Can you cover RISC V cpus?

    • @GaryExplains
      @GaryExplains  5 лет назад +1

      Yes, I plan to cover RISC-V soon.

  • @soylentgreenb
    @soylentgreenb 5 лет назад

    CISC + RISC-like microops won single threaded performance on the desktop. CISC is like memory compression and saves a lot of external bandwidth. The failure of Dennard scaling makes this choice less obvious now; sometimes you’d rather have more cores than power wasting front ends. Especially for portable stuff.

  • @shikhanshu
    @shikhanshu 5 лет назад +1

    such an amazing video!

  • @pixannaai
    @pixannaai 4 года назад

    The best explanation ever. Thanks! keep going!

  • @centuriomacro9787
    @centuriomacro9787 5 лет назад

    I have a question regarding the transistors of a CPU. You said that there are billions on a die. What I want to know is the following: are the transistors already used provide specific numbers of AND, NAND, OR, XOR gates or can they be put together to perform a specific logic operation at any time? And how would they do that?

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад

      In a conventional CPU: Transistors are fixed and their interconnections are also fixed in a classic CPU.
      In contrast, in an FPGA: you can configure the connection between logical blocks, called Look Up Table (LUT). But once configured, it's set for the run.

    • @centuriomacro9787
      @centuriomacro9787 5 лет назад

      @@Arthur-qv8np thx for your answer. What does fixed in their interconnection mean? That they are printed together as logic gates?

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад

      @@centuriomacro9787 Exactly :p

  • @23wjam
    @23wjam 5 лет назад +1

    A lot of so-called Risc chips have multi-cycle instructions. I think risc is more of a concept now, but a lot of real world implementations aren't so pure as lines blur between risc and cisc from the risc side.
    Soon comp Sci will redefine what risc is, like they did with mini-computer. Imo

    • @AlexEvans1
      @AlexEvans1 5 лет назад

      Yep, pretty much no such thing as a pure RISC or pure CISC architecture. When did they redefine what a minicomputer is? I am aware of a lot of idiots that call things which are smaller than microcomputers minicomputers, but in the field 1) minicomputer was a term only used by part of the field even in the days of say the PDP-11 and 2) has essentially been abandoned.

    • @23wjam
      @23wjam 5 лет назад

      @@AlexEvans1 well now minicomputers is meant more between mainframe and micro, but in the past, a mini-computer was a computer with minimal features.
      The closest thing to the old definition of mini-computer is probably MISC, however it's not a perfect replacement because apparently MISC ISA can't have microcode. DEC's PDP-8, a mini-computer with like 8 instructions isn't MISC because it has microcode.

    • @AlexEvans1
      @AlexEvans1 5 лет назад

      @@23wjam That seems a difference that isn't a difference since you are talking about a time when there simply weren't microcomputers. A definition that *may* have existed before 1970. Most of DECs PDP series (particularly not the PDP-10) were considered minicomputers when they came out. I know of RISC and CISC implementations that didn't use microcode.

    • @23wjam
      @23wjam 5 лет назад

      @@AlexEvans1 the difference is that they aren't minimal anymore, previously a fundamental facet for the former criteria. I guess the issue is, and obviously how and why it changed, is probably due to it being more of a marketing term. (maybe? I'm not sure because it was before my time)
      As regards the no microcode, there are obviously other criteria involved, but in my example the PDP-8 is disqualified from the MISC classification due to microcode, which is idiotic as its definitely MISC in spirit.

    • @AlexEvans1
      @AlexEvans1 5 лет назад +1

      @@23wjam that and MISC is a term that wasn't in use at the time. Just like CISC architectures weren't referred to that way until the introduction of the term RISC. MISC has generally referred to certain academic architectures like the one instruction architectures (for example subtract and branch if not zero). In any case RISC and CISC are really abstract notions that few computers fit perfectly.

  • @eterusilvers3919
    @eterusilvers3919 4 года назад

    Wow you are really good at explaining complex stuff! :)

  • @mas921
    @mas921 5 лет назад +4

    I still have to deal with low level optimizations from time to time but on GPUs. Which made me think, Professor; that In mobile, how dominant (or not) are SIMD instructions vs usual "logic" RISC instructions. Because when i was watching the video on my note 8 i thought "RISC insts. are showing the professor on my screen right now lol....oh wait.." ...then i realized actually there is a SIMD video decoder taking care of that! And then there is the GPU for the UI, DSP for the camera...etc etc. So i was reallllly intrigued by how much SIMD is actually "running our multimedia rich mobile world" hence it might not be RISC vs CISC now in 2019 as much as dominance of SIMD/ co-processors ;)

    • @GaryExplains
      @GaryExplains  5 лет назад +3

      Yes I agree, a smartphone has a processor with not only a RISC CPU, but there is the FPU, SIMD, Cryptography extensions, a DSP, an NPU plus the GPU. But there is also seperate video processor (decoder) and a separate display processor!

  • @RayDrouillard
    @RayDrouillard 5 лет назад

    Since modern processors have an internal speed that is several times quicker than their ability to fetch memory, the CISC method of running several processes per instruction has a distinct advantage.

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад

      "running several processes per instruction has a distinct"
      I think you're discussing about superscalar architecture, not CISC ISA.
      Some ARM processor are superscalar too. (with a RISC ISA)

  • @idesigncentral7855
    @idesigncentral7855 4 года назад

    Thanks for the explanation. Now I know why RISC always seems to be better performing than a CISC based computer. The old Apple PowerPC's were smoother in performance than their current Intel counterparts. It's good to hear that Apple is ditching the Intel Chips in favour of the new RISC based ARM processors. Now maybe we'll get some of the old performance back. I often found the Intel Macs would freeze in the middle of simplest tasks where this was never an issue with the PowerPC.

  • @vaibhav6982
    @vaibhav6982 3 года назад

    People say a risc instruction takes only one cycle to execute, but what does the term 'cycle' actually mean here? Is it the clock cycle or the instruction cycle. Please answer sir

    • @GaryExplains
      @GaryExplains  3 года назад

      Clock cycle. I have a video about it here: ruclips.net/video/gLsdS0zQ82c/видео.html PS. The whole one instruction per cycle thing is no longer true for CISC or RISC.

    • @vaibhav6982
      @vaibhav6982 3 года назад

      @@GaryExplains But sir, how can an Instruction be executed in just a single clock cycle, it has to go through various steps like fetching, decoding and executing which will require one clock pulse(or clock cycle) each.

    • @vaibhav6982
      @vaibhav6982 3 года назад

      I mean if an instruction is to be executed it has to be first fetched which will require a clock pulse to enable the Program Counter(PC) and set the contents on the Memory Address Register (MAR) and then fetching the operands will also require another clock pulse, so how can an instruction be executed in just one clock pulse.

    • @GaryExplains
      @GaryExplains  3 года назад

      I suggest you watch the video, that is why I made it and why I gave you the link.

    • @vaibhav6982
      @vaibhav6982 3 года назад

      @@GaryExplains Sure sir, as you say😊

  • @Blue.star1
    @Blue.star1 4 года назад

    Gary do you know what sort of ram , L1, L2 cach these risc processors consume ! I dont think cisc will wait few clock cycles to execute a x86 operand , there are methods to bypass waiting few clocks....

    • @GaryExplains
      @GaryExplains  4 года назад

      I am not clear on exactly what you are asking? The problem of fetching instructions from memory and the use of caches is the same for RISC and CISC, but there is a disadvantage to system that use variable length instructions.

    • @Blue.star1
      @Blue.star1 4 года назад

      @@GaryExplains I meant multiple instructions for risc instead of using one...

    • @GaryExplains
      @GaryExplains  4 года назад

      So the idea is this. On RISC the instructions are guaranteed to be of a certain size, which means the fetch and decode phase is simpler compared to variable instruction lengths which means tat the first part has to be fetch and partially decoded to see if more needs to be fetched. As for execution time (in cycles) vs time to fetch next instruction, these mechanisms are basically decoupled nowadays on both CISC and RISC due to caching, wide pipelines, and instruction level parallelism, etc.

    • @Blue.star1
      @Blue.star1 4 года назад

      @@GaryExplains instead of running code line by line we should use multiple fpga and cpu's to decode instructions and run basic OS Files, we got maturity in hardware and software, we have to implement common OS system files inside the chip...problem is it heats up

    • @GaryExplains
      @GaryExplains  4 года назад

      I don't really know what you mean by "instead of running code line by line" or what you mean by using multiple cpus to decode instructions. Of course, heat (which equals energy spent) is the THE problem.

  • @sureshkhanal3801
    @sureshkhanal3801 5 лет назад +3

    Knowledgeable ❤.

  • @Ko_kB
    @Ko_kB 2 года назад

    Great explanation.

  • @C.Zacarias-Main
    @C.Zacarias-Main 3 года назад

    I'm glad that RISC processing is now being used in tablets and smartphones. What will happen when the desktop and laptop PC's used RICS processing?

  • @xCaleb
    @xCaleb 3 года назад +1

    So is there any benefit for Intel to still be technically using CISC in their x86 chips?

    • @GaryExplains
      @GaryExplains  3 года назад +1

      x86 is CISC, so Intel and AMD have no choice but to use CISC.

  • @dogman2387
    @dogman2387 5 лет назад

    Great video!

  • @kristeinsalmath1959
    @kristeinsalmath1959 5 лет назад

    If i understood correctly. The head generated on CISC chips is caused by complexity of the microcode while RISC's chips take less effort to perform one simple intruction?

    • @Waccoon
      @Waccoon 5 лет назад

      Microcode and nanocode have been around forever. Modern CISC can re-order the nanocode before caching and execution, while RISC tends to still be hard-coded. Ultimately, RISC vs CISC is a checklist of design features, and there's no hard line of separation between the two. Hence, why there's so much confusion.
      Both styles of chips have huge amounts of redundancy and parallelism. The only "real" difference between CISC and RISC these days is that CISC can combine memory access with a computation (orthogonal instructions), while RISC strictly separates the two (load/store instructions).

  • @pallavprabhakar
    @pallavprabhakar 4 года назад

    Amazing video!!!

  • @DaywaIker
    @DaywaIker 5 лет назад

    Thankyou Gary!

  • @fk319fk
    @fk319fk 5 лет назад

    For RISC, the best example is Video Cards...
    my question is what happened to VLIW? It seems you compile once, have plenty of real time to do so, so it seems that is the best time to optimize.

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад

      VLIW (for Very Long Instruction Word) just graps together a fixed number of "small" instructions into a "big" one (a "very long" one) also called a bundle. When you execute that "big" instruction: you execute in parallel all the "small" instructions, each of them is assigned to a different compute unit. The unit assignment must be done at compile time this process is called "static scheduling".
      That's the opposite of "dynamic scheduling" where the units are assigned at runtime within the CPU, which is what we call a superscalar processor (like x86 CPUs, ARM, RISC-V, ..).
      "static scheduling" reduces CPU complexity by giving the compiler the workload of scheduling.
      But building an efficient compiler for static scheduling is really challenging for general purpose processor (like the one you use to navigate on youtube).
      This is why most general purpose processors are superscalar and not VLIW. But the VLIW architecture is still used in DSPs (Digital Signal Processor) for example.

  • @onisarb
    @onisarb 5 лет назад

    We find new information yet again!

  • @Faraz-cse
    @Faraz-cse 3 года назад +2

    I am coming back here after ARM V 9 😅

  • @Dsnsnssnsnsjej
    @Dsnsnssnsnsjej Год назад

    Thank you. 👍🏻

  • @keiyakins
    @keiyakins 4 года назад

    I honestly kind of hate the thing modern x86 processors do. It being impossible to know what the computer is *actually* doing... it just bothers me. And then there's the security problems introduced by stuff like branch prediction...
    Then again I'm a huge fan of the approach Microsoft research was taking in Singularity and Midori of keeping code in a level where it can be reasoned about reliably until delivery to the computer it'll run on, which does the same thing but is more explicit about it because it's at the OS layer.

  • @madmotorcyclist
    @madmotorcyclist 4 года назад

    Thermal will be the deciding factor in the end as the scale of production shrinks (RISC has the advantage here). Also, the fastest supercomputer in the world is RISC based.

  • @BruceHoult
    @BruceHoult 5 лет назад

    While original MIPS and SPARC had branch delay slots, it was quickly realised this is a bad idea as it optimises for just one microarchitecture and doesn't help later machines at all. So MIPS has quietly dropped it and Power(PC) and Alpha and ARM and RISC-V have never had it. Once branch prediction got good there was absolutely no point to it anyway.

    • @Arthur-qv8np
      @Arthur-qv8np 5 лет назад

      Instead of branch prediction you can use hardware loop buffering (for loops) and predicated instructions (for if statements) to get time predictability and low power features.
      But obviously that's not so great for general purpose computing ^^

  • @bpark10001
    @bpark10001 5 лет назад

    I think you may be confusing Cisc/Risc with Harvard/Von Newman. In order for instructions to operate in 1 cycle (with no branch), there must be parallel memory buses (for program and data along with a separate internal stack as a minimum). For example, to push 1 byte onto the stack, The op-code must be read from memory, and the byte written to the same memory. That can't happen in Von Newman architecture. In Harvard architecture, the stack has a dedicated memory that can be read/written at the same time an op-code is fetched from program memory (and the operand from data memory).
    The problem with Harvard architecture is that the sizes of the various memory banks (stack, program memory, data memory) are limited by the number of bits allocated to them in the (usually very wide) instruction, and can't be changed on-the-fly to suit an application. So if an application needs more stack memory, but needs little data memory, the entire architecture must be demolished and re-designed. On the other hand, in Von-Newman architecture, all the memory is contiguous, and the only limit on its size is the number of bits in the program counter, and the address bus width. The only ARCHITECTURE changes needed to accommodate this would be perhaps more addressing modes for some of the instructions (along with some heinous "paging" modes). But the fundamental architecture remains unchanged. It is this reason that the earlier X86 architecture (originally 8-bit data width, and 24-bit address bus width) survives; because it was easy to extend. (If you want to see the nightmares the Harvard architecture causes, look at the history of Microchip's PIC processor line (stacks only 2 levels deep, subroutine calls could only go to "stripes" of memory, etc.)
    An interesting architecture to look at the TI430 processor. The design ingeniously provides complex "instructions" by changing parameters on a more fundamental instruction (leaning in the direction of your "one instruction" CPU).

    • @GaryExplains
      @GaryExplains  5 лет назад

      No, I am not making that confusion. To get around the problem you describe, RISC processors use a multi-stage pipeline. For RISC-1 it was a two stage pipeline, hence the delayed branch thing.

    • @bpark10001
      @bpark10001 5 лет назад

      @@GaryExplains ...but there is no way that any instruction that accesses memory (which ALL of yours do by definition) in 1 cycle, with a pipeline OF ANY LENGTH. An instruction can never be faster then the number of memory accesses required (double that for any read-modify-write accesses). Your instructions require 3 memory reads to get the "instruction" (3 addresses) unless your data width is 3 times the address-bus size (that would be a pretty crazy machine with an oversize data width and an undersize address bus width. If that were true, that would "just" get the instruction into the CPU in 1 cycle. Then at least 1 more cycle would be needed to write out the result (you have no instruction that is register only since there are no registers other then PC). For a machine with 2-byte address bus, data (and memory) bus would need to be 6 bytes wide. All this could "guarantee" 2 cycles MINIMUM per instruction.
      In your simulations, what are the data/address widths?

  • @WinterCharmVT
    @WinterCharmVT 5 лет назад

    ARM is the future. People just don't really see it yet but the writing is on the wall. Apple's SoCs are ridiculously powerful, and they have an amazing architecture on RISC... It's nearing the speed of desktop CISC processors and using teeny amounts of power.

    • @dlwatib
      @dlwatib 5 лет назад

      Intel has an amazing catalog full of low power x86_64 chips. My new 4 core 3.2 GHz i3-8300T with 8 MB SmartCache is rated at only 35 watts TDP. It doesn't need a fan, and the case is barely warm to the touch.

  • @gordonlawrence4749
    @gordonlawrence4749 5 лет назад

    One other advantage of RISC is how many gates it takes up on an FPGA. OK there are FPGAs with a bazillion gates on but there are still FPGA's with less than 50k gates too. For SOC on something that small you have to go CISK and stuff a bit of RAM and ROM round the outside.

  • @infopackrat
    @infopackrat 4 года назад

    Why can't the CPU be both risc and cisc today? Then it's up to the compiler or programmer to choose how to use those instructions to best optimize the program.

    • @erikengheim1106
      @erikengheim1106 4 года назад

      Because silicon costs money. If you try to be both at the same time, that will cost silicon, which means there will be less space for say cache or other things. It will also push up the cost.
      And in a way RISC and CISC processor are already in many ways both. ARM has a whole bunch of CISC like instructions. High performance ARM CPUs also actually turns its instructions into microcode just like a CISC processor. And of course CISC processor are RISC like since they turn their instructions into microcode, which is RISC like.
      The difference is that they both use techniques from the other one where it makes sense, for how they already operate.

  • @Alan_Skywalker
    @Alan_Skywalker 2 года назад

    The problem now is, RISC needs a lot more instructions to do the same thing, which means it tanks on memory subsystem, both latency and bandwidth. It's no problem inside CPU though, that's why CPUs always break down cisc instructions into micro-OPs in the pipeline.

  • @TheRojo387
    @TheRojo387 2 года назад

    I heard that CISC still has its use to compress functions into a format inflatable back into a RISC format for simpler hardware to execute. I believe VLIW would do this even more so.

  • @louiscouture9139
    @louiscouture9139 5 лет назад

    Good explanations, I like it.

  • @PsychoticusRex
    @PsychoticusRex 5 лет назад +2

    Is operable memory still a thing? Whence instead of reading the memory first, you just send the number too add or subtract to the memory and it'd do the simplest math on it since the cpu doesn't care what the results are at that point. It'd introduce a lot of streamlining to cpu-memory communication and inter-thread communication preventing unnecessary blocks.

  • @sybaseguru
    @sybaseguru 4 года назад

    Great explanation, thanks. Not sure about your conclusion though. The ryzen 3600 proves that you can get fantastic performance without massive power requirements. Risc keeps sticking its head up and gets it chopped off very quickly. Its probably cheaper and quicker to design and fabricate so uses the latest technolgy whilst cisc is a generation behind but so much quicker. DEC, IBM, Intel, MIPS, Sun, HP all in their heyday tried and got trashed. The only time RISC wins is when very low power is first, second and third priority.

  • @WizardNumberNext
    @WizardNumberNext 5 лет назад

    there was single architecture which did every single instruction in single clock - MIPS (but it lacked division and possibly multiplication too)
    all other RISCs did not have all instructions in single clock

    • @AlexEvans1
      @AlexEvans1 5 лет назад

      early versions of SPARC also did this. They multiplied using the mulscc instruction. For a 32 x n multiply, you would execute the instruction n times.

  • @teh_hunterer
    @teh_hunterer 4 года назад

    Naive and basic question then: why is CISC still used for new software then? Why wouldn't newly developed x86 software and the x86 versions of Windows/macOS be written so they bypass the CISC to RISC translation layer and just execute directly as RISC code? Is there something stopping this? Seems like you'd keep the CISC hardware on Intel and AMD chips for legacy purposes but have all new software skip all that CISC stuff... hope someone can explain this to me.

    • @GaryExplains
      @GaryExplains  4 года назад

      Because x86 is the legacy and back compatibility is essential. That is why the Itanium failed. For Intel to invent a whole new instruction set (a RISC one) that has to work side-by-side with the old one would be a disaster.

  • @centuriomacro9787
    @centuriomacro9787 5 лет назад

    RISC vs CISC is so exciting. I would have never thought that there are different types of instruction sets and they have such a big impact on the device.
    Im looking especially forward to Apples ARM designs and how Intel is able to response

    • @hermanstokbrood
      @hermanstokbrood 5 лет назад +1

      ARM is designed by......ARM. Apple only customizes it like Qualcomm, Huwawei, and Samsung.
      Qualcomm and Huawei have closed the gap since they build also on 7nm like Apple already did.

    • @wanmaziah9835
      @wanmaziah9835 5 лет назад

      @@hermanstokbrood lol Huawei did not customize their are cpu...

    • @hermanstokbrood
      @hermanstokbrood 5 лет назад +1

      @@wanmaziah9835 ROFL Yes they are. ARM delivers the building blocks. They all do their own stuff with it.

    • @wanmaziah9835
      @wanmaziah9835 5 лет назад

      @@hermanstokbrood what did you means by ROFL ?

    • @wanmaziah9835
      @wanmaziah9835 5 лет назад

      @@hermanstokbrood what is different between semi custom and full custom.... as we can see in Qualcomm chip in sd 835 rather than sd 820 / 821 because Qualcomm modified full custom core..... in sd 820 and 821.. and I think Monggose also did full custom core not semi like Samsung do... Is that true.. ??🤔🤔

  • @Joe-ij6of
    @Joe-ij6of 4 года назад +3

    10:50 so... x86 takes compiled code and uses a hardware implemented interpreter? Just so it can run on some sort of underlying risc isa? No Wonder why apple is jumping ship!

  • @Tapajara
    @Tapajara 2 месяца назад

    In the early days all processors were RISC. All developers were beginners. It was only later when designers got good enough that they could develop CISC. But those developers were few and they were closed source. Now the world of processor development has opened up to the general public. They are beginners like the few were in the early days. So it will take some time before the open source people are good enough to start developing CISC processors. They require more sophisticated engineering but have better theoretical performance and are much more efficient with memory. It is inevitable unless the industry collapses.