Decoding ALU Micro-Ops - Superscalar 8-Bit CPU #33

Fabian Schuiki

Просмотров 1,8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 сен 2024

Комментарии • 33

@Artentus 6 месяцев назад ⁺⁶
Negating with carry is actually a usefull although probably a rare operation. Allows chaining negation like addition and subtraction.
@fabianschuiki 6 месяцев назад ⁺⁶
Oh that's an excellent point. That would be a useful thing to have, for wide negations. Luckily, with the ALU op decoder, it's pretty easy to go and add such an op 🙂. I'll probably have to upgrade to 5 bit ALU opcodes soon enough 😁
@dmoisset 2 месяца назад ⁺¹
A cool detail about ATF16V8s is that they have internall pull-ups on pins, so it's actually correct to leave inputs unconnected. Pins will float up and won't suffer from the noise randomly toggling your FETs and eating power. Those are described in section 7 of the datasheet
@fabianschuiki 2 месяца назад
Great point! Feels a lot like the 74LS series of logic with all the high-side pullups 🙂. I'm still not entirely sure whether I want to have the operand data buses pulled low by default, which would mean that I'd have to add a buffer in front of the ALU inputs. Otherwise I could also let the ALU pull RD1 and RD2 high through the ATF16V8's pullups.
@OscarSommerbo 6 месяцев назад ⁺³
The strange/useless operation eliminated is what was "illegal op-codes" in many 80s 8-bit micros. Most were useless, but some were useful and could shave off a clock cycle when used creatively. Fabians approach is of course valid and "more correct" but the inclusion of the illegal op-codes in early home computers is an interesting anecdote.
@fabianschuiki 6 месяцев назад ⁺³
Yes we definitely miss out on those quirky but useful undocumented instructions like this 🙁. You might still get them if you don't use a ROM for decoding, but some logic instead, because these ops often hide in optimizations to the decoder logic. The problem with my approach is that I'm using an ALU decoder, plus an instruction decoder later. So the illegal ops would have to survive two steps of decoding, which gets very unlikely. I was planning to have the instruction decoder throw an Illegal Instruction exception for every unknown bit pattern, but you're making me reconsider that 😃
@OscarSommerbo 6 месяцев назад ⁺²
@@fabianschuiki Stopping the user from accessing undefined op-codes is a very modern idea, and is a way to ensure uniformity. I believe that the home computers simply skipped that logic to save on logic gates. Either way you go will be interesting, but I would probably include the function to hinder access to the undefined op-codes.
@fabianschuiki 6 месяцев назад ⁺¹
👍 It would be cool if a small operating system would abort your process if it encounters any illegal instructions 😏
@eryksoowiej4427 5 месяцев назад ⁺²
@@fabianschuiki I have an interesting idea, that solves this problem quite gracefully. What if you were to add a special instruction that saves a byte of data from program memory to an internal (read-only) register (or FIFO queue of them) to store an ALU "user function"/-s, and then the programmer can use an another new instruction to bypass the ALU decoder and use the contents of that register to controll the ALU instead. That way, the program has minimal overhead when executing these functions, but the programmer has access to *ALL* of the ALU functionality (even if not to all of it at the same time).
@fabianschuiki 5 месяцев назад ⁺¹
Oh that is a clever idea. So you'd essentially make the programmer create their own equivalent of instruction prefixes in x86: they push state into the ALU into that queue, and then issue a generic opcode to pop from that queue and execute any user-defined function. That's a cool idea.
@TheMason76 6 месяцев назад ⁺¹
Great video. Can't wait to see the next one(s)
@fabianschuiki 6 месяцев назад
Thanks! 🙂
@newklear2k 6 месяцев назад ⁺¹
You have big Ben Eater vibes. That's a compliment.
@fabianschuiki 6 месяцев назад ⁺¹
Thanks 🙂
@schrodingerscat1863 6 месяцев назад ⁺³
In simple mode the PLD is similar to an EPROM and you could have used an EPROM for implementation. I remember using similar PLDs back in the 80's when I was at university, back then they were very new and rarely used.
@fabianschuiki 6 месяцев назад ⁺¹
Yes I agree, an EEPROM would have definitely worked here. And it would be strictly more powerful, because the EEPROM can store any distinct bit pattern for every input. That allows it to represent *any* boolean function, not just the ones which have a DNF with a limited number of terms.
The reasons I went with the PLD were that they are a lot smaller (I can't find any DIP EEPROMs that arent huge 30+ pin wide DIPs), and I wanted to try this particular breed of programmable logic 🙂. Going to use an EEPROM for the instruction decoding for sure though 😀
@schrodingerscat1863 6 месяцев назад ⁺²
@@fabianschuiki Yes, EPROMS are much larger packages for sure and much slower too. I have used write once PROMs in the past for very fast operation but they are of course not reprogrammable so not ideal for tinkering. PLDs are worth experimenting with, they can replace a lot of discrete logic, are cheap and fast.
@fabianschuiki 6 месяцев назад
One downside they have is a lack of miniaturization. They do exist in some SMD packages, but I don't think you can do proper in-system programming with them 😕. Sou you're stuck with the DIP or maybe square-ish bent-pin package.
@schrodingerscat1863 6 месяцев назад ⁺²
@@fabianschuiki They are not programmable in system at all, much like EPROMS, they are from an era when the idea of in situ programming or remote updates wasn't a thing at all. For more modern packaging CPLDs are your only option but these are way more complex devices that are much more difficult to program and are way more expensive.
@fabianschuiki 6 месяцев назад
Yeah... At that point you could just build a big CPU in an FPGA and call it a day. But where's the fun and blinking LEDs in that? 😉
@mrengstad 6 месяцев назад ⁺²
Arithmetic right shift of negative numbers isn't exactly like division by 2. Try -1 and you get -1, and the same of -3, you get -2. not -1. It always rounds down, so -0.5 -> -1, -1.5 -> -2. This could be fine, and often is, but it is something to be aware of.
@fabianschuiki 6 месяцев назад ⁺¹
Yes, that is a great point! It always rounds down towards negative infinity, while you likely would expect it to round towards zero, seeing that 1/2 goed to 0 as you mention. Thanks! 🙂
@JaenEngineering 6 месяцев назад ⁺²
Cool little device. Definitely more elegant than using an EEPROM. I'm guessing the plan next will be to replace the XOR based inverter circuit with a full multiplexer based Logic Unit?
@fabianschuiki 6 месяцев назад ⁺¹
Yes exactly! I haven't really found a better approach than the implementation with muxes. Which is sad, because the muxes are very wasteful due to them having two 4-way muxes sharing one set of control lines. You could use more of these 16V8 PLDs, and handle maybe 2 bits per package, but you'd still need 4 chips, and they are about 1.5x the size of a mux chip. Doesn't feel like a real improvement.
@JaenEngineering 6 месяцев назад ⁺²
Not to mention cost. Those PALs are quite expensive compared to a cheap muxer. It does also open up the possibility of performing two bitwise logic operations simultaneously by feeding each of the 2 sets of inputs a different "truth table". Not sure if that could be of any use...?
@fabianschuiki 6 месяцев назад
Haha great point, you could derive to separate results from the same input bits. I can't think of any use of this on the spot, but maybe there is one!
@eryksoowiej4427 6 месяцев назад ⁺²
@@fabianschuiki I might be a bit late to say this, but you musn't have read that datasheet too carefully, as you totaly *CAN* fit *4* *BITS* worth of muxing (for the logic unit) inside of *ONE* ATF16V8B! In "Simple Mode" the IC treats inputs 1 (I/CLK) and 11 (I9/*OE) as regular inputs, and pins 12-14 and 17-19 are *BIDIRECTIONAL*; so for example you could use pins 1-4 for 4 bits of LHS input, pins 5-8 for 4 bits of RHS input, pins 11-14 for 4 bits of control signals (since they are shared between muxes (in the original design) anyway) and last but not least pins 15-18 for the 4 bit result and still have pins 9 (input only) and 19 (input/output) free for expansion and with all bits (of each 4 bit nibbles) neatly arranged in order, next to one another!
@fabianschuiki 6 месяцев назад
@@eryksoowiej4427 This is a brilliant idea. 🎉 It hadn't crossed my mind that I could trade some of the output pins for additional input pins. As you suggest, I could handle 4 bits with a single PLD chip, such that 2 chips could do it all. And the best part is that I don't really need all 16 possible logic functions, but only 6-8 of them. So instead of accepting a 4 input truth table, I could make the PLDs accept a 3 bit opcode instead. That would save me an output at the ALU decoder PLD, which simplifies a few things further down the road. Thanks for the exciting idea! 🎉🥳😃
@tmbarral664 6 месяцев назад ⁺¹
tiny thing I saw, which I was doing before I was caught by a prof :)
it's an op with 4 bits. And it's a 4-bit bus. Key is the hyphen here, making a word thus no need for plural ;)
@fabianschuiki 6 месяцев назад
😀👍

Следующие

Автовоспроизведение

Doing Boolean Algebra on My CPU - Superscalar 8-Bit CPU #34