Creating Your Own Programming Language - Computerphile

Поделиться
HTML-код
  • Опубликовано: 20 дек 2024

Комментарии • 577

  • @sundhaug92
    @sundhaug92 Месяц назад +1038

    Made a small one in my teens, a commenter wrote on sourceforge I had no idea what I was doing ... they were right

    • @xlerb2286
      @xlerb2286 Месяц назад +100

      Don't let that stop you. The next one you'll know more, and the 3rd one even more :)

    • @GorgioFernen
      @GorgioFernen Месяц назад

      @@xlerb2286 its not like hes a teen right now.

    • @KT-dj4iy
      @KT-dj4iy Месяц назад +62

      You probably knew more than 99% of other teens. Or other people for that matter. But the commenter? Well he (oh, it was a he, for sure) sounds like an absolute genius.

    • @gabef9538
      @gabef9538 Месяц назад +31

      Make a worse one

    • @anotheruser2527
      @anotheruser2527 Месяц назад +12

      Was it Python?

  • @firaskallel5848
    @firaskallel5848 Месяц назад +807

    Proving the Turing completeness of this programming language is left as an exercise to the viewer.

    • @brandonmack111
      @brandonmack111 Месяц назад +24

      It might be. It is definitely surprising what you can do with just that while loop, for sure 😁
      For example it can be used to make a rather clunky sort of if:
      ...
      if = x

    • @TheMaginor
      @TheMaginor Месяц назад +21

      I'm pretty sure it isn't. You can't allocate arbitrarily sized working space, and it doesn't have scope local variables or recursive function calls, so it would be difficult to get around that limitation.

    • @TheMaginor
      @TheMaginor Месяц назад +19

      @@brandonmack111 The problem is that it can only store data in a finite set of variable names that are determined at program start. For a general Turing machine, it needs to be able to grow its working space dynamically depending on the program input (and how much that is can't in general be determined without running the program on that input). I think that may be the only thing that is missing though.

    • @TheOriginalJohnDoe
      @TheOriginalJohnDoe Месяц назад +3

      Just proved it, easy.

    • @NStripleseven
      @NStripleseven Месяц назад +10

      @@TheOriginalJohnDoeproof left as an exercise to the commenter

  • @acquite
    @acquite Месяц назад +252

    i wrote a compiled imperative language in rust :) writing a compiled language is probably the most educational project that exists ever, because you learn (on a deep level):
    - how memory works (including pointers, allocator kinds aka stack, region, heap, temp, gc, etc, how stack frames work)
    - how compilers work (lexer -> parser -> pre-IR for semanalysis -> compilation to IR -> asm -> object code -> linked into executable)
    - how to maintain a larger scale project (with potentially 10k+ lines of code)
    - how to structure a larger scale project (with project file names and splitting of code into different files/functions etc)
    - best practices in terms of data structures to use (Array vs HashMap vs HashSet, null terminated string vs sized string, etc)
    - abstract syntax trees and/or grammar (including different parsing algorithms for operator precedence if you write your own parser (as you should) such as recursive descent or shunting yard)
    - how to debug potentially thousands of lines of code,
    - how modern language features actually work under the hood (for example, loops and conditionals being compiled into just jumps, monomorphization vs dynamic dispatch for generics on structs and/or functions, capturing lambdas and their environments, etc etc)
    - how to use tooling to your advantage
    - how to effectively test your code
    among others. i think there is not a project that ticks more boxes at once than writing a compiled language. plus its really fun seeing it slowly expand and being able to do more things over time :3

    • @HumaniNihil-c8k
      @HumaniNihil-c8k Месяц назад +11

      Do you have any recommended resources for a project like this? I'm currently going through the 'Writing An Interpreter In Go' book (which I will follow with its sequel 'Writing A Compiler In Go') and I was wondering if you had some more insights. Thanks! :)

    • @MarshalLeigh1911
      @MarshalLeigh1911 Месяц назад +1

      I'm just commenting in case you answer the fella above me's question

    • @ANT-jm4qx
      @ANT-jm4qx 29 дней назад

      @@HumaniNihil-c8k "Crafting Interpreters" is free online and it goes writing an implementation of the Lox language in both Java and C.

    • @rosuav
      @rosuav 29 дней назад

      @@HumaniNihil-c8k I'd recommend just looking into the parser part first. Start by poking around with existing languages (eg Python's "ast" module) and learn about how a stream of source code gets tokenized, then converted into a syntax tree, and finally executed. Most languages "execute" by converting into a series of instructions (machine code, or a higher level bytecode), but for learning purposes, directly interpreting the Abstract Syntax Tree is a lot easier to get your head around.
      Design your own language with a very simple grammar. Parse it into its corresponding source tree. Then run it by taking the root node of the source tree and recursively doing what it says! Mastering that will give you a great insight into how programming languages work.

    • @brockbrumley95
      @brockbrumley95 29 дней назад

      ^

  • @sebastiantomasalvarez
    @sebastiantomasalvarez Месяц назад +91

    I like to follow, and implement when possible, this way of sharing knowledge when introducing a topic:
    - No frameworks
    - No add ons
    - Code something practical, pause, ask questions, implement the answers.
    Very nice explanation Dr. Tratt

  • @regiondeltas
    @regiondeltas Месяц назад +133

    A fun excercise - I always thought such things would be purely academic, but I wound up writing a custom sort of query language for a work project. It was to allow non or less technical staff to write their own custom tests for some tooling I'd created. Devilishly hard to actually get right end to end - huge amounts of considerations around data types, parsing, things like brackets, ands ors. But I got there and it works well.

    • @jakistam1000
      @jakistam1000 Месяц назад +4

      I mean... that's nice, but wouldn't it be easier to actually teach them the very basics of a high level language? Maybe write a custom library that's easy to use, but still most of heavy lifting done by the tried and tested language?

    • @SimonBuchanNz
      @SimonBuchanNz Месяц назад +22

      ​@@jakistam1000in the experience of everyone that's tried, no, absolutely not.

    • @Chex_Mex
      @Chex_Mex Месяц назад +16

      ​@@jakistam1000Nope, not at all. There's a reason DSLs or Domain Specific Languages are very useful.
      Even for someone who programs, that can be very useful. If you've ever used SQL or jq to query a database or JSON, you've used a DSL

    • @tharsisharmonia9316
      @tharsisharmonia9316 Месяц назад +8

      @@jakistam1000 Even if that were the case ... why not snaffle some $$$ to do a fun project? Gotta live a little in this life.

    • @jakistam1000
      @jakistam1000 Месяц назад +1

      @@Chex_Mex While I have some VERY basic knowledge of SQL, I mostly employ the approach "there's always a Python library for this". Maybe it's a habit, maybe a bad one (?), maybe just the consequence of no need to optimize for speed - but it's kinda working for me so far. (I'm still learning, though - I used to write everything myself. At work I transitioned somewhat out of necessity, but in my off time I still prefer that, even if I get worse code.)
      The point is, it's difficult for me to believe that writing a programming language is a better solution than teaching an existing one. You're always going to have more tools and flexibility at your disposal, at relatively low cost (imo). I mean, if you say that DSLs are "better" for some applications, I believe it, but I don't really understand it.

  • @Ice_Karma
    @Ice_Karma Месяц назад +201

    Q: Why do programmers have such big egos?
    A: So there's something left after the compiler gets done telling them how bad they are at their job!

    • @muzzletov
      @muzzletov Месяц назад +21

      well, i write the compiler. and since i do, it only tells me nice things.

    • @theewizaard
      @theewizaard Месяц назад +6

      @@muzzletov motivated to start working on mine.

    • @TheMaginor
      @TheMaginor Месяц назад +26

      @@muzzletov "There is a wonderful happy little accident at line 102. I am sure you meant to put a close parenthesis instead of a closed square bracket, didn't you now? Don't worry, this sort of mistake happens to everybody! Let's not even call it a mistake, it was just a twitch of the finger. You are such a great programmer, I really mean it!".

  • @benoitb.3679
    @benoitb.3679 29 дней назад +41

    6:40 "Programming is just a continual reminder that I can't do anything correctly"

  • @vertical3life
    @vertical3life Месяц назад +102

    Small piece of advice: write out full variable names. For somebody just starting out in programming, toks and ev don't mean anything. Tokens and eval is more expressive and it's just a few more keystrokes.

    • @ZT1ST
      @ZT1ST Месяц назад +9

      Sad response to that: you're going to see variable names like that in some languages - IIRC, the function to tokenize strings in C is `strtok'.

    • @RobertFisher1969
      @RobertFisher1969 Месяц назад +23

      @@ZT1ST Of course the early C standard library functions names were limited by what the early linkers could handle. These days we don't have to live with such limitations.

    • @michaelsommers2356
      @michaelsommers2356 Месяц назад +2

      It is short-sighted to limit your programming style to what is easily understood by beginners. Those people will either quickly become experienced, or they will leave the field. Either way, there is no sense in catering to them.

    • @michaelsommers2356
      @michaelsommers2356 Месяц назад +4

      Short variable names make programs much, much easier to read.

    • @sebastiantomasalvarez
      @sebastiantomasalvarez Месяц назад +3

      Just starting out programming and checking a video on how to implement a small interpreter with RPN?

  • @pointinpolyhedron
    @pointinpolyhedron 29 дней назад +9

    The comment section makes me so happy. So many people sharing their own language design experiences :)

  • @BruceGrembowski
    @BruceGrembowski 21 день назад +6

    Nice quick introduction to writing a programming language! Although it has an inception problem, kinda like the one I wrote for a project in 1978. It was called SIMPLE (Student-Implemented Programming Language Experiment), and it implemented a subset of the BASIC language in BASIC, along with a bare-bones operating environment to save and load files.

  • @IceMetalPunk
    @IceMetalPunk Месяц назад +51

    A long time ago, I made a simple stack-based esolang (esoteric programming language) called "Noy", because it used every letter of the English alphabet except Y. (Alternatively, I also called it Alphabet Soup.) It had two stacks: a main stack and a temp/scratch stack that you could transfer to and from. Operations were a single letter each, attempting to be mnemonic (all I remember was "k" for "kick to temp stack" and "u" for "unkick from temp stack"). Digits were also a single letter each ("a" through "i"), and to get bigger numbers you'd have to just operate on the digits you pushed. Every program, therefore, was just a string of letters, without any symbols, numbers, whitespace, or line breaks.
    I made an interpreter for it in JavaScript, C++, and PHP (JS and PHP were my go-to languages at the time).
    Totally useless, but also really fun to make 😁

    • @olbluelips
      @olbluelips 29 дней назад +1

      that sounds super fun! I definitely want to make some esolangs

  •  Месяц назад +42

    20:35 "No one uses reverse polish notation in a real language"... Forth begs to differ! A quirky but really fun and powerful language once you get your head around it 😆

    • @Acorn_Anomaly
      @Acorn_Anomaly Месяц назад +2

      "real language" 😜

    • @prosfilaes
      @prosfilaes Месяц назад

      @@Acorn_Anomaly It's on TIOBE's top 100 languages, it comes with Debian, EA released several games written in it (Worms? and Lords of Conquest, among others). What more do you want?

    • @fredrikkilander4044
      @fredrikkilander4044 Месяц назад +3

      PostScript comes to mind

    • @deadmarshal
      @deadmarshal 27 дней назад

      @@Acorn_Anomaly It is Turing complete, so ....

    • @Yehor-v7y
      @Yehor-v7y 24 дня назад

      ​@@deadmarshal bf is turing complete

  • @MisterFanwank
    @MisterFanwank 29 дней назад +11

    This is a lot closer to real, practical language development than the more academic treatment you'll usually see online that spends forever talking about parsing and leaves doing anything interesting as an exercise for the reader. There is so much you can do quick and easy with this kind of approach that will give you wonderful results. Even something like Java is way more complicated than what people actually need.

    • @lylerolleman1564
      @lylerolleman1564 26 дней назад +2

      In my experience I actually never really see this kind of programming.
      This is fine for simple stuff, but for most cases, a standard regex will work better (albeit a little slower). If you want something "serious", you're going to want something more purpose built, like ANTLR, for you lexing/parsing

    • @theshermantanker7043
      @theshermantanker7043 25 дней назад +1

      Crafting Interpreters is an excellent book on how to implement an Interpreter for an example language, if you're looking for one

    • @MisterFanwank
      @MisterFanwank 24 дня назад

      @@lylerolleman1564 Parser generators are the worst way to handle this, and tend to indicate a very small amount of thought was put into a language's design. Write more complex parsers by hand if you really need them(you don't). They're trivial.

    • @MisterFanwank
      @MisterFanwank 24 дня назад

      @@theshermantanker7043 Crafting Interpreters is one of the best resources for learning the conventional wisdom around basic compiler construction, but that's a low bar as far as I'm concerned. You'll learn a lot more practical knowledge from building a FORTH and playing with macro assemblers.

    • @lylerolleman1564
      @lylerolleman1564 24 дня назад

      @@MisterFanwank I've done them by hand. It can work fine for simple stuff, or for stuff that really needs to perform well but for complex but not THAT complex stuff, a parser generator works very well, provided of course you understand the pitfalls (same as everything else)
      That all said, you also seem to be assuming that the developer is the one in charge of the language design. I have never actually seen this be the case in the real world. Far more likely you got a designer or client who wants all sorts of shinies and will change their mind every 4 hours. In this scenario, I very much appreciate having to maintain 50-100 lines of grammar, not 2k lines of code.

  • @dk6024
    @dk6024 Месяц назад +45

    Forth you love if honk then.

  • @JamesD2957
    @JamesD2957 Месяц назад +4

    "let's try this and see how I've gone wrong"
    my entire programming career, right there

  • @SimGunther
    @SimGunther Месяц назад +76

    The language should've been called Splits because it's an excellent demo on how far split carries software design 😊

    • @jmckinney0040
      @jmckinney0040 Месяц назад +21

      😂 love it! Oh! How about SplitSplat! Because every time it threw an error he called it going "splat".

    • @vadrif-draco
      @vadrif-draco Месяц назад +3

      Split++

    • @muzzletov
      @muzzletov Месяц назад +1

      he only used it to implement a quick demo. dont be naive.

    • @hemmper
      @hemmper 29 дней назад +1

      Splat-- might be more appropriate.

  • @nekrosis4431
    @nekrosis4431 Месяц назад +179

    Interpreter being interpreted by an interpreter...
    - C/C++ programmers fuming
    - integrated circuits screaming in fear
    - memory controller in shambles
    - LLVM crying

    • @mahdoosh1907
      @mahdoosh1907 Месяц назад +1

      exactly

    • @Bunny99s
      @Bunny99s Месяц назад +9

      :) I actually made an expression evaluator in C# quite some time back but it's an actual infix evaluator. And I parse it in infix and don't convert it to post or prefix :) It's not really "that" efficient, but it actually creates a custom expression tree which can be executed / evaluated as often as you like with changed variables. The actual insight and main idea was to look at the issue in reverse. Instead of thinking about the highest priority first, I was looking for the lowest priority first and simply split on that. The only issue was brackets. So I made a bracket substitution step first and the brackets were evaluated seperately. The whole thing simply ran recursively, split on the lowest priority first and evaluated the individual parts again. At the end where numbers, variables and functions. It's quite extensible as you can add custom functions as well. Also once the expression tree was parsed, the expression could be evaluated quickly. Each operation was just a class instance.
      In the end the only fundamental operations were: addition, multiplication and power. There were additional unary operations like negate and reciprocal. So a subtraction is actually an addition with the second argument negated. A division is a multiplication with the reciprocal of the second argument. Later I rewrote the whole thing to include boolean operators and boolean logic and it turned into my LogicExpressionParser. It all came out of a question that was asked on Unity Answers years ago about a parametric Lindenmayer system. The L-System is actually way simpler than the parsing of the parametric expression :P It's all on github (MIT license). Just look for "Bunny83 LogicExpressionParser". The whole parser is a single C# file in just under 1000 lines of code.
      The only thing that it does not handle well is an actual unary minus. It needs to have parentheses. So you can not do "5 * -3" but you have to do "5 * (-3)"
      I made some WebGL examples, one expression parser example that essentially draws a line sequence of I think 2000 segments according to the expression entered in realtime and a somewhat graphical calculator that manipulates the height of a square mesh.

    • @ognjenjakovljevic494
      @ognjenjakovljevic494 Месяц назад +3

      I just wanted to comment like now create a programming lang from scratch

    • @qwfp
      @qwfp Месяц назад +12

      @@ognjenjakovljevic494 First you have to create a universe

    • @quintrankid8045
      @quintrankid8045 Месяц назад +1

      @@qwfp Wasn't Virtual Universe an IBM product?

  • @ewerybody
    @ewerybody Месяц назад +1

    3:42: You might want to use .isdecimal() instead of .isdigit()!! The latter actually will tell True on "¹", "²" and "³" although you cannot cast them via int() 🤷‍♀ and the former actually only tells you True on 0-9!

  • @hoi-polloi1863
    @hoi-polloi1863 18 дней назад

    I love this! One thing to watch out for... I think that interpreter only allows you to have one while statement in the program. If there are two (even not nested), the "end" handler for the second while loop will jump back to the top of the first while loop.

  • @PauxloE
    @PauxloE 28 дней назад +2

    7:45 Reverse polish expressions for arithmetic, but then infix notation for assignments? Not quite consistent. (But I guess otherwise you'll need some differentiation for variables to assign from the ones you evaluate.)
    8:34 Throwing away the `=`sign looks strange. So »x = 2 4 +« is equivalent to »x + 2 4 +« or even »x y 2 4 +«. Maybe better use (name, expr) = split(" = ") here?

    • @ruslikaici
      @ruslikaici 25 дней назад

      my thoughts exactly, consistent left-to-right evaluation and assignment, like "2 3 + -> x" would have been better

  • @Jon4as
    @Jon4as 28 дней назад +2

    I strongly recommend the two books by Thorsten Ball, "Writing an interpreter in Go" and "Writing a compiler in Go"!
    These two make interpreters and compilers understandable, while implementing a full language.

  • @SirusStarTV
    @SirusStarTV 29 дней назад +2

    Even if no one ever uses the programming language you create, the process of building it is worth it. Writing an interpreter or compiler forces you to really understand the language you’re working in-not just in theory, but in practice.

  • @GabrielAcosta00
    @GabrielAcosta00 13 дней назад +1

    Excellent explanation. Thank you.
    I also recommend Ruslan Spivak's serie “let's build a simple interpreter”.

  • @P-39_Airacobra
    @P-39_Airacobra Месяц назад +51

    Reverse Polish Notation is awesome. I love the consistency of not needing parentheses. Begone, operator precedence.

    • @JohannaMueller57
      @JohannaMueller57 Месяц назад +5

      i guess it's awesome for implementing something like this, but is it awesome to use it?

    • @stefanalecu9532
      @stefanalecu9532 Месяц назад +6

      ​@@JohannaMueller57 ask all Forth developers and they'll give you their opinion, you won't find many negative complaints

    • @P-39_Airacobra
      @P-39_Airacobra Месяц назад +3

      @@JohannaMueller57 ​ I don't see why not. It's conceptually simpler than normal notation in every way. The majority of the world's languages even use a subject-object-verb word order, so it's not even unnatural. It takes some relearning to instinctively understand it, sure, but at least you can always understand it by applying very simple evaluation rules, unlike infix notation, which requires some very complicated evaluation rules, to the point that developers often have to look up operator precedence, and proactively use parentheses to avoid confusion. I've read a fair amount of Forth code and it's not difficult to grasp, it's pretty intuitive as far as source code goes.

    • @JohannaMueller57
      @JohannaMueller57 Месяц назад +8

      @@P-39_Airacobra so you're saying a b * c d * + is easier and more intuitive than a*b + c*d?

    • @P-39_Airacobra
      @P-39_Airacobra Месяц назад +11

      @@JohannaMueller57 Which one requires more knowledge to process? You have to separate what you're used to and what's simplest. Of course you're used to the second. Does that mean the second is simpler? No. Does it mean that the first will still be difficult to process even when you're used to it? No. I can understand a b * c d * + just fine because I'm used to Forth and so I know how the operators and and operands relate. I still process a*b + c*d faster, but that's only because I've seen that exact form a million times. If I had seen the Forth form a million times, I would be able to process it just as fast if not faster.

  • @k98killer
    @k98killer Месяц назад +3

    I wrote a stack machine called tapescript for embedding access controls in distributed applications/data types. It compiles to a byte code that is then interpreted, and it has a bunch of useful tools and advanced cryptography implementations.

  • @JumboFPS
    @JumboFPS 12 дней назад +4

    Thought that was a hair on my phone screen at the start lol

  • @refactorear
    @refactorear Месяц назад +2

    As many here I tried a few times, however I used Bison/YACC instead because I was at university and we were learning compilers and I decided to go a step ahead and start learning Bison and all that. I also uploaded it at Sourceforge. This was for a strategy game which where you would write script, feed it to the game so that the game would execute them would play the game until the end, then you would have to rewrite the rules to handle events, attacks and invasions and continue improving the script. Writing the parser itself was the easiest part, though, once Bison clicked. Unfortunately (and with most things) once I reached the point to start writing graphics code I gave up.

  • @RPrice_OG
    @RPrice_OG Месяц назад +2

    A very long time ago I wrote a language for fun. It wasn't very good but I enjoyed figuring out how to make it work.

  • @sanderbos4243
    @sanderbos4243 Месяц назад +4

    Great intro to writing your own programming language!

  • @KSPAtlas
    @KSPAtlas Месяц назад +3

    Good to see you also have the Great I Key Press that starts every good vim session

  • @paulojcavalcanti
    @paulojcavalcanti Месяц назад

    officially one of my favorite videos in this channel!

  • @Andrew-rc3vh
    @Andrew-rc3vh 24 дня назад

    I've done this to create my own functional programming language. I wrote it in C to run fast. Rather than use dirty functions like split I just spin the file in a loop so the loop gives me one character at a time. This then goes to a select statement where I can do something for each character, or for groups of them in some instances. I use this to handle the basic syntax like separating out words that mean something, your string literals, your various symbols and in some instances branch off according to the character before. The idea of doing it like this is you can accomplish quite a lot in just one loop of the file. It gets a bit more complicated as one needs to process brackets and ensure execution is done in the right order. The system then goes on to parse the data several times for the various things one has to do, e.g. like handling higher level issues like function calls in the language.
    Anyway, it was great fun to do and my new language has been put to useful work in running programs on a micro controller and using OTA updates. Compile times are measured in milliseconds.

    • @vladimirnicolescu1342
      @vladimirnicolescu1342 23 дня назад

      That's crazy! Nice work! I'm really curious though:
      How long did it take you?
      Do you feel like you learned a lot of things and the experience was worth it?
      I'm now considering writing one in C too since I'm already learning C at school

    • @Andrew-rc3vh
      @Andrew-rc3vh 22 дня назад

      @@vladimirnicolescu1342 It's about 5000 lines for the compiler and about the same again for the rest of it. I built it in stages and had some experience of trying it once before. My first attempt was to use a split function, but soon realised it was a bad idea.

  • @alexaneals8194
    @alexaneals8194 Месяц назад +1

    One of my C tutorial books had a problem where you created SML (simple machine language). It was a fun project to work on and extend. It reminded me of programming the TI-58 when I was a kid. I have nothing against interpreted languages, I learned how to code in one, BASIC.

  • @trevinbeattie4888
    @trevinbeattie4888 Месяц назад +2

    I was hoping to see something written in Bison / YACC.
    Many years ago I decided to write my own BASIC interpreter in Bison for Linux. I got it to the point where it’s able to run several of the programs in the book of “BASIC Computer Games” either as-is or with minimal changes; the features I hadn’t got to yet include handling arrays and graphics commands. Adding X11 graphics is hard though, so I put the project on indefinite hold.

  • @TheStevenWhiting
    @TheStevenWhiting Месяц назад +77

    Making programming languages has always fascinated me as I always thought "But don't you already need a programming language to make one. And how is your programming language a language if it needs the original language to work?"

    • @TheMohawkNinja
      @TheMohawkNinja Месяц назад +22

      And that's where paper tape comes in.

    • @sufianhaq
      @sufianhaq Месяц назад

      The base programming language is basically your interface with the CPU/ALU.
      Then you create wrappers or abstraction using programming languages for each layer.
      If you want to learn more, search for Nand2Tetris. Its an amazing project that allows you to see how NAND gate and clever abstraction can be used to create a OS with tetris game...

    • @pmmeurcatpics
      @pmmeurcatpics Месяц назад +41

      Ikr! And then you read about language bootstrapping and your mind is blown

    • @xlerb2286
      @xlerb2286 Месяц назад +51

      The language is written in itself. Oh, the first iteration of it will have to be written in some other language, you can use assembly if you want the bragging rights. But then the first goal is to get a compiler that can compile itself. After that it's just adding features. :)

    • @qwertyTRiG
      @qwertyTRiG Месяц назад +5

      Sometimes. PHP, for example, is mostly written in C. But read the essay "Reflections on Trusting Trust" for more about self-hosting.

  • @kodaklen
    @kodaklen Месяц назад +10

    His enthusiasm is really infectious :D

  • @timstevens3361
    @timstevens3361 29 дней назад +2

    i made a programming language out of english words and numbers.
    any words it doesnt know, it just ignores.
    its purpose is to make 2d drawings, or 3d models
    of things you describe, with or without motion,
    with or without text and or audio captions.
    it has dolists, loops, paths, sequences,
    and transforms. i recently added sound
    and conditional flags with switches.

  • @ecavero1
    @ecavero1 Месяц назад +3

    20:30 Bitcoin script kind-of uses reverse polish notation, because it uses the stack for evaluating expressions, too!

  • @svecs132
    @svecs132 Месяц назад +3

    finally the next Porth video after years

    • @DuskyDaily
      @DuskyDaily 29 дней назад +1

      Welcome to yet another recreational programming session by Mr. Zozin

    • @halfsourlizard9319
      @halfsourlizard9319 10 дней назад

      Can your assembly language do that!?

  • @peruibeloko
    @peruibeloko 28 дней назад

    Very happy to know Prof. Laurie is a man of great taste! Gotta love Gruvbox

  • @Skuiggly
    @Skuiggly Месяц назад +5

    for viewers that want to get their hands dirty i HIGHLY recommend Crafting Interpreters
    its a hands on book teaching main concepts of compilers

  • @ericmintz8305
    @ericmintz8305 29 дней назад

    I once wrote a DSL between 9 AM Friday morning and 6 PM Sunday. I was in a fury from beginning to end because a delivery failure put my project at risk. My boss and I eventually got a patent for the interpreter.
    I'm emulating a Control Data 160-A for a computer museum and wrote a simple assembler to support testing. The assembler is a dictionary mapping instruction names to tiny code generators. It works a treat.

  • @kevincozens6837
    @kevincozens6837 18 дней назад

    Years ago I wrote a stack based expression evaluator as part of a data analysis program used by others. For that same system of software I used Yacc and Lex to create a simple language that was used to guide the installation of the software. I am currently implementing a version of FORTH that will run on some older microprocessors.

  • @stephenelliott7071
    @stephenelliott7071 Месяц назад +1

    Great stuff! And yes that split function was a really useful addition to a language.

  • @jacobi2393
    @jacobi2393 День назад

    My favorite roy project i ever sid was defining a paeudo assemply language with like 2 instructions, and then inplementing a few sinple functions in it.
    I wrote an _incredibly_ slow recursive factorial function with it

  • @Nors2Ka
    @Nors2Ka 29 дней назад +1

    Just a FYI for anyone who might want to make a programming language: parsing and evaluating expressions are actually the easiest parts of a compiler/interpreter, academia tries its hardest to make it seem like it's not.

    • @halfsourlizard9319
      @halfsourlizard9319 10 дней назад

      Wat? Clearly defining the semantics and proving soundness of the type system are the interesting and challenging bits.

  • @ShaunCKennedyAuthor
    @ShaunCKennedyAuthor Месяц назад +6

    Look at you implementing a PostScript interpreter right on screen!

  • @sundhaug92
    @sundhaug92 Месяц назад +5

    Why not use a stack for the while-loop? Then you could just pop the base pc/ip off the stack when you hit the end of an iteration

    • @SimonBuchanNz
      @SimonBuchanNz Месяц назад

      They would be the "assume it's not nested" part, you still need to handle finding the end

  • @nirajabcd
    @nirajabcd 23 дня назад

    “Programming is just a basic continual reminder that I can’t do anything properly..” - couldn’t agree more!

  • @TheMohawkNinja
    @TheMohawkNinja Месяц назад

    Just finished a PEMDAS algorithm for the text parser for a shell I am working on, and it is definitely an interesting problem to solve. Since some operators hold equal precedence, I ended up using a 2D array to hold the operator strings, with the 'Y' axis being the precedence and the 'X' axis being each operator at a given precedence level.

  • @eugeneplay9416
    @eugeneplay9416 10 дней назад

    Very entertaining video. I coded along and learnt something.

  • @kenhaley4
    @kenhaley4 Месяц назад

    Quickly coded, but very clearly explained. Well done!

  • @xlerb2286
    @xlerb2286 Месяц назад +20

    I've done 3 small programming languages. The first one was in college for the compiler class final project - but I went way over and beyond what was needed. The school used it as a teaching language for years after that. the second was a scripting language for controlling the process of building an application, running the unit tests, and building the distribution media way back before there were such tools available online or as open source. That language and tool was used for many years by several companies in the area and I think there's one small company still using it 30 years later (it was written for Win95, it still works today - go figure). The third one was just recently and it is a domain specific language for an expert system I built at work. Off and on I've dabbled with another language that I think is an embodiment of the saying "just because you can doesn't mean you should". It has some interesting traits but I've never really seen where they'd be that useful.

  • @Yehor-v7y
    @Yehor-v7y 24 дня назад

    I'm following along him and implementing features he does! Currently at branches

  • @mr.k4039
    @mr.k4039 28 дней назад +1

    Imagine, for a second, that this is the first Computerphile video you've ever seen.

    • @jespensonson7351
      @jespensonson7351 26 дней назад

      this is my first video ever seen by this guy I understand zero

    • @Yehor-v7y
      @Yehor-v7y 24 дня назад

      ​@@jespensonson7351why?

  • @olbluelips
    @olbluelips 29 дней назад

    Nice video! Seeing a nice little interpreter in Python is super refreshing and fun bc my brain is so fried rn. I'm currently trying to make a lang with a tiny syntax and a really algebraic type system... it's insanely hard but I do have a roadmap of things that I need to write so maybe in like 3 more years it will be real

  • @MMarcuzzo
    @MMarcuzzo 27 дней назад

    I use entr (eradman software) for instant feedback from terminal. It's great for tdd-like development or leetcode/beecrowd checking.
    It would make the video even smoother. Just a ctrl+s on the software would not need to re-type the python3 command

  • @quintencabo
    @quintencabo Месяц назад +1

    For reading the file lines you can just iterate over the result of ppen directly. People dont know this often for some reason

  • @simpletongeek
    @simpletongeek Месяц назад +2

    For a programming language to be useful, you need:
    1. PC
    2. Conditional Branching
    3. Indirection (pointer)
    With those, you can implement loops, variables, and arrays. Then, move on to stacks, queues, and lists.
    Operator precedent isn't that hard to do. Well, Dijkstra algorithm one is hard to understand. I personally used 3 stacks algo. 2 stacks if you don't mind right to left parsing.
    I don't understand why Tiny BASIC has FOR loops but not WHILE loops. From experience, I can say that implementing WHILE/REPEAT loops are easy. FOR loops may be easy to compile, but not interpreted!
    TinyBASIC is surprisingly close to assembly language, imo, that I'm surprised that not more people are working on it!

    • @Kobold666
      @Kobold666 Месяц назад +3

      WHILE/WEND was never part of the specification (ECMA-55, Minimal BASIC, 1978). I guess it was introduced with Microsoft Quick BASIC. There is a DO/WHILE loop in ECMA-116 (1986). Tiny BASIC (as the name suggests) uses only a subset of the full language.
      Take a look at how Microsoft BASIC implements FOR loops on a Commodore 64, for example. It breaks the specification in many ways. It's pretty easy to interpret, and you would have to emulate the interpreter's behaviour when compiling it.

  • @andythedishwasher1117
    @andythedishwasher1117 Месяц назад +2

    You aren't programming until you're speaking English and typing reverse Polish simultaneously.

  • @collin4555
    @collin4555 Месяц назад

    I do love the elegance of reverse Polish notation, even if it's not the most intuitive as a human code author

  • @SweDennis
    @SweDennis Месяц назад +1

    Using RPN is not cheating, it's doing it right. :-D Just saying. Loved my HP48sx. Everything is so much more natural and simple, and clear, with RPN, it's not cheating. 😀

  • @tahaAFK
    @tahaAFK Месяц назад +1

    This is exactly what i was looking for !!

  • @smithadmin
    @smithadmin 14 дней назад

    Forth uses reverse polish notation, so i would imagine Forth programmers might take issue with the "no serious programmers use reverse polish notation" quip. 😂
    This video is actually very interesting. Thank you for doing it!

  • @as-qh1qq
    @as-qh1qq Месяц назад +1

    This should have been computerphile's first video :)

  • @marksusskind1260
    @marksusskind1260 17 дней назад

    I didn't call it reverse Polish. It was postfix Polish. I also noted that function notation func1(arg1,arg2, arg3) is like a prefix Polish, but Polish notation would not have parenthesis.

  • @zamf
    @zamf Месяц назад +7

    I am actually currently in the middle of defining a general-purpose programming language and implementing a compiler for it as a proof-of-concept. Writing the parser/interpreter is the easy part. Writing the evaluator is the real nightmare. The hardest part seems to be evaluating function calls.

    • @Clank-j6w
      @Clank-j6w Месяц назад

      What kind of problems are you running into? I remember it being a little bit of a puzzle but nothing too hard. Implementing the call stack and stack frames helped facilitate it all in the end a lot.

    • @zamf
      @zamf Месяц назад +2

      @@Clank-j6w If you only pass values in and out of functions then a stack-based approach is quite straightforward. The problem is that in my case I have the concept of value ownership (similar to Rust) and functions can take full ownership of a value (a.k.a. sinking a value) or temporary ownership (a.k.a. borrowing a value) which is returned after the function finishes. This complicates things a lot. Also, the language I'm developing is compiled. So only things that can be evaluated at compile-time are evaluated. For the rest of the code I have to generate actual machine code, which I haven't even started. I'm planning on using C++ as an IR, since it suits my needs and it's the language I know the best.

    • @muesique
      @muesique Месяц назад

      Also want to do a little programming language. In my case it should be a transpiler cause I can't do the low level stuff as I am not a computer scientist just a hobby programmer.

    • @stefanalecu9532
      @stefanalecu9532 Месяц назад

      ​@@muesique don't believe a transpiler is much easier, since you still have to go through all of the steps of making a regular compiler but you also have to worry about how to map your semantics to the target language and also how to desugar (if you've got syntax sugar) constructs in your language. You are doing 95% of the work of a real compiler, except the code generation is different.

    • @muesique
      @muesique Месяц назад

      @stefanalecu9532 worth thinking about it... 🤔
      But at the moment it's much easier for me to do it that way.
      If you wanna know: I fell in love with LDPL which is a much cleaner subset of COBOL. But there are issues with errors. The developer has to go some way to catch all the rough edges. Because my C++ is... basic I want to translate with Tcl to C or even Pascal (which is much more readable and to think in).

  • @fussyboy2000
    @fussyboy2000 Месяц назад +3

    Adobe Postscript uses RPN.

  • @cerulity32k
    @cerulity32k 27 дней назад

    I very recently made a stack-based reverse Polish notation "language" for creating bytebeat, where instructions are single characters. I'm still working on it, but it has functions, labels, conditionals, and embedding.

  • @buzzz241
    @buzzz241 Месяц назад +2

    Where is the source available? 😊

  • @CarlWilde-v6d
    @CarlWilde-v6d Месяц назад

    Nice vid. My SwissMicros DM42n calculator is RPN just like my old HP's. Good enough for CERN, bonkers good for me. It's great when someone asks to borrow it for a minute... and then the grey clouds of confusion drift over their face

    • @darrendrapkin4508
      @darrendrapkin4508 26 дней назад

      some people, It seems, know the difference between RPN and a stack😂

  • @_zelatrix
    @_zelatrix Месяц назад +1

    I wrote a compiler as a project in university. I'm a bit embarrassed to say I used libraries to make my lexer and parser and to do code generation. But I don't think I would have had the time to hand roll them for a language I managed to make Turing-complete

  • @ShorlanTanzo
    @ShorlanTanzo 29 дней назад +1

    "This is never going to work the first time."
    That's how you know he's an experienced programmer, and not just a theoretical teacher.

  • @AmeanAbdelfattah
    @AmeanAbdelfattah Месяц назад +5

    Can you make a video on creating your own Database Management System? I dont mean downloading sqlite or postrges and create a database. I mean actually coding your own database technology from scratch. I want to know the learning path, the recommended languages, its something im trying to look into but it is really hard to find resources. I used databases for years and its a project i like to try to work on.

    • @olbluelips
      @olbluelips 29 дней назад

      Writing your own database and management software sounds like one of the hardest things ever tbh

  • @arushford
    @arushford Месяц назад

    Reverse Polish Notation Made My Year!

  • @MichaelDoornbos
    @MichaelDoornbos Месяц назад +2

    One thing that I know from learning Forth is that Forth is best at creating your own Forth.

  •  Месяц назад +1

    The way that guy is enjoying what he's doing, makes me wanting to create a programming language myself 😅
    Really nice and motivating video!

  • @GodofWar1515
    @GodofWar1515 Месяц назад +3

    This is a field I've been really interested in for a long time. Enjoyed the video, keep it up! 👍

  • @dianekivi5349
    @dianekivi5349 Месяц назад +4

    Hooray for Reverse Polish Notation!

    • @AloisMahdal
      @AloisMahdal Месяц назад +6

      "Polish Reverse Notation" "Hooray" for

    • @bradyjamesdesign
      @bradyjamesdesign Месяц назад +1

      I started using the HP48g back in 99, to this day I still have trouble if I have to use a regular calculator.

    • @bradyjamesdesign
      @bradyjamesdesign Месяц назад

      @@AloisMahdal 😂

    • @BaronVonTacocat
      @BaronVonTacocat Месяц назад

      hsiloP all day ✊

  • @MladenMijatov
    @MladenMijatov Месяц назад +61

    Why is it that academics always seem to write least readable code. It's no longer 1983, we have space for variable names longer than two characters.

    • @G5rry
      @G5rry Месяц назад +9

      I agree. As soon as I saw the function name was "ev", I lost interest and saw where this was going.
      The interesting part comes from defining the language grammar and how does that grammar get implemented.

    • @landsgevaer
      @landsgevaer Месяц назад

      Also, two-space indents...
      (I would like them, but it ain't convention.)

    • @maximinus1972
      @maximinus1972 Месяц назад +4

      Also seems scared of using vertical white-space!

    • @ZT1ST
      @ZT1ST Месяц назад +3

      To be fair, one reason to do this, especially if you are planning to write globally accessible functions, is to ensure that you don't have conflict with user defined functions and classes.
      Though yeah - there is plenty of room to improve in what they wrote, it's understandable why they would do it that way.

    • @EnriqueSalceda-k4v
      @EnriqueSalceda-k4v Месяц назад +8

      Its totally readable to me.

  • @Tomyb15
    @Tomyb15 Месяц назад

    More videos with Dr. Tratt please!

  • @jasonyesmarc309
    @jasonyesmarc309 Месяц назад +1

    1. This guy is great.
    2. Fellow `split()` fans rise up.

  • @pierreabbat6157
    @pierreabbat6157 Месяц назад

    I looked up "it's all Greek to me" in Wikipedia; one of the German versions is "rückwärts polnisch" (backwards Polish). So I looked up "reverse Polish" in Wiktionary, and it's "umgekehrte polnische" (turned-around Polish).
    Forth and PostScript both use reverse Polish.

  • @j7ndominica051
    @j7ndominica051 29 дней назад

    Reverse polish notation with longer formulas that would have parentheses is a mindf..k. I can understand how you dig into the inner parens and break them down into two terms on each line, like in simple scripting languages. But when you condense them down all on the same line, I can't read it. There was an AviSynth plugin requiring RPN, I think it was for applying curves.

  • @PrimordialOracleOfManyWorlds
    @PrimordialOracleOfManyWorlds 11 дней назад

    in my compiler college course, the professor wanted the class to use recursive algorithms for the token reader as well as other parts of the compiler.

  • @jeffspaulding9834
    @jeffspaulding9834 Месяц назад +7

    "No one uses reverse Polish notation in a real programming language."
    Picking a fight with all the Forth fans, I see!

  • @addcoding8150
    @addcoding8150 Месяц назад +6

    NEOVIM BTW
    Nice to see that Tratt is a cultured person

  • @florinmarin8662
    @florinmarin8662 Месяц назад

    Aaaaa, now one of my favourite topics, i will take a pit stop here for a while..

  • @as-qh1qq
    @as-qh1qq Месяц назад +3

    Next assignment: write that split function using this language

  • @aimem6315
    @aimem6315 26 дней назад

    Не так давно написал свой простенький язык на расте, было в меру сложно (я боролся с бороу чекером 90% времени), но весь компилятор и интерпритатор уместились в, вроде, строках 700. Сам язык стек-ореинтированный и максимально прост в реализации, но тьюринг полноту доказать я смог, добавив в примеры rule110.
    Я даже добавил крайне убогое подобие системы шаблонов, но только для базовых команд.
    Я даже добавил встраивание функций(только в режиме компиляции и на уровне трансляции в ассемблер), ведь вызов функций в этом языке, из-за его стек-ореинтированной природы требует некоторых извращений.
    Кому интересно: на гитхабе лежит "jalgo", ссылки не будет, ищи сам.
    Кому это вообще интересно?

  • @sirynka
    @sirynka Месяц назад +1

    What would be the shortest/simplest programming language that can interpret itself?

    • @gdclemo
      @gdclemo Месяц назад +3

      Depends how you define simplest. Any Turing-complete language should be able to interpret itself. There are languages like Brainf**k (name censored) which is incredibly simple, it uses only eight different tokens including input and output, but it has self-hosted interpreters and even compilers.

  • @BenMakesGames
    @BenMakesGames Месяц назад +1

    I do love building little engines! it would help readability to have more human-readable names for the classes, methods, and variable names, though!

  • @PeterVerhas
    @PeterVerhas 24 дня назад

    I created a similar interprer in Java just for fun, but it reads ahead only the tokens it needs at the moment, and then I created api to get the rest of input by the code, so I could define an expression with operators with priorities and parentheses already in the language itself.

  •  Месяц назад

    If someone actually wants to create a language I recommend using the language workbench JetBrains MPS. It works very differently from other language creation methods and I think it's very nice.

    • @xlerb2286
      @xlerb2286 Месяц назад

      For so many years the standard tools were LEX and YACC (I used slightly improved versions called FLEX and Bison). They were powerful but I wouldn't call them user friendly. We have such better tools today.

    • @SimonBuchanNz
      @SimonBuchanNz Месяц назад

      Honestly it's pretty straightforward to parse relative to the rest of creating a language that I never got the desire to automate that part.
      You switch on the next character to see what type of token you're parsing, then skip while it's valid for that kind, you've got a lexer. Write a function that returns a value and the remaining input for each syntax rule that calls the functions for other rules and you have a recursive descent parser. Use a Pratt parser, which is just a couple of stacks and a loop, and you can parse infix expressions. It's simple to read, lets you provide much nicer errors, and you have a real language if you need to do something fancier at every point.
      Compared to type analysis, optimization, emit, parsing is the easiest bit by far.

    •  Месяц назад

      ​@@SimonBuchanNz In MPS, you don't create a parser at all, you do it the opposite way.

    • @borealis75
      @borealis75 Месяц назад +1

      The parser creates you?

    • @SimonBuchanNz
      @SimonBuchanNz Месяц назад

      More replying to @xlerb2286, but MPS is also solving the same issue, from what can tell: that "parsers are hard". Possibly true for small DSLs, but if you're talking about implementing a full language, they're a trivial part of the effort.

  • @PeterfoxUwU
    @PeterfoxUwU Месяц назад +1

    This should even be Turing Complete, right?
    It supports while loops (WHILE is Turing complete), well no nested loops directly, but you should be able to emulate them eg. a while loops until break != 0 and inside the body every arithmetic is x = x * (1-nested) + (changed value from nested loop body) * nested. In this case nested would be set to one, when the inner loop should do its work and it could also set itself back to zero when it's done effectively skipping all the changes it would do while running. This process has no limit, maybe someone should build a Transpiler, which translates nested loops into this form. You also still have if-conditions by using the same multiply by {1,0} Trick. With this many nested loops it would be no problem to define a power function as well as a floored log, which allows you to create a stack or even a tree on an arbitrarily sized Integer (which I'm definitely going to allow, since python supports them by default and being limited by hardware is no requirement for Turing Completeness). With stacks you could then simulate a Turing Machine or do some Game of life or whatever you like, would be very tedious tho.

  • @OmarQunsul
    @OmarQunsul 27 дней назад

    This code will have some problem with nested loops. Not sure what's the best way to handle this. Probably using a Stack data structure for keeping tracks of how many While/End(s) are there

  • @TheJaguar1983
    @TheJaguar1983 25 дней назад

    What's interesting to see here is how he starts with doing things a cheaty way, and as features are added to the language, things get more and more complicated, until you see the need for a better way of doing the whole thing. It makes you realise why lexers and parsers are so complex.

  • @kenchilton
    @kenchilton Месяц назад

    When designing a computer language, it might be helpful to first determine the programming paradigm. You chose an imperative paradigm, and proceeded to describe a procedural language, but did not mention that. There are, of course, several other possibilities.
    However, it was a great introduction to the simplicity of Forth! 🤭

  • @ralvarezb78
    @ralvarezb78 11 дней назад

    Since I use HP48 yet, I use reverse polish notation and love stack machines

  • @ifcoltransg2
    @ifcoltransg2 Месяц назад

    If anyone has played around with this and is looking for a bigger more complicated version of it as a project to try out, 'Crafting Interpreters' is a beginner-friendly guide.