Code Review: C

Поделиться
HTML-код
  • Опубликовано: 23 ноя 2024

Комментарии • 97

  • @Kknewkles
    @Kknewkles Год назад +188

    "It's good. Because it's not C++"
    "I feel like this is where C++ shines pretty well"
    I also feel like it shines pretty well everywhere that it isn't

    • @MrAbrazildo
      @MrAbrazildo Год назад +12

      C++ is better and safer.

    • @Kknewkles
      @Kknewkles Год назад +11

      @@MrAbrazildo see you in 10 years, silly goose. Hope it doesn't take you much longer to figure things out.

    • @MrAbrazildo
      @MrAbrazildo Год назад +8

      @@Kknewkles I take entire projects in C++. It's the best, by test.

    • @friedrichmyers
      @friedrichmyers Год назад +8

      @@MrAbrazildo For me, I can never code in C++; It is not at all simple, and there is already enough gibberish to process in C because of pointers syntax. C++ goes nuclear with shitty syntax. I personally prefer C because of simple, functional and fast code. But no problem with C++ either.

    • @MrAbrazildo
      @MrAbrazildo Год назад

      ​@@friedrichmyersDo you mean it's needed to call C f()s because of pointers? As long as I know, C++ rewrote everything to work with iterators, or even higher level.
      Shitty syntax? It's pretty elegant to me. It inherited the "ambiguity" from C, which is good, leading to more productive work, and added OO features, which allow to configure things to work your way. Miss the ^ from BASIC? No problem: make a class, put your integer in it, and bingo: x ^ y. It could be even more elegant though, if macros receive a strong upgrade.

  • @PhatPazzo
    @PhatPazzo Год назад +91

    Oh, I hate the hidden pointer. That feels like a huge foot gun when it comes to ownership…

    • @TheVimeagen
      @TheVimeagen  Год назад +12

      yepper, super confusing there

    • @zivlakmilos
      @zivlakmilos Год назад +18

      ​@@TheVimeagen This was for encapsulation. Lexer struct is defined in .c file. So .h file cannot know what fields are inside, so it could be only pointer.
      That way user of interface only needs to call create and cleanup functions at the end. He cannot access any fields. He mus call functions inside lexer module to do operations on lexer struct.

    • @zivlakmilos
      @zivlakmilos Год назад

      Token is available to user of interface, so no need to be pointer. But I like consistency.

    • @laughingvampire7555
      @laughingvampire7555 Год назад +2

      @@TheVimeagen its not confusing at all if you use a real IDE.

    • @LARathbone
      @LARathbone Год назад +2

      Agreed, typedeffing away a pointer is pure evil

  • @TestTost-j4d
    @TestTost-j4d Год назад +120

    If you use calloc() instead of malloc() you don't have to memset().

    • @MrChickenpoulet
      @MrChickenpoulet Год назад +9

      yea was wondering about why the author did this, maybe there are reasons I'm not aware of where doing this way has some advantages? also no check on `malloc` returns value being potentially `NULL` :D

    • @xBiggs
      @xBiggs Год назад +16

      ​@@MrChickenpoulet I always check if malloc fails, but I don't think it can happen on Linux unless you initially ask for a large amount of memory

    • @whoopsimsorry2546
      @whoopsimsorry2546 Год назад +3

      I still find it more readable doing it explicitly.

    • @xBiggs
      @xBiggs Год назад +7

      @@whoopsimsorry2546 I generally use calloc more because the interface is better

    • @angelcaru
      @angelcaru 6 месяцев назад +4

      @@MrChickenpoulet it's usually unnecessary to check the return value of malloc() because it can only fail if you're out of memory. and if you're out of memory you're fucked anyway so who cares

  • @baguettedad
    @baguettedad Год назад +25

    "Why do I have a more human color?"
    Lmao

    • @TheVimeagen
      @TheVimeagen  Год назад +8

      classic tj, looking like a lizard

    • @Jabberwockybird
      @Jabberwockybird 5 месяцев назад +1

      There's a lot of dog whistles in this video.
      Going up 3 lines in vim
      Talking about whitespace
      Lexar is kind of like Texas

  • @ykhi
    @ykhi Год назад +32

    Now let's see Paul Allen's code

    • @Jabberwockybird
      @Jabberwockybird 5 месяцев назад +1

      👨‍🍳🤌
      Perfect use of the meme

  • @andylyounger74
    @andylyounger74 Год назад +26

    with regards to pointer placement, due to being able to make multiple variable declarations in a single line (and for lack of a better term pointers not carrying), the asterisks has to be part of the variable name rather than the 'type' to make sense. Think of 'int* a, b;" only a is a pointer here. Not saying this is a a good thing, (C is a bit of an evil step-kid in syntax) but it exists. As such "int *a, b;" makes more logical sense.

    • @TheVimeagen
      @TheVimeagen  Год назад +16

      i get it , but i think that there should be some special kind of wrath set aside for those that write this type of code :)

    • @andylyounger74
      @andylyounger74 Год назад +1

      @@TheVimeagen Ha, don't disagree. Just sometimes a new companies coding standards will drive you mad. The newer trend of tooling such as clang-format or gofmt (is it still applicable to call it new?) - "Gofmt's style is nobody's favourite, but gofmt is everybody's favourite.". has a lot of benefit.

  • @abcdefg-nu4xj
    @abcdefg-nu4xj Год назад +34

    when i saw the C++ code review vid i immediately started looking for the C one and sadly didn't find it. well, problem solved

  • @thetastefultoastie6077
    @thetastefultoastie6077 Год назад +21

    Am I right in thinking the purpose of this is to assess the performance of the C language?
    Because this could be sped up significantly by not calling malloc() for every token.
    Also this is not a great specimen for idiomatic C and has a strong C++ flavour:
    - Hiding pointer behind a typedef makes all code using that typedef misleading and is widely discouraged
    - Identifiers starting with underscore are not permitted (both the include guards and private functions violate this)
    - Name of struct typedef doesn't end in "_t", as is idiomatic for C but not C++
    - extern "C" is C++ syntax, and should be in a C++ file, should not be in a C language file
    - strndup() is not portable and you either need to define the appropriate feature macros to request it from the compiler, or a static implementation in the C file

    • @zachmanifold
      @zachmanifold Год назад +1

      What do you think is a good alternative to malloc()ing each token, a larger up-front memory pool (that maybe needs resizing later) that you manage for the tokens?

    • @zivlakmilos
      @zivlakmilos Год назад +3

      You are right about _t convetion. I completly forget about that.
      Hiding pointer can be missliding, but I don't think that is big problem here. Lexer struct is defined in .c file, so other users of lexer module do not have direct access to fields of struct. He can only interact with that struct using functions made for that. It's done for encapsulation. Hi should not allocate or free any memory. Just cal appropriate functions and cleanup function at the end. It's not really low level C, and I would not use this approch for embedded systems, but I think it's nice and convenient interface.
      You are correct about memorry allocation. I will optimize code more when complete parser. Probably one memorry allocation will be better, but implementaion of parser in book holds referect to every token in AST. I don't like to do too much optimisation in advance, that can hurt you latter.
      You are alo correct about strndup.
      I don't really see the problem with underscore. I don't think that would impact performance in any way. Linux kernel uses it also.
      Same about extern C. It's behind preprocessor macro, so only C++ compiler would include that after preprocessing. I think it's nice to have it. That way you can reuse same code in C++ if you need to. It's convenient for user of library. I know that this is not library, but it does not hurt.

    • @zivlakmilos
      @zivlakmilos Год назад +3

      @@zachmanifold Better approche would be to havly ony one instance of token if you do not need to hold it. Just change data inside, and that all. Alternative would be memory pooling, but I think that is too early for detailed optimisation. It's really depends on how you use it latter. Implementation in book holds reference to token in AST, so one instance maybe it's not that great idea. If you optimise it too early can hurt you in long run. But definitely can and should be optimized.

    • @zachmanifold
      @zachmanifold Год назад +1

      @@zivlakmilos yeah, with the memory pool I was assuming we actually needed the individual tokens, but depending on the application if we could get away with reusing the token then that's definitely preferred

    • @xBiggs
      @xBiggs Год назад +4

      Identifiers starting with underscore are allowed if it isn't underscore underscore or underscore capital letter. Only underscore lowercase letter was used here so it's fine

  • @pskocik
    @pskocik Год назад +5

    Global scope identifiers starting with an underscore are reserved in C => instant undefined behavior.

  • @Jmcgee1125
    @Jmcgee1125 Год назад +35

    5:35 Correct, it makes `Token` a `SToken*`. This is commonly done when making a type that is meant to be used from an interface where the implementer will decide how big the item is and provide allocation/deallocation features (as here), and especially when the type is opaque.
    Nice C code, I like it (though I agree that `type* name` is the One True Way). Good use of setting the pointer to NULL in the cleanup - that is 100% a method to avoid use-after-free errors without having to rely on checking something like Valgrind.

    • @xBiggs
      @xBiggs Год назад +7

      Microsoft does that pointer definition a lot in their C Windows API. Personally, I hate when people declare opaque types this way. I think its better to forward declare the typedef in a header and give the implementation in the c file.

    • @Jmcgee1125
      @Jmcgee1125 Год назад

      That's what I tend to do. But I can see cases where it's helpful, like when the implementation varies by platform. If only this was the most annoying part of the Windows API.

    • @InfiniteQuest86
      @InfiniteQuest86 Год назад +1

      Well to be accurate 'Token' is a 'struct SToken *'. I'd prefer seeing it named pToken or something similar. It's pretty helpful to name things that way.

    • @xBiggs
      @xBiggs Год назад +9

      @@InfiniteQuest86 Or you could just not typedef pointers

    • @zivlakmilos
      @zivlakmilos Год назад +2

      You are correct.
      This was for encapsulation. Lexer struct is defined in .c file. So .h file cannot know what fields are inside, so it could be only pointer.
      That way user of interface only needs to call create and cleanup functions at the end. He cannot access any fields. He mus call functions inside lexer module to do operations on lexer struct.
      Token is available to user of interface, but I just like consistency.

  • @delicious_seabass
    @delicious_seabass Год назад +7

    I'm not a big fan of the use of underscores in the function names here. I get that they're marked static, so its just limited to the scope of the file, which technically is the correct use as stated by the C standard, but libraries (std, posix) and compilers still use and expose such named identifiers at the global level and assume anything with one or more underscored is reserved for them. It's probably best to just stay away from that convention in C.

    • @JJCUBER
      @JJCUBER 4 месяца назад

      C compilers only consider starting with double underscores (or single underscore when external I believe) and starting with a single underscore followed by an uppercase letter as reserved.

  • @real1cytv
    @real1cytv Год назад +6

    The classic "Start C (end)" marker

  • @Zzznmop
    @Zzznmop Год назад +2

    Shoutouts to prime and teej flying out for in person pair code reviews :p

  • @Jabberwockybird
    @Jabberwockybird 5 месяцев назад +1

    Lexar is short for Lex Luthor. He has an mallocious plan hidden in their somewhere

  • @mrbonono2951
    @mrbonono2951 6 месяцев назад +5

    C++ really shines when it's actually c.

  • @HelloThere-xs8ss
    @HelloThere-xs8ss Год назад +3

    I could watch at least an hour of these two running through this code with gdb

  • @caio757
    @caio757 Год назад +1

    This is a super simple code, the developers do they job here

  • @HenrikBgelundLavstsen
    @HenrikBgelundLavstsen Год назад +4

    With all the JDSL and Tom's a genius i think you need to make a T-shirt

  • @soyitiel
    @soyitiel 6 месяцев назад

    Bruh, the whole time I thought they were just sitting together like really good friends

  • @laughingvampire7555
    @laughingvampire7555 Год назад

    5:13 SToken is the name of the type for that struct, the *Token is the type of the pointer to a struct SToken, so you will be able to do this
    "SToken a_instance_of_stoken, another_instance_of_stoken;
    Token a_pointer_to_stoken, another_pointer_to_stoken;"
    so is just syntactic sugar to do type you wanted with int* but is illegal with multiple variable names.
    That can also be declared
    typedef struct {
    TokenType type;
    char *literal;
    } SToken, *Token;
    other people to avoid the "hidden pointer" stuff do the other way of this, Token for the struct and PtrToken for the pointer type, or some other naming convention that is explicit about being a pointer. If you have a good IDE you can know what it is by hovering on it, this is why Cormac disses on non-IDEs like vim & emacs. I guess you can use a "go to definition" keyboard shortcut on vim.

  • @MACAYCZ
    @MACAYCZ Год назад +17

    For me personally, C is just the best language of all time.

  • @Sayan_Shankhari
    @Sayan_Shankhari Год назад

    size_t => unsigned int 32/64 bit
    in-place object declaration

  • @AlexandreJasmin
    @AlexandreJasmin Год назад +1

    Put that underscore at the end of the name!

  • @cheebadigga4092
    @cheebadigga4092 Год назад +3

    The Slexer tho XDDD I'd do "struct Lexer_t"

  • @MysteriousFoxy87
    @MysteriousFoxy87 4 месяца назад

    1:49 I disagree on this -- I would've written it as "const char * _lexerReadInt" (with spaces on each end) for example, because:
    - char is a type
    - * is a specifier/qualifier
    - char* is not a type per-se
    - pointer specifier/qualifiers can be applied to any C type
    - char* a, b, c; leads to programming errors

  • @jarvenpaajani8105
    @jarvenpaajani8105 Год назад +1

    2:49 it's hungarian notation

  • @buku69420
    @buku69420 Год назад +2

    I should study more

  • @AdamS-lo9mr
    @AdamS-lo9mr 4 месяца назад

    Real ones know the star goes at the function/variable name because when you declare more then one pointer in a single line you need a star for every variable name. Yes, its stupid and pedantic but thats the way it must be and REAL ONES WILL KNOW...
    edit: damn I wrote this befor prime explained why they do it that way and now I feel stupid.

  • @james4-u3o
    @james4-u3o 3 месяца назад

    why would you call malloc() then call memset()? Couldn't they have achieved the same outcome with calloc()? time: 3:11

  • @ryanlockhart5328
    @ryanlockhart5328 4 месяца назад

    Is malloc a macro that wraps actual malloc that checks for a failed allocation?

  • @wolfgangsanyer3544
    @wolfgangsanyer3544 Год назад

    * to search for the word under your cursor

  • @AsbestosSoup
    @AsbestosSoup Год назад

    Charlie Puth: *perfect pitch*
    TJ: *puts on Tom's extract* *perfect debug skllz*
    Also, TJ kind of looks like charlie too (?), just came to mind. Any ops degens...?

  • @PinakiGupta82Appu
    @PinakiGupta82Appu Год назад

    You write industry-standard code.

  • @PinakiGupta82Appu
    @PinakiGupta82Appu Год назад

    Looks perfect.

  • @MrAbrazildo
    @MrAbrazildo Год назад

    1:00, size_t is safer in 2 ways: 1) it's the biggest possible integer type, thus holding possible higher values, less chance to get overflowed. 2) If that happens, the behavior is entirely defined, you know what will happen, much different than signed types, which are not specified, allowing different implementations, equivalent to arbitrary behavior from the programmer's perspective! But signed types are faster.
    3:45, much better would be: while (strchr ("
    \t
    ", lexer->character)). //From .
    4:21, it's so ugly to see a switch like that for C/C++. It's possible to reduce this to 3-5 lines of code, instead of this ugliness.
    5:00, I guess it means that when reading the content of pointer Token, the content is the struct SToken. So, whenever you define Token as the type of some variable, that means it'll be a pointer to SToken. So it's not needed to use pointer operator for pointers to this type. i.e.: Token my_ptr_to_SToken = NULL; //It doesn't require *.
    5:25, in C++, it's not needed to make a typedef here: once a class/struct is defined, it's name means its type. But bizarre C may be still demanding that nowadays.
    7:19, the double pointer means a collection of pointers. You could check "nullness" with only a pointer, but it'd be it alone. They want several tokens together.
    7:45, much better: return ch == '_' || isalpha (ch); //From .

    • @aaronfleisher4694
      @aaronfleisher4694 2 месяца назад

      Why not use uint16_t (or uint32_t) instead of size_t? Or, if you want to build in some sort of error handling, int16_t (or int32_t)? A 16 bit int should be sufficient for a lexer. It might be best to chunk up input into a well defined buffer of fewer than 65k characters in the first place. That makes uint16_t (or int16_t) sufficient. That will reduce the struct size from the current (at least) 35+padding bytes to a more precise 21+padding bytes. Or, does size_t offer advantages that I’m not aware of (I’m still learning C)?

    • @MrAbrazildo
      @MrAbrazildo 2 месяца назад

      ​@@aaronfleisher4694 If 16 bits are safe enough, then I would use it - but I would make a typedef for it (it might needs to change in the future). But when the idea is to have more space on a variable, size_t looks towards the future ("future-proof"), growing automatically according to the platform size.

  • @hyto
    @hyto Год назад

    this videos are awesome

  • @xBiggs
    @xBiggs Год назад

    I like using (void*)0 instead of NULL

  • @CrazyMineCuber
    @CrazyMineCuber Год назад

    Review nix next!

  • @sub-harmonik
    @sub-harmonik Год назад +5

    c is underrated

  • @bool2max
    @bool2max Год назад +1

    2:17 that's wrong, both foo and a are pointers.

  • @darkarie
    @darkarie Год назад +1

    Search word under cursor:
    ```
    local tbi = require "telescope.builtin"
    vim.keymap.set("n", "rw", function()
    tbi.grep_string {
    search = vim.fn.expand "",
    }
    end, { silent = true})
    ```

  • @orustammanapov
    @orustammanapov Год назад +1

    it's called Hungarian notation en.wikipedia.org/wiki/Hungarian_notation

  • @nicwhites
    @nicwhites Год назад +1

    I’m sorry, but did you really just find . | grep compile?