Python AST Parsing and Custom Linting

Поделиться
HTML-код
  • Опубликовано: 21 дек 2024

Комментарии • 67

  • @tokiomutex4148
    @tokiomutex4148 2 года назад +174

    Solving this problem with regular expressions (the Unix way) is a brilliant idea, you no longer have to worry about the original problem because you're too preoccupied with finding the right pattern.

    • @Alche_mist
      @Alche_mist 2 года назад +58

      Bill has a problem.
      Bill attempts to use regex to solve his problem.
      Bill has two problems.

    • @cryp0g00n4
      @cryp0g00n4 2 года назад +6

      @@Alche_mist tbh regex makes me faster in a lot of ways. I keep handy examples around after figuring them out.

    • @fcantil
      @fcantil 2 года назад +4

      When you're first starting out with regex, it definitely feels like adding another problem but once you start to get the hang of the basics and save those old expressions you've made, it gets easier imo

    • @casper64
      @casper64 Год назад

      @@cryp0g00n4 some languages/syntaxes a regex can’t be used. Html for example

    • @Raja-jo5dm
      @Raja-jo5dm 7 месяцев назад

      he told that was nightmare!

  • @UnFallenRain20
    @UnFallenRain20 2 года назад +25

    This video is such perfect timing for me, I could watch a whole series on static analysis for python.

  • @Mutual_Information
    @Mutual_Information 2 года назад +14

    2:01 Wow!! Whenever I've applied python code to other python code, I've treated it as a string. That's awesome. I've been meaning to create a tool that'll crawl over github repos and extract code quality metrics (probably to learn which repo's are worth checking out).. this would make that task *much* easier.

    • @mCoding
      @mCoding  2 года назад +6

      This might also make your task easier!! pypi.org/project/wily/

    • @Mutual_Information
      @Mutual_Information 2 года назад

      @@mCoding Never heard of this but looks helpful, thanks again!

    • @DilettanteProjects
      @DilettanteProjects 6 месяцев назад

      I like how the use of "probably" implies that you yourself don't even know why you're doing it

  • @ren200758
    @ren200758 2 года назад +7

    wow I didn't expect to see homework assignments in a youtube video. making me feel I should pay for this lesson!

  • @6Sloth9
    @6Sloth9 2 года назад +5

    Entertaining and informative as always. Thank you for your work.

  • @victornoagbodji
    @victornoagbodji 2 года назад +1

    This is a great example for the ast module 😊

  • @hoang-himself
    @hoang-himself 2 года назад

    Very good timing
    I am learning the same stuff but with antlr

    • @markcuello5
      @markcuello5 2 года назад

      Yeah, thanks; I never hears of Antlr.

  • @chongyewchang1705
    @chongyewchang1705 2 года назад +1

    Very well explained and exercises are cool!

    • @mCoding
      @mCoding  2 года назад

      Glad you like them!

  • @Khushpich
    @Khushpich 2 года назад +1

    Very informative. Thanks James.

  • @MrSteini124
    @MrSteini124 2 года назад +4

    Missed ya!

    • @mCoding
      @mCoding  2 года назад +4

      Glad to be back doing Python!

  • @JohnZakaria
    @JohnZakaria 2 года назад +5

    Is it possible to replace the ast?
    For example rename all the functions in a file.
    Or AST to code.

    • @mCoding
      @mCoding  2 года назад +9

      Yes you can use ast.NodeTransformer to modify the AST, combined with ast.unparse() to get back to text. Note that ast.unparse(ast.parse(...)) does not necessarily give back the original code though, and you may want to run it through black afterwards to normalize the code style. You can also use a library like rope.

    • @lawrencedoliveiro9104
      @lawrencedoliveiro9104 Год назад +1

      I have used this to implement a syntax-level macro facility. I was looking for a way to provide both blocking versions of API calls and nonblocking ones that used async/await. The code ends up being 90% identical, but forcing non-async clients to call async functions is fiddly, and trying to use “await” in a non-coroutine function is a syntax error. ASTs provided the answer.
      For a nontrivial example of this in action, see my “Seaskirt” wrapper for the Asterisk telephony engine.

    • @JohnZakaria
      @JohnZakaria Год назад

      @@lawrencedoliveiro9104 I wanted to do exactly the same as you.
      Would you share your code (even if it is broken)

  • @lemna9138
    @lemna9138 2 года назад +3

    Is there any reason past style to put the check function in a class instead of making them a function of the class that goes through the ast?

    • @mCoding
      @mCoding  2 года назад +6

      This was just a logical separation of concerns. Imagine you have 10s or 100s of custom checks. You won't want 100s of functions in your single NodeVisitor class, you will want to break it up into individual units that each do one check. Some checks may also be more complex and require their own helper functions, so using a class makes it easy to tell which helper functions go with which checks.

  • @Graham_Wideman
    @Graham_Wideman 2 года назад +4

    Another great topic choice James! Judging by comments, I am probably not alone in wishing for a sophisticated examination into how far you can get with a static analysis of Python code, where construction of the structure of objects is dynamic. Python does not force you to declare class or object members before you introduce them wherever you like. And the type of a variable is often not easy to determine statically, unless the code relies on some heuristics, which could be quite tricky. So how does flake8 (and its underlying tools) deal with that, or does it even attempt to? Does it have any chance of determining whether a mention of myvar.myfeld = 1 is legit, or a typo for myvar.myfield?

    • @mCoding
      @mCoding  2 года назад +2

      This is one of those times where the dynamicism of python makes it hard for tools to give good feedback. If you are adding attributes at runtime, there is no way to know with certainty whether it was a typo or not (something something halting problem...). What you can do is hold yourself to a self imposed static requirement that you will _not_ add any attrobute that is not named e.g. within dunder init of the class. Then if you see a name set outside of init you know it's wrong if it isn't listed in init. Although this kind of thing requires type information in addition to the ast if you want to handle cases outside of class methods,, so you may need to make a mypy plugin to achieve something like that.

    • @Graham_Wideman
      @Graham_Wideman 2 года назад +1

      @@mCoding Thanks for the reply. I had long ago adopted a policy of adding all attributes in dunder init, as that enables IDE to provide lookup and completion of them, and to generally maintain sanity. It seems to me that we should enable whatever static validation is feasible, and focus runtime testing on runtime (logical) errors. Indeed, type info is the next level up and Python's runtime type resolution can make that difficult, but in a large proportion of cases the lateness of type resolution is not necessary, and it could be deduced or specified in a static manner (especially with type hints).

  • @stiffer_do
    @stiffer_do 2 года назад

    Amazing! Thanks James.

  • @etopowertwon
    @etopowertwon 2 года назад +5

    Honestly, I'd use '\s+.*import' to find local imports (though I never write function-level imports to begin with)
    Another possibility: AST is fun for writing DSLs. You can write extremely basic transpiler from subset of python to c#/c++ in just an evening (especially if you are not afraid of litterling parenthesis that even lisper would blush).

    • @KASANITEJ
      @KASANITEJ 2 года назад

      '\s+.*import' this is not a good idea as this will match commented imports or a string that has import in it.

    • @etopowertwon
      @etopowertwon 2 года назад

      @@KASANITEJ Oh no, I will have to manually skip whole 0-2 results (most likely zero).

  • @vsolyomi
    @vsolyomi 2 года назад

    It makes me think - is it possible to bind an ast to runtime? Like - my dream is a function that prints it's source file name and line number when executed. Maybe it's not to do with ASTs... I'm not sure how to approach it.

  • @kurtmayer2041
    @kurtmayer2041 2 года назад +1

    funnily enough, i wasn't able to figure out how to do the eval in a way that would *not* find the "sneakily assign eval to something else and use that"
    (i actually just visit all the names and if they're eval, i yield an error)

    • @sadhlife
      @sadhlife 2 года назад

      that wasn't the edge case, it was globals()['eval'](...)

  • @sagarbhatia7598
    @sagarbhatia7598 Год назад

    Hello. I am working on research in Change Impact Analysis (CIA). If I compare AST of 2 different python file versions? Will this tool be useful for CIA ?

  • @bigsmoke6414
    @bigsmoke6414 2 года назад

    tbh, i would have just allowed imports to only be at the very beginning of the files (excluding all comments). That way local imports are illegal, because no non comments are allowed before them and all of this is solved, but the ast is still very interesting

  • @Mertly
    @Mertly 2 года назад

    I feel smarter with every video and dummer with everything I still don't know. 1010/1010

  • @РашидАлимов-з1в
    @РашидАлимов-з1в 2 года назад

    Thanks! This is really helpful information.

  • @andidomi4335
    @andidomi4335 2 года назад +1

    Would it be possible to setup pycharm to also use this custom linting checks as part of it's intelisense?

    • @mCoding
      @mCoding  2 года назад

      Not that I'm aware of, but maybe by writing a plugin.

  • @zarifatai
    @zarifatai 2 года назад +1

    Hi James, thanks for the video. I'm having troubles running the flake8_mcoding.py file. I get the following error. Does this have to do with my Python version? I have 3.8.10 installed.
    line 20, in LocalImportsNotAllowed
    def check(cls, node: ast.FunctionDef, errors: list[Flake8ASTErrorInfo]) -> None:
    TypeError: 'type' object is not subscriptable

    • @mCoding
      @mCoding  2 года назад +2

      Yes 3.9 is required for the type hints. You can delete all the type hints that use [], or upgrade to 3.9 (or even 3.10 since that is the current version). Sorry for this inconvenience!

    • @MrRyanroberson1
      @MrRyanroberson1 2 года назад +1

      you can also do: "from typing import List" and replace "list[...]" with "List[...]"; 3.9 was when python decided to integrate that kind of hinting into the base code, without the need for any imported helpers

  • @VectorAmrit2
    @VectorAmrit2 Год назад

    This is what I was looking for from past 3 days. 😊

  • @94grzech
    @94grzech 2 года назад

    good shit my man

  • @kellymoses8566
    @kellymoses8566 2 года назад

    I really wish version control systems like Git worked on the AST of the code instead of the test. It would have a lot of advantages.

    • @mthf5839
      @mthf5839 2 года назад

      The problem with AST is that you loose all formatting, like whitespace, comments, etc.
      You could use the CST, but I do not the advantage there (and actually, I think it should be possible to do that with git hooks and some hacks).

  • @lphillis1
    @lphillis1 2 года назад +1

    YO this is great

  • @MrRyanroberson1
    @MrRyanroberson1 2 года назад +1

    for imports specifically, i think it would be easy enough to just...
    "any time 'from' or 'import' are indented", but i'm sure advanced code has things like conditional imports and whatnot that are indented, so yeah seems you're right overall

  • @tetraxile
    @tetraxile 2 года назад +3

    discord gang

  • @sadhlife
    @sadhlife 2 года назад +2

    I just wrote a 10k word article on writing your own ast based linter from scratch, coincidence? :P

    • @mCoding
      @mCoding  2 года назад +1

      I'm not familiar with your article, but feel free to post a link, I'm sure some may find it a useful resource.

    • @sadhlife
      @sadhlife 2 года назад

      the link disappears when I post

    • @arisweedler4703
      @arisweedler4703 2 года назад

      @@sadhlife what can I google to find your article?

    • @sadhlife
      @sadhlife 2 года назад +2

      @@arisweedler4703 learn Python ASTs by building your own linter

    • @niazhimselfangels
      @niazhimselfangels 2 года назад +1

      Fabulous article! Thanks for going so thoroughly, and please keep the posts coming! 😍

  • @DJStompZone
    @DJStompZone 6 месяцев назад

    r"^\s+?(?!#)(import\s+\w+|from\s+\w+\s+import\s+\w+)"
    I get what you're saying, but finding local imports with regex might not have been the best example to use, since that one would actually be pretty easy. You would just need to match "(from ...) import ...", preceded by at least one whitespace character. EZPZ

    • @mCoding
      @mCoding  5 месяцев назад

      Your regex matches a multiline string that contains a local import! Like """
      def f():
      import x
      """
      Try again!

  • @angryman9333
    @angryman9333 Год назад

    Do Javascript AST baby

  • @markcuello5
    @markcuello5 2 года назад

    Help me

  • @TNeulaender
    @TNeulaender 2 года назад

    Lokal import Regex :P
    / +from\s+\w+\s+import.*/

  • @vazaubaev
    @vazaubaev 6 месяцев назад

    Its better to use cst: ruclips.net/video/ASRqxDGutpA/видео.html

  • @uuu12343
    @uuu12343 2 года назад

    False, trying to work with *any form* of Regular Expression will be an absolute nightmare