Parsing Java Bytecode with Python (JelloVM Ep.01)

Поделиться
HTML-код
  • Опубликовано: 17 окт 2022
  • References:
    - Specs: docs.oracle.com/javase/specs/...
    - WASM Learning Website: kolumb.github.io/learning-was...
    - Source Code: github.com/tsoding/pjython
  • НаукаНаука

Комментарии • 148

  • @vivekascoder
    @vivekascoder Год назад +434

    "Money is not the most important thing in your life, what's important is knowing how to parse JVM" -Tsoding

  • @calebharper9567
    @calebharper9567 Год назад +148

    I like how he reflexively keeps typing semicolons at the end of lines and correcting it

    • @EDFHLFLFF
      @EDFHLFLFF Год назад +5

      I always do that in matlab too, mostly because sometimes you're meant to and sometimes not 🤷

    • @raidensama1511
      @raidensama1511 Год назад +1

      He could leave them in, it would still work.

  • @desertfish74
    @desertfish74 Год назад +59

    We had Jython in the past (java running python programs) now Pava (?) Python running Java. Letsgoo

    • @reinhold1616
      @reinhold1616 Год назад +12

      java running python running java running python running java when?

    • @ndrechtseiter
      @ndrechtseiter Год назад +5

      @@reinhold1616 milk inside a bag of milk inside a bag of milk

    • @ratofthecity6351
      @ratofthecity6351 5 месяцев назад

      ​@@ndrechtseiterREAL

    • @ratofthecity6351
      @ratofthecity6351 4 месяца назад

      ​@@ndrechtseiterholy shit milk game

    • @angelcaru
      @angelcaru 3 месяца назад

      milk mentioned@@ratofthecity6351

  • @mandrak87
    @mandrak87 Год назад +64

    I just wanted to say how much I enjoy watching your videos. It is the perfect balance between top quality educational content and really funny/entertaining jokes mixed in. You truly are a one of a kind engineer. Keep up the fantastic work and I hope things are not too bad for you in Russia. Tsoding rocks 👍

    • @craigcraig6248
      @craigcraig6248 Год назад +6

      Yeah this is a really underrated channel for a cs nerd like me

  • @smergibblegibberish
    @smergibblegibberish Год назад +18

    He hadn't uploaded since the Russian mobilization started. I had started to worry for him. Glad to see he is alright.

  • @simonetii
    @simonetii Год назад +35

    in python3 with f strings you can do
    foo = "a"
    print(f"{foo=})
    which will print foo = "a"
    basically if you put an equal sign at the end of the block it will evaluate the expression and return its value and the expression itself as string
    great video as always bro

    • @NathanChambers
      @NathanChambers Год назад +5

      wrong, print(f"{foo=}) will print foo="a" if you want the spaces for example print(f"{foo= }) would print foo= "a". Point being, it doesn't automatically add a space before and after the = like your example :)

  • @jozef_kascak
    @jozef_kascak Год назад +11

    Glad to see you back. I hope you are safe from everything.

  • @eboubaker3722
    @eboubaker3722 Год назад +7

    Ooh man my favorite channel makes a video about my favorite language java best day ever

  • @mrmaniac9905
    @mrmaniac9905 Год назад +4

    Glad to see you back keep up the content, I love watching it in the background at work!

  • @accountname1047
    @accountname1047 Год назад +7

    You are a wizard, love watching you work

  • @lucifer-5ybtn
    @lucifer-5ybtn Год назад +7

    good to see you’re back🎉

  • @1vader
    @1vader Год назад +47

    Some reasons why the binaries contain the class names:
    - debugging output, e.g. exception stacktraces
    - reflection
    - methods that print the class name, including default toString implementations and getClass().getName()

    • @DeathSugar
      @DeathSugar Год назад +2

      is there a way to strip them? or the only way to do it is via JIT/AOT?

    • @laurensweyn
      @laurensweyn Год назад +5

      @@DeathSugar You can strip class names via an obfuscator. This effectively renames all your classes to 'A', 'B', 'C'... 'AA', 'AB' and so on. Great if you want to avoid reverse engineering, not so great if you want to understand a stack trace or any errors/debug output

    • @DeathSugar
      @DeathSugar Год назад +1

      @@laurensweyn is it built-in thing in java? Can this data be moved in some dsym file, like CXX does?

    • @SnackLive
      @SnackLive Год назад +1

      @@DeathSugar Dont remember if is actually build in the compilator but theres a lot of tooling dedicated to obfuscation and code reduction

    • @pozdroszejset4460
      @pozdroszejset4460 Год назад

      @@DeathSugar proguard is a popular one

  • @jasonkary8431
    @jasonkary8431 6 месяцев назад +3

    Being an old guy, I know why there are
    and
    ... It's from the teletype machine days.
    is a carriage return where the print head is physically returned to the start of the line.
    is a line feed where physically scrolled the paper up one line. No collusion involved. ;)

    • @VojtaJavora
      @VojtaJavora 6 месяцев назад

      Right, but once nobody used teletypes, why did each of them choose different line ending.

  • @mattcoley
    @mattcoley Год назад +8

    35:00 - Lol welcome to the spec. It literally says in that page they regret some of these decisions. But yeah, its not a horribly complicated spec, just got some quirks.
    43:30 - Arrays of 'constants' of each type would actually be kinda cool.

  • @bassguitarbill
    @bassguitarbill Год назад +3

    Really entertaining video, looking forward to the rest of this!

  • @TheBigLou13
    @TheBigLou13 Год назад +1

    This was interesting to watch and I learned a lot. Thank you! :)
    You're clever and a quick thinker.

  • @BulletHeadPL
    @BulletHeadPL Год назад +3

    im so happy i can listen to u again

  • @Fikerus2
    @Fikerus2 Год назад +23

    2:52 He checked the date to say what year it is

  • @LordMardur
    @LordMardur Год назад +5

    18:30 I think the "intention" of Windows (or DOS) having both line feed and carriage return together is a simpler printer driver, and laziness. Early matrix printers had different commands for line feed (moving the paper) and carriage return (returning the print head to the home position, this is where the name comes from). With having both commands directly in the text document, a printer driver can simply send the file byte by byte to the printer and it does the right thing. Unix and Mac preferred a single command inside text files, since in a text document there is not really a concept of carriage return or line feed, just new line.
    For parsing binary data from bytes, I recommend struct.unpack. It is a standard package, which allows for easier reading (and writing) of entire structures and can handle endianness as well.

    • @5omebody
      @5omebody Год назад +1

      i feel like it's less laziness, and more... DOS probably cared more about compatibility with existing standards, whereas unix and max (also unix) *probably* just figured, you're probably not printing your text files, let alone your code, so why not keep it simple

  • @coder4937
    @coder4937 Год назад +3

    I was waiting for your video

  • @morgengabe1
    @morgengabe1 Год назад +1

    dude, great to see you're safe! was worried when you went quiet after the draft was announced!

  • @Yash42189
    @Yash42189 Год назад

    ure the best programming related youtube content creator. keep up the good work

  • @rogo7330
    @rogo7330 Год назад +21

    By the way, try to disable "Turbo" (also called p-state). You can do that with `echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo`. This will disable your CPU from running on extra sShpPpiIieeed and will no longer give you a heatstroke while you using your computer.

  •  Год назад +2

    I love that there is the development process captured in this!
    Building custom JVM is cool, but I like to see you adding new features and suddenly saying "this can go to a separate function".
    It is such a great learning material! I wish something like this was done 15 years ago when I started to code :)

  • @filipmajetic1174
    @filipmajetic1174 Год назад +23

    Such a shame we had to give up Turing completenss to get Unicode in python 3...
    Btw, Java 18 or 19 finally got "script" files, where you can just start writing code without a public static void main and it just works™

    • @Rene-tu3fc
      @Rene-tu3fc Год назад +8

      maybe by java 23443 we'll have static function files, where we don't need to encapsulate public static functions within classes

    • @sagnikc4
      @sagnikc4 Год назад

      jshell ?

    • @keineangabe8993
      @keineangabe8993 Год назад

      @@Rene-tu3fc I actually dont have a problem with that, we need some way of accessing functions like that anyway, I see utility classes as some kind of namespace.

    • @orizach01
      @orizach01 Год назад +3

      JavaScript in real life

  • @claudiusraphael9423
    @claudiusraphael9423 8 месяцев назад +1

    Just "reading" that title alone made me want to comment: You really are the Sado-Masochistic Dominatrix of Programming. G-sus Krajst: This is the epitome of perversion, lol - and i thought generating 6502-assembly from inside a BASIC V2 program and poking it live to replace the running kernel by triggering a soft-reset would be nasty - this is a whole nother level of dark-arts. Razpackt!

  • @alpers.2123
    @alpers.2123 Год назад +6

    def pprint(obj): print(json.dumps(obj, indent=4, sort_keys=False))

  • @ShotgunLlama
    @ShotgunLlama Год назад

    Well this definitely got me down a deep rabbit hole

  • @JanBebendorf
    @JanBebendorf Год назад +2

    We built a transpiler that can target multiple languages instead (we made backends for lua and php but built it to be extensible). The subset of the stdlib that we implemented is very small as it stands but also built to be extended using Java classes with native methods and annotations that provide the code for the different languages. The produced code also doesn't look great because it replicates the behavior instruction by instruction but it actually works quite good.

  • @jonasls
    @jonasls Год назад +11

    Super cool, can't wait for more! Maybe an interpreter?

  • @thegate8985
    @thegate8985 Год назад +2

    Hi (or privet, as a Russian I hear russian accent in your speech :D)! Thanks for the video :)

  • @salamemilano
    @salamemilano Год назад +8

    1:23:11 the best part of the entire live

  • @TenderBug
    @TenderBug Год назад +1

    Someone is overly funny and sarcastic than usual in this stream 🤣
    Nice vid Tsoding. Learned lot.

  • @L0wPressure
    @L0wPressure Год назад

    My man, every time i watch your videos i feel stupid, but at the same time i always learn something new :) Let's hope all that shit ends soon, i wish us all peace.

  • @superscatboy
    @superscatboy Год назад

    You're a lunatic, man. Never change :)

  • @paulfragemann3333
    @paulfragemann3333 Год назад +13

    There is a lot more incompatible change in Python 2 vs 3 but the print is the most noticable syntactically. The way strings work completely changed, splitting the str type of Python 2 that was basically just a list of bytes into 3 different type: str (unicode string), bytes (immutable list of bytes), bytearray (mutable list of bytes), breaking a lot of programms messing around with binary files. And a lot more stuffs even less noticeable than that.

    • @laurensweyn
      @laurensweyn Год назад +3

      Exactly, the differences are far bigger than print() -- most libraries wouldn't take nearly so long to port to Python 3 if that was the only change, especially since most use proper logging instead of prints which were unchanged.
      The division operator '/' always returning a floating point number in Python 3 instead of also doing integer division like in Python 2 (which is now '//') is a particularly nasty one I remember for example.

    • @CraftMine1000
      @CraftMine1000 Год назад +1

      Let's not forget import pathing changed slightly from 2 to 3 also

    • @TsodingDaily
      @TsodingDaily  Год назад +12

      @Elsevar Asadov I mean, to realize that this was a joke you need to watch that bit until the end. You can't expect such a huge attention span from an average RUclips commenter especially after introduction of RUclips Shorts.

    • @laurensweyn
      @laurensweyn Год назад +2

      @@TsodingDaily I had this 2.5 hour video in my recommendations, never seen this channel before, and first impressions matter. I got the impression you didn't know what you were talking about (and I've heard plenty of beginners complain about this very thing during early Python 3 days), and if that were a sign of what's to come for the rest of the video, I'm not sitting through 2.5 hours of this, especially if I have many more creators to choose from.
      I'm sorry my judgement was incorrect, but I'm sure I wasn't the only one to come to this conclusion. Maybe something to consider for future videos. Or don't, I won't stop you.

    • @sandworm9528
      @sandworm9528 Год назад +4

      @@laurensweyn you only have to watch 4m 15s to hear him say he's joking. But go ahead and explain how valuable your opinions are

  • @amosaidoo5741
    @amosaidoo5741 10 месяцев назад

    That was fun to watch

  • @ecampo123
    @ecampo123 Год назад +1

    The real OOP were the friends we made along the way

  • @alexandrohdez3982
    @alexandrohdez3982 Год назад

    Again GREAT VIDEO 👏👏👏👏

  • @BalintCsala
    @BalintCsala Год назад +6

    I admittedly haven't watched through the whole video, but since you are using python 3 I recommend using format strings in prints, you can do a lot of the stuff you did effortlessly
    e.g.
    print(f"{foo = }") results in "foo = "

    • @zohnannor
      @zohnannor Год назад

      but without pretty-printing, although he could've used `pprint.pformat`.

    • @BalintCsala
      @BalintCsala Год назад +1

      @@zohnannor I was mostly talking about stuff around 20 minutes
      Also, for pretty printing as others mentioned json _is_ better

    • @zohnannor
      @zohnannor Год назад

      @@BalintCsala yes i laughed when he tried `json.dump` without `indent` parameter, saw that it didn't print like he thought and immediately threw that code away xD so close yet so far

  • @annybodykila
    @annybodykila 5 месяцев назад

    Knowledge of reflection and injection would be helpful for a project like this 😉

  •  Год назад

    1:47:28 When you said "holy fucken shit"... I felt it in my bones.

  • @brvtalcake
    @brvtalcake Год назад

    I really like your spying videos

  • @marceloxsweet1358
    @marceloxsweet1358 11 месяцев назад

    Jython, perfect name dude ❤

  • @devqbasic2384
    @devqbasic2384 Год назад +2

    can you write assembly on an os level ? It's fun and easy if you stay 16bit because you still have the bios.

  • @rafagd
    @rafagd Год назад +7

    I know it's too late to sugest a name, but if Jython is Python on JVM, JVM in Python needs to be Pava or PVM

  • @1495978707
    @1495978707 Год назад +2

    6:00 for the thing about open source as a means to exploit young enthusiastic programmers

  • @berndeckenfels
    @berndeckenfels 3 месяца назад

    1:38:10 Descriptor is the ()V (void) return signature with no args

  • @white_145
    @white_145 8 месяцев назад

    2:08:58
    could use the ,= operator (which is just weird tuple syntax)

  •  Год назад

    2:03:50 Yes! This is me programming in PHP 15 years ago.
    "Why does the function return allways null? Ah, because I haven't included `return result` at the end..."

  • @kvikende
    @kvikende Год назад

    Tsoding adding all those parantheses might be his unconcious telling him to learn LISP

  • @CarterColeisInfamous
    @CarterColeisInfamous Год назад

    3:26 that is correct

  • @hatkidchan_
    @hatkidchan_ Год назад +2

    Hehe... Пива... Ehehehhehhehhh...

  • @samholland209
    @samholland209 Месяц назад

    Since you did this, why not also write a program in Python that can decompile Java bytecode?

  • @arthurlokhov6856
    @arthurlokhov6856 Год назад +1

    what font are you using?

  • @hwstar9416
    @hwstar9416 Год назад

    I think this is the third time you've changed the thumbnail now 😂

  • @aciddev_
    @aciddev_ Год назад +2

    imagine running this through jython

  • @jerssh
    @jerssh Год назад

    19:35
    Once a bunch of files i was reading in were getting cut of at random points, ends up the binary info contained EOF markers coincidentally, and reading it in non-binary mode would just make it quit reading the file when it ran into them.

  • @Lars-ce4rd
    @Lars-ce4rd 2 месяца назад

    I guess tsoding was so afraid of enabling JVM developers that he decided to use python for this task.

  • @i007c
    @i007c Год назад

    python struct module is also a good option

  •  Год назад

    1:23:10 As a CS graduate, I confirm.

  • @ilikegeorgiabutiveonlybeen6705
    @ilikegeorgiabutiveonlybeen6705 3 месяца назад

    гагага пива

  • @theodorealenas3171
    @theodorealenas3171 2 месяца назад

    1:04:00 it's hard to convince my peers to do this versus blop chunks of code and debug little by little

  • @skr-kute1677
    @skr-kute1677 Год назад

    i love the jokes here n there

    • @skr-kute1677
      @skr-kute1677 Год назад

      bruh, like really as im watchin the vid, it do be extra funny and interesting to watch

  • @fullstack_journey
    @fullstack_journey Год назад +2

    Oh the inexplicably fantastic horror. I cannot look at it but i cannot look away from it

  • @remrevo3944
    @remrevo3944 Год назад +1

    Javas interpretation of UTF-8 is going to be a pain. Because if you start using emojis in the java strings there are going to be problems.

  • @pishax3056
    @pishax3056 Год назад +1

    пива))

  • @replikvltyoutube3727
    @replikvltyoutube3727 Год назад +3

    Make porth compile to jvm

  • @rogo7330
    @rogo7330 Год назад +4

    How on the Earth Java compiles simple write to stdout so long?

  • @donovanvanderlinde3478
    @donovanvanderlinde3478 Год назад

    Hmm
    Feels like Porth all over again
    You say it’s just for hello world but I bet this will end up being a lot more 😂😂

  • @Richarddesk
    @Richarddesk Год назад +1

    oracle jdk😄

  • @jebarchives
    @jebarchives Год назад

    :D

  • @cycomkid
    @cycomkid Год назад

    I liked the video but dislike the fact that you are not getting the money, if there is anything i can do let me know, i will be happy to support. I am from india and Russia is a good friend of India

  • @TheBasyx
    @TheBasyx Год назад

    print without () had security issues

  • @D0Samp
    @D0Samp Год назад

    Next step, writing a Java library to parse CPython byte code.

  • @rodelias9378
    @rodelias9378 Год назад

    "It's doable"

  • @akam9919
    @akam9919 Год назад

    My insides feel weird reading the title.

  • @gargleblasta
    @gargleblasta Год назад

    I want to code something simple today he says 😂

  • @Un0rdin4rYPr0gr4mmeR
    @Un0rdin4rYPr0gr4mmeR Год назад

    Has nobody commented on "P0rn folder size at 14:45" - Too smol PepeHands :D

  • @frechjo
    @frechjo Год назад

    Mi vidas komentojn en la angla, mi vidas komentojn en la rusa, sed kie estas la komentoj en la plej bona lingvo?
    Ŝajnas ke mi mem devas zorgi pri tio! Ankaŭ, algoritma engaĝiĝo.

  • @ErikOrjehag
    @ErikOrjehag Год назад +2

    Did you move apartment?

    • @mbarrio
      @mbarrio Год назад

      Maybe he got back to parents house, Kemerovo, or Novokuznetsk.

  • @joaomendoncayt
    @joaomendoncayt Год назад

    are we not going to talk about 2:52?
    Edit: Ok, I should've just kept watching...

  • @meaningfulname9437
    @meaningfulname9437 Год назад

    So you were in Russia that time?😀

  • @valshaped
    @valshaped Год назад

    The most important thing we can do to combat AI code generator agents is write massive piles of the sh*ttiest code on the planet, like bash scripts that generate bash scripts

  • @ThePoke151
    @ThePoke151 Год назад

    23:45 I came, but only because I like pain or something

  • @MultiSuperUnicorn
    @MultiSuperUnicorn Год назад

    open source and hackathons are one and the same :D

  • @Akhulud
    @Akhulud Год назад

    21:10 it's because return is a keyword and print is a function but was a keyword in py2

  • @l.iwakura6553
    @l.iwakura6553 Год назад

    sorry, but how doesnt you back hurt when you stay sit for quite a time?

  • @thehackr258
    @thehackr258 3 месяца назад

    14:07 is really hot 🔥 😂😂😂!!
    in Russia 😂

  • @skeleton_craftGaming
    @skeleton_craftGaming Год назад

    Yeah, the whole point of Java is that you don't need to know how Java works internally.

  • @MrOboema
    @MrOboema Год назад +3

    Java Bytecode in Python. Wow. How's the performance? 🤣

    • @objectobject5889
      @objectobject5889 Год назад +4

      3x faster. In the development time domain ;)

    • @MrOboema
      @MrOboema Год назад +2

      @@objectobject5889 but 10x slower in execution and the next Python version wont be able to run it without major changes. Got it, like usual Python code...got it 😁

  • @replikvltyoutube3727
    @replikvltyoutube3727 Год назад

    It would be rly cool if you could help free pascal with jvm backend

    • @superscatboy
      @superscatboy Год назад

      I think Free Pascal can already target the JVM.

    • @replikvltyoutube3727
      @replikvltyoutube3727 Год назад

      @@superscatboy it does but is not upstream last time I checked and it is based on really old Jasmin library

    • @superscatboy
      @superscatboy Год назад

      @@replikvltyoutube3727 Fair enough, I just remember hearing that it was a thing it could do.

  • @nibrobb
    @nibrobb 9 месяцев назад

    1:41:00 incoming call

  • @replikvltyoutube3727
    @replikvltyoutube3727 Год назад

    Maybe yer laptop needs to be cleaned from dust ?

  • @zgliu8018
    @zgliu8018 Год назад

    Un-staticize your interpreted language 😈

  • @sossupummi
    @sossupummi Год назад

    first!

  • @aciddev_
    @aciddev_ Год назад

    может тебе создать русский канал? и заливать туда озвучки, на русском, или сюда добавить субтитры?

  • @aqqq4097
    @aqqq4097 Год назад

    Whyyyyy? Just whyyy

  • @toastedgralic4868
    @toastedgralic4868 4 месяца назад

    8:50
    en.wikipedia.org/wiki/Java_class_file#:~:text=the%20sourcefile%2C%20etc.)-,Magic%20Number,-%5Bedit%5D