These are interesting beyond any reason, thank you so much for making them still. Been here for over 4 years when i had just started uni, your videos of you in the car inspired me a lot. I adopted your C++ style and a lot of the commit etiquette, and of course changed it to fit my needs. I'm happy in a job now, finished uni, and am married, and I think these videos are one of the few things that keeps bringing me joy and gives me new things to learn, even if they're small things sometimes. Thanks for making these. I doubt you remember my comments from 4 years ago, you responded to them, I always had a good time here :)
"Implement parts of our JS engine in JS and gain performance benefit from it" ...writing down to my notebook of sentences that have not been uttered ever
Yes! Finally got an hour long video. There is no other channel that I can just sit and watch someone for an hour straight and still wish that the video was longer. Thanks for posting!
interestingly enough, this is actually more of a misfeature than a bug. in the earliest javascript engines, there was a feature called “liveconnect” which allowed for direct interoperability with java classes. unfortunately, because all java objects are nullable & in spite of javascript already having undefined (in practice if not necessarily defined as a keyword until quite a bit later), javascript ended up with an empty, nonexistent object called null which, entirely for long-defunct interoperability reasons, has a typeof object
@@REALsnstruthers this doesn't appear to be accurate. liveconnect is a library rather than a language feature. the explanation I saw from the creator of JS was that null meant an object with a pointer of 0, akin to NULL in C in which it was written
@@kreuner11 browsing the netscape archives of the internet archive, liveconnect was a feature that was used as part of their browser integration with java, including with applets
very cool experiment! I remember recommending this some time ago so I am really interesting to see how it would go
Месяц назад+1
My initial thought when seeing the JS was to use an array and join them instead of doing the string manipulation for every flag, you then mentioned the string concat exactly when I was about to comment so you were ahead of me 🤠
Just curious, why not make object_get from C++ benefit from the cache instead? It’s probably even faster, and speeds up every call rather than just the ones on this specific built-in you’ve ported.
That was my thought too. But still, performance isn't really the current top priority, and if it turns out in the future that this code ends up as a candidate for optimization in the future maybe it will eventually be rewritten in C++ with better optimizations. I think this was really just intended as an experiment (and something that would make for a good video).
i may be missing something here, but if those flags are basically immutable after the regexp object is created, why dont you build that string on object creation and just ask for it later? you could even build a lookup table with all possible strings and use that bitfield to index it. then you just can return a reference to the string and save on memory big time
It makes perfect sense that avoiding the contains() should make best and worst case closer in speed. Unless I'm mistaken checking every flag would be O(nm) where n is the number of flags on the Regex and m is the number of flags to check against. Turning the flags into a bitmask would flatten this into O(m) since checking the bitmask is O(1) rather than contains() which is O(n).
I was wondering that too! That seems like it would have a more profound impact to me, but maybe there's some limitation as to why that's not easy to do. I might spend some time this weekend messing around with it though.
100% my thought too. Why not bring that caching to C++? But hey, if they're moving to swift, may as well rewrite it in that eventually once the limits of the JS engine have been reached. C++ can def move faster than JS - this is an implementation thing not a language thing.
Are you constructing all those buildings objects in native or is there already some javascript shim code which gets installed in the runtime, then you can add the accessor there and don’t need extra parse runs (and also it can better optimize the whole call chain if the compiler sees all code on JS stack I imagine).
I was anxiously waiting to see how much performance gain, if any, you could've gotten by removing the "silly jumps" from the bytecode emitter. It probably wouldn't be much, but still, they indeed were some "silly jumps".
When I wanna "cache" something in my C++ game engine I just write it to an unordered map and load it back from there later. Not sure how accepted that is as a solution, since I'm not the most proficient C++ dev.
I might totally missed this, but why couldn't you just use the flags substring and return that? And after you extracted the flags to a bit mask, couldn't you use that to make the native C++ accessor faster?
The JavaScript spec specifies a flag order so if you say /foo/gi or /foo/ig then flags will return the same value in both cases. This is useful for checking in JS if two RegExp objects have the same flags, etc.
Because javascript is optimisation's worst enemy; that behaviour is required by the spec (also) because the user is allowed to subclass RegExp and override...almost everything, even the pattern string itself.
@@cxboog True, but it's fairly safe to assume that, most of the time, it will be invoked on a native RegEx object without overrides. Maybe there could be a fast path in that case?
You can use JavaScript to write more powerful compiler, so, technically, it's possible. But it would be practically hard, because GCC and Clang are leaders backed by professionals.
here's another crazy idea you might consider: rewrite the UI (e.g the address bar area) in HTML. that would reduce dependence on Qt and would help with porting (could also make a desktop/tablet HTML UI and a mobile/vertical UI)
@@mariocamspam72 edge also uses React framework to build UI elements but that does not change fact that UI is built with web standards such as html or in Mozilla Firefox case XUL/xhtml
Maybe a silly question. Can the original get() that you wanted to optimize reuse inline caches somehow? By running in the JS context it was called from vs running from the perspective of C++ attempting to access JS context? So kind of a JS -> JS trampoline by way of the C++? Obviously something no JIT can do... but if you're sticking to interpreter land anyway... why not?
Love it! Don’t think I’ll ever go back to programming without LLM autocomplete. So much less friction, especially when doing less interesting tasks. :)
It makes so much more sense to use LLMs as autocomplete helpers instead of fully writing entire files/classes/functions with them. After all, they're basically super good word/token predictors behind the curtains.
@@awesomekling hope you have an interview / talk with ThePrimeagen some day, know you two had a small interaction already. He's had an interesting time with Copilot, hope that topic comes up during the talk.
Where would I even start to learn this kind of wizardry? As an iOS dev, it all feels so foreign. I barely touch the terminal, and mixing languages, especially the way you did here, seems impossible to me.
I believe he was always on Mac, running SerenityOS in a VM. But he's put creating Serenity on hold, since he's focusing on the Ladybird browser exclusively right now.
Would the length of the regexp have an impact on the speed? I could see this being where JS gets slower. Your regexp was very short and quite sane. What people create with regexp isn't sane.
I only watched until 7:00 but as soon as you said it was a problem with C not caching previous gets, I stopped the video. In those cases I do something similar this, and problem solved. I have a bit field to see if the get has been previously called, and another bit field to for the boolean get values. This is "old republic" C, if you don't like bit fields and bit manipulation, just use an array of booleans, or whatever, although it's less memory efficient. Maybe I should have watch the entire video, because maybe this isn't what you are looking for, but I didn't have time for an hour video today. bool getFlag(RegExpObject* regexp, const String& flagName) { static char FlagNames[][16] = { "global","ingoreCase", "multiline", "sticky", }; static int numFlags = sizeof(FlagNames) / sizeof(FlagNames[0]); static int RegExpFlags = 0 , RegExpCached = 0; for(int i=0; i> i) & 1)) { RegExpCached |= (1 get(flagName))) RegExpFlags |= (1 > i) & 1); } } printf("error not found"); return false; }
Maybe you should have paid a bit more attention to what he was saying before you stopped the video, because it wasn't that the flags themself aren't cached. The issue is that a property access requires walking the prototype chain to find where the property actually exists, and in C++ there is no cache in the property lookup function, so they're forced to walk that chain every time. JS caches that part of the property lookup process. He also did use a bitfield to hold the actual flags because he discovered that the flags were being stored as a string and each lookup was calling contains() to search for each letter, but that was after moving to JS to take advantage of the property lookup caching.
I believe I have understood the issue. The first call to regexp.get in any language requires walking the prototype chain to find the property. Javascript uses automatic inline caching to speed up subsequential calls to each property. My C function implements a similar caching mechanism using a 32-bit bitfield. This can be even faster because the bitfield fits easily within the CPU cache, whereas JavaScript's inline caching, although fast, depends on various factors, including the JavaScript engine's implementation and may not always fit in the CPU cache Where I think there might be confusion is that both C and JavaScript must walk the prototype chain for the first call. The real key advantage of implementing this in JavaScript is the reduced overhead from cross-language context switches, which can be significant, but this does not avoid the initial property lookup. Writing this logic in JavaScript doesn’t eliminate the overhead of the initial call; it merely optimizes subsequent accesses, which my C function also does.
Ah the flags property has a defined order and the input flags param can be in any order. Hm bizarre, returning the u32 flags to cpp should be faster still. Anyways, i like the idea od pushing stuff to js, as that makes you “dogfood” your js runtime and thus have (another) reason to make it fast.
JavaScript is very dynamic. Someone could replace the original flag getters on a RegExp object with their own, and then we have to invoke those getters, which may lead to arbitrary behavior and side effects.. it's a fun language!
@@awesomekling The extension has a separate backend/DB for users of the extension, so other extension users who downvote have it all kept track there and that number is returned.
These are interesting beyond any reason, thank you so much for making them still. Been here for over 4 years when i had just started uni, your videos of you in the car inspired me a lot. I adopted your C++ style and a lot of the commit etiquette, and of course changed it to fit my needs. I'm happy in a job now, finished uni, and am married, and I think these videos are one of the few things that keeps bringing me joy and gives me new things to learn, even if they're small things sometimes.
Thanks for making these. I doubt you remember my comments from 4 years ago, you responded to them, I always had a good time here :)
Hi Lion! Glad to see you're still here, all grown up with a job and wife! Congratulations on making it
Thank you!!! I appreciate that - and you came so far, too!
"And the sausage stops growing" Never heard that one
"Implement parts of our JS engine in JS and gain performance benefit from it"
...writing down to my notebook of sentences that have not been uttered ever
😂
Thanks for all you do Andreas and also thanks for uploading to youtube, I enjoy it. I wish you a good weekend.
Thanks for stopping by! I wish you a good weekend as well
\x1b[0;30mIs this how you do it?
color now?.
How did you color :^) and how did you do it that when I copy it I only get "caret"
@@theevilcottonball You have to become a channel member to use that emoji.
Yes! Finally got an hour long video. There is no other channel that I can just sit and watch someone for an hour straight and still wish that the video was longer. Thanks for posting!
Careful with the `typeof R === 'object'` because in javascript `null` is also an object!
interestingly enough, this is actually more of a misfeature than a bug. in the earliest javascript engines, there was a feature called “liveconnect” which allowed for direct interoperability with java classes. unfortunately, because all java objects are nullable & in spite of javascript already having undefined (in practice if not necessarily defined as a keyword until quite a bit later), javascript ended up with an empty, nonexistent object called null which, entirely for long-defunct interoperability reasons, has a typeof object
@@REALsnstruthers fantastic comment - thanks for sharing that info!
@@REALsnstruthers this doesn't appear to be accurate. liveconnect is a library rather than a language feature. the explanation I saw from the creator of JS was that null meant an object with a pointer of 0, akin to NULL in C in which it was written
@@kreuner11 browsing the netscape archives of the internet archive, liveconnect was a feature that was used as part of their browser integration with java, including with applets
very cool experiment! I remember recommending this some time ago so I am really interesting to see how it would go
My initial thought when seeing the JS was to use an array and join them instead of doing the string manipulation for every flag, you then mentioned the string concat exactly when I was about to comment so you were ahead of me 🤠
I mean, you could also compile those to byte code while compiling lb, so you don't have to compile them to byte code on execution. :)
I'm sure he knows. This sounded a whole lot more like a "I wonder if this would work" type of thing.
Thanks that was very amusing! i wrote a lot of js when node was new and you still had to use brains to make it run fast in the browser.
Very good stuff to learn something interesting. Thanks for it!
Just curious, why not make object_get from C++ benefit from the cache instead? It’s probably even faster, and speeds up every call rather than just the ones on this specific built-in you’ve ported.
That was my thought too. But still, performance isn't really the current top priority, and if it turns out in the future that this code ends up as a candidate for optimization in the future maybe it will eventually be rewritten in C++ with better optimizations.
I think this was really just intended as an experiment (and something that would make for a good video).
The bitmask optimization should go into LB regardless of the other tests you did.
Certainly yeah, just gotta clean it up!
@@awesomekling That optimization was not that hacky, should be easy to polish up.
@@awesomekling you can also use popcount (number of high bits) to allocate the right amount of bytes
Don't those flags need to be left-shifted on creation, or in the enum? It's been ages since I did C++, so maybe I'm misunderstanding...?
Oh yes, you are absolutely right! So the code was buggy. With that fixed, performance doesn’t change though :) Great observation dude!
@@awesomekling Woohoo! Thanks! For everything! I love these videos, thank you so much, tussentack!
I also noticed that, seems like I wasn't the first one though 😸
Good thing he has a copilot to copilot 😅
i may be missing something here, but if those flags are basically immutable after the regexp object is created, why dont you build that string on object creation and just ask for it later?
you could even build a lookup table with all possible strings and use that bitfield to index it. then you just can return a reference to the string and save on memory big time
Thank you, I was missing your longer videos
Very interesting, thank you for video of prototyping this
Nice video, AK!
Why did you change your OS from Linux to macOS? Just curious.
I really missed the Instruments profiler, so I bought a MacBook Pro and started using it :)
@@awesomekling I also noticed that you started using a debugger instead of printf() debugging. What changed?
Trying to use some new tools! But I still use printf debugging every day :)
@@awesomekling I've noticed you specifically used "old" Instruments. Where did you get the old version?
It makes perfect sense that avoiding the contains() should make best and worst case closer in speed.
Unless I'm mistaken checking every flag would be O(nm) where n is the number of flags on the Regex and m is the number of flags to check against. Turning the flags into a bitmask would flatten this into O(m) since checking the bitmask is O(1) rather than contains() which is O(n).
Since there are only 8 flags, that means there are 256 possible string values, so can't you precompute those and use a LUT?
IIRC, both Mozilla and Google implement some of the JS functions in JavaScript.
Yeah, WebKit as well. It does make a lot of sense :)
Curious to know if the c++ version could also cache its work and be faster still
There’s definitely ways to achieve that, but then we’d lose out on the comfy vibes of writing JS in JS! 😎
I was wondering that too! That seems like it would have a more profound impact to me, but maybe there's some limitation as to why that's not easy to do. I might spend some time this weekend messing around with it though.
@@awesomekling Hah, well then by all means :)
100% my thought too. Why not bring that caching to C++? But hey, if they're moving to swift, may as well rewrite it in that eventually once the limits of the JS engine have been reached. C++ can def move faster than JS - this is an implementation thing not a language thing.
WHF! Very interesting and hackish, love it!
Are you constructing all those buildings objects in native or is there already some javascript shim code which gets installed in the runtime, then you can add the accessor there and don’t need extra parse runs (and also it can better optimize the whole call chain if the compiler sees all code on JS stack I imagine).
I was anxiously waiting to see how much performance gain, if any, you could've gotten by removing the "silly jumps" from the bytecode emitter. It probably wouldn't be much, but still, they indeed were some "silly jumps".
I enjoyed that very much!
I'm sorry for the off topic comment, but what tool are you using to be able to drag a window from anywhere on a Mac?
mmazzarolo.com/blog/2022-04-16-drag-window-by-clicking-anywhere-on-macos/
very intertesting seeing you using the debugger 😛
When I wanna "cache" something in my C++ game engine I just write it to an unordered map and load it back from there later. Not sure how accepted that is as a solution, since I'm not the most proficient C++ dev.
Well Hello Andreas!
Well Hello Michal!
Huzzah! A Stag beetle and a Scarab beetle?
I might totally missed this, but why couldn't you just use the flags substring and return that? And after you extracted the flags to a bit mask, couldn't you use that to make the native C++ accessor faster?
The JavaScript spec specifies a flag order so if you say /foo/gi or /foo/ig then flags will return the same value in both cases. This is useful for checking in JS if two RegExp objects have the same flags, etc.
Because javascript is optimisation's worst enemy; that behaviour is required by the spec (also) because the user is allowed to subclass RegExp and override...almost everything, even the pattern string itself.
@@cxboog True, but it's fairly safe to assume that, most of the time, it will be invoked on a native RegEx object without overrides. Maybe there could be a fast path in that case?
@@msclrhd ah, makes sense. That's what I meant by "I might have totally missed this" :)
@@nordern1 yeah there's tons of these optimisation opportunities!
You can use JavaScript to write more powerful compiler, so, technically, it's possible.
But it would be practically hard, because GCC and Clang are leaders backed by professionals.
So glad you're doing videos again. Very inspirational!
here's another crazy idea you might consider: rewrite the UI (e.g the address bar area) in HTML. that would reduce dependence on Qt and would help with porting (could also make a desktop/tablet HTML UI and a mobile/vertical UI)
Yes, Firefox and Edge do it and they are relatively successful
@spiritofstar Firefox uses a home-grown UI framework which describes layouts with XML
@@mariocamspam72 edge also uses React framework to build UI elements but that does not change fact that UI is built with web standards such as html or in Mozilla Firefox case XUL/xhtml
@@mariocamspam72 edge uses react to build it’s UI,they are still web based UI, what is your point?
@spiritofstar Firefox's UI framework is fully native, that's my point 😁. You're comparing two different things.
Sorry for the off-topic question, but do you see yourself returning to Serenity sometime?
Not anytime soon. I have a browser to build!
@@awesomekling❤
Maybe a silly question. Can the original get() that you wanted to optimize reuse inline caches somehow? By running in the JS context it was called from vs running from the perspective of C++ attempting to access JS context? So kind of a JS -> JS trampoline by way of the C++? Obviously something no JIT can do... but if you're sticking to interpreter land anyway... why not?
The PIC would need to be associated with the C++ itself. Maybe some sort of upfront declaration in the C++ as to what might be accessed.
just curious, in macos, how are you dragging the window from anywhere?
mmazzarolo.com/blog/2022-04-16-drag-window-by-clicking-anywhere-on-macos/
Wouldn't optimising the c++ code be more beneficial in the long run?
In a future of never JIT yes, in a future of JIT, no.
WHF!
Well Hello Fred!
How has been your experience with copilot so far?
Love it! Don’t think I’ll ever go back to programming without LLM autocomplete. So much less friction, especially when doing less interesting tasks. :)
It makes so much more sense to use LLMs as autocomplete helpers instead of fully writing entire files/classes/functions with them. After all, they're basically super good word/token predictors behind the curtains.
@@awesomekling hope you have an interview / talk with ThePrimeagen some day, know you two had a small interaction already. He's had an interesting time with Copilot, hope that topic comes up during the talk.
Where would I even start to learn this kind of wizardry? As an iOS dev, it all feels so foreign. I barely touch the terminal, and mixing languages, especially the way you did here, seems impossible to me.
Try doing new things just outside your comfort zone every week :)
And touching the terminal is quite useful. You could see how he forced himself away from lldb.
Swift vs Rust solved = just use JavaScript
Couldn't you use the flags enum to create the flags string from C++?
Catchy title hehe
Hey! Haven't watched in a bit. When/why did you switch to Mac?
I believe he was always on Mac, running SerenityOS in a VM. But he's put creating Serenity on hold, since he's focusing on the Ladybird browser exclusively right now.
@@TheRealFallingFist Nah, he worked on Ubuntu for years. The latest Macs just compile so much faster than anything else so why wouldn't you use that.
@@TBasianeyes Also, he's an ex-apple employee. So I think he's also very comfortable with Macs
I would really appreciate it if somebody tells me the *font face* kling is using
It's in the video description :)
Berkeley Mono: berkeleygraphics.com/typefaces/berkeley-mono/
Js with good usage of sqlite is faster than c++ with poor usage of a db (especially a more bloated one)
Would the length of the regexp have an impact on the speed? I could see this being where JS gets slower. Your regexp was very short and quite sane. What people create with regexp isn't sane.
The regular expression itself will still be evaluated by the same C++ regex engine we have. We're not implementing a regex engine in JS :)
This is mad hacking video!
Good one
Doesn't V8 even have a "custom language" to do this?
Well that's debatable
I only watched until 7:00 but as soon as you said it was a problem with C not caching previous gets, I stopped the video. In those cases I do something similar this, and problem solved. I have a bit field to see if the get has been previously called, and another bit field to for the boolean get values. This is "old republic" C, if you don't like bit fields and bit manipulation, just use an array of booleans, or whatever, although it's less memory efficient. Maybe I should have watch the entire video, because maybe this isn't what you are looking for, but I didn't have time for an hour video today.
bool getFlag(RegExpObject* regexp, const String& flagName)
{
static char FlagNames[][16] = { "global","ingoreCase", "multiline", "sticky", };
static int numFlags = sizeof(FlagNames) / sizeof(FlagNames[0]);
static int RegExpFlags = 0 , RegExpCached = 0;
for(int i=0; i> i) & 1))
{ RegExpCached |= (1 get(flagName)))
RegExpFlags |= (1 > i) & 1);
}
}
printf("error not found");
return false;
}
Maybe you should have paid a bit more attention to what he was saying before you stopped the video, because it wasn't that the flags themself aren't cached.
The issue is that a property access requires walking the prototype chain to find where the property actually exists, and in C++ there is no cache in the property lookup function, so they're forced to walk that chain every time. JS caches that part of the property lookup process.
He also did use a bitfield to hold the actual flags because he discovered that the flags were being stored as a string and each lookup was calling contains() to search for each letter, but that was after moving to JS to take advantage of the property lookup caching.
I believe I have understood the issue. The first call to regexp.get in any language requires walking the prototype chain to find the property. Javascript uses automatic inline caching to speed up subsequential calls to each property.
My C function implements a similar caching mechanism using a 32-bit bitfield. This can be even faster because the bitfield fits easily within the CPU cache, whereas JavaScript's inline caching, although fast, depends on various factors, including the JavaScript engine's implementation and may not always fit in the CPU cache
Where I think there might be confusion is that both C and JavaScript must walk the prototype chain for the first call. The real key advantage of implementing this in JavaScript is the reduced overhead from cross-language context switches, which can be significant, but this does not avoid the initial property lookup. Writing this logic in JavaScript doesn’t eliminate the overhead of the initial call; it merely optimizes subsequent accesses, which my C function also does.
JavaScript is scripting language implemented by C++ we can’t run JavaScript without C++
Well Hello!
Very cool video, thanks! Question: Why CLion over Qt Creator? As a new Qt Oslo Office employee to an old one :)
Maybe because he wants to try new stuff :^) He used to use Qt creator back in the Serenity days before the idea of Ladybird was conceived
Nice video.
TIL: vim !$
!$ in bash refers to the last argument of the previous command. It's a quick way to reference and reuse the last parameter from your previous command.
whk! js hacking, very fun!
no way ?
Why not just return the inner flags string?
Ah the flags property has a defined order and the input flags param can be in any order. Hm bizarre, returning the u32 flags to cpp should be faster still.
Anyways, i like the idea od pushing stuff to js, as that makes you “dogfood” your js runtime and thus have (another) reason to make it fast.
JavaScript is very dynamic. Someone could replace the original flag getters on a RegExp object with their own, and then we have to invoke those getters, which may lead to arbitrary behavior and side effects.. it's a fun language!
@@awesomekling haha you are right, I guess the workaround for that would be worse that the original code. Fun indeed :D
Wait, why does this video have so many dislikes?
At least RYD thinks it is disliked :/
This video currently has zero dislikes. Your extension is just showing you a random number :)
@@awesomeklinggood to hear :) seems like it is 50/50ing for some reason
@@awesomekling The extension has a separate backend/DB for users of the extension, so other extension users who downvote have it all kept track there and that number is returned.
Because javascript stinks probably.
Those extensions do not work. They just make stuff up. Stop using them.
I like your desktop wallpaper.
MORE OF
SerenityOS PLZ
WHF!
Helllllooooo!!!