They used a character that could not be represented by a single byte. It requires two bytes. There was an optimization made, where if all characters could be represented by one byte the source code stayed in UTF-8 encoding. When the new character was introduced, this forced the encoding to go from UTF-8 to UTF 16. That is twice the memory for each character represented. One bite is 8 bits. Two bites is 16 bits. This is the difference between UTF-8, and UTF 16 (with regards to memory required per character represented) After realizing that the size of the source code doubled because of this single character being included in the source, they remove that character and replaced it with characters that could be represented in UTF-8. Did that help?
@@TianYuanEX Because it's just a regular text character. Javascript files are Unicode documents. Just because it's not an English/Latin letter doesn't mean it's not a regular Unicode text character. (According to the MDN docs, any character in the ID_Start Unicode character set is allowed to start an identifier, and any character in ID_Continue set is allowed to be in an identifier after the first character.)
that's the Japanese character ツ (pronounced tsu) it took me by surprise when i first saw it. the name is the プライウマジイン
Priuma Majin
He should change
@@atxorsatti Right! it's catchy
Wouldn't it be closer to something like プライマジェン? (Maybe プライーマジェン if you want to carry over the first-syllable stress that he puts on the name)
@@DubiousNachos I think so, Am still N5 level Japanese speaker
Prima majin buu
The memory went up by ツ times
That's what happens when you try to get fancy for absolutely no reason at all.
ThePrimeagenClips, This is perfect! I subscribed right away!
I had no idea he quit Netflix. However, I kinda suspected he wasn't working as much given the amount of videos he was pumping out.
Why use a non ASCII character to begin with?
to be cool JS dev
JS developers like problems, so they create problems from trivial things
I thought the reason why JavaScript uses UTF-16 is because it predates UTF-8, but then I looked it up and it doesn't. Weird.
Is this an extension of the main Prime accounts, or is it a community led abstraction?
Is the performance story about the change from indexing to property access or the thing about multibyte character?
Yes
Yes
Did bro managed to get screen tearing to occur in the final frame?
ツのgen sounds cool but unfortunately it's the wrong order and genのツ doesn't sound as good
I did not understand :(. I'm a backend dev, can anyone please help me with this?
They used a character that could not be represented by a single byte. It requires two bytes.
There was an optimization made, where if all characters could be represented by one byte the source code stayed in UTF-8 encoding.
When the new character was introduced, this forced the encoding to go from UTF-8 to UTF 16.
That is twice the memory for each character represented.
One bite is 8 bits. Two bites is 16 bits. This is the difference between UTF-8, and UTF 16 (with regards to memory required per character represented)
After realizing that the size of the source code doubled because of this single character being included in the source, they remove that character and replaced it with characters that could be represented in UTF-8.
Did that help?
@@ChrisCox-wv7ooMy hero!
@@ChrisCox-wv7oo why use this character in the first place though?
@@anj000 To make it more difficult for people to access and mess with the client side cache. Basically obfuscation.
@@Sam54345 wtf it has nothing to do with obfusscation, obfusscated code with this character or different won't be making any difference
UTF-8 characters in source code? What is this sorcery?
I put an emoji in a PHP comment once, and the interpreter would refuse to run the script 🥲
案外だ
How is that smile emoji even valid as a lexical token??
Not a smile emoji tho
It's part of Japanese Katakana... But to us western folks, it looks like a smile emoji.
@@FinlayDaG33k Ah okay, thanks for info! Though still, how is that a valid lexical token?
Its utf, you can even use emoji in js src code.
@@TianYuanEX Because it's just a regular text character. Javascript files are Unicode documents. Just because it's not an English/Latin letter doesn't mean it's not a regular Unicode text character.
(According to the MDN docs, any character in the ID_Start Unicode character set is allowed to start an identifier, and any character in ID_Continue set is allowed to be in an identifier after the first character.)