Okay, I have gone through maybe 13 other videos including a video by my instructor, and all of them were not as simple and easy as you made this explanation out to be. Thank you for making this. I'm finally understanding it!
The explanation is wrong. When saved in UNICODE format, Notepad adds a two-byte magic number to mark the file format as being UTF-16, and the ALT-1 character is saved as two bytes. Notepad does not save in a 32-bit UNICODE format. You can verify this by putting the ALT-1 character in the file twice, and see that the file size is then 6 bytes. Also, UTF-16 encodes up to 20 bits, not 32 as stated.
Thank you so much for the heroes that create these videos. My first time delving into telcom based project and this video helped me so much for a non tech.
(4:30) You should preferably save in UTF-8 instead, that uses 1-7 bits per character depending on how far into Unicode it appears. - Furthermore, you can't use characters beyond the 1 114 111th character, due to how the standard is set up with the 16-bit surrogate pairs.
The short answer is 'no' because the first bit being equal to 1 will be the flag used to indicate a Unicode character. Since the first bit has to be a '1', you are halving the amount of available combinations potentially available.
2:44 Shouldn't it be 128 characters/instructions represented by ASCII because it ranges from 0-127 and not 1-127 and 2^7 = 128 which is the total combinations of 0s and 1s?
Great job! I had a really hard time understanding it the way others have explained it, but you explained it really well and I understood straight away. Subscribed :)
ASCII is an extension of the 5-bit Badau Code used in such things as the original electro-mechanical Teletype Keyboard/Printer long-distance communication devices that replaced the old hand-keyed dit-dah Morse Code when more capable and much faster methods of sending on/off (digital) codes were developed at the end of the 19th Century. Much of ASCII was adopted to allow more characters to be sent and to allow more thorough control of the receiving device (End-of-Message value and so forth) for more intricate messages (for example, the Escape Code as a flag to allow non-ASCII values to be sent for use by the receiving machine as other than letters or numbers or standard symbols or ASCII Printer Commands). Sending pictures is another major extension of ASCII where the original printable characters are now just a small part of the image printed out. UNICODE is one part of this expansion but such things as "JPG" and "MP4" and other special-purpose coding schemes are now also used extensively for inter-computer messaging. Automatic "handshaking" and error determination are now absolutely needed for computer messaging that is going much too fast for human monitoring of the connections -- this can get extremely complex when automatic backup systems with quick-swapping paths are used.
Wow, what a lot of extra information! Very interesting, thank you very much for sharing that. My videos tend to be targetted at the current UK GCSE Computer Science curriculum, and so only tend to provide that much information, but it's always good when subscribers share extra information and explanations, so thank you!
@@TheTechTrain You are welcome. I started work for the US Navy as an Electronic Engineer for Systems of the TERRIER Anti-Aircraft Guided Missile System in late 1972, just as digital computers were being added to replace the old electro-mechanical computers used to aim the missiles and initialize them just prior to firing and control the large tracking radars (huge "folded telescope" designs that used pulse tracking and a separate continuous very-high-frequency illumination beam for the missile's radar "eye" to home on). A couple of years later the profession of Computer Engineer finally was added to the US Government employment system and our group all changed over to it, so I, in a manner of speech, was not on the "ground floor" of the computer revolution but, as far as the US Federal Government was concerned, I was "in the basement" at this extremely critical time of change when computers "got smaller" as was given as an inside joke in the Remo Williams movie. There is a RUclips video series out concerning a group in a museum rebuilding one of those old Teletype machines that used the Badau Code and showing how it controlled all of the tiny moving parts in the machines. VERY interesting!
You've certainly seen a fair few changes in that time then! As someone with such an extensive background in the subject I feel humbled at you reviewing my little video! Are you still involved with the subject these days?
@@TheTechTrain I retired in 2014 after 41 years of US Federal Government employment. First for TERRIER until it was decommissioned in 1992 when steam warships (other than the nukes) were suddenly without any warning deleted from the Navy, where I was lucky and could immediately change over to TARTAR, which was on many gas-turbine-powered ships and lasted a few years longer before AEGIS replaced almost every other major US Navy missile system. TARTAR had some computer engineering/programming jobs open and I now learned a whole new kind of software/hardware design and support scheme -- boy, was that a step down, from the 18-bit UNIVAC C-152 computers that TERRIER used to the chained-together 16-bit computers from several manufacturers that TARTAR used, since 18-bit (though now obsolete due to 32- and 64-bit machines becoming standard) gave programmers WAY, WAY more capability than 16-bit did. When TARTAR in the US Navy "bit the dust" (I think that a foreign navy still uses it) a few years later, I moved to the FFG-7 frigates that used a kind of "poor-man's TARTAR" (still limited to the early SM-1 "homing-all-the-way" missiles when TARTAR had changed over to the much more capable command-guided, much-longer-range SM-2 missiles). I did some programming and program testing and spec writing, with my largest work effort being on the several-year-long project to upgrade the Australian FFG-7 ships to SM-2 and an Evolved Seasparrow vertical-launch close-in defense system -- that was a HUGE job like shoe-horning a size 12 foot into a size 8 shoe, but we did our part in developing the programming portion of the system and it WORKED!! By then I was doing Software Quality Assurance and Control (SQA), where I made sure all documents were properly reviewed and OKed and I held the final meetings where we decided if we have finished each major project step and can go to the next one, which was a major change for me. I had to learn all about the SQA process, originally developed for NASA (though we never got to their own Level 5 SQA System as that would have needed many more people and lots more money), and my boss had me flow-chart, by hand, the entire process with all possible branches to make me REALLY know it ASAP -- he stuck my large flow-charts up on the wall just outside his office and directed that everybody study their part in it (only I and our project librarian/documentation/computer storage medium "czar" had to learn the whole thing; just my luck!). To get some idea as to how far behind the US Navy is getting vis-a-vis software, where originally we were on the "bleeding edge" when I started work, I was the ONLY SQA person in our entire group, handling the several concurrent projects we had by juggling the timing of meetings and so forth. In the online video game DIABLO III, to name just one, they have over ONE-HUNDRED (100!!!) people dedicated to just SQA, and that is only a small part of their entire international staff. I felt like Indiana Jones being dragged by his whip behind that Nazi truck, only in my case that truck was getting farther and farther away as the whip stretched...
very well and simply explained, thanks a lot , but I have a question: why can 32-bit represent only half the no. of values, I mean why 2 billion while it can represent up to 4.3 billion ??
When calculating binary, each column increases the value of the column to its right by the power of 2. So the first bit is 1, the second bit is 2, and then each subsequent bit or column is multiplied by 2, being 4, then 16, 32, 64 and 128. These are the values for the first 8 columns - or a byte of data. With 8 columns, or 8 bits, you can calculate the highest possible value by adding the value of each column - so 128+64+32+16+8+4+2+1=255. With 32 bits we just simply continue this process. Raising each new column to the power of 2 results in some pretty large numbers very quickly, and when you write out 32 columns, or 32 bits, that's the total you get (well, almost - the first bit is a 'flag' digit to identify the value as Unicode, so we're really only adding up the values of the first 31 columns).
I'm not sure you quite understand what is meant by raising a number to the power of a number. 32 is 2 to the power of 5 because when we write 2 x 2 x 2 x 2 x 2 you'll see that there are five 2's. If you write 16 x 16 - where is the 2?
Keyboard keys have a different number asociated to them that gets translated into "ASCII" later, upper or lower case depending on whether Shift was held down. PS/2 scancode for A is 0x1E. You skipped over intermediate word lengths. For most of the history, a character has been 8 bits, and later 16 bits. Having so many bits comes at cost. The Notepad shortcut might be something found in newest Windows only. A Unicode text file is usually prefixed with a technical character called a "byte order mark" to indicate the UTF format. Saving one symbol will actually save two.
It's just simply Unicode that uses blocks of 8 bits to represent characters. ASCII uses 7 bits, and extended ASCII uses 8 bits. Unicode 8, or UTF-8 encodes characters using one or more blocks of 8 bits.
For characters encoded with 8 bits, yes, exactly the same. However, UTF-8 doesn't mean only 8 bits are used - only that blocks of 8 bits are used. So two 8-but blocks could be used, which would be 16 bits.
The value of 32 refers to the number of bits (individual binary digits), but the value of 4 refers to the number of bytes, which are groups of eight bits. So 4 X 8 = 32.
My GCSE students have their paper 1 on Monday too, so I feel your pain! Still... at least you're revising! (I'm not sure my own students are that promising!) I'm here if you need any advice.
(3:20) Well, you see. "emoji" is a glyph in Unicode that is defined to be a picture, usually with its own colours, unlike other text. These characters are specifically defined to represent these pictures. - "emoticon" is a series of characters, built up by existing symbols that was not intended to be part of the picture. For example the basic smilie ":)" consists of colon and parenthesis, two symbols intended for other things.
No, Emoji are Unicode characters, not series of many. That is three if you encode the Unicode (also known as ISO-10646) with UTF-8 encoding/compression. Then characters like Å, Ä and Ö turns out as two characters if you look at UTF-8 encoded files as if they was ASCII or Latin-1 (that is ISO-8859-1). Common missconfiguration of web servers.
Please solve me a question: If every character and command is being represented first by a decimal number and then to binary number for example "Letter A, 65, 1000001". So what happens with the number 65? What decimal number is attached to this number? I'm so confused. Thank you so much for the video!
Hey bud! It's pretty straightforward! The number 65 will be broken down as follows: 6 -> 54 and 5 -> 53. Now convert 54 and 53 respectively to their binary formats i.e. 00110110 and 00110101. Both are 8 bits respectively which means the total storage required for the number 65 is 16 bits (2 bytes). Hope this helps.
@@Flasherrrr it depends on weather the number is stored as a string or an int. For an int you do as you say, but for an int you convert the number to binary, and that takes a lot less information.
does our system uses a single system at a time or it toggles between ascii and unicode as needed automatically ? what if the file contains simple alphabets as well as the emojis ??
If a text file contains only ASCII files then it will be saved by default as an ASCII file, unless you choose otherwise. If the file contains any Unicode characters then if you try to save it as an ASCII/ANSII file then you will be warned about the potential loss of the Unicode characters. Generally the system will try to keep file sizes low, so will only save using the higher Unicode file size if any Unicode characters are included.
The reason notepad uses four bytes in that example is not because utf-32, but instead because they use utf-16 with an additional BOM at the beginning of the file.
Do you have your exam tomorrow? I'm so glad you found this video helpful. Let me know if there are any other topics or questions you'd find useful. If you do have your exam tomorrow, then GOOD LUCK! 🤞
I tried this in Windows 10 v22H2 and found that the Alt+1 combination file size was 3 bytes instead of 4 bytes, as mentioned in the video. Any specific reason for this that you can recall?
Hold down the Alt key, then press the number '1' on your number pad, then let go of the '1' key, then let go of the Alt key. There you are - an emoji! ☺
At the simplest level, that's largely the job of the operating system (e.g. Windows. MacOS, Linux etc.). If you have a text file stored in binary then it will be the OS that loads that text file into the computer's memory, then decides on the default application to use to open the file. It will be a combination of the OS and that application which decodes the binary into text and displays it on the screen.
The Tech Train what if i created my own computer out of raw materials like sand to silicon and other metal parts and design the logic gate from cpu and other parts how do i create my own OS?
I suspect you're being deliberately awkward (maybe you are one of my students?) but if you are interested there are plenty of examples online of people creating working CPUs using mundane materials. Here is an example of a working CPU made using nothing but redstone in Minecraft: ruclips.net/video/ziv8SXfMbkk/видео.html. If you do ever manage to build a working computer out of rocks from your garden do please let me know!
32-bit refers to a type of computer architecture where the processor and operating system can handle data in chunks of 32 bits at a time. This means that the computer can process data and perform calculations on numbers that are 32 bits long.
You said A requires an entirety of a byte which is 8 bits but then you also said that ASCII is a 7 bit binary code so what is the use of the extra bit added is it extended ASCII or something
One of the most beautiful arts is making complicated things look so simple. And only legends can do it.
We watch this video in my computer class at school 😂 it’s very well put together, good job.
Perfect. Simple, easy and straightfoward 10/10 Great explanation!
Thank you so much, I'm very glad you think so.
The greatest teachers are the ones that can simplify the most complicated of things. Bravo to you!! tysm for the vid :)
Thank you! I'm so glad you found it helpful. 😊
Fantastic, thank you for the clarity. Have read blog posts and seen videos on this topic and never understood it quite so well.
Thank you Alexander, I'm very glad you feel this video is so useful.
Clean and clear, super well presented. Thank you for contributing great quality information on this platform. It's a breath of fresh air.
Thank you so much for your kind comment, I am so glad you felt the video was so useful. Hope to see you here again!
Simple, clear instructions- very helpful. Thank you!
I'm so glad you found it helpful, thank you.
Well explained and I love the “Try It Yourself”
by far the best on this topic!!!
Chinese: Im gonna end ASCII's whole career.
😆
Unicode: what are these..?
Japanese: Emojis
Unicode: it.. it's a face
Japanese: yeah, and?
Unicode: now has Emojis
LMAO
Good to see this comment! ΟωΟ
Unicode ; hold my 🍺
Great explanation. very helpful!
You're welcome!
Okay, I have gone through maybe 13 other videos including a video by my instructor, and all of them were not as simple and easy as you made this explanation out to be. Thank you for making this. I'm finally understanding it!
Well, he have left out quite a lot to make it look simple. Like the important encoding of Unicode (ISO-10646) in UTF-8.
It took me lots of video browsing to be here but this was the the video I was looking for all this while. This is the best
This video explains it beautifully and very easy to understand. Thanks for the great content
Glad it was helpful!
The explanation is wrong. When saved in UNICODE format, Notepad adds a two-byte magic number to mark the file format as being UTF-16, and the ALT-1 character is saved as two bytes. Notepad does not save in a 32-bit UNICODE format. You can verify this by putting the ALT-1 character in the file twice, and see that the file size is then 6 bytes. Also, UTF-16 encodes up to 20 bits, not 32 as stated.
Nice, I got 6 bytes!
I got 16 bytes
@@gbzedaii3765I don't know how you managed to get 16 bytes
I got infinite 😅
More than half an hour I was searching answer for this question but within 6min you did it....thanks a lot..
I'm so glad I was able to help you.
Loved this, especially the "Try it out" part - this made exam prep for Intro to IT much easier!
You guys explain this better then what my prof. did over 2 hours. lolz
7 bit stores 128 characters from 0-127 => 0000000-1111111. Correct me if I am wrong please.
@@mrsher4517 a lot of info dumped here lolz not fast enough ooooopp
Thank you so much for the heroes that create these videos. My first time delving into telcom based project and this video helped me so much for a non tech.
Refreshed me a beginner lessons of my computer science class. great thanks
I'm so glad you found it helpful Abed Behrooz
(4:30) You should preferably save in UTF-8 instead, that uses 1-7 bits per character depending on how far into Unicode it appears. - Furthermore, you can't use characters beyond the 1 114 111th character, due to how the standard is set up with the 16-bit surrogate pairs.
Unicode also supports a large list of languages, not only numbers, letters and a large list of symbols.
Best ever explanation...Thanks dude
My content on this topic is crystal clear. Thanks tech train.
this is the best video on this topic!! must watch!!
Thank you so much! I'm glad it was useful.
isn't 32-bit capable of storing potentially 2^32-1=4,294,967,295 characters (not only 2,147,483,647, as shown in the video)?
The short answer is 'no' because the first bit being equal to 1 will be the flag used to indicate a Unicode character. Since the first bit has to be a '1', you are halving the amount of available combinations potentially available.
Its actually 2^31
@@xiaoling943 It's actually 2^31 - 1. xD
Unsigned vs signed integer
Dude your youtube channel is amazing! Especially this vid! Helped a lot with computer science 🤞👍
Thank you very much! I'm so glad you found it helpful. 👍
I love the "try it yourself" part!. Thanks a lot sir!
I'm glad you found it useful
Thank you very much! This is the best explanation I've ever seen in my life.
Thank you so much, I'm very glad you liked it
How can we learn all of them,bcd,ascii ebcduc
2:44 Shouldn't it be 128 characters/instructions represented by ASCII because it ranges from 0-127 and not 1-127 and 2^7 = 128 which is the total combinations of 0s and 1s?
the best explanation ....easily understood the topic..hats off
Thank you!
This is what im looking for this morning
So what are the possible questions that can come up for extended ASCII
Thank you made this very easy to understand!
It was short video but full of content. Very well explained. Thank You :) !
This actually helped me understand why a bunch of symbols have random numbers after it.
Great video . Have understood ascii and unicode clearly . This video deserves thumbs up ..
Wonderfully clear and concise delivery, thank you
Why can't characters be saved as a 1-byte ASCII and then a 4-byte Unicode for other characters? Which would reduce the size needed for memory?
Unicode does come in many flavours, but you don't want to have 4 bytes necessarily for everything as that would waste storage space.
@@TheTechTrain I want to make unicde of my language and want to revive the script of my language
WOW, by far the best explanation
Glad it was helpful!
@@TheTechTrain yeahhhhhhhhhhhhhhhhhhhhhhh
Easy to understand, will subscribe for that
Extremely helpful, thanks a lot!
You're welcome!
Great method 👍🏻👍🏻👍🏻
Great job! I had a really hard time understanding it the way others have explained it, but you explained it really well and I understood straight away. Subscribed :)
Thank you Theme Park UK, I'm so glad you liked it.
thanks u are an amazing professor!!
Just awesome 👏👏👏👏.
Great absolutely great! love the video best video I have viewed for this topic!
Thank you so much, I'm very glad you found it so useful! (Feel free to share and help spread the word! 😉👍)
ASCII is an extension of the 5-bit Badau Code used in such things as the original electro-mechanical Teletype Keyboard/Printer long-distance communication devices that replaced the old hand-keyed dit-dah Morse Code when more capable and much faster methods of sending on/off (digital) codes were developed at the end of the 19th Century. Much of ASCII was adopted to allow more characters to be sent and to allow more thorough control of the receiving device (End-of-Message value and so forth) for more intricate messages (for example, the Escape Code as a flag to allow non-ASCII values to be sent for use by the receiving machine as other than letters or numbers or standard symbols or ASCII Printer Commands). Sending pictures is another major extension of ASCII where the original printable characters are now just a small part of the image printed out. UNICODE is one part of this expansion but such things as "JPG" and "MP4" and other special-purpose coding schemes are now also used extensively for inter-computer messaging. Automatic "handshaking" and error determination are now absolutely needed for computer messaging that is going much too fast for human monitoring of the connections -- this can get extremely complex when automatic backup systems with quick-swapping paths are used.
Wow, what a lot of extra information! Very interesting, thank you very much for sharing that. My videos tend to be targetted at the current UK GCSE Computer Science curriculum, and so only tend to provide that much information, but it's always good when subscribers share extra information and explanations, so thank you!
@@TheTechTrain You are welcome. I started work for the US Navy as an Electronic Engineer for Systems of the TERRIER Anti-Aircraft Guided Missile System in late 1972, just as digital computers were being added to replace the old electro-mechanical computers used to aim the missiles and initialize them just prior to firing and control the large tracking radars (huge "folded telescope" designs that used pulse tracking and a separate continuous very-high-frequency illumination beam for the missile's radar "eye" to home on). A couple of years later the profession of Computer Engineer finally was added to the US Government employment system and our group all changed over to it, so I, in a manner of speech, was not on the "ground floor" of the computer revolution but, as far as the US Federal Government was concerned, I was "in the basement" at this extremely critical time of change when computers "got smaller" as was given as an inside joke in the Remo Williams movie. There is a RUclips video series out concerning a group in a museum rebuilding one of those old Teletype machines that used the Badau Code and showing how it controlled all of the tiny moving parts in the machines. VERY interesting!
You've certainly seen a fair few changes in that time then! As someone with such an extensive background in the subject I feel humbled at you reviewing my little video! Are you still involved with the subject these days?
@@TheTechTrain I retired in 2014 after 41 years of US Federal Government employment. First for TERRIER until it was decommissioned in 1992 when steam warships (other than the nukes) were suddenly without any warning deleted from the Navy, where I was lucky and could immediately change over to TARTAR, which was on many gas-turbine-powered ships and lasted a few years longer before AEGIS replaced almost every other major US Navy missile system. TARTAR had some computer engineering/programming jobs open and I now learned a whole new kind of software/hardware design and support scheme -- boy, was that a step down, from the 18-bit UNIVAC C-152 computers that TERRIER used to the chained-together 16-bit computers from several manufacturers that TARTAR used, since 18-bit (though now obsolete due to 32- and 64-bit machines becoming standard) gave programmers WAY, WAY more capability than 16-bit did. When TARTAR in the US Navy "bit the dust" (I think that a foreign navy still uses it) a few years later, I moved to the FFG-7 frigates that used a kind of "poor-man's TARTAR" (still limited to the early SM-1 "homing-all-the-way" missiles when TARTAR had changed over to the much more capable command-guided, much-longer-range SM-2 missiles). I did some programming and program testing and spec writing, with my largest work effort being on the several-year-long project to upgrade the Australian FFG-7 ships to SM-2 and an Evolved Seasparrow vertical-launch close-in defense system -- that was a HUGE job like shoe-horning a size 12 foot into a size 8 shoe, but we did our part in developing the programming portion of the system and it WORKED!! By then I was doing Software Quality Assurance and Control (SQA), where I made sure all documents were properly reviewed and OKed and I held the final meetings where we decided if we have finished each major project step and can go to the next one, which was a major change for me. I had to learn all about the SQA process, originally developed for NASA (though we never got to their own Level 5 SQA System as that would have needed many more people and lots more money), and my boss had me flow-chart, by hand, the entire process with all possible branches to make me REALLY know it ASAP -- he stuck my large flow-charts up on the wall just outside his office and directed that everybody study their part in it (only I and our project librarian/documentation/computer storage medium "czar" had to learn the whole thing; just my luck!). To get some idea as to how far behind the US Navy is getting vis-a-vis software, where originally we were on the "bleeding edge" when I started work, I was the ONLY SQA person in our entire group, handling the several concurrent projects we had by juggling the timing of meetings and so forth. In the online video game DIABLO III, to name just one, they have over ONE-HUNDRED (100!!!) people dedicated to just SQA, and that is only a small part of their entire international staff. I felt like Indiana Jones being dragged by his whip behind that Nazi truck, only in my case that truck was getting farther and farther away as the whip stretched...
so useful even now, thank you
Glad to hear!
very well and simply explained, thanks a lot , but I have a question: why can 32-bit represent only half the no. of values, I mean why 2 billion while it can represent up to 4.3 billion ??
1 bit is used to represent positive or negative
What an amazing video, bravo, I definitely will Sub
I'm so glad it helped! Thank you for the sub! 👍
Damn such a beautiful way to explain things
Thank you, I'm so glad you found it helpful.
Best explanation I’ve seen yet!
Subbed, likes, etc.
This video has really helped me understand from a total beginners point of view, thank you. :)
For UTF-8, there are 21 free bits. So the highest possible code point will be 2097151 (decimal) or 1FFFFF (hex)
Don't forget that with UTF-8 only one byte is used for the ASCII characters.
How would I type in a passcode on my mobile device in ASCII?
What are emojis encoded in utf-8 or utf-16 or utf-32?😀
How do you get the number 2,147,483,647 from 32 bits?
When calculating binary, each column increases the value of the column to its right by the power of 2. So the first bit is 1, the second bit is 2, and then each subsequent bit or column is multiplied by 2, being 4, then 16, 32, 64 and 128. These are the values for the first 8 columns - or a byte of data. With 8 columns, or 8 bits, you can calculate the highest possible value by adding the value of each column - so 128+64+32+16+8+4+2+1=255. With 32 bits we just simply continue this process. Raising each new column to the power of 2 results in some pretty large numbers very quickly, and when you write out 32 columns, or 32 bits, that's the total you get (well, almost - the first bit is a 'flag' digit to identify the value as Unicode, so we're really only adding up the values of the first 31 columns).
The Tech Train but it's not raising to the power of two, because 16x16 is not 32.
I'm not sure you quite understand what is meant by raising a number to the power of a number. 32 is 2 to the power of 5 because when we write 2 x 2 x 2 x 2 x 2 you'll see that there are five 2's. If you write 16 x 16 - where is the 2?
Keyboard keys have a different number asociated to them that gets translated into "ASCII" later, upper or lower case depending on whether Shift was held down. PS/2 scancode for A is 0x1E. You skipped over intermediate word lengths. For most of the history, a character has been 8 bits, and later 16 bits. Having so many bits comes at cost.
The Notepad shortcut might be something found in newest Windows only. A Unicode text file is usually prefixed with a technical character called a "byte order mark" to indicate the UTF format. Saving one symbol will actually save two.
What exactly is utf-8? Could someone explain?
It's just simply Unicode that uses blocks of 8 bits to represent characters. ASCII uses 7 bits, and extended ASCII uses 8 bits. Unicode 8, or UTF-8 encodes characters using one or more blocks of 8 bits.
So utf-8 and extended ascii are basically the same thing?
For characters encoded with 8 bits, yes, exactly the same. However, UTF-8 doesn't mean only 8 bits are used - only that blocks of 8 bits are used. So two 8-but blocks could be used, which would be 16 bits.
Oh so utf8 has 8 block of 8bit? Totally to 64bits?
No no, UTF-8 is made up of 1, 2, 3 or 4 blocks of 8 bits.
If it takes 32 to store one character how is the emoji stored at 4 not 32
The value of 32 refers to the number of bits (individual binary digits), but the value of 4 refers to the number of bytes, which are groups of eight bits. So 4 X 8 = 32.
Oh thankyou so much especially for your quick reply considering the time this video was uploaded !
You're welcome. Always feel free to ask if there are any topics you'd like me to make videos on if you feel it would be helpful. 👍
I got my edexcel computing science exam paper 1 Monday and I'm trying to find revising resources from today throughout the weekend so appreciate it
My GCSE students have their paper 1 on Monday too, so I feel your pain! Still... at least you're revising! (I'm not sure my own students are that promising!) I'm here if you need any advice.
(3:20) Well, you see. "emoji" is a glyph in Unicode that is defined to be a picture, usually with its own colours, unlike other text. These characters are specifically defined to represent these pictures. - "emoticon" is a series of characters, built up by existing symbols that was not intended to be part of the picture. For example the basic smilie ":)" consists of colon and parenthesis, two symbols intended for other things.
No, Emoji are Unicode characters, not series of many. That is three if you encode the Unicode (also known as ISO-10646) with UTF-8 encoding/compression.
Then characters like Å, Ä and Ö turns out as two characters if you look at UTF-8 encoded files as if they was ASCII or Latin-1 (that is ISO-8859-1). Common missconfiguration of web servers.
@@AndersJackson Not all emoji are one Unicode codepoint. For example, 👍🏻 is made up of 2 codepoints: 👍 and the pale skin tone modifier.
Please solve me a question: If every character and command is being represented first by a decimal number and then to binary number for example "Letter A, 65, 1000001". So what happens with the number 65? What decimal number is attached to this number? I'm so confused. Thank you so much for the video!
Hey bud! It's pretty straightforward! The number 65 will be broken down as follows: 6 -> 54 and 5 -> 53. Now convert 54 and 53 respectively to their binary formats i.e. 00110110 and 00110101. Both are 8 bits respectively which means the total storage required for the number 65 is 16 bits (2 bytes). Hope this helps.
@@Flasherrrr it depends on weather the number is stored as a string or an int. For an int you do as you say, but for an int you convert the number to binary, and that takes a lot less information.
Thank you for explaining we are thankfull to you
It's my pleasure
1:47 aren't there 128 possible variations here (from 0 to 127)? anyway, thank you for this video!
yes, this video is a bit outdated
does our system uses a single system at a time
or
it toggles between ascii and unicode as needed automatically ?
what if the file contains simple alphabets as well as the emojis ??
If a text file contains only ASCII files then it will be saved by default as an ASCII file, unless you choose otherwise. If the file contains any Unicode characters then if you try to save it as an ASCII/ANSII file then you will be warned about the potential loss of the Unicode characters. Generally the system will try to keep file sizes low, so will only save using the higher Unicode file size if any Unicode characters are included.
Thank you for great explanation!
this is most simple and crisp explanation.
The reason notepad uses four bytes in that example is not because utf-32, but instead because they use utf-16 with an additional BOM at the beginning of the file.
yeah i think i seen something like this in the Wikipedia page
thanks for making me pass my GCSEs
Do you have your exam tomorrow? I'm so glad you found this video helpful. Let me know if there are any other topics or questions you'd find useful. If you do have your exam tomorrow, then GOOD LUCK! 🤞
@@TheTechTrain It is tomorrow yes, thankyou. I feel really confident going into the exam now :)
I'm so glad it helped. The very best of luck! Anything else you were wanting to look at? Let me know how it goes!
The Tech Train I’m revising searching and sorting at the moment
Linear and Binary, Bubble and Merge?
I tried this in Windows 10 v22H2 and found that the Alt+1 combination file size was 3 bytes instead of 4 bytes, as mentioned in the video. Any specific reason for this that you can recall?
i didnt get the emoji, how is it posible to use emojies on a pc?
: D
Hold down the Alt key, then press the number '1' on your number pad, then let go of the '1' key, then let go of the Alt key. There you are - an emoji! ☺
@@TheTechTrain ☺
@@relavilrodford9272 ☻☻☻☻☻
@@TheTechTrain it does not work for me. Is it because the computer type? 😔
@@astronomylover2.0 yea same for me
nice explanation........... i want more abot ANSI
But how computer knows when a binary code represents a letter or a number if both are the same? Like 'A' and 65?
As smooth as it get, Thanks!
You're a saving grace, bruv. God bless your heart. Merry Christmas and good night.
So how does the computer write the text on the screen and how does the text being created at the first place?
At the simplest level, that's largely the job of the operating system (e.g. Windows. MacOS, Linux etc.). If you have a text file stored in binary then it will be the OS that loads that text file into the computer's memory, then decides on the default application to use to open the file. It will be a combination of the OS and that application which decodes the binary into text and displays it on the screen.
The Tech Train what if i created my own computer out of raw materials like sand to silicon and other metal parts and design the logic gate from cpu and other parts how do i create my own OS?
The Tech Train and how does the OS maker design the rext do they do it by hand and pixelated it and let the instruction execute it?
The Tech Train pls respond i need to be enligthen
I suspect you're being deliberately awkward (maybe you are one of my students?) but if you are interested there are plenty of examples online of people creating working CPUs using mundane materials. Here is an example of a working CPU made using nothing but redstone in Minecraft: ruclips.net/video/ziv8SXfMbkk/видео.html. If you do ever manage to build a working computer out of rocks from your garden do please let me know!
why 7bits can only represent 127 symbols? isnt it 128?
Because 64 + 32 + 16 + 8 + 4 + 2 + 1 = 127
And because 0 is a null character
@@rays.3429 But NULL is also a control character, just as 1-31 are, so actually 128 Characters are represented by ASCII.
Extremely helpful. Thanks a million😊
7 bits should result in 128 (0~127) numbers, right?
Very nice basic introduction
Glad you liked it
4:45
wait a second... 2^31 -1 = 2147483647..
why we call it 32 bits instead of 31 bits?
32-bit refers to a type of computer architecture where the processor and operating system can handle data in chunks of 32 bits at a time. This means that the computer can process data and perform calculations on numbers that are 32 bits long.
This is very nice and easy to understand video. It even gave a simple practice at the end for the learner to apply. Thank you.
guys how do you transform ASCII coded data into UTF16 coded data?
Will be using this video for my Computing class, it is perfect. Many thanks! (y)
Thanks ...anything on EBCDIC ?
It means we have need to remember ASCII code.Is there is any way to remember ASCII code
Clear explanation❣Thanks a lot sir
You are welcome
You said A requires an entirety of a byte which is 8 bits but then you also said that ASCII is a 7 bit binary code so what is the use of the extra bit added is it extended ASCII or something
Alt 1 doesnt work for me :(
Not possible .
@bumboni ☺
Thank u so much. No one explained this way.❤️❤️
how we turn lettuce into numbers
very nice explanation
Great video 👌, clear explanation, very good examples.
Is Unicode an extension of ASCII?
one of the final exam questions was how much bit is unicode
and what is the answer?
But late I think 256 and uni is 65,000
So Sir ,is this also the reason why we cannot type more than 140 characters in google
alt+1 is not working in my notepad
@Peterolen I am using Windows
Hold down Alt, then press 1, then release 1, then release Alt.
Explanation is good.. and helpful as well
Hey will u answer some of my questions relating to this topic
Thanks for this. Great explination.