Acerola is just fantastic. His video so effortlessly mix shit-posting, meme-ing, pragmatic problem solving, application of math, all while projecting a sense of comprehension into the viewer. To be honest, I don't understand the math behind some of his VFX programming, but he makes me feel like I did. I'm really glad Prime picked this up, because Acerola deserves the visibility this will bring to his channel.
@@jakes-dev1337 Sure, but he made his own shader and implemented it with reshade. Tell me you don't know how shaders works without telling me. This is like saying you've been able to do this forever with optifine when discussing someones custom made shader in minecraft. Makes no sense.
2:52 Having seen the original video before, I was waiting for the moment where Acerola would say it wasn't good, just after Prime was saying it was good. Got a good chuckle from me.
Before I discovered atan2, I had to make “atan2 at home”. The atan (arctangent) function takes the slope (ratio of y/x) and returns an angle. Unfortunately, you lose information if you just provide it a pre calculated ratio such as whether either or both of the components were negative, so classic atan can only give you an angle between [-90, 90]. atan2 takes the components separately, so it can also use a bit of logic to return all the angular values [-180, 180].
I remember doing that in the Scratch programming language back when I was messing around with it lol When I learned that other programming languages just have that function built-in, my mind was blown
ASCII does not define the _font_ or design of characters but the map between byte and character representation. Extended ASCII is not an encoding by itself (confusing, I know), but a classification, or "repetoire". Many OEMs would create their own version of Extended ASCII, some being ANSI-compliant, while others not. DOS and Windows, before Unicode was formed, would have their own Windows-1252, while other OEMs like IBM had their Code page 437-as you probably know from IBM PCs boot screens and BIOS menus. The latter has box-drawing characters included, where the former uses them for accented characters. For example the box drawing character "block" (0xdb) in CP437 maps to a █, but in CP1252 maps to a Û (u-circumflex in CP1252). Both are in the Unicode standard (which is why you can see them in this text). So no, there's not one right way to represent ASCII art. It is totally up to the character set and font design. You _could_ say it would then be "ANSI art" since they extend ASCII, usually with 1 extra bit, but lets not be pedantic :)
You have to found a thousand words in it first, what are the rules anyway? If there's 999.9 words, does it round up? Is needed to be a thousand different and unique words? Does it need to be a cohesive text with structured phrases? Are duplicate of the same allowed, is just thousands of "a" okay?
Dude what if he made a shader that only rendered complete words that described the object. Like he made a game where you solve puzzles by putting on the glasses from They Live
@@mantevian it's even interesting and very entertaining if you don't understand most of it. His videos somehow always make me want to understand what he's talking about
It's crazy how toxic chat was at the beginning just because Acerola stated his artistic opinions, but as soon as he started talking math and showed off his edge detection, suddenly chat was super kind.
This is sick effect. Scaled down to youtube size and with compression fuzzing the details of the ascii it reminds me of those orange gas plasma dot matrix displays in late 90s/00s pinball machines that had a few levels of pixel intensity control. All the art was hand-drawn for that format, and this filter really seems to get pretty close to the look of it from what I remember.
splitting the luminance linearly into bins feels like such a bad idea since brightness is a log scale. It would probably look a lot better if you use more bins for the darker values
Not mentioning that the brightness of source pixels was accounted for twice - first by selecting a character from a subset and then the second time, by multiplying the result by the source color. This practically applies a gamma correction with γ=2 to the whole image, making it much darker and less coherent.
You're having it in reverse. Brightness (as shown) is in gamma space, which gives it more variation in the darks. If he did this in linear space, the whole screen would look near full brightness to us.
If the Sobel filter is an approximation of the gradient, the difference of gaussians is an approximation of the Laplacian (the divergence of the gradient). Intuitively, the Sobel filter looks only at first derivatives, while the difference of gaussians is looking at the second derivative. I think what the difference of gaussians is doing here is effectively removing large scale contrast variations (makes things hard when thresholding for edge detection) and blurring out noise (with the gaussian filtering), so that the Sobel filter has something cleaner to work with.
Another thing taking the Laplacian does is help resolve thin lines. If you have a line that's just one pixel wide, applying a Sobel filter will not capture it because Sobel has the highest response to step changes in luminance. A Laplacian filter however responds to peaks and troughs as well, and the blurring out should help those lines get found (albeit doubled) in the Sobel pass. I'm curious what the effect would look like with a more principled approach: instead of edge detecting, you could essentially use the characters themselves as the kernels, and then pick whichever character produces the strongest response.
@@isodoubIet Even if it is one pixel wide, shouldn't Sobel capture it? Rather, it should capture thinner lines better than thick ones (of the same peak luminance) precisely because it would have the highest step response instead of a wide smear of lower gradient to get there.
@@MrSonny6155 The step changes on either side of the line will be picked up, but the line itself will not, resulting in a signal that's doubled and incorrectly placed. Something like a difference of gaussians will pick up a correctly placed, strong signal, without doubling because what it responds to is the minimum in the second derivative on the line itself rather than the ramping up to and from the line.
holy this guy is a god. i studied quite a bit about edge and image processing in my 2nd year uni and i knew some of what he was talking about but man being from just knowing about something to actually doing something as fantastic as this is just magic. I wonder when i will become a tier 6 grand mage like him
The result was amazing, and he went the smart route because you can get high color fidelity by simply brute forcing it to find the optimal fore/back/char combo, or you get "shape fidelity" by using edge detection. To put both to work in a "visually pleasing" way is no small feat. Now that said... from someone who had at some point to do vector drawing for some years to put food on table, and that implied A LOT of bitmap tracing, i can tell you that no "easy solution" or "hand crafted algorithm" will ever achieve what a human can do by hand, either for vector drawing or ASCII art, which you could sort of call "low res semi vector art". This is a thing that just SCREAMS Machine Learning, it's just that... we're lacking the big data it would take... And good luck on getting that, as convincing people that do bitmap>vector to give out data is the same as convincing them to phase out part of their income ;) p.s. anyone suggesting auto-trace tools, just no, nice in a tight spot, good quick fix, not good enough generally as it takes so much work to clean up that yo might as well DIY
Actually the lack of data isn't true. It's an older architecture, but CycleGAN is kinda built for things like this. You essentially just need two collections of images in different domains. For example, pictures of apples and pictures of oranges. The model then learns to convert pictures to and from both domains. There's no need for there to be 1-1 equivalents of each image It would take some modifications and some manual curation, but definitely possible Edit: Should also mention pix2pix here which is another type of GAN called a cGAN or conditional GAN
@@Foulgaz3 What are you talking about? Because we're talking about img2vector and img2ASCII. Yes, img2img is pretty much solved, because... there's PLENTY of data. But what we're talking about... no data... on the output side. You can have as much as you want on the input there's images up the wazzoo everywhere, it's the output side that's lacking.
@@ErazerPT hence the modification, but this could actually still be img2img. Just use softmax instead of sigmoid for the color channels and then use a one-hot encoding to dedicate the ascii characters you want to use to different channels. So an img2img architecture could be adapted to it pretty easily. Then it just becomes encoding data into that format. All that should take is collecting txt files for a bunch of ascii art. Sizing would be an issue, but you could encode the ascii art and then interpolate it to whatever size you need. This would end up with floating point values, but it'd effectively just be doing label smoothing. Eventually you could probably use some semi-supervised learning by curating the results you generated to create a larger synthetic dataset. So yeah I don't really see the problem
@@Foulgaz3 Ok, now i understand the thought process. And you're right, you don't see the problem. The point is NOT to turn "this image" into "a generic ASCII art of the subject matter" but to turn it into a "high quality ASCII art of the subject matter". The former you have data for, just scrape the web for it. The later you don't, because you need both the source image and the ASCII art someone made of it. Take for example castles. Plenty of images, and plenty of ASCII art. But not "image AND it's ASCII art" pairs. It's easy for example to do img2text, IF you're after the "overall description" and not "every detail". And given ASCII art is ALL about the details... extrapolate the conclusion ;) p.s. on the scaling part, you'd need a very smart upscaler to synthesize "larger sizes" for the dataset, because ASCII art doesn't scale too good using naive methods. "|-" for example will scale into "|---" at 2x for most cases, never "||--". But "-=" will most likely scale to "--==". And "/\" scales most likely to "/ \" because part of it "went up". What about "/_"? Well that is most likely "/___" AND some "went up". Not integer scaling? Good luck with that. It looks a lot easier than it is.
@@ErazerPT oh I didn't mean to imply it'd be easy. I've done enough similar projects to know that it wouldn't. More just that you wouldn't strictly need one-to-one pairings. Thanks for the conversation btw; it's nice to talk with someone who's clearly familiar with ML. I actually agree that there'd be plenty of other problems you'd run into along the way that make it very difficult. But I will say that the problems you pose aren't impossible obstacles. Your concerns regarding interpolation mostly boil down to the encoding not being very robust. To fix it, you could come up with a more robust scheme that preserves visual relationships between characters. In text processing, character encoding schemes often seek to mathematically preserve semantic relationships between characters, like 'A' - 'a' + 'b' = 'B' or '[' - '(' + ')' = ']'. In this context, you could maybe record the sine and cosine for angle along with like luminence for the image side of things. You should even be able to come up with things like implied curvature or corner angles if you get clever. It's essentially a problem of feature engineering at that point. Certainly nontrivial, but not impossible. To encode characters, you could pick a point of your encoding space for each one to belong and use the points to create voronoi cells to do the reverse. That or just a regular nearest neighbors algorithm. This would probably be the best way of doing it bc you could potentially convert between different styles of ascii art. By itself that'd probably be a really interesting project There's also other ways you could do it that'd have their own pro's and cons. You'd definitely run into plenty of problems, but personally these are the types of projects I enjoy.
A solid block did exist in the 8 bit character tables for Apple ][, most IBM-PCs, PET(?)/Commodores, etc. In the era when we really had to use it, it was not uncommon to see. To the point there were translators for BBSes that handled the "upper ASCII" and converted art whatever system you were using -- with varying results. Also, you can (like the Apple ]['s "block" where bit 8 was just inverting the cell of the 7 bit character), just use a space that is inverted. Once you accept you're using ANSI escape codes to do fg/bg colors (moving from ASCII art to ANSI art), blocks are just inverted spaces anyway. TUI interfaces, from ANSI based to Borland IDEs, used both. I had custom macros in {COMMO} (think a BBS client that is more akin to emacs/neovim than a dumb terminal) that drew frame lines around my comments. It irritated some people using older CP/M machines and other 7-bit systems, but was impressive for those that could view it.
0:52 I feel sorry for y'all, the Seven Seas hogged all the best ASCII artists out there. And the full bright, shade and empty extended character sets are available in codepage 437, the codepage for the US in MS-DOS, which is the only codepage most ASCII artists and NFO viewers care about. The DOS text mode came with tons of line, shadow and box piece characters for making mice-centric GUIs in the days before Windows.
I studied image manipulation for 2 years and I do gamedev from time to time so this video was easy to follow and nice refresher on my rusty skills. I will make a demo when I get time I guess.
I had the same experience as primeagen with shaders. Let me try to explain what a shader is to everyone who is as confused as I was: Shaders are programs that you write and run on the GPU. Usually there are two main types of shaders that run when OpenGL/DirectX is rendering graphics: vertex and fragment shader. The vertex shader is a function that runs for every vertex of every triangle on screen. It will receive information about a vertex of a triangle (such as its x, y and z position in the game world) and other information about the vertex and needs to return at least the position that that vertex will be drawn at screen. It can also compute any other information about that vertex you want, such as how much light that vertex is receving, but only the screen position is required. The fragment shader runs during the process of drawing each pixel of each triangle on the screen. It receives the position of the pixel as well as any other information that you computed for the vertices. "But I only computed attributes for each vertex of the triangle, not for every pixel inside it". Don't worry, OpenGL/DirectX will interpolate the values and find the value per pixel of whatever you computed above. What's the job of the fragment shader? Return the RGBA value of the pixel you're processing. Can be used to set the color of the pixel using a texture, or compute lighting per pixel. Acerola used a third type of shader: compute shader. This is basically an "arbitrary shader" that is not tied to the process of rendering the scene and can be triggered at any moment. Can be used to perform any computation you want.
22:17 What you described was a box blur, where all the pixels equally contribute to the final average. Twitch chat was correct when saying weighted average, depending on the kernel size and other parameters a function is defined and a kernel is created. That hernel is then run for every pixel in the image. (A kernel is like the grid you drew, and in each cell of the grid there is a multipler, the gaussian blur's kernel is based on a 2D gaussian [normal] distribution so that the sum of the cells is 1).
Can't wait for the from webdev to graphics programmer career update video. I am also transitioning from AI script kiddie (language model research) to graphics programming right now...
also that operation what gausian filter does is same as what is used in convolution layers of image recognition neural networks. they have kernel too, but actually many types ofkernels each for detecting different features (vertical lines, horizontal, curves, etc.)
Given how much you love this stuff, you should check out the old DOS program TheDraw. We used TheDraw to create all the BBS screens back in the day, including ANSI and extended ASCII characters like the block. I used TheDraw a few years ago when I created a BBS for iDOS on the iPad. Up until Apple nuked it anyway, because people kept finding ways around the side-loading of DOS games.
My favorite part of this is the implied detail from motion. What I mean is your brain fills in the detail that is missing during like a cutscene or something. It's awesome.
On thing to be mentioned is that this isn't terminal rendering but direct GPU screen pixel one. Thus he can get far more resolution than pure terminal drawing.
The problem with extended ASCII is that, depending on the OS and country, it completely depended on the code page that was used. The full block appears in the DOS code page 437, which is the native code page for the original IBM PC. Now that we have Unicode, we can use any defined character from any language. I think sometimes that typographical characters and ASCII are terms that may get conflated a bit.
Computational Neuroscience and Neuroinformatics are great calls to look into. They have tons of filters and stuff and abstracted their usecases further away from what they originally meant to be doing than the distance of a reach for a walk on the moon would be for any american office worker, that still believes in the american dream.
My GPU programming is a bit dusty, but the simplest answer i can think of on what is a shader would be : a program the can modify data available at a stage of the rendering pipeline. For GPU programming with OpenGL and DirectX, when i started the API exposed a rasterization pipeline with programmable vertex and fragment construction stage (and then with geometry and tesselation stage added on the road, + compute shaders for more generic computes), but now i assume it is more composite with raytracing capabilities of cards (and a lot more features are also exposed now). Compute shader are a bit off this definition as they are not really attached to a specific stage of the rendering pipeline if i recall and are more used for side compute not bound to the rendering pipeline directly. The types of shaders available might depend on both the API and the hardware you target (I used to work in Computer graphics GPU rendering stuff in OpenGL 4 and doing vertex/geometry/tesselation (if available)/fragment shader, while some colleagues were working on mentalray shaders for raytracing results on CPU)
Raytracing is treated sort of like compute in that it's detached from the standard rendering pipeline, instead using its own separate pipeline that is specific to raytracing. There is a standalone ray query operation that lets you perform a more limited raytracing step in the standard rendering pipeline or even in a compute shader, but it has a few limitations.
You can get this on pretty much any game using a program called *ReShade* then install the ASCII shader preset. I got it working on *Unreal Tournament* and *Hades* and it's pretty trippy, but you don't get characters that properly outline edges though
I am currently at the stage where i am really understanding graphics programming, after couple of years of trying for weeks probably, it's hard to piece all the pieces together :) it was very painful but now, damn i feel like in candy store :D
Someone's going to replicate this effect for their indie game that's about being trapped in a computer terminal and it's going to win best art direction in award shows.
The game Saints Row (i believe the third one?) also had an ascii shader that activated during certain part of the game and you can also use a cheat code to unlock it and use it permanently. When i saw it for the first time i was amazed because it didn't even lower my FPS on my low end laptop back then.
Acerola is the goat of doing random shit with shaders
I don't like goats.
@@felixmoore6781 i love goats
@@felixmoore6781 do you like acerola? Then you like goats
you mean goat like Greatest Of All Times?
@@ewerybody goat as in GOAT, you goat
Acerola is just fantastic. His video so effortlessly mix shit-posting, meme-ing, pragmatic problem solving, application of math, all while projecting a sense of comprehension into the viewer. To be honest, I don't understand the math behind some of his VFX programming, but he makes me feel like I did. I'm really glad Prime picked this up, because Acerola deserves the visibility this will bring to his channel.
His video about color is one of my favorite videos ever. He takes such complicated subjects like color spaces and makes them easy to understand.
You've been able to do this with Reshade for years. Just install reshade to your exe and select your renderer (D3D, OpenGL, Vulkan).
He knows about reshade. He even uses it in the video. It’s about making the existing effect better.
@@jakes-dev1337 bro's a professional hater
@@jakes-dev1337 Sure, but he made his own shader and implemented it with reshade. Tell me you don't know how shaders works without telling me.
This is like saying you've been able to do this forever with optifine when discussing someones custom made shader in minecraft. Makes no sense.
Need more reactions of acerola videos from prime. This guy is a magician with shaders
Agree 👍💯
You've been able to do this with Reshade for years. Just install reshade to your exe and select your renderer (D3D, OpenGL, Vulkan).
@@jakes-dev1337 Yes and He is using Reshade? But he is not just installing some existing shaders, he is making them
Prime reacting to an Acerola video? Damn, that’s a pretty good birthday present
Happy birthday!
Hopefully it keeps getting better from here on out! Happy birthday. :)
Thank you all :)
Happy birthday 🎉🎉🎉😊
"A sufficiently advanced shader is indistinguishable from a duck."
-John Carmack (probably)
2:52 Having seen the original video before, I was waiting for the moment where Acerola would say it wasn't good, just after Prime was saying it was good. Got a good chuckle from me.
But Acerola 😳
But Acerola
But Acerola
but aceroooola
but aceroolaaa
but aceroolaa
Prime should 100% watch the pixel sorting shader video
One of the Acerola's greatest videos
Before I discovered atan2, I had to make “atan2 at home”. The atan (arctangent) function takes the slope (ratio of y/x) and returns an angle. Unfortunately, you lose information if you just provide it a pre calculated ratio such as whether either or both of the components were negative, so classic atan can only give you an angle between [-90, 90]. atan2 takes the components separately, so it can also use a bit of logic to return all the angular values [-180, 180].
I remember doing that in the Scratch programming language back when I was messing around with it lol
When I learned that other programming languages just have that function built-in, my mind was blown
Same@@HedgehogGolf
TIL Acerola exists, and my life has been enriched.
I’ve pre-watched this Acerola vid
Acerola is dope. I love his videos and he deserves recognition.
Let’s goooo baby! Acerola is OP
ASCII does not define the _font_ or design of characters but the map between byte and character representation. Extended ASCII is not an encoding by itself (confusing, I know), but a classification, or "repetoire". Many OEMs would create their own version of Extended ASCII, some being ANSI-compliant, while others not.
DOS and Windows, before Unicode was formed, would have their own Windows-1252, while other OEMs like IBM had their Code page 437-as you probably know from IBM PCs boot screens and BIOS menus. The latter has box-drawing characters included, where the former uses them for accented characters.
For example the box drawing character "block" (0xdb) in CP437 maps to a █, but in CP1252 maps to a Û (u-circumflex in CP1252). Both are in the Unicode standard (which is why you can see them in this text).
So no, there's not one right way to represent ASCII art. It is totally up to the character set and font design. You _could_ say it would then be "ANSI art" since they extend ASCII, usually with 1 extra bit, but lets not be pedantic :)
@@dealloc Pretty interesting insight. That would make a pretty bad title though
That's where the phrase "image is worth a thousand words" began
You have to found a thousand words in it first, what are the rules anyway?
If there's 999.9 words, does it round up?
Is needed to be a thousand different and unique words?
Does it need to be a cohesive text with structured phrases?
Are duplicate of the same allowed, is just thousands of "a" okay?
Dude what if he made a shader that only rendered complete words that described the object. Like he made a game where you solve puzzles by putting on the glasses from They Live
Acerola is a very great guy, you should watch more of his stuff, it's always super interesting and cool
@@mantevian it's even interesting and very entertaining if you don't understand most of it. His videos somehow always make me want to understand what he's talking about
It's crazy how toxic chat was at the beginning just because Acerola stated his artistic opinions, but as soon as he started talking math and showed off his edge detection, suddenly chat was super kind.
Yeah it was hilarious how soon they flipped
Well his artistic opinions were trash, so...
@@xenn4985 He was kinda right, harsh, but right. The ascii shader was subpar for the reasons he stated.
@@TheeSirRandom That's not what we were talking about though?
@@xenn4985I never saw this comment back then, but what the fuck were you talking about then?
LETS GO finally prime got to acerola. Man he is THE one, every video is a masterpiece
25:00 wheel of time mentioned, based
This is sick effect. Scaled down to youtube size and with compression fuzzing the details of the ascii it reminds me of those orange gas plasma dot matrix displays in late 90s/00s pinball machines that had a few levels of pixel intensity control. All the art was hand-drawn for that format, and this filter really seems to get pretty close to the look of it from what I remember.
splitting the luminance linearly into bins feels like such a bad idea since brightness is a log scale. It would probably look a lot better if you use more bins for the darker values
Not mentioning that the brightness of source pixels was accounted for twice - first by selecting a character from a subset and then the second time, by multiplying the result by the source color. This practically applies a gamma correction with γ=2 to the whole image, making it much darker and less coherent.
You're having it in reverse. Brightness (as shown) is in gamma space, which gives it more variation in the darks. If he did this in linear space, the whole screen would look near full brightness to us.
Acerola is literally the goat, love his every vid
Prime dancing to Persona 3 music at the end made my day haha
If the Sobel filter is an approximation of the gradient, the difference of gaussians is an approximation of the Laplacian (the divergence of the gradient). Intuitively, the Sobel filter looks only at first derivatives, while the difference of gaussians is looking at the second derivative.
I think what the difference of gaussians is doing here is effectively removing large scale contrast variations (makes things hard when thresholding for edge detection) and blurring out noise (with the gaussian filtering), so that the Sobel filter has something cleaner to work with.
Another thing taking the Laplacian does is help resolve thin lines.
If you have a line that's just one pixel wide, applying a Sobel filter will not capture it because Sobel has the highest response to step changes in luminance. A Laplacian filter however responds to peaks and troughs as well, and the blurring out should help those lines get found (albeit doubled) in the Sobel pass.
I'm curious what the effect would look like with a more principled approach: instead of edge detecting, you could essentially use the characters themselves as the kernels, and then pick whichever character produces the strongest response.
@@isodoubIet Even if it is one pixel wide, shouldn't Sobel capture it? Rather, it should capture thinner lines better than thick ones (of the same peak luminance) precisely because it would have the highest step response instead of a wide smear of lower gradient to get there.
@@MrSonny6155 The step changes on either side of the line will be picked up, but the line itself will not, resulting in a signal that's doubled and incorrectly placed. Something like a difference of gaussians will pick up a correctly placed, strong signal, without doubling because what it responds to is the minimum in the second derivative on the line itself rather than the ramping up to and from the line.
@@isodoubIet Ah, you are right. I was mentally mistaking the single pixel line for a regular "step up" edge.
holy this guy is a god. i studied quite a bit about edge and image processing in my 2nd year uni and i knew some of what he was talking about but man being from just knowing about something to actually doing something as fantastic as this is just magic.
I wonder when i will become a tier 6 grand mage like him
Happy to see Acerola getting the clout
YES!!! Love Acerola's videos! Happy to see him getting more attention!
Acerola is a certified wizard!
The result was amazing, and he went the smart route because you can get high color fidelity by simply brute forcing it to find the optimal fore/back/char combo, or you get "shape fidelity" by using edge detection. To put both to work in a "visually pleasing" way is no small feat.
Now that said... from someone who had at some point to do vector drawing for some years to put food on table, and that implied A LOT of bitmap tracing, i can tell you that no "easy solution" or "hand crafted algorithm" will ever achieve what a human can do by hand, either for vector drawing or ASCII art, which you could sort of call "low res semi vector art".
This is a thing that just SCREAMS Machine Learning, it's just that... we're lacking the big data it would take... And good luck on getting that, as convincing people that do bitmap>vector to give out data is the same as convincing them to phase out part of their income ;)
p.s. anyone suggesting auto-trace tools, just no, nice in a tight spot, good quick fix, not good enough generally as it takes so much work to clean up that yo might as well DIY
Actually the lack of data isn't true. It's an older architecture, but CycleGAN is kinda built for things like this.
You essentially just need two collections of images in different domains. For example, pictures of apples and pictures of oranges. The model then learns to convert pictures to and from both domains. There's no need for there to be 1-1 equivalents of each image
It would take some modifications and some manual curation, but definitely possible
Edit:
Should also mention pix2pix here which is another type of GAN called a cGAN or conditional GAN
@@Foulgaz3 What are you talking about? Because we're talking about img2vector and img2ASCII. Yes, img2img is pretty much solved, because... there's PLENTY of data. But what we're talking about... no data... on the output side. You can have as much as you want on the input there's images up the wazzoo everywhere, it's the output side that's lacking.
@@ErazerPT hence the modification, but this could actually still be img2img.
Just use softmax instead of sigmoid for the color channels and then use a one-hot encoding to dedicate the ascii characters you want to use to different channels. So an img2img architecture could be adapted to it pretty easily.
Then it just becomes encoding data into that format. All that should take is collecting txt files for a bunch of ascii art. Sizing would be an issue, but you could encode the ascii art and then interpolate it to whatever size you need. This would end up with floating point values, but it'd effectively just be doing label smoothing.
Eventually you could probably use some semi-supervised learning by curating the results you generated to create a larger synthetic dataset.
So yeah I don't really see the problem
@@Foulgaz3 Ok, now i understand the thought process. And you're right, you don't see the problem. The point is NOT to turn "this image" into "a generic ASCII art of the subject matter" but to turn it into a "high quality ASCII art of the subject matter". The former you have data for, just scrape the web for it. The later you don't, because you need both the source image and the ASCII art someone made of it.
Take for example castles. Plenty of images, and plenty of ASCII art. But not "image AND it's ASCII art" pairs.
It's easy for example to do img2text, IF you're after the "overall description" and not "every detail". And given ASCII art is ALL about the details... extrapolate the conclusion ;)
p.s. on the scaling part, you'd need a very smart upscaler to synthesize "larger sizes" for the dataset, because ASCII art doesn't scale too good using naive methods. "|-" for example will scale into "|---" at 2x for most cases, never "||--". But "-=" will most likely scale to "--==". And "/\" scales most likely to "/ \" because part of it "went up". What about "/_"? Well that is most likely "/___" AND some "went up". Not integer scaling? Good luck with that.
It looks a lot easier than it is.
@@ErazerPT oh I didn't mean to imply it'd be easy. I've done enough similar projects to know that it wouldn't.
More just that you wouldn't strictly need one-to-one pairings.
Thanks for the conversation btw; it's nice to talk with someone who's clearly familiar with ML. I actually agree that there'd be plenty of other problems you'd run into along the way that make it very difficult.
But I will say that the problems you pose aren't impossible obstacles. Your concerns regarding interpolation mostly boil down to the encoding not being very robust. To fix it, you could come up with a more robust scheme that preserves visual relationships between characters.
In text processing, character encoding schemes often seek to mathematically preserve semantic relationships between characters, like 'A' - 'a' + 'b' = 'B' or '[' - '(' + ')' = ']'.
In this context, you could maybe record the sine and cosine for angle along with like luminence for the image side of things. You should even be able to come up with things like implied curvature or corner angles if you get clever.
It's essentially a problem of feature engineering at that point. Certainly nontrivial, but not impossible.
To encode characters, you could pick a point of your encoding space for each one to belong and use the points to create voronoi cells to do the reverse. That or just a regular nearest neighbors algorithm.
This would probably be the best way of doing it bc you could potentially convert between different styles of ascii art. By itself that'd probably be a really interesting project
There's also other ways you could do it that'd have their own pro's and cons.
You'd definitely run into plenty of problems, but personally these are the types of projects I enjoy.
my mans just casually busting out the wizardry
A solid block did exist in the 8 bit character tables for Apple ][, most IBM-PCs, PET(?)/Commodores, etc. In the era when we really had to use it, it was not uncommon to see. To the point there were translators for BBSes that handled the "upper ASCII" and converted art whatever system you were using -- with varying results. Also, you can (like the Apple ]['s "block" where bit 8 was just inverting the cell of the 7 bit character), just use a space that is inverted. Once you accept you're using ANSI escape codes to do fg/bg colors (moving from ASCII art to ANSI art), blocks are just inverted spaces anyway. TUI interfaces, from ANSI based to Borland IDEs, used both.
I had custom macros in {COMMO} (think a BBS client that is more akin to emacs/neovim than a dumb terminal) that drew frame lines around my comments. It irritated some people using older CP/M machines and other 7-bit systems, but was impressive for those that could view it.
0:52 I feel sorry for y'all, the Seven Seas hogged all the best ASCII artists out there.
And the full bright, shade and empty extended character sets are available in codepage 437, the codepage for the US in MS-DOS, which is the only codepage most ASCII artists and NFO viewers care about. The DOS text mode came with tons of line, shadow and box piece characters for making mice-centric GUIs in the days before Windows.
I studied image manipulation for 2 years and I do gamedev from time to time so this video was easy to follow and nice refresher on my rusty skills.
I will make a demo when I get time I guess.
Yessss Acerola is awesome, glad hes getting plugged here
I had the same experience as primeagen with shaders. Let me try to explain what a shader is to everyone who is as confused as I was:
Shaders are programs that you write and run on the GPU. Usually there are two main types of shaders that run when OpenGL/DirectX is rendering graphics: vertex and fragment shader.
The vertex shader is a function that runs for every vertex of every triangle on screen. It will receive information about a vertex of a triangle (such as its x, y and z position in the game world) and other information about the vertex and needs to return at least the position that that vertex will be drawn at screen. It can also compute any other information about that vertex you want, such as how much light that vertex is receving, but only the screen position is required.
The fragment shader runs during the process of drawing each pixel of each triangle on the screen. It receives the position of the pixel as well as any other information that you computed for the vertices. "But I only computed attributes for each vertex of the triangle, not for every pixel inside it". Don't worry, OpenGL/DirectX will interpolate the values and find the value per pixel of whatever you computed above. What's the job of the fragment shader? Return the RGBA value of the pixel you're processing. Can be used to set the color of the pixel using a texture, or compute lighting per pixel.
Acerola used a third type of shader: compute shader. This is basically an "arbitrary shader" that is not tied to the process of rendering the scene and can be triggered at any moment. Can be used to perform any computation you want.
acerola is a friggin genius
So surprised to see Acerola here! Fuck yeah! Dude deserves the attention!
Acerola is the graphics god. Love all of his videos
If you're curious about how he knows what he knows, you could watch his 100,000 sub special: 'What Is A Graphics Programmer?' A great video as well.
Acerola is so good and entertaining, I love that guy.
I have a CS degree, and this was one big "I like your funny words magic man". Amazing what some people can do!!!
I LOVED THIS VIDEO! THANK YOU!
Acerola chads rise up.
27:40 "What kind of wizard is this!?!" This is the power of *one* graphics programmer
22:17 What you described was a box blur, where all the pixels equally contribute to the final average. Twitch chat was correct when saying weighted average, depending on the kernel size and other parameters a function is defined and a kernel is created. That hernel is then run for every pixel in the image. (A kernel is like the grid you drew, and in each cell of the grid there is a multipler, the gaussian blur's kernel is based on a 2D gaussian [normal] distribution so that the sum of the cells is 1).
OMFG Greyman from The Wheel of Time mentioned... Lets fuggin go.
Can't wait for the from webdev to graphics programmer career update video.
I am also transitioning from AI script kiddie (language model research) to graphics programming right now...
also that operation what gausian filter does is same as what is used in convolution layers of image recognition neural networks. they have kernel too, but actually many types ofkernels each for detecting different features (vertical lines, horizontal, curves, etc.)
Returnal: 🫥
Prime: Sooo good
Ace: Not good
So glad to see you check out Acerola the man is absolute CHAD and I hope you check out some of his other videos because they're all so good
Prime reacting to Acerola. Is this heavan.
Great video and reaction. Got me hyped for image processing. Sad I forgot what I learned way back in school.
Given how much you love this stuff, you should check out the old DOS program TheDraw. We used TheDraw to create all the BBS screens back in the day, including ANSI and extended ASCII characters like the block. I used TheDraw a few years ago when I created a BBS for iDOS on the iPad. Up until Apple nuked it anyway, because people kept finding ways around the side-loading of DOS games.
My favorite part of this is the implied detail from motion. What I mean is your brain fills in the detail that is missing during like a cutscene or something. It's awesome.
Acerola has awesome videos just like this going over over types of shader techniques or other computer graphics programming related stuff
That was absolutely amazing, wtf.
I KNEW this was gonna happen lmao. Worlds are actually colliding rn.
But Acerola!!!
My jaw dropped like 8 times in this bruhhhhhh
> Watches a video on ASCII Shader
> Suddenly wants to become an edgelord
Acerola the goat inspired me to make my own shaders...
It reminded me school days when I learned about this subject, really great video!
This guy is literally an Edge Lord. 🤣🤣🤣
On thing to be mentioned is that this isn't terminal rendering but direct GPU screen pixel one. Thus he can get far more resolution than pure terminal drawing.
10:13 These aren't just any old cat girls, Prime...
wrong timestamp
@@RandomGeometryDashStuffThanks. Fixed.
Acerola is amazing. We need moree
I love when someone puts out something insanely niche, and then the other guy with the exact same interest finds it.
15:10 And you are nothing short of that, which is one of the many reasons, I like watching your content so much ❤
Acerola hype! I love his channel
22:58 of cource it is confusing, it's an upside-down 2D convolution of the original image with the kernel :")
The problem with extended ASCII is that, depending on the OS and country, it completely depended on the code page that was used. The full block appears in the DOS code page 437, which is the native code page for the original IBM PC. Now that we have Unicode, we can use any defined character from any language. I think sometimes that typographical characters and ASCII are terms that may get conflated a bit.
did you just reference the wheel of time? Subbed.
23:19: Thinking about the Victorian era
23:21: Living in the Victorian era
Ensha couldn't hold a candle to this level of "visions of edge, lord."
acerola is the goat and i highly encourage to watch his other videos on your own
Fun fact, Acerola is a common fruit in Brazil. I like to make drinks with it after pythoning all day
You should see the video "your colours suck", that analizes color theory.
I'm just sitting here watching waiting for his mind to explode when he realizes that Acerola is making a live shader out of this.
I fucking love Acerola so much
I am among his first 10000 or so subs and for good reason
Acerola deserves the attention!
Acerola is the best! Huge fan.
Computational Neuroscience and Neuroinformatics are great calls to look into.
They have tons of filters and stuff and abstracted their usecases further away
from what they originally meant to be doing than the distance of a reach for a
walk on the moon would be for any american office worker, that still believes in
the american dream.
Dude I was trying to look at cool Elden Ring art I had no idea I was walking into a calculus lesson.
My GPU programming is a bit dusty, but the simplest answer i can think of on what is a shader would be : a program the can modify data available at a stage of the rendering pipeline. For GPU programming with OpenGL and DirectX, when i started the API exposed a rasterization pipeline with programmable vertex and fragment construction stage (and then with geometry and tesselation stage added on the road, + compute shaders for more generic computes), but now i assume it is more composite with raytracing capabilities of cards (and a lot more features are also exposed now).
Compute shader are a bit off this definition as they are not really attached to a specific stage of the rendering pipeline if i recall and are more used for side compute not bound to the rendering pipeline directly.
The types of shaders available might depend on both the API and the hardware you target (I used to work in Computer graphics GPU rendering stuff in OpenGL 4 and doing vertex/geometry/tesselation (if available)/fragment shader, while some colleagues were working on mentalray shaders for raytracing results on CPU)
Raytracing is treated sort of like compute in that it's detached from the standard rendering pipeline, instead using its own separate pipeline that is specific to raytracing. There is a standalone ray query operation that lets you perform a more limited raytracing step in the standard rendering pipeline or even in a compute shader, but it has a few limitations.
You can get this on pretty much any game using a program called *ReShade* then install the ASCII shader preset.
I got it working on *Unreal Tournament* and *Hades* and it's pretty trippy, but you don't get characters that properly outline edges though
Acerola on Primagen is something I really wished for. Love em' :D
Acerola+Primeagen "Vim with me" colab would be amazing!
Seems we all loved acerola from before, love it
Woah processing code is so cool
Love acerola, more reacts to his other stuff would be sick haha
There used to be a "Text Mode Demo Contest" scener party releasing ONLY such demos
Why is my linear algebra all on a hyper sphere?
That is the AWESOME thing about shaders, I wanted to learn more about it but haven't got the time to do it
That shader seriously needs to be a full on mod for many games. Absolutely stunning!
oh my god is happening two of my favorite programing content creators in one video. No I'm waiting for him to get invited to a stream
I am currently at the stage where i am really understanding graphics programming, after couple of years of trying for weeks probably, it's hard to piece all the pieces together :) it was very painful but now, damn i feel like in candy store :D
FINALLY Prime is clueless in sth. Feels so good to be FOR CHANGE better than Prime in sth. Watching all other vids I'm just so clueless.
The difference of Gaussians is a kernel that picks up edges because it approximates the second distributional derivative of the Dirac delta.
Someone's going to replicate this effect for their indie game that's about being trapped in a computer terminal and it's going to win best art direction in award shows.
The game Saints Row (i believe the third one?) also had an ascii shader that activated during certain part of the game and you can also use a cheat code to unlock it and use it permanently. When i saw it for the first time i was amazed because it didn't even lower my FPS on my low end laptop back then.
For the record Prime, this is now officially a research.