AI Art Explained: How AI Generates Images (Stable Diffusion, Midjourney, and DALLE)

Jay Alammar

Просмотров 38 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 4 янв 2025

Комментарии • 47

@TaherART 2 месяца назад
Amazing Jalal, best material I've seen so far describing this matter.
Keep up the great work old friend😊
@arp_ai 2 месяца назад
Glad to hear it, bro! Hope you're doing excellent!
@omidsajedi5 2 года назад ⁺¹⁶
You have a very unique way of explaining deep learning concepts. The illustrations are very concise and to the point which really helps focus on the core concepts and not get distracted by technical details. Thanks for making this great video!
@herrbonk3635 11 месяцев назад
Good for you, I understood nothing. Some concrete technical detail would have helped me.
@Tjeminee Год назад
As a visual thinker, SD can be quite overwhelming under the hood. I have been using the graphical interface "Comfyui" and it has taken me quite a distance in understanding the dynamics of SD. Your video and page helped me a lot in taking the next step to the more advanced features and expanding my options. Thanks Jay!
@anupamsaha674 5 месяцев назад
Thank you Sir for sharing ..your explanation is always different ..from transformer architecture i am following you..great
@trajesh81 Год назад ⁺¹
Thanks Jay! just like your NLP Transformer series which still stands tall with the test of time.., one more added to the my list of go--to reference.! you are indeed a master in the art of teaching!!
@laostalk 6 месяцев назад
Very practical and useful information. Thanks!
@sanyahyde3959 Год назад
Excellent video, thank you!
@maxkhan4485 Год назад
Thank you! I finally understand Stable Diffusion!
@karthik8972 Год назад
Thanks Jay for the video, the concept of converting noised image to a clear image is understood.
How does it creates a image which doesn't exist in its training ?
It is understood that the model doesn't understand the concepts of the image and only focuses on the patterns.
But how is the below operations performed,
1. Creating a cartoon image of cat based on caption ex: Place a hat on top of cat
How does it creates a cartoon image of cat ?
How does it know the exact location of cat's head ?
How does it know to place the hat exactly at the head ?
2. A closeup shot of a dog facing the sun
How does it knows to create a close shot of a dog ?
How does it know to place the sun in the background ?
How it makes the the object to turn towards the sun ?
No videos exist to explain this concept. It would be of great help if you could make a video on this.
@DrNoureddinSadawi Год назад
Nice explanation, thanks!
@unwind_ai 2 года назад ⁺¹
Great explaination, loved it!
@andrechoi2553 Год назад ⁺¹
Good video, very inspiring😁
@XishanAfzal Год назад
More than useful. Thanks
@paresh1930 Год назад
Thank you for this great explanation!
@nqnam12345 2 года назад ⁺¹
great Jay!
@JohnGilbertmoore Год назад
It renders the image from text instead of a 3D model. Its like Maya-but with words, and using 1B+’pre-trained models (images with their text descriptions) from the Internet wired up with plain English, so you don’t have to build the models in 3D, you can just type what you want to create using plain English, and the AI renders out the image.
@daveonvr2192 Год назад
Thanks Jay - I had been looking for something that does more than describe the denoising process and the attention bit related to prompts is what I was missing. That said, I still can't quite understand how you get a completely new image. I can understand that you should be able to get back to an original image (say a dog, or a flower) via the noisification and reverse process, but how can it, say, create an image with a flower and the dog such they are integrated in some way? Where does that data that come from? A visual example of the earlier stages which show this would be helpful. The examples you had jumped from basically to an image (albeit unrefined) in 3 steps - I'd like to see this broken down so I can "see" what is happening. Still requires a level of acceptance without evidence that I am not happy with....
@d.p.5874 Год назад ⁺¹
Thanks Jay for all your efforts to share a bit of your knowledge in AI.
I am not an expert, by far, but I came to the conclusion that AI is mainly a construction of hundreds of lego bricks, assembled together into specific architectures and trained with the same gradient back propagation algorithm. Some of them perform well some other don't.
Therefore, the only genuine piece of AI theory is the mathematical background of the training algorithm. The rest is pure heuristics more or less well explained, a kind of AI cook books with ad hoc recipees.
The training algorithm itself seems very limited (even if highly powerful), since it is applied in a centralized way onto a predefined architecture and does not participate to the architecture topology definition. In other words, the topology is defined before the training while, intituively, the training should probably define the topology.
Therefore incremental learning remains a big issue in most of the AI architectures if not all.
This lack of a consistent and unified AI theory (there is no, to my limited knowledge, any AI theorems nor demonstrations that some sort of optimum is reached using a given architecture) makes me believe that we are at the very beginning of a new science still to come.
Could you react to the above humble considerations and share your thoughts ?
Kind regards,
@anilsharma32g Год назад
Dear Sir, I am your Subscriber
I want to create a tool that finds text errors in the image.
For Example:
if I forgot to write CONTACT US, BUY NOW, CONTACT NUMBER, SPELLING MISTAKE, etc... in my social media post.
that the tool finds error and suggests what are missing or what is incorrect in social media post.
🙏 Please guide me and suggest what course I need to buy or what I need to learn to create this tool
Thank you!
@rachidbensaid6629 2 года назад
Great Work, Good luck
@muhammedaneesk.a4848 Год назад
Thanks for the explanation. Can you please make a 1 hr or 2hr video with more deep dive into the internal? Maybe you already have it recorded I guess. Thanks.
@RodrigoRibeiroGomes 8 месяцев назад
Excelente!!!
@jamiewatts333 Год назад
Is this simplified explanation of the process of noise in Stable Diffusion true?
It's like teaching an artist about our visual world -- object definitions, shapes, dimensions, etc., and how they correspond to the person who commissioned the art (text prompts).
The artist then watches a mosaic - say of an ice cream - being inserted by hundreds of tesserae (rectangular slabs used to create a mosaic) and then removed to restore the original mosaic. During this, the artist learns how to understand, recreate, and reinterpret the ‘ice cream’ image in other mosaics. The artist goes through this with millions of other depictions in mosaics (objects, locations, etc.) so they can create entirely new mosaics based on the requests (or text prompts) of the person commissioning them.
Sampling steps are like commissioning an artist to interpret and construct a mosaic quickly or carefully. The more detail or accuracy you want, the more work and time have to go into it.
@itsnotthattough7588 Год назад
Thanks, sir!
@adeelgilll 2 года назад ⁺¹
excellent
@方小兰 10 месяцев назад
thank you
@mostlynotworking4112 Год назад
Simple question: does that mean it can't create a prompt (or specific word) that it hasn't been trained on? Thank you for your video!
@justaguy2365 8 месяцев назад
Oppose to the end!
@UnderstandingCode Год назад
love from Saudi arabia!
@10FACTSABOUTGAMES Год назад ⁺¹
Would you kindly tell me if it is possible to sell the artwork that I made with stable diffusion , and does the administration allow this, and how can I communicate with them i mban the mangemment or soppert for this program-, and where can the pictures be sold as pieces of art? I do not speak English, help me
@treksis 2 года назад ⁺²
👆just like the transformers series, excellent
@CptBlaueWolke 2 года назад ⁺⁴
*AI Pictures. Art means craftsmanship and personal expression
@nerdfinite 2 года назад ⁺¹
Not all photographs are art, but photography can be an art. The nonsense I draw in a game of Pictionary has no craftsmanship or personal expression. However, illustration is an important form of art.
Not only would the AI never produce art on its own, it would never produce anything. The amount of craftsmanship and personal expression being put into the image is dependent on the person using it. A low effort random prompt to the AI is arguably not art, but that's not really the point.
@youtuberaphaell Год назад
Writing the prompts is personal expression
@avistryfe4534 Год назад
@@youtuberaphaell nope. It aint shit. Even with a shortcut. You will still have zero talent or expression. Anyone can say those words.
So you have the same skill and expressive power as a toddler. Enjoy.
Pretend with your orgy of robots all you like. But you are not special.
@mingkko1 Год назад ⁺¹
@@youtuberaphaell so is ordering food at a restaurant but that does not make you a chef.😉
@CptBlaueWolke Год назад
@@youtuberaphaell no it isn't Writing a full text by yourself is.
@nerdfinite 2 года назад ⁺¹
Not all photographs are art, but photography can be an art. The nonsense I draw in a game of Pictionary has no craftsmanship or personal expression. However, illustration is an important form of art.
Not only would the AI never produce art on its own, it would never produce anything. The amount of craftsmanship and personal expression being put into the image is dependent on the person using it. A low effort random prompt to the AI is arguably not art, but that's not really the point.
@simawpalmer7721 Год назад
Thanks, great video again, but Your voice has a lot of sibilants, making the listening experience is atrocious. If you make enough money making these videos, I suggest hiring a professional audio producer/mixing guy to clean up the audio. Email me, I'll suggest someone.
@anneallison6402 Год назад
This is not art don't be silly
@mpavankumar6695 Год назад
No, this is revolution
@pierrelebreton7634 Год назад
Thank you, really nicely explained!

Следующие

Автовоспроизведение

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile