Видео 30
Просмотров 398 392

4:22

This AI Learned to Turn a Video Into Layers

4:16

How FlashAttention Accelerates Generative AI Revolution

11:54

How Rotary Position Embedding Supercharges Modern LLMs

13:39

The Algorithm that Helps Machines Learn

7:20

But What Are Transformers?

16:52

3D Reconstruction by Shedding New Light, Literally

We present a method to reconstruct 3D objects from images captured under extreme illumination variations.
📝 arxiv.org/abs/2412.15211
🌐 relight-to-reconstruct.github.io/
00:00 Introduction
00:40 Single illumination case
01:19 Unstructured image collection
02:07 Multiview relighting
02:30 Shading embedding vs. appearance embedding
03:15 Comparisons
Reference:
Hadi Alzayer, Philipp Henzler, Jon Barron, Jia-Bin Huang, Pratul P. Srinivasan, and Dor Verbin
Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation
arXiv 2024

Видео

4:22

This AI Makes 3D Illusions

Просмотров 1,3 тыс.Месяц назад

Creating 3D multiview illusion art requires exceptional artistic skills and time. In this work, we show we can democratize 3D illusion generation with the power of AI (specifically, image priors from pretrained text-to-image models). Chapters: 00:00 Optical illusions 00:30 Examples of 3D illusion 01:21 Generating 3D illusion with AI 02:13 Single-view anamorphic art 03:00 Multiview anamorphic ar...

This AI Learned to Turn a Video Into Layers

4:16

This AI Learned to Turn a Video Into Layers

Просмотров 8 тыс.Месяц назад

Layered composition has been an indispensable aspect of video editing. This work presents a method that extracts semantically meaningful layers for each object of interest. This allows applications like creative video compositions, moment retiming, action shots, and object removal. Chapters: 00:00 Layer composition 01:00 Video editing applications 02:33 Method overview 03:29 Comparison with the...

How FlashAttention Accelerates Generative AI Revolution

11:54

How FlashAttention Accelerates Generative AI Revolution

Просмотров 4 тыс.2 месяца назад

FlashAttention is an IO-aware algorithm for computing attention used in Transformers. It's fast, memory-efficient, and exact. It has become a standard tool for speeding up LLM training and inference. Join me and learn how FlashAttention works! References: - [OnlineSoftmax] arxiv.org/abs/1805.02867 - [From Online Softmax to FlashAttention] courses.cs.washington.edu/courses/cse599m/23sp/notes/fla...

How Rotary Position Embedding Supercharges Modern LLMs

13:39

How Rotary Position Embedding Supercharges Modern LLMs

Просмотров 4,7 тыс.2 месяца назад

Positional information is critical in transformers' understanding of sequences and their ability to generalize beyond training context length. In this video, we discuss - 1) Why attention mechanism in transformers is not sufficient - 2) Earlier attempt for injecting positional information (e.g., sinusoidal positional encoding) - 3) Rotary position embedding, and - 4) Techniques for long-context...

7:20

The Algorithm that Helps Machines Learn

Просмотров 1,4 тыс.3 месяца назад

How do machines learn? In this video, we review the basic ideas of optimizers, algorithms that efficiently update the parameters of deep neural networks and minimize the loss function. We will cover gradient descent, momentum, RMSProp, Adam, and AdamW. References: [RMSProp] www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf [Adam] Adam: A Method for Stochastic Optimization arxiv.o...

16:52

But What Are Transformers?

Просмотров 5 тыс.5 месяцев назад

Transformers is arguably the most influential neural network architecture in the last decade, powering the current boom of generative AI. In this video, we will review the basic ideas of the original encoder-decoder transformer architecture and understand how various design decisions are made. Enjoy!

16:25

How I Understand Flow Matching

Просмотров 21 тыс.7 месяцев назад

Flow matching is a new generative modeling method that combines the advantages of Continuous Normalising Flows (CNFs) and Diffusion Models (DMs). In this tutorial, I share my understanding of the basics of flow matching and provide an overview of how these ideas evolve over time. Check out the resources below to learn more about this topic. Slides Introduction: www.dropbox.com/scl/fi/tv449mdq0k...

3:53

3D Texture Made Easy

Просмотров 1,2 тыс.7 месяцев назад

Introducing TextureDreamer! TextureDreamer transfers textures from a few images to arbitrary 3D shapes. Excited about democratizing 3D content creation! TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion Yu-Ying Yeh, Jia-Bin Huang, Changil Kim, Lei Xiao, Thu Nguyen-Phuoc, Numair Khan, Cheng Zhang, Manmohan Chandraker, Carl S Marshall, Zhao Dong, and Zhengqin Li IEEE...

4:35

How We Can Convert Any Videos to 3D

Просмотров 2,5 тыс.8 месяцев назад

Videos are windows to another world. But the videos today are *flat*, confined to the original viewpoints. We showcase a method for converting any 2D videos into 3D videos that allow free-view synthesis. Fast View Synthesis of Casual Videos Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, and Feng Liu European Conference on Computer Vision, 202...

How Do Computers See Motion? Lucas-Kanade Method Explained

8:09

How Do Computers See Motion? Lucas-Kanade Method Explained

Просмотров 2,3 тыс.9 месяцев назад

How can machines perceive the dynamic world around us? In this video, we discuss an influential Lucas-Kanade tracking method. The core algorithm and its variants are used in a wide variety of computer vision applications. Stay until the end to learn about the inspiring story behind this seminal paper! Reference: - Bruce D Lucas and Takeo Kanade, An iterative image registration technique with an...

What are Good Features to Track? Shi-Tomasi Corner Detector Explained

4:48

What are Good Features to Track? Shi-Tomasi Corner Detector Explained

Просмотров 1,3 тыс.9 месяцев назад

Identifying reliable features for tracking is an important step for many computer vision systems, including video stabilization, object tracking, and simultaneous localization and mapping (SLAM). This video covers the basics of corner detection algorithms. References: Jianbo Shi and Carlo Tomasi, Good Features to Track, CVPR 1994 C Harris, M Stephens, A combined corner and edge detector, Alvey ...

4:27

How does OpenAI's Sora work?

Просмотров 52 тыс.10 месяцев назад

OpenAI presents Sora, a text-to-video model for generating high-quality video from text prompts. In this video, we explain a high-level overview of how Sora works.

Compositional Text-to-Image Generation Made Easy

4:45

Compositional Text-to-Image Generation Made Easy

Просмотров 1,4 тыс.11 месяцев назад

Fast View Synthesis of Casual Videos Yao-Chih Lee, Zhoutong Zhang, Kevin Blackburn-Matzen, Simon Niklaus, Jianming Zhang, Jia-Bin Huang, and Feng Liu arXiv 2023 📝 Paper: arxiv.org/abs/2312.02135 🌐 Website: casual-fvs.github.io/ Abstract: Novel view synthesis from an in-the-wild video is difficult due to challenges like scene dynamics and lack of parallax. While existing methods have shown promi...

17:39

How I Understand Diffusion Models

Просмотров 41 тыс.Год назад

Diffusion models are powerful generative models that enable many successful applications like image, video, and 3D generation from texts. In this tutorial, I share my understanding of the diffusion model basics, including training, guidance, resolution, and speed. Below are some other great resources to learn more about diffusion models. Slides Here are the slides used in this video Training: b...

3D Human Digitization from a Single Image!

3:30

3D Human Digitization from a Single Image!

Просмотров 35 тыс.Год назад

3D Human Digitization from a Single Image!

12:59

AI 3D Generation, explained

Просмотров 11 тыс.Год назад

AI 3D Generation, explained

4:03

Visualizing Climate Change Impacts

Просмотров 1,4 тыс.Год назад

Visualizing Climate Change Impacts

5:07

Expressive Text-to-Image with Rich Text

Просмотров 6 тыс.Год назад

Expressive Text-to-Image with Rich Text

3:25

Seeing Subtle Motion in 3D

Просмотров 1,1 тыс.Год назад

Seeing Subtle Motion in 3D

3:36

Immersive 3D Video is Coming

Просмотров 1,5 тыс.Год назад

Immersive 3D Video is Coming

Immersive 3D Rendering from Casual Videos

4:29

Immersive 3D Rendering from Casual Videos

Просмотров 18 тыс.Год назад

Immersive 3D Rendering from Casual Videos

4:03

Step into the World from a Single Image

Просмотров 2,4 тыс.Год назад

Step into the World from a Single Image

1:45

Out-of-focus Photos No More

Просмотров 1,3 тыс.Год назад

Out-of-focus Photos No More

Miss Korea 2013 Contestants Face Morphing

1:18

Miss Korea 2013 Contestants Face Morphing

Просмотров 170 тыс.11 лет назад

Miss Korea 2013 Contestants Face Morphing

@SocialSophia День назад
When can we expect to test this awesome concept?
@origeniuslaw3288 День назад
这种方法要是成熟了，岂不是，可以轻易地扫描整个世界！每个人拿手机随便一拍就可以重现现实世界了，而且还可以多人合作，一起分享照片来重建。
@sdfasdfsdfds День назад
is this planning to be opensource?
@amanattheedge9056 4 дня назад
Why everyone uses RoPE instead of AliBi?
@sumitsp01 6 дней назад
Graduate student descent 😂😂 I will now never forget the concepts in this video ❤
@jbhuang0604 6 дней назад
Haha! That’s great!
@reverse_meta9264 6 дней назад
Great video 👏
@jbhuang0604 6 дней назад
Thank you!
@alexandreflamant6187 7 дней назад
Amazing video, thank you
@jbhuang0604 7 дней назад
Thank you for your kind words!
@ChenLiu-nc5tg 8 дней назад
I really, really love this video! Thanks for making those complex papers interesting and understandable for me. Now, I have the courage and enthusiasm to read the original papers. Haha!
@jbhuang0604 8 дней назад
That's great to hear! Glad you found it helpful.
@pleka 9 дней назад
Fantastic work and a very good explainer video. I hope this makes it into some products soon.
@jbhuang0604 9 дней назад
Thanks a lot!
@duzx4541 9 дней назад
Uncle Roger??? haha sorry, good explanation, cheers
@jbhuang0604 9 дней назад
Thank you! Cheers!
@aritraroy3220 10 дней назад
Wow! What a beautiful explanation!!!!
@jbhuang0604 10 дней назад
Thank you so much!
@fuhodev9548 11 дней назад
Idont know if there's anybody like me, the video is easy to understand but I need to watch it more, for now I've watched it for 10 times but still does not fully get the formulas. Thank you so much!
@jbhuang0604 10 дней назад
Yup, it’s not easy to understand these math equations. But I hope the video provides some intuition on why and how it works.
@ysy69 11 дней назад
amazing!
@jbhuang0604 11 дней назад
Thank you! Cheers!
@jiaruixu4873 14 дней назад
Really amazing video! May I ask what tools you use to create this video?
@jbhuang0604 14 дней назад
Thanks! The animation comes from PowerPoints. I edit the video with Adobe premiere pro.
@agnivsharma9163 14 дней назад
This is the best video on diffusion models, I can't even imagine how you were able to distill this much info into 17 minutes
@jbhuang0604 14 дней назад
Glad it was helpful! Thanks a lot!
@blackswann9555 16 дней назад
Very entertaining and valuable. Excellent editing and explainer! subbed
@jbhuang0604 16 дней назад
Glad you enjoyed it! Thanks for subscribing!
@captainvenom7252 18 дней назад
Would it be better if your lidar to accurately measure distance of various objects in environment
@jbhuang0604 18 дней назад
Probably not. LiDAR estimates the distance by detecting reflected light. It would have difficulty in shiny objects (since the specular surfaces reflect light at specific angle). So unless the sensor is perfectly aligned with the reflected path, it won’t “see” the object. Therefore, LiDAR would help in estimating the geometry of diffuse surface, but not shiny objects.
@captainvenom7252 18 дней назад
@jbhuang0604 good point mate, well I'd work in this field and see lidar uses polarization to specify the surface type and using dual wavelength emitters can help with transparent or translucent objects and at last there's a way called multi echoes processing which are to be done to distinguish between surface and it's reflectivity And using RGB data one can make a proper 3d reconstruction of the object and environment surrounding it... Just like apple does but not quite accurate.. I was in game development and i used these techniques to create 3d version of object but that time it was just limited to shape size and albedo but now it's far more developed..
@jbhuang0604 18 дней назад
Wow! That’s cool! Yup, LiDAR usually has difficulty in working with reflective and transparent surfaces. It’s interesting that you could handle these challenging cases.
@solomonkhess5779 18 дней назад
Among us
@jbhuang0604 18 дней назад
That’s right!
@panjak323 18 дней назад
Are you Nvidia engineer?
@jbhuang0604 18 дней назад
Nope!
@panjak323 18 дней назад
@jbhuang0604 just that you are presenting the algorithm as "we".
@jbhuang0604 18 дней назад
Ah! Got it. This is a collaborative work between Google and University of Maryland College Park.
@davidlearnforus 19 дней назад
It's a very good explanation and video in general as well!
@jbhuang0604 19 дней назад
Glad it was helpful!
@VladimerKhasia День назад
@@jbhuang0604 Love all of the content on this channel! Thank you so much for doing this. I am spreading info about this great channel everywhere :))
@JasonKuanCapillaryJ 20 дней назад
nice
@jbhuang0604 20 дней назад
Glad you liked it!
@brickpid4595 21 день назад
great video!🎉🎉❤❤
@jbhuang0604 20 дней назад
Glad you liked the video!
@idcrafter-cgi 21 день назад
that is a smart way to fix some issues and i need it!
@jbhuang0604 20 дней назад
Very cool!
@xuanluo5807 23 дня назад
Really well-made video! Love how you put all these concepts in the same framework and explain all the math intuitively!
@jbhuang0604 22 дня назад
Thanks for the kind words, Xuan!
@HangLe-ou1rm 23 дня назад
Thank you for such a great videos with all the steps and equations explained so clearly! I was looking for the referenced papers to dive deeper and found those in the video description! I've learned so much through the video! Your students are so lucky to have such a dedicated instructor!
@jbhuang0604 23 дня назад
Thanks so much for your kind words!
@nvrcnfrm6225 24 дня назад
Do you work at a 7/11 in Kilsyth
@jbhuang0604 23 дня назад
What’s that?
@nvrcnfrm6225 23 дня назад
@ ice coffee
@potusboy 24 дня назад
why you want to remove the black kids only Jia smh this is one of the assumptions that you train an ai on , that inherits bias (racist tendencies) without explicitly being drafted to grasp the causality of its motive.
@jbhuang0604 23 дня назад
I am a bit confused. The method is for general objects, not specific to people (or race).
@potusboy 23 дня назад
@@jbhuang0604 yes as far as you know , hard to estimate what associations the network carries forward as you extrapolate capabilities over time , that may bake in bias in ways you had not considered
@potusboy 23 дня назад
compounding interest of color schema as best practice could result in causal tendencies across varied modality as capacities increase toward more general purpose networks , for example -- were the masking purely done in black and white for autonomous vehicle , with black being labeled as a phantom person , an edge case scenario (1 in 1million) might result in a zoox or waymo determining its ok to smack me on the road
@potusboy 23 дня назад
I am a huge fan of this work in any case , that's why I am here
@jbhuang0604 18 дней назад
Got it! Thanks for the reminder! Yes, we definitely need to work on reducing these unintentional bias.
@HCTripleC Месяц назад
Only 30 seconds in, I could already tell that this video was going to be quality and immediately subscribed. Awesome work!!!!
@jbhuang0604 Месяц назад
Really appreciate your comment! Glad that you like the video!
@yoshiyuki1732ify Месяц назад
Great content, just a small tip, avoid extraneous load, some sound seems unnecessary.
@jbhuang0604 Месяц назад
Thanks for the feedback!
@RezaMohammadi-c7s Месяц назад
Great video
@jbhuang0604 Месяц назад
Thank you!
@tian_chen4816 Месяц назад
这么新的技术能找到质量这么高的视频，赛博活佛拯救无知本科生啊😭
@jbhuang0604 Месяц назад
Thanks a lot!
@rotors_taker_0h Месяц назад
Every video on that channel is a banger
@jbhuang0604 Месяц назад
Glad you like it! Thanks!
@Neptutron Месяц назад
Oh my god, I saw this paper before! It was really cool! Good luck :D The video presentation quality here is excellent. (I'm the first author of Diffusion Illusions)
@jbhuang0604 Месяц назад
Thanks a lot for your comment. Your work is a big inspiration for us!
@SeiriosX Месяц назад
Wonderful tutorial! Thank you!
@jbhuang0604 Месяц назад
Thanks for your kind words!
@chakery3 Месяц назад
This video explains the maths of Flow Matching very well!! Esp. you mentioned that Flow Matching is a generalised version of diffusion model, it suddenly makes all sense. Looking forward to your next video!!
@jbhuang0604 Месяц назад
Thank you! You made my day!
@gholamrezadar Месяц назад
This was a very good video. also enjoyed the kanade clip at the end. thank you.
@jbhuang0604 Месяц назад
Thanks, I'm glad you enjoyed it!
@JanHolly-m5s Месяц назад
Thank you for this video! Clearly explained!
@jbhuang0604 Месяц назад
Thank you!!
@JanHolly-m5s Месяц назад
Thank you for this video! Amazing explanation!
@jbhuang0604 Месяц назад
You’re welcome! Happy that you liked it.
@t.w.7065 Месяц назад
@11:18 not HMB but HBM
@jbhuang0604 Месяц назад
Good catch! Clearly I was just trying to make sure if people are paying attention. :-p
@jiadong1021 Месяц назад
有么有开源计划啊。我本来还想用inpainting来修复一些视频，这玩意太牛逼了
@jbhuang0604 23 дня назад
We are working on it! Stay tuned!
@AIDesignResources Месяц назад
This is the tool I've been waiting for. In the future we'll be able to add generate layers in, we'll be able to generate 3D objects and add them to scenes and also change the position of the camera. All these technologies exist to day in isolation.
@jbhuang0604 Месяц назад
Exciting times ahead!!
@xiaoxiaotaozi Месяц назад
你好，什么时候可以使用到这项伟大的技术
@jbhuang0604 20 дней назад
We are still working on a version that we can release.
@marcshawn Месяц назад
Is this open-source?
@jbhuang0604 23 дня назад
We are working on it. Stay tuned!
@CatBlack01 Месяц назад
I need this NOW!
@jbhuang0604 18 дней назад
We are working on it!
@Тима-щ2ю Месяц назад
Combining this video with Umar Jamil implementation is useful
@jbhuang0604 Месяц назад
That’s COOL!
@wholeness Месяц назад
Can't wait to this goes live!
@jbhuang0604 18 дней назад
We are excited as well!
@Corruptinator Месяц назад
Considering we saw the demo of the toy airplane in this video... Practical flightcrafts/starships/battleships for practical effects is way easier to do without CGI... with of course some AI involved. :) I honestly would like to try this new tool. Any chance for a code release?
@jbhuang0604 23 дня назад
Thanks for your comment. We cannot release the version building upon Google internal pretrained model so we have to rebuild it using open source model. We are still working on it and hopefully can release that version.
@Corruptinator 23 дня назад
@@jbhuang0604 Thats awesome, looking forward to try it! Take your time though; no need to rush. :) I can definitely wait.
@jbhuang0604 18 дней назад
Great! We are working on it!
@abc123634 Месяц назад
This is super promising and exciting, congrats on making this possible!
@jbhuang0604 Месяц назад
Thank you so much!
@matchboxgiant Месяц назад
this kind of result is pretty amazing
@jbhuang0604 Месяц назад
Yes, we are very excited about what new opportunities this tool can unlock!
@outliier Месяц назад
pretty cool!
@jbhuang0604 Месяц назад
Thanks for the kind words!

Jia-Bin Huang

Комментарии