So you think you know Text to Video Diffusion models?

The U-Net (actually) explained in 10 minutes

How to write YOLO networks from the ground up for object detection

Blox Fruits Dragon Rework Update [Full Stream]

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

Raising a Grocery Store King Crab as a Pet

Coding Image Segmentation with UNet from first principles | Football Computer Vision

Neural Breakdown with AVB

Просмотров 1,8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 19 янв 2025

Комментарии • 11

@thierrym6165 8 месяцев назад ⁺¹
I’m curious to know the rationale behind combining loss functions. Do you initially know that X loss function will do x task, and Y loss function will do y task. And then you combine both losses?
Or is there some deep literature done to decide which losses do what, and hope they work as expected?
@avb_fj 8 месяцев назад
I think the combination of Dice & Focal loss is one of those empirically "tried and tested" combos that tend to do well in segmentation tasks. Many segmentation paper mention this including the Meta's Segment Anything paper from last year. Intuitively, the focal loss/BCE is good for pixel-wise or low-level classification (the focal loss also addresses class imbalances) & dice loss is a more area-wise or higher level segmentation. At the end of the day, I think people try different loss functions/hyperparameters to test what works best for their dataset/model, and for segmentation the Dice+Focal loss is a traditional place to start experimenting.
@bushrarafiachowdhury1317 4 месяца назад
Hello, could you please upload tutorials to provide direction on referring image segmentation (RIS) research?
@avb_fj 4 месяца назад ⁺¹
Great suggestion. Referring Image Segmentation is pretty wild. The next video will be a follow up on this UNET project where I’ll be implementing YOLO from scratch, but I’ll add RIS to my queue. Meanwhile, I’ll also suggest checking out my Multimodal Neural Nets video which has a ton of info on the evolution of text+image models. Multimodal AI from First Principles - Neural Nets that can see, hear, AND write.
ruclips.net/video/-llkMpNH160/видео.html
@gajendrasinghdhaked 8 месяцев назад ⁺¹
can we get the access to the code for this project its really interested as i am a big football fan
@avb_fj 7 месяцев назад
Ping me on twitter - @neural_avb
@revimfadli4666 9 месяцев назад
What are its advantages and disadvantages over YOLO?
@avb_fj 9 месяцев назад ⁺¹²
The short answer is that: YOLO is generally used for object localization+detection with anchor boxes (or bounding boxes)… UNET operates at pixel level as shown in the video and used for pixel-perfect object segmentation. So they kinda fulfill different purposes.
@persevere1052 9 месяцев назад
@@avb_fj What about the segmentation YOLO models, not the detection ones? For example yolov8-seg ..

Следующие

Автовоспроизведение

So you think you know Text to Video Diffusion models?

So you think you know Text to Video Diffusion models?

The U-Net (actually) explained in 10 minutes

The U-Net (actually) explained in 10 minutes

How to write YOLO networks from the ground up for object detection

How to write YOLO networks from the ground up for object detection

Blox Fruits Dragon Rework Update [Full Stream]

Blox Fruits Dragon Rework Update [Full Stream]

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

Raising a Grocery Store King Crab as a Pet

Raising a Grocery Store King Crab as a Pet

I Bought Gifts In ONE COLOR For My Sister!

I Bought Gifts In ONE COLOR For My Sister!

UNet for Image Segmentation - What You Need To Know! - Computer Vision

UNet for Image Segmentation - What You Need To Know! - Computer Vision

Convolutional Differentiable Logic Gate Networks - NeurIPS Oral - difflogic

Convolutional Differentiable Logic Gate Networks - NeurIPS Oral - difflogic

Explaining the Segment Anything Model - Network architecture, Dataset, Training

Explaining the Segment Anything Model - Network architecture, Dataset, Training

Learn Machine Learning Like a GENIUS and Not Waste Time

Learn Machine Learning Like a GENIUS and Not Waste Time

Vision Transformers - The big picture of how and why it works so well.

Vision Transformers - The big picture of how and why it works so well.

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Unet++ Model for Image Quality Detection: Model and Python Code Explained

Unet++ Model for Image Quality Detection: Model and Python Code Explained

Text to Image Diffusion AI Model from scratch - Explained one line of code at a time!

Text to Image Diffusion AI Model from scratch - Explained one line of code at a time!

Autoencoders | Deep Learning Animated

Autoencoders | Deep Learning Animated

Какая хитрая 😈😂

Какая хитрая 😈😂

선물용 쫀득쿠키 대량생산 😁 #청담언니 #요리하는메이크업아티스트 #shorts

선물용 쫀득쿠키 대량생산 😁 #청담언니 #요리하는메이크업아티스트 #shorts

Версия без цензуры в 🛒 МИРАКЛЯНДИЯ

Версия без цензуры в 🛒 МИРАКЛЯНДИЯ

Слово за словом 💃🏻 Кого забыли? #boardgames #настольныеигры #games #игры #настолки #настольные_игры

Слово за словом 💃🏻 Кого забыли? #boardgames #настольныеигры #games #игры #настолки #настольные_игры

💀СЛОМАЛ Айфон за 3 СЕКУНДЫ😱

💀СЛОМАЛ Айфон за 3 СЕКУНДЫ😱

100km/h Reflex Challenge 😱🚀

100km/h Reflex Challenge 😱🚀

Пора снимать все деньги с карты? 💳 Нездоровая паника с банками.. || Дмитрий Потапенко* отвечает

Пора снимать все деньги с карты? 💳 Нездоровая паника с банками.. || Дмитрий Потапенко* отвечает

It Went THROUGH the PHONE 🤯 #shorts

It Went THROUGH the PHONE 🤯 #shorts