Coding Image Segmentation with UNet from first principles | Football Computer Vision

Поделиться
HTML-код
  • Опубликовано: 19 янв 2025

Комментарии • 11

  • @thierrym6165
    @thierrym6165 8 месяцев назад +1

    I’m curious to know the rationale behind combining loss functions. Do you initially know that X loss function will do x task, and Y loss function will do y task. And then you combine both losses?
    Or is there some deep literature done to decide which losses do what, and hope they work as expected?

    • @avb_fj
      @avb_fj  8 месяцев назад

      I think the combination of Dice & Focal loss is one of those empirically "tried and tested" combos that tend to do well in segmentation tasks. Many segmentation paper mention this including the Meta's Segment Anything paper from last year. Intuitively, the focal loss/BCE is good for pixel-wise or low-level classification (the focal loss also addresses class imbalances) & dice loss is a more area-wise or higher level segmentation. At the end of the day, I think people try different loss functions/hyperparameters to test what works best for their dataset/model, and for segmentation the Dice+Focal loss is a traditional place to start experimenting.

  • @bushrarafiachowdhury1317
    @bushrarafiachowdhury1317 4 месяца назад

    Hello, could you please upload tutorials to provide direction on referring image segmentation (RIS) research?

    • @avb_fj
      @avb_fj  4 месяца назад +1

      Great suggestion. Referring Image Segmentation is pretty wild. The next video will be a follow up on this UNET project where I’ll be implementing YOLO from scratch, but I’ll add RIS to my queue. Meanwhile, I’ll also suggest checking out my Multimodal Neural Nets video which has a ton of info on the evolution of text+image models. Multimodal AI from First Principles - Neural Nets that can see, hear, AND write.
      ruclips.net/video/-llkMpNH160/видео.html

  • @gajendrasinghdhaked
    @gajendrasinghdhaked 8 месяцев назад +1

    can we get the access to the code for this project its really interested as i am a big football fan

    • @avb_fj
      @avb_fj  7 месяцев назад

      Ping me on twitter - @neural_avb

  • @revimfadli4666
    @revimfadli4666 9 месяцев назад

    What are its advantages and disadvantages over YOLO?

    • @avb_fj
      @avb_fj  9 месяцев назад +12

      The short answer is that: YOLO is generally used for object localization+detection with anchor boxes (or bounding boxes)… UNET operates at pixel level as shown in the video and used for pixel-perfect object segmentation. So they kinda fulfill different purposes.

    • @persevere1052
      @persevere1052 9 месяцев назад

      @@avb_fj What about the segmentation YOLO models, not the detection ones? For example yolov8-seg ..