C4W3L08 Anchor Boxes

Поделиться
HTML-код
  • Опубликовано: 20 авг 2024

Комментарии • 45

  • @MG5350
    @MG5350 4 года назад +41

    I feel that there are a lot of intricacies that are not explained. Great lecture hands down, but I'm starting to feel that I need concrete examples or implementation to understand many of these subtleties.

    • @dhidhi1000
      @dhidhi1000 4 года назад +7

      This is a video series, there is like 9 other videos explaining

    • @davidvultur8704
      @davidvultur8704 2 года назад

      ruclips.net/p/PL_IHmaMAvkVxdDOBRg2CbcJBq9SY7ZUvs This seems to be the one

    • @ykpoff
      @ykpoff 4 месяца назад

      @@dhidhi1000 so?

  • @keweml3544
    @keweml3544 3 года назад +2

    I think anchor box algorithm is for those problems lying somewhere between image classification and pixel classification. Recognizing an object that is either the entire image or a pixel is really tricky.

  • @nischalkhadgi8128
    @nischalkhadgi8128 5 лет назад +2

    Great one. Was really helpful. Hope you put some demonstration as well.

  • @sanjivgautam9063
    @sanjivgautam9063 4 года назад +1

    The anchor box concept is not clear. Hands down to great explanation till date though. I want to share few ideas here. During training, the label that contains bx,by,bh,bw is changed to be between 0 and 1. Obviously, bh and bw can be greater than 1. So for each of those "normalized" bounding boxes, we try to determine which of the predefined anchor box is suitable. How do we define the "suitability"? The IOU. So if we choose 5 anchor boxes, we check our normalized bounding box against all those 5 anchor box and determine which one has highest IOU, so we choose that anchor box as what Andrew NG has explained above. Also the loss function is bit of a headache to explain, here in comment. But would be great if Andrew had explained it himself. Nevermind though, we are getting videos from AI god himself!

  • @adityarajora7219
    @adityarajora7219 4 года назад +5

    how it can predict bounding box larger than grid cell................explain, please.....if anyone knows YOLO

    • @sanjivgautam9063
      @sanjivgautam9063 4 года назад +3

      Here is that thing. We actually have bx and by which falls between 0 and 1. However, the bw and bh (width and height) can have values more than 1, so that any object that goes beyond the grid cell is incorporated with that bh and bw. Did you get the point? In one of his videos, he explains how bx and by falls between 0 and 1 whilst bh and bw can go higher than 1.

    • @adityarajora7219
      @adityarajora7219 4 года назад +1

      @@sanjivgautam9063 thanks......but still I didn't get intuition.......could you give that video reference.

    • @sanjivgautam9063
      @sanjivgautam9063 4 года назад +4

      @@adityarajora7219 ruclips.net/video/gKreZOUi-O0/видео.html. I think you are following a playlist that doesn't have one video in it. The video in this link explains the bounding box rules.

  • @MARTIN-101
    @MARTIN-101 4 месяца назад

    is there a way to detect object without anchors.
    like attaching 2 mlp heads on convolution base.
    one head for classification and another for regression...
    is it a implemented way in research ?

  • @nithinmesingerme6976
    @nithinmesingerme6976 2 года назад

    As the size of anchor boxes are fixed.. how the same kind of object, one which very close and one which very far works??

  • @TheKovosh
    @TheKovosh 4 года назад +4

    One video is missed that's why I have problem understanding the rest.

    • @rohitborra2507
      @rohitborra2507 4 года назад

      if u find it please keep the link bro

    • @aymannaeem22
      @aymannaeem22 4 года назад +4

      ruclips.net/video/gKreZOUi-O0/видео.html&list=PL_IHmaMAvkVxdDOBRg2CbcJBq9SY7ZUvs&t=656

  • @sandipansarkar9211
    @sandipansarkar9211 3 года назад

    nice explanation

  • @heejuneAhn
    @heejuneAhn 3 года назад +1

    Thank you, Prof. Ng, I learned a lot. One question or request for clarification. Is it that the anchor box shapes should be taken into account to the network receptive field? So we need to use non-squared convolutional filters? Thanks.

  • @nidhihada1122
    @nidhihada1122 5 лет назад +2

    One doubt. if we are specifying Bx By Bh Bw then we can specify any anchor box. Then for an image where two objects are present in same grid cell, sharing same shape of anchor box, even this can be solved by using their respective bx, by, bh, bw in output. Where in both anchor box have their own bx by bh bw. I could not understand why andrew says it can not be solved.

    • @MARTIN-101
      @MARTIN-101 4 месяца назад

      i figured it out yesterday. but i forgot it again 😂😂

  • @ganonlight
    @ganonlight 3 года назад +1

    These anchor boxes seem more like a workaround than an actual solution tbh

    • @vishaljain4915
      @vishaljain4915 3 года назад

      Agreed, do you have a better idea

    • @ganonlight
      @ganonlight 3 года назад +1

      @@vishaljain4915 No not really

    • @vishaljain4915
      @vishaljain4915 3 года назад

      @@ganonlight 😂😂😂 me neither aha

    • @ganonlight
      @ganonlight 3 года назад

      @@vishaljain4915 😅

    • @akashkewar
      @akashkewar 3 года назад +4

      Anchor boxes are one of the many ways you can use for object detection. Algorithms like "CornerNet" don't use anchor boxes to locate objects but keypoints. Some algorithm also uses pose estimation or/and semantic segmentation to give you pretty accurate bounding boxes prediction like Pose2Seg and so on. Just google search "anchorless object detection". Also, tbh most of the stuff you see in machine learning is "workaround", but it's magic to see them work so great. There is no silver bullet that could solve all the problems, machine learning is all about choosing the right tools and being creative to the problem given in hand.

  • @marcoburkhardt6496
    @marcoburkhardt6496 3 года назад

    just good. thanks a lot :)

  • @koeficientas
    @koeficientas 5 лет назад

    If I have only 2 classes, I can give hardwired anchor for each class per grid cell and deny the c1 c2 c3? So the output vector can be y=[pc1 bx by bh bw, pc2 bx by bh bw]? pc1 - probability of perestrian, pc2 - probability of car.

  • @anujk.9893
    @anujk.9893 4 года назад +1

    If we define the shape and size of anchor boxes, won't we need only 2 outputs to identify it. Bx and By would be enough. We should not need Bh and Bw ?
    Please explain if someone knows

    • @tomvandewiele7031
      @tomvandewiele7031 4 года назад +1

      We predict an arbitrary height and width so we do still have to output Bh and Bw. With anchor boxes, the IoU is used to pick the best matching anchor box shape of the labeled data. The target shape (together with Bx, By and the class) is only set as a target for the best matching anchor box.

  • @TheKovosh
    @TheKovosh 4 года назад +1

    if I have a fixed size anchor box, then what is the point of bw and bh

    • @thomasqiao916
      @thomasqiao916 3 года назад +1

      bw and bh define the anchor box

  • @rijulsingh9803
    @rijulsingh9803 3 года назад

    So the minimum bound on number of anchor boxes is the number of classes present? Also, is there a way to optimize the size of anchor boxes? I'm a little confused here. Everything else here is crystal clear, thank you so much for this tutorial!

    • @polimetakrylanmetylu2483
      @polimetakrylanmetylu2483 2 года назад +1

      If I understand it correctly, as for 1, no, you can specify any number of anchor boxes, and each one will output it's predictions for class. You can also only specify one or any arbitrarly low/high number of them - there is no relation between number of classes and number of anchor boxes.
      As for 2, your NN will not output the entire bounding box, but instead it outputs the correction of an anchor box. They have to be defined when you create the model. What you can do is collect every bounding box from your dataset as width-height pair, and either plot it and look at it, or run some clustering algorithm to find optimal sizes

  • @EranM
    @EranM 5 лет назад +5

    0:25 right in the nuts

    • @HabibRK
      @HabibRK 5 лет назад

      it's a she

    • @lovemormus
      @lovemormus 4 года назад

      @@HabibRK how do you know it's a she

  • @ShubhamKumar-me7xy
    @ShubhamKumar-me7xy 2 года назад

    Mid point of pedestrian :xd

  • @dota2islife262
    @dota2islife262 5 лет назад

    what is the name of the course on Coursera

    • @maxbaugh9372
      @maxbaugh9372 3 года назад

      Deep Learning Specialization - Course 4: Convolutional Neural Networks

  • @guardrepresenter5099
    @guardrepresenter5099 5 лет назад

    What is pc and how pc know himself 0,1 before c1,c2,c3 are unknown????

    • @adityarajora7219
      @adityarajora7219 4 года назад +1

      PC shows there is "something" with probability and c1,c2,c3 describes what this "something" actually is.

  • @sandipansarkar9211
    @sandipansarkar9211 3 года назад

    nice explanation