Focal Loss for Dense Object Detection

Поделиться
HTML-код
  • Опубликовано: 9 сен 2024

Комментарии • 31

  • @autripat
    @autripat 6 лет назад +38

    Lovely presentation; states the problem clearly (class imbalance for dense boxes) and the solution just as clearly (modulating the cross-entropy loss towards the hard examples). Brilliant solution too!

    • @shubhamprasad6910
      @shubhamprasad6910 3 года назад

      just want to clarify hard examples here mean FP and FN while training.

    • @raphaeldesmond4736
      @raphaeldesmond4736 3 года назад

      you all prolly dont care at all but does anybody know a way to log back into an instagram account..?
      I somehow lost the password. I appreciate any tips you can offer me!

    • @casonkade9316
      @casonkade9316 3 года назад

      @Raphael Desmond instablaster ;)

  • @XX-vu5jo
    @XX-vu5jo 4 года назад +3

    I wish i could work with these people someday.

  • @lexiegao4085
    @lexiegao4085 2 года назад +4

    He is cuteeee!!!!!

  • @larryliu2298
    @larryliu2298 6 лет назад +1

    great work!

  • @user-fk1wo2ys3b
    @user-fk1wo2ys3b 3 года назад

    Great report!

  • @manan4436
    @manan4436 6 лет назад +1

    woww Simple anaylisis leads to best perfomance...
    Think different.

  • @punithavalli824
    @punithavalli824 3 года назад +2

    It was really great work. I am very curious about the αt term and α - balance factor, can you please help you to get some clarity about α and αt. it will be a great help for my studies

    • @punithavalli824
      @punithavalli824 3 года назад +1

      Hope for your reply

    • @luansouzasilva31
      @luansouzasilva31 3 месяца назад

      It is an idea from balanced cross-entropy, they just brought it up. Datasets with 2 or more classes usually have a class imbalance. This is a problem because the networks tend to focus on majority data, getting poor learning over the minority ones. So the idea of alpha is to put weight on the loss so that majority classes have less impact than minority classes. Alpha can be thought of as the "inverse frequency" of class distribution in the dataset.
      Example: if you have 100 dogs (class 0) and 900 cats (class 1), the distribution is 10% for dogs and 90% for cats. So the inverse frequency would be 1 - 0.1 = 0.9 for dogs, and 1 - 0.9 = 0.1 for cats. It means that alpha_dogs = 0.9 and alpha_cats = 0.1.
      In binary classification the alpha is thought of as the "weight for positive classes", so the weight for negative classes would be 1 - alpha. For the above problem, alpha=alpha_cats, as cats represent the positive class. However, for multiclass classification, the alpha is a vector with a length corresponding to the number of classes.

  • @lastk7235
    @lastk7235 3 месяца назад

    TY is cool

  • @andrijanamarjanovic2212
    @andrijanamarjanovic2212 4 года назад

    Bravo!

  • @talha_anwar
    @talha_anwar 4 года назад

    Is he talking about binary cross entropy

  • @donghunpark379
    @donghunpark379 5 лет назад

    5:35 On which reason he pointed 2.3(Left) and 0.1(right)?? The point is meaningful? or just example. If it is example how can he say hard example affect 40x bigger loss then easy example like a general case. It's strange.

    • @haileyliu9602
      @haileyliu9602 4 года назад +2

      He is trying to say that a hard example only impacts loss 20 times bigger than the easy example. So with the setting of the dense detector, where hard examples : easy examples is 1 : 1000, then the loss of hard examples : the loss of easy examples is 2.3 : 100. This means the loss is overwhelmed by the easy examples.

    • @amaniarman460
      @amaniarman460 3 года назад

      Thanks for this

  • @caiyu538
    @caiyu538 Год назад

    what is the definition of easy train or hard train datasets?

    • @luansouzasilva31
      @luansouzasilva31 3 месяца назад +2

      Easy examples are those that the model quickly learns how to correctly predict. In the context of object detection, you can think of it as big objects, having a unique shape (low chance of confusing it with other objects), etc. Hard examples are those that have high similarity or are too small in the images. Detecting an airplane and differentiating it from a bottle (easy) is more suitable than detecting and differentiating a dog from a wolf (hard).
      Based on this context, during the learning process is expected that the model quickly learns the easy examples, meaning that its probabilities will be close to 1 for positive examples. The factor (1-pt) modularizes it. As pt is close to 1 the factor (1-pt) get close to zero, then reducing the loss value. Semantically it can be seen as "reducing the impact of easy examples". The factor gamma just tells how intense is this modularization.

    • @caiyu538
      @caiyu538 3 месяца назад

      @@luansouzasilva31 thanks

  • @bobsalita3417
    @bobsalita3417 6 лет назад

    Paper: arxiv.org/abs/1708.02002

  • @MrAe0nblue
    @MrAe0nblue 5 лет назад +3

    I feel like he had a small prank hidden in his talk. As a deep learning expert at Google Brain, the one word he should know better than any other would be the word "classify", yet he stumbles on it multiple times. But oddly enough, only that word. Clearly, those that work at Google Brain are some of the brightest most talented (I'm not trying to pick on him). That's why that must be a prank right!? Or maybe he was just a bit nervous.

    • @talha_anwar
      @talha_anwar 4 года назад +4

      May b they talked in chinese there

    • @hangfanliu3370
      @hangfanliu3370 4 года назад

      actually he is Japanese😊

    • @peterfireflylund
      @peterfireflylund 4 года назад

      Kl- before a vowel is really hard to say for the Chinese. They say kr- instead.

    • @gordonguochengqian3793
      @gordonguochengqian3793 4 года назад

      @@hangfanliu3370 he is from P.R Taiwan

  • @hmmhuh3263
    @hmmhuh3263 2 года назад +1

    really bad presenter but great idea

  • @k.z.982
    @k.z.982 3 года назад

    apparently he does not know what he's talking about

  • @k.z.982
    @k.z.982 3 года назад

    this guy is reading...

    • @slime67
      @slime67 2 года назад +2

      you won't believe, but every TV presenter does exactly this, - nobody wants to slip up while presenting their thoughts to the large audience
      modern cameras have some tricky mirror system which allows you to read the text while at the same time looking at camera and apparently, it's not the case here :)