Lecture 09 - The Linear Model II

Поделиться
HTML-код
  • Опубликовано: 15 янв 2025
  • НаукаНаука

Комментарии • 37

  • @jsanch855
    @jsanch855 12 лет назад +15

    Maestro!!! This has to be the standard in RUclips, very good Lectures, other Universities Learn from here.

  • @Nestorghh
    @Nestorghh 12 лет назад +5

    Por dios... este profesor rockea demasiado! es muty claro, excelente. Se hace entender al máximo. Sus ejemplos son simples y entiendo perfecto. Es un genio!

  • @samchan2535
    @samchan2535 7 лет назад +8

    Professor Yaser always answers a question in a more fundamental way, instead of trying to solve a problem at face value.

  • @deleteme924
    @deleteme924 8 лет назад +21

    54:38 "Let's say I am in 3 dimensions." Hmm, yes, you've been in 3 dimensions for a while now. :D

    • @ankittripathi4385
      @ankittripathi4385 8 лет назад +6

      not in 3 dimenssion we actually live in 3 spatial and 1 temporal :p

    • @tonyraubenheimer5468
      @tonyraubenheimer5468 5 лет назад +4

      That sounds like a joke Yaser would actually make. I can imagine it in his voice.

  • @kavourakos
    @kavourakos 4 месяца назад

    best machine learning course ever.

  • @akankshachawla2280
    @akankshachawla2280 5 лет назад +5

    Logistic: 24:33

  • @TheOlmesartan
    @TheOlmesartan 3 месяца назад

    Amazing leacture ever seen

  • @SN-rs1lb
    @SN-rs1lb 11 лет назад +4

    Excellent!! Thanks for uploading.

  • @mhchitsaz
    @mhchitsaz 12 лет назад +2

    great lecture, extremely clear and understandable

  • @zuodongzhou3334
    @zuodongzhou3334 9 лет назад +3

    He is the best !

  • @YashChavanYC
    @YashChavanYC 6 лет назад +6

    "The weight of the person, not the weight of the input" Hahaha xD

  • @Jd-dw8rn
    @Jd-dw8rn 7 лет назад +1

    58:50 he says that Conjugate gradient takes into account the second order terms - isn't that rather Newton's method ? Conjugate gradient is an improvement but through different means, i.e. taking into account earlier directions of descent, whereas Newton's method makes explicit use of the second order Taylor expansion.

  • @mcm248
    @mcm248 12 лет назад +3

    Should try the online course: Learning from Data

  • @brod515
    @brod515 4 года назад

    @20:50 after he explains the problem of not learning on the data before choosing a model.
    My question is what if someone gives the data after already this and uses only z(x_1^2 + x_2^2 - 0.6) and this is the data they give you.
    How could you possible know the data was originally something else... you assume the data is the collected one. How would you charge the correct VC dimension.

  • @PradiptoDas-SUNYBuffalo
    @PradiptoDas-SUNYBuffalo 12 лет назад +1

    Superb!

  • @gyeonghokim
    @gyeonghokim 3 года назад

    thanks a lot for such a great lecture

  • @karannaik1555
    @karannaik1555 6 лет назад

    1st question explanation awesome!

  • @ahmedomar636
    @ahmedomar636 4 года назад +1

    look nice 2021

  • @leonig100
    @leonig100 8 лет назад +1

    I can understand the use of the soft threshold but saying this is the probability without explanation cannot be acceptable as the Professor said that there are other soft threshold functions that can be used. Depending on the function used there must be a correction to get the real probability.

    • @donaldslowik5009
      @donaldslowik5009 6 лет назад

      The real probability is what we try to best approximate/learn within the hypothesis set. Sigmoid hypothesis set allows to adjust the range over which probability goes from 0 to 1 (magnitude of w -bigger w is more of a hard transition -closer to perceptron), and direction of transition (direction of w othogonal to the hyperplane where probability is 0.5), and its offset from origin(w_0).

    • @abrarfaiyaz6503
      @abrarfaiyaz6503 3 года назад

      The machine will set the weights after enough iteration so that it matches the actual probability, regardless of what soft threshold we use. Different functions will have different weights for the same data, and hence give approximately the same probability.

  • @moranreznik
    @moranreznik 7 лет назад +1

    can someone plz explain to me the use of taylor series in gradiant decent? I mean, I know you use it to approxiamte a function near a point, but what did he do here? howcan you approximte delta Ein, if its built using 2 Ein's? what is the input here and around what vale he tryes to approximate?

    • @tradingmogador9171
      @tradingmogador9171 6 лет назад +1

      he use taylor serie of first degree to calculate gradient descent.
      f(x)=f(W0)+((x-W0)^T).f'(W0)+O(||x-W0||²)
      replace f by E_in and X by W1, and you will find the gradient descent of E_in.

  • @chongsun7872
    @chongsun7872 6 лет назад +1

    Lol really like how he explained the steepest descent.

  • @AndyLee-xq8wq
    @AndyLee-xq8wq Год назад

  • @gamer966
    @gamer966 8 лет назад

    I don't get it
    In the logistic regression Ein(w0) is a scalar but is assumed to be a vector
    What am I missing?

    • @andysilv
      @andysilv 8 лет назад

      Yes, it is scalar. Where do we use it as a vector?

    • @gamer966
      @gamer966 8 лет назад

      I thought you could only take the gradient of a vector?

    • @andysilv
      @andysilv 8 лет назад +2

      Gradient is a vector. But you usually take it of function depending on several variables. Yet, the function itself is a scalar in every point.

    • @gamer966
      @gamer966 8 лет назад +1

      Oh I just realized my error
      The gradient is the partial derivative of each weight, right?

    • @andysilv
      @andysilv 8 лет назад

      Yep, exactly.

  • @yakyuck
    @yakyuck 10 лет назад

    What is the difference between E(in) and|| E(in)||? What do those lines stand for?

    • @foihrifughe
      @foihrifughe 10 лет назад +4

      E(in) is a vector (say like (3, 4)), ||E(in)|| is the norm (magnitude) of the vector, which is more commonly the L2 norm which is equal to square_root(3^2 + 4^2) = square_root(25) = 5

  • @RajatDangiYT
    @RajatDangiYT 7 лет назад +2

    Tip: Wath all these lectures at 1.5x