An Introduction to LSTMs in Tensorflow

Поделиться
HTML-код
  • Опубликовано: 26 дек 2024

Комментарии • 64

  • @emmanguyen3808
    @emmanguyen3808 7 лет назад +24

    I would give 10 thumbs-up for this clarity and coherence presentation!

    • @SantoshGupta-jn1wn
      @SantoshGupta-jn1wn 6 лет назад

      I would give 10 thumbs-up for this clarity and coherence comment!

  • @ashoknp
    @ashoknp 5 лет назад +2

    Excellent Lecture on RNN and LSTM, Thanks Harini and Nicholas.

  • @muratcan__22
    @muratcan__22 5 лет назад

    25:50 the sentence "Errors due to further back time steps have smaller and smaller gradients because they have to pass through this huge chain rule in order to be counted in as part of the loss." clarified me about how that long term dependency dissappears in RNN. This implies that it is because that the affect of the change of the weight(gradient) in account of the first words to our loss is now just skinny numbers. Those numbers actually have no saying at all in the overall gradient change which means our model won't be shaping to minimize this loss accourding to them(first words). Thanks.

  • @behrangjavaherian
    @behrangjavaherian 7 лет назад +5

    Good video describing the LSTMs. The only part I did not like about the video is the description of Morkov model at around 11:00 minutes. Morkov models can capture a richer state than just the previous word. Also the lecturer mentioned that Markov models assume that each state depends on the previous state which is incorrect. In Markov models at state s[n] is not independent from s[n-2]. But it is conditionally independent from s[n-2]. Independence of conditional independents has a profoundly different interpretation in statistics and probability theory as well as machine learning.

  • @mohammednagdy6661
    @mohammednagdy6661 6 лет назад

    At 26:37 Wouldn't this affect all the layers not just the ones closer to the output layer, since we're taking the sum of the gradients of all the previous cell states? So wouldn't every node change by the same amount when we update our weights?

  • @曹恒-t4l
    @曹恒-t4l 6 лет назад +1

    Very good video, both general and professional, which helps me clarify important concepts in cs224n.

  • @nitinissacjoy5270
    @nitinissacjoy5270 7 лет назад +65

    Audio stuttering ruined a great presentation :/ You should've used LSTM to fill in for it haha

  • @bharath5673__
    @bharath5673__ 6 лет назад +1

    i really was pretty confused how to start with LSTM,, but seriously ur presentation made me supereasy #all stars for harini

  • @trygvb
    @trygvb 7 лет назад +24

    39:10 that's my first time hearing numpy pronounced as 'num-pee'

    • @mliuzzolino
      @mliuzzolino 7 лет назад +8

      And it was like nails on a chalkboard.

    • @mcgil8891
      @mcgil8891 6 лет назад

      😂

    • @andresmejia812
      @andresmejia812 5 лет назад

      hahahaha here in Colombia all the people say num-pee

    • @Vocal4Local
      @Vocal4Local 3 года назад

      I've always called it num-pee.. it feels easier to import num-pee as en-pee

  • @saifghassan
    @saifghassan 6 лет назад +3

    Kind of good explanation but I think in 13:31 should be s1 = tanh(Wx1 + Us0)

  • @4skynet193
    @4skynet193 7 лет назад +3

    A bit strange... "LSTM has a bit more parameters, than simple RNN which has W matrix"
    -> vanilla LSTM'97 (also true for many other lstms) has only W (+bias) [which often involved all 4 parts in big one]
    >> but it also passes internal state (c+h), but it's not so crucial to my mind

  • @ImranKhan-fi2sm
    @ImranKhan-fi2sm 5 лет назад

    Hii
    How to handle persistent model problem. While doing time series analysis i get the output which seems to be one time step ahead of the actual series. How to rectify this problem?? This thing i am getting with several ML, DL, and as well as with statistical algos. Please do reply??

  • @username42
    @username42 6 лет назад +1

    where is the link for the opensource software package ?

  • @Bounzztothabeat
    @Bounzztothabeat 6 лет назад +1

    As an amature in this field 19:42 was helpful af

  • @解晶莉
    @解晶莉 6 лет назад

    25:27 is the derivative right?

  • @MrKrunu
    @MrKrunu 7 лет назад +5

    Very nice presentation, explained complex things very nicely by Harini !!!

  • @zhuyixue4979
    @zhuyixue4979 6 лет назад

    at 25:14, if that the derivatives of tanh and sigmoid most often are < 1 explains gradient vanishing, what about gradient explosion, why would that happen? I am trying to understand this paper better, arxiv.org/abs/1211.5063

  • @MrSDPace
    @MrSDPace 7 лет назад

    At 43:31 the line that has sess.run(optimizer, feed_dict={x: inputs, y: lables}) what specific values should be given for inputs and labels so that I can run the session??

    • @arindamsengupta8014
      @arindamsengupta8014 6 лет назад

      inputs and labels would both be tensors that you would load somewhere in your code. Eg: Say you have an 10 images (each with a 32x32x3 size)), your input to the network at a given instant would be an array of 32x32x3 = 3072 pixels. Your label for this input would be the class of the output in a vector form. Say all the 10 images were either dogs, cats or rabbits, you have 3 classes. For a given image input of 3072 pixels, if your class is cat, your labels vector would be [0 1 0]. For another 3072 input pixels of a rabbit, the label would be [0 0 1] and so on. So once you have constructed an input tensor with the size 10x3072 and output with the size 10x3, you would feed_dict them to optimize the parameters. Hope this helps!

  • @pranabsarkar
    @pranabsarkar 4 года назад

    Fantastic Presentation!

  • @ericazombie793
    @ericazombie793 3 года назад

    The tensorflow is too old for the current tensorflow version.

  • @ShivamSinhaiiitm
    @ShivamSinhaiiitm 7 лет назад +2

    Wow, that's what I needed explanation via example.

  • @valerioorfano3532
    @valerioorfano3532 7 лет назад

    at minute 25 there is an error in the definition of dsn/ds(n-1)?

  • @iloveno3
    @iloveno3 6 лет назад

    Where is the continuation?

  • @kameziax
    @kameziax 6 лет назад +2

    Harini had a hard time explaining the LSTM architecture. I would recommend to use simple examples to walkthrough the LSTM execution.

  • @recplayandstop
    @recplayandstop 7 лет назад

    did someone go though the tutorial? i need some help to add tensorboard summaries to see what's happening.

  • @4skynet193
    @4skynet193 7 лет назад +4

    One more... -> Speech Apr 26, 2017 >> but you show old Tensorflow API (before 1.0)

    • @cadeop
      @cadeop 7 лет назад +1

      hahah ... yeah man, kinda like your implying that MIT quality is just based on propaganda. Like Harvard echonomists.

    • @multiks2200
      @multiks2200 7 лет назад +2

      how's that an issue? version migration is a commitment, and quite often doesn't affect the quality of the outcome.

    • @XiaosChannel
      @XiaosChannel 7 лет назад +1

      Nibs Aardvark It can be an issue if people try to code that way with a newer API since 1.0 changed it.

  • @sepidet6970
    @sepidet6970 5 лет назад +1

    Thank you Harini that was a great presentation :)

  • @yagamilaitoo6232
    @yagamilaitoo6232 7 лет назад

    So what should we understand from S2=tanh(wx+Us1) how to find U? or is it given already? or we need to initialize U as well ?

    • @arindamsengupta8014
      @arindamsengupta8014 6 лет назад

      just like w, we would need to initialize U as well...as the model trains, it will optimize both w and U

  • @rahulkrishnan529
    @rahulkrishnan529 6 лет назад +1

    Awesome presentation. Can I look forward to a presentation on ConvolutionalLSTM any time soon ? Anyways keep posting more awesome content.

  • @valerioorfano3532
    @valerioorfano3532 7 лет назад

    dsn/ds(n-1) should be U*f'(US(n-1)+WnXn)

  • @siddharthagrawal8300
    @siddharthagrawal8300 6 лет назад

    Anyone can explain me mathematical explanation of differentiating the functions of lstm plssss! I wanna know how the backprop works. I need it for my math school project! Plssss helllpp!!

  • @insoucyant
    @insoucyant 7 лет назад

    Thanks for the wonderful presentation. Nicely explained. Is the slide done in Latex? Can I get the latex code? TIA

  • @cyluxo
    @cyluxo 7 лет назад +1

    Thank you for the nice presentation. Can I get the slides?

  • @trygvb
    @trygvb 7 лет назад

    This is amazing helpful. Thanks

  • @tingwang6631
    @tingwang6631 7 лет назад

    very helpful and clear

  • @state_song_xprt
    @state_song_xprt 7 лет назад

    This is a great explaination! Thank you so much!

  • @felixetn9443
    @felixetn9443 7 лет назад

    One of the best introductions I have found! BTW: Here is the link to the LSTM Tensorflow tutorial: github.com/nicholaslocascio/bcs-lstm

  • @MahmoudRabieadevedo
    @MahmoudRabieadevedo 7 лет назад

    The tutorial is greate

  • @joaosousapinto3614
    @joaosousapinto3614 7 лет назад

    Great talk, thanks.

  • @aes9217
    @aes9217 7 лет назад +1

    10 points to gryffindor

  • @riomanty
    @riomanty 7 лет назад

    Great talk

  • @deepaksahoo8006
    @deepaksahoo8006 7 лет назад

    Excellent

  • @sebastianavalos2055
    @sebastianavalos2055 6 лет назад

    Thanks!

  • @forbeswinthrop153
    @forbeswinthrop153 7 лет назад +1

    vibrant uptalker

  • @BenOgorek
    @BenOgorek 6 лет назад

    Good video, though I feel like it was more "LSTMs and Tensorflow" than "LSTMs in Tensorflow." The lab looks looks like it fills in the rest: github.com/nicholaslocascio/bcs-lstm/blob/master/Lab.ipynb

  • @th3n04h
    @th3n04h 7 лет назад

    40:00

  • @tsrevo1
    @tsrevo1 7 лет назад

    I would marry her.

  • @sakcee
    @sakcee 3 месяца назад

    the guy crying about equations....geez