Deep Shallownet
Deep Shallownet
  • Видео 70
  • Просмотров 171 975
K-means++ & Lloyd's algorithm
Hands-on example of K-means++ initialization algorithm and Lloyd's algorithm of iterative refinement for K-means clustering.
References:
K means++ - Arthur, D.; Vassilvitskii, S. (2007). "k-means++: the advantages of careful seeding" (PDF). Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics Philadelphia, PA, USA. pp. 1027-1035.
Lloyd - Lloyd, Stuart P. (1982), "Least squares quantization in PCM" (PDF), IEEE Transactions on Information Theory, 28 (2): 129-137,
Plot - www.desmos.com/calculator/o3zfw1mxjy
-----------------------------
Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua ...
Просмотров: 6 268

Видео

Principal Component Analysis (PCA) - Step by Step
Просмотров 3984 года назад
A step-by-step by hand principal component analysis on a toy dataset with visualizations. Covariance Matrix: ruclips.net/video/Ggtsxmyx5rM/видео.html Eigen Decomposition: ruclips.net/video/inBYfXAOZDA/видео.html Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint Plot: www....
Decision Tree - Gini Impurity
Просмотров 8304 года назад
Building a decision tree with Gini impurity, by hand. Decision tree learning, top-down greedy approach/algorithm. 0:00 - Interpreting a tree 1:18 - Attribute and split types 2:22 - Building a decision tree 2:45 - Gini Impurity 5:26 - Building a tree intuitively 6:24 - Notes on overfitting Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courv...
Radial Basis Function Kernel - Gaussian Kernel
Просмотров 7 тыс.4 года назад
Radial Basis Function Kernel considered as a measure of similarity and showing how it corresponds to a dot product. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint Derivation of space: pages.cs.wisc.edu/~matthewb/pages/notes/pdf/svms/RBFKernel.pdf
Kernel Trick Visualization, Derivation, and Explanation.
Просмотров 1,4 тыс.4 года назад
What Kernel Trick does visually, deriving the polynomial kernel of degree two, writing down the kernel perceptron algorithm. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Animation made with www.geogebra.org Slides made in: Microsoft PowerPoint
Perceptron
Просмотров 1584 года назад
By hand numerical example of finding a decision boundary using a perceptron learning algorithm and using it for classification. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint Graph: www.desmos.com/calculator/vnoky7b2fn
Soft Margin SVM
Просмотров 5394 года назад
Comparing Hard Margin with Soft Margin SVM, and finding the soft margin classifier in Python. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint Graph: www.desmos.com/calculator/dd0nuurs3e CVX solver: cvxopt.org/
Maximum Margin Classifier, SVM - Support Vector Machine
Просмотров 1,2 тыс.4 года назад
Explanation and example of finding a separating hyperplane using linear programming. 0:00 - Maximum Margin Classifier 2:20 - Linear Programming with CVX Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint Graphs made with Desmos: www.desmos.com/calculator/rp0lniv4oj CVX sol...
Logistic Regression | Binary Logistic Regression
Просмотров 914 года назад
Binary Logistic Regression with an example of fitting to the data. Erratum: At 3:02 and 3:15, log-likelihoods need a negative sign, because we're finding the derivative of the negative log-likelihood. View correct slide at: ibb.co/YTXK51M or bit.ly/3esE0cD Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Sl...
Bayesian Inference & Maximum a Posteriori Estimation | Bayesian Statistics
Просмотров 1 тыс.4 года назад
Bayesian Statistics with examples. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint Plots: Desmos
Linear Regression as Maximum Likelihood
Просмотров 2974 года назад
Justifying Linear Regression as a Maximum Likelihood Estimation. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint
Maximum Likelihood Estimation
Просмотров 2934 года назад
Likelihood and Maximum Likelihood Estimation Explained. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint Graphs: www.desmos.com/calculator
Consistency in Estimators, Bias of Consistent Estimators
Просмотров 6 тыс.4 года назад
Consistent Estimators and their Bias. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint
Bias & Variance Tradeoff with MSE
Просмотров 2574 года назад
Bias and Variance Tradeoff, representing MSE as bias and variance. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint
Variance and Standard Error of an Estimator/Statistic
Просмотров 1,5 тыс.4 года назад
Showing the importance of variance in estimators and finding the variance of the sample mean. Recommended to read along: Deep Learning An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville www.deeplearningbook.org/ Slides made in: Microsoft PowerPoint
Bias of an Estimator
Просмотров 1,3 тыс.4 года назад
Bias of an Estimator
Point Estimators & Function Estimators
Просмотров 1624 года назад
Point Estimators & Function Estimators
K-fold Cross-Validation
Просмотров 2164 года назад
K-fold Cross-Validation
Parameters and Hyperparameters
Просмотров 1104 года назад
Parameters and Hyperparameters
Weight Decay - L2 Regularization Example
Просмотров 4,5 тыс.4 года назад
Weight Decay - L2 Regularization Example
No Free Lunch Theorem (NFL)
Просмотров 1,1 тыс.4 года назад
No Free Lunch Theorem (NFL)
K-Nearest Neighbor Regression
Просмотров 4204 года назад
K-Nearest Neighbor Regression
Overfitting And Underfitting In Machine Learning
Просмотров 1894 года назад
Overfitting And Underfitting In Machine Learning
Linear Regression
Просмотров 5554 года назад
Linear Regression
Constrained Gradient Descent
Просмотров 1,6 тыс.5 лет назад
Constrained Gradient Descent
Constrained Optimization Problem
Просмотров 3355 лет назад
Constrained Optimization Problem
Backtracking Line Search in Gradient Descent
Просмотров 18 тыс.5 лет назад
Backtracking Line Search in Gradient Descent
Gradient Descent
Просмотров 3245 лет назад
Gradient Descent
Convex Sets & Functions
Просмотров 4305 лет назад
Convex Sets & Functions
Lipschitz Continuity | Lipschitz Condition
Просмотров 12 тыс.5 лет назад
Lipschitz Continuity | Lipschitz Condition

Комментарии

  • @geografixxxx
    @geografixxxx 7 месяцев назад

    Thanks, very informative!

  • @drunky5247
    @drunky5247 9 месяцев назад

  • @farooq36901
    @farooq36901 10 месяцев назад

    The most underrated channel for deep learning and probability.

  • @Citreonic98
    @Citreonic98 11 месяцев назад

    pretty sure you forgot to include the 1/2 at the end.

  • @dabeeramir1407
    @dabeeramir1407 Год назад

    You explained this in a minute better than every single other video I've seen

  • @ZHENHENGCHOO
    @ZHENHENGCHOO Год назад

    thanks! simple and easy to understand

  • @asrahussain8642
    @asrahussain8642 Год назад

    rank of a matrix?

  • @dusanmarceta1562
    @dusanmarceta1562 Год назад

    Isn't these data separable already in 2D space?

  • @el_nuevo
    @el_nuevo Год назад

    There are any application of Hadamard product on a matrix?

    • @hjdeheer
      @hjdeheer Год назад

      Yes, for example in the calculation of LSTM gates and gated recurrent units (GRUS) in deep learning

  • @alejandrosanzfernandez3927
    @alejandrosanzfernandez3927 Год назад

    once you obtain the xnew=0.18 you start again the method with x=0.18 and beta=1? or you restart the method with x=0.18 and your last epsilon? or the exercise is over?

  • @29ibrahimsayed95
    @29ibrahimsayed95 Год назад

    nice explanation

  • @aliamirkhorasani9715
    @aliamirkhorasani9715 2 года назад

    Brief and precise

  • @AJ-et3vf
    @AJ-et3vf 2 года назад

    So difficult to understand

  • @Sorayahyogal
    @Sorayahyogal 2 года назад

    it helps thank you

  • @actuarialscience2283
    @actuarialscience2283 2 года назад

    God bless you boss. I have really struggled today for close to 24 hours in search for g2bbage online till I got your video.

  • @almerchant7595
    @almerchant7595 2 года назад

    how / why is the the first term constant?

  • @wadewang574
    @wadewang574 2 года назад

    It seems that the y in the 1st row is the x_2 in the 3rd row ?

  • @naveennayak989
    @naveennayak989 2 года назад

    What is meant by fair ,biased and unbiased coin?

  • @jayakumarr3847
    @jayakumarr3847 2 года назад

    Enna koop ada e paryunnathu uttaram kittunila

  • @MariaGarcia-ey3fg
    @MariaGarcia-ey3fg 2 года назад

    This video really helped me clear it up and your example was very simple and useful! I am confused as to why you made beta = 0.707... what made you choose this? is it just a standard?

    • @skittles6486
      @skittles6486 2 года назад

      It is given. alpha lies between 0 and 0.5 and beta lies between 0 and 1.

  • @elp09bm1
    @elp09bm1 2 года назад

    good for quick recap

  • @RogerKamena22
    @RogerKamena22 2 года назад

    Super clear explanation!

  • @plttji2615
    @plttji2615 3 года назад

    Thank you for the video, can you help me how to prove that is unbiased in this question? Question: Compare the average height of employees in Google with the average height in the United States, do you think it is an unbiased estimate? If not, how to prove it is not mathced?

  • @autogenes
    @autogenes 3 года назад

    nice. would have been even cooler if you visualized the iterations of the converging centroids

  • @Tyokok
    @Tyokok 3 года назад

    Thanks for the great video! One question: 1:03 where do you get or derived that while condition? Can you please provide some materials? Thanks!

    • @leotorres300
      @leotorres300 3 года назад

      The while condition is just there to let you know to continue the inequality (updating epsilon or t in some textbooks) until it no longer holds. Then your Xc will be the last value where the inequality did not longer hold. Evaluate your objective function at this Xc to mimimize it.

    • @Tyokok
      @Tyokok 3 года назад

      @@leotorres300 thanks for the reply! I understand the logic. But I just want to know math of this while condition. why exactly is in that form? is it from gradient or what? it cannot be any arbitrary condition

  • @damianwysokinski3285
    @damianwysokinski3285 3 года назад

    Thank you

  • @zhangbo0037
    @zhangbo0037 3 года назад

    better than any other videos

  • @-danR
    @-danR 3 года назад

    "...ℝ , superscript 10..." . Whoopsie, TTS.

  • @evgenyavgerinov5865
    @evgenyavgerinov5865 3 года назад

    Perfect, go ahead 👏

  • @sandeshacharya553
    @sandeshacharya553 3 года назад

    Your annotation is blocking the screen. Do not use annotations pls.

  • @gcumauma3319
    @gcumauma3319 3 года назад

    Examples of unbiased but inconsistent estimator may pls be given

  • @gcumauma3319
    @gcumauma3319 3 года назад

    very lucid but would be more helpful with numerical example, thanks

  • @jeninola3814
    @jeninola3814 3 года назад

    *Robot* this video was *effective*

  • @niazahmed9609
    @niazahmed9609 3 года назад

    Can you please share the link of the slides?

  • @mitchellsolano1631
    @mitchellsolano1631 4 года назад

    lol, that first slide answered all of my questions.

  • @natashaguptafanaccount
    @natashaguptafanaccount 4 года назад

    Much appreciated

  • @robotminsu
    @robotminsu 4 года назад

    thank you for easy explanation

  • @vigneshwarilango7866
    @vigneshwarilango7866 4 года назад

    How is it useful ? Any real time examples? Or examples in Data Science preprocessing?

    • @deepshallownet5206
      @deepshallownet5206 4 года назад

      Both distributions can be used as a prior for Bayesian Inference. The probability of the failure of a battery over time can be modeled as exponential distribution. With time, the probability of a failure grows exponentially. Laplace distributions are encountered in finances, you can check the paper "Modelling and predicting market risk with Laplace-Gaussian mixture distributions".

  • @yazanaloqaily5476
    @yazanaloqaily5476 4 года назад

    Thanks for this perfect video Please could you provide me any resource that solves examples about Gaussian Maximum likelihood?

    • @deepshallownet5206
      @deepshallownet5206 4 года назад

      Applied statistics and probablility for engineers 6th edition by Douglas Montgomery and George Runger chapter 7.4.2, example 7-13 may be what you're looking for, but I'm not sure.

    • @yazanaloqaily5476
      @yazanaloqaily5476 4 года назад

      Thank you sir i will check it God place you 🌸😍💐

  • @ccuuttww
    @ccuuttww 4 года назад

    well this example have a problem how can one coins affect the probability to the other coins

    • @deepshallownet5206
      @deepshallownet5206 4 года назад

      Thank you for the question. In the example, the outcome of one coin does not affect the outcome of the other. As mentioned at 1:00 , the two events are independent, that is why when we calculate P(A=H|B=T) it equals P(A=H). If we had dependent events, it could have been the case that P(A=H|B=T) != P(A=H). Regardless of the dependency, it is true that P(A=H|B=T) = P(A=H, B=T)/P(B=T). The example showed how to use the formula.

    • @ccuuttww
      @ccuuttww 4 года назад

      @@deepshallownet5206 OK u should state that the experiment is independent

  • @mdsifath7741
    @mdsifath7741 5 лет назад

    do i need to read the book? i'm asking coz i have it and there are lot of elaboration in the book. so i'm asking is watching videos sufficient enough?

    • @deepshallownet5206
      @deepshallownet5206 5 лет назад

      It really depends on what you goal is. This book gives less elaboration than the standalone books dedicated to the linear algebra. For example there are no exercises, which you will always find in the linear algebra books. It covers only the minimum, which is required to understand rest of the deep learning, but there still are some details in the book that may be hard to grasp at first. However after watching these videos, the book should be easier to understand. Also, this playlist omits the principal component analysis derived with just linear algebra. If you want to understand that part, you will have to read the chapter. Anyway, reading the book would not hurt. The book often skips the examples of concepts making them hard to understand. But these videos explain them in an easier way and provide examples.

  • @mdsifath7741
    @mdsifath7741 5 лет назад

    keep it up man.doing great job!!!!!!