DBSCAN Clustering | Python | Clustering

Поделиться
HTML-код
  • Опубликовано: 27 окт 2024

Комментарии • 47

  • @pankajgoikar4158
    @pankajgoikar4158 2 года назад +2

    Trust me bro. You have done great job. Just amazing. Very helpful and simple code. Thanks and keep it up.

    • @StatsWire
      @StatsWire  2 года назад

      Thank you Pankaj for your kind words.

  • @soumyabanerjee6860
    @soumyabanerjee6860 2 года назад +6

    How can we find the optimmal number of Eps and min_samples? Is there any method similar to elbow method under Kmeans by which we can find the best values?

    • @StatsWire
      @StatsWire  2 года назад +1

      I will get back on this soon.

    • @amritakaul87
      @amritakaul87 2 года назад

      I have a similar query how do I choose values for eps and min_samples for different data sets, a few data sets have huge values and a few have a smaller values. Thanks in advance if anyone answers the same

  • @AdnanKhan-vr2rc
    @AdnanKhan-vr2rc Год назад +2

    why is the elbow method used here to set the value of Epsilon? Epsilon is the radius of the circles in DBSCAN right and elbow method gives us optimal number of clusters for K-means? how are these two related exactly?

    • @darkprinc3ss47
      @darkprinc3ss47 Год назад +1

      I know this is kind of old now, but I think it was supposed to be used to make sure you get the optimal number of clusters. He had clusters 0-4 which is 5 clusters, and that matches up with the result of the elbow method. I used all the features of the data (not just 2), so I had to guess-and-check values of epsilon and minimum samples until I got 5 clusters.

  • @ananayagupta4719
    @ananayagupta4719 Год назад

    Where can I find this jupyter notebook of yours

    • @StatsWire
      @StatsWire  Год назад

      Please find the dataset and jupyternotebook link: github.com/siddiquiamir/Python-Clustering-Tutorials

  • @MrMadmaggot
    @MrMadmaggot 11 месяцев назад

    Man do you know where can I find the code of DBSCAN implemented from scratch but for multidimensional datasets? Not only 2 x's but x1,x2,x3,and x4

    • @StatsWire
      @StatsWire  11 месяцев назад

      I am not sure but I will have to look for it.

  • @tmorid3
    @tmorid3 Год назад

    why do we use fit_predict and not fit on the train set and then predict on the test set? thanks!

    • @StatsWire
      @StatsWire  Год назад

      Because we fit the data on the training set only because we use training data for training the model and predict on the test data because that data is not trained by the model it is unseen data.

  • @suridtheanalyst6770
    @suridtheanalyst6770 Год назад

    Thanks brother ! great video
    Can I have the jupyter notebook ?

    • @StatsWire
      @StatsWire  Год назад +1

      Sure, please find the link for notebook and dataset: github.com/siddiquiamir/Python-Clustering-Tutorials

    • @suridtheanalyst6770
      @suridtheanalyst6770 Год назад

      @@StatsWire Thanks a lit man

    • @StatsWire
      @StatsWire  Год назад

      @@suridtheanalyst6770 You're welcome

  • @oubaaid8108
    @oubaaid8108 2 года назад +1

    Thank you for the video, replace all that plotting mess with this line :
    sns.scatterplot(x="x_axis", y="y_axis", hue="labels", data=df)
    Keep it up !

    • @StatsWire
      @StatsWire  2 года назад

      Thank you for providing the shortest code:) I hope this will help others

  • @paolofasoli4310
    @paolofasoli4310 Год назад

    Could you change the description of the video and insert there a link to the code and to the data set, please?

    • @StatsWire
      @StatsWire  Год назад

      Thanks for the suggestion. I will do it right away.

  • @maxjohnson7623
    @maxjohnson7623 Год назад

    Thank you so much sir!

  • @antoniobento2105
    @antoniobento2105 2 года назад

    After I click "Run" at 1:39, I get a "NameError: name 'df' is not defined". I don't know why it is different for you

    • @StatsWire
      @StatsWire  2 года назад

      Hi please see if df is saved or not. Before running that line use print(df) and see if you are getting error or it is printing df

  • @dilnawazahmed949
    @dilnawazahmed949 Год назад

    Can u give the dataset in the description box?

    • @StatsWire
      @StatsWire  Год назад +1

      Please find the dataset and jupyternotebook on my github account: github.com/siddiquiamir/Python-Clustering-Tutorials

  • @hiratabassum
    @hiratabassum 2 года назад

    Very helpful video

  • @mdashrafmoin1170
    @mdashrafmoin1170 2 года назад +1

    How to detect anamoly using dbscan can u provide me the code

    • @StatsWire
      @StatsWire  2 года назад +1

      That's a different topic. I will have to make a separate video on that

    • @mdashrafmoin1170
      @mdashrafmoin1170 2 года назад +1

      @@StatsWire thanks

  • @applepie9806
    @applepie9806 2 года назад +1

    Please turn on the automatic subtitles/captions, it helps me understand better. I'm a visual learner. I'm also really bad at understanding accents. Also the playlists are gone.

    • @StatsWire
      @StatsWire  2 года назад

      Hi, I am not aware if I can still do it. Is it possible to do it now? I will do it right away if that is the case

  • @roopeshroope2026
    @roopeshroope2026 Год назад

    how to plot a single cluster I got only one cluster

    • @StatsWire
      @StatsWire  Год назад

      That's not correct. Because if there is only one group then why there is a need to do clustering.

  • @hananetliouant9218
    @hananetliouant9218 Год назад

    How to calculate the silhouette score ?

    • @StatsWire
      @StatsWire  Год назад

      You can refer to the official documentation: scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html

  • @fathimafarha8217
    @fathimafarha8217 Год назад

    Can u help me? I have a Doubt

    • @StatsWire
      @StatsWire  Год назад

      Yes, let me know

    • @fathimafarha8217
      @fathimafarha8217 Год назад

      I am doing my thesis using DBSCAN
      Can u help me to do?
      Can I contact u?

    • @StatsWire
      @StatsWire  Год назад

      @@fathimafarha8217 you can connect on Instagram stats_wire

  • @mazharalamsiddiqui6904
    @mazharalamsiddiqui6904 3 года назад

    Nice tutorial

  • @tanirikapal8690
    @tanirikapal8690 4 месяца назад

    my code is not reading the dataset