Gaussian Mixture Model (GMM) for clustering - calculate AIC/BIC
HTML-код
- Опубликовано: 11 сен 2024
- In this video, I tried to implement Gaussian Mixture Model (GMM) for clustering using Scikit-Learn. Gaussian Mixture Models (GMMs) assume that a certain number of Gaussian distributions exist within a dataset. Therefore, each Gaussian distribution represents a particular cluster. We can also calculate AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) in GMM clustering to determine the best fit.
GitHub address: github.com/ran...
For more details, check Scikit-Learn documentation: scikit-learn.o...
01:04 Import the required libraries
02:50 Load penguins dataset
04:45 Drop NaN values
07:26 Replace categorical variables with numeric values
08:51 Select features and targets
10:25 Perform preprocessing
10:57 Perform GMM for clustering
12:46 Comparison of predictions with targets
17:02 Calculate AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) to determine the best fit
#datascience #clustering #python #jupyternotebook #unsupervisedlearning #GaussianMixtureModel #distributionbasedclustering #sklearn #matplotlib
Within 10:11 - 10:43, I scaled the features but eventually forgot to use them later.
It's better not to scale the features for this example. It seems unscaled features work better in calculating AIC.