Complete guide to outliers| how to work with outliers | Finding an outlier in dataset using python,

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 11

  • @rajasekaranm1198
    @rajasekaranm1198 2 месяца назад

    i can't lie to you all,unfold data science is one of the best data science learning platform
    ,i learned many usefull skills from his videos..............

  • @sanketadamapure802
    @sanketadamapure802 Год назад +3

    Distance-based methods for outlier detection are well-suited for handling outliers. Here are a few distance-based algorithms commonly used for outlier detection:
    1 ] k-nearest neighbors (k-NN): In k-NN, each data point is classified based on the majority class among its k nearest neighbors. Outliers can be identified as data points that have few or no neighbors within a certain distance.
    2] Local Outlier Factor (LOF): LOF calculates the local density of a data point compared to its neighbors. It identifies outliers as data points with significantly lower density compared to their neighbors. LOF takes into account the distance to k-nearest neighbors and provides an outlier score for each data point.
    3] Isolation Forest: Isolation Forest constructs random decision trees to isolate outliers. It measures the number of splits required to isolate a data point from the rest of the data. Outliers are identified as data points with a shorter average path length in the tree construction.
    4] DBSCAN (Density-Based Spatial Clustering of Applications with Noise): DBSCAN groups together data points that are close to each other based on a density criterion. Outliers are considered as data points that do not belong to any dense cluster.

  • @rajasekaranm1198
    @rajasekaranm1198 2 месяца назад

    what a beautiful explanation

  • @ozan4702
    @ozan4702 2 месяца назад

    Thank you for the video. Do you recommend combining multiple outlier treatment methods? For example, log transform + winsorization? Or log transform + winsorization + standard scaler? If so, what should be the order of applying these methods?

  • @manjeerag868
    @manjeerag868 Год назад

    Hi Aman
    Thanq so much for your valuable videos.
    Pinged you on linked in. Please reply🙏

  • @umeshtiwari800
    @umeshtiwari800 Год назад

    Tx, Aman

  • @balajikomma541
    @balajikomma541 Год назад

    Sir actually I'm following your playlist "Big Data Hadoop and Unix playlist" but after video 'Sqoop' installation step, there are no other videos, could you please tell me where are the continuation videos of these playlist. Kindly update that playlist.
    Also one doubt, is Big data even in 2023 is important for data science or else can be managed with the cloud technologies like databricks pyspark in aws or azure or GCP, Kindly reply sir

  • @bhuvanavinodh3498
    @bhuvanavinodh3498 Год назад

    This Dataset pl