Merging multiple datasets for Machine Learning Project | Challenges in merging multiple datasets

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024
  • Hello Friends,
    In this episode, I am going to share details about,
    When you should merge datasets
    Why you need to merge datasets
    challenges while merging datasets
    how to handle these challenges
    what are all alternative to avoid merging datasets
    code: github.com/dat...
    Stay tuned and enjoy Machine Learning !!!
    Cheers !!!
    #mergedatasets #datapreprocessing #machinelearning
    #datamagic
    Connect with me,
    ☑️ RUclips : / datamagic2020
    ☑️ Facebook : / datamagic2020
    ☑️ Instagram : / datamagic2020
    ☑️ Twitter : / datamagic5
    ☑️ Telegram: t.me/datamagic...
    For Business Inquiries : datamagic2020@gmail.com
    Best book for Machine Learning : amzn.to/3qCe0Rf
    Machine Learning for Absolute Beginners: amzn.to/3mMSRUO
    Machine Learning For Dummies: amzn.to/32K7Ms6
    Hands-On Machine Learning with Scikit-Learn and TensorFlow: amzn.to/3mOf0SL
    The Elements of Statistical Learning: amzn.to/3Jysegf
    Machine Learning in Action: amzn.to/3mNE7Ff
    🎥 Playlists :
    ☑️Machine Learning Basics
    • Machine Learning Basics
    ☑️Feature Engineering/ Data Preprocessing
    • Data Preprocessing
    ☑️OpenCV Tutorial [Computer Vision]
    • OpenCV Tutorial [Compu...
    ☑️Machine Learning Algorithms
    • ML Algorithms
    ☑️Machine Learning Environment Setup
    • Machine Learning Envir...
    ☑️Machine Learning Model Deployment
    • Machine Learning Model...
    ☑️Machine Learning Projects
    • Machine Learning 100 P...
    ☑️Kaggle Tutorial
    • Kaggle Tutorial
    ☑️Microsoft Lobe Tutorial
    • Microsoft Lobe
    Thank you for watching !!
    Please Like, Comment, Share and Subscribe!!!

Комментарии • 44

  • @IgarokSpider-ht2ve
    @IgarokSpider-ht2ve 10 месяцев назад

    Best channel. Pls make a video on having 9 datasets for 9 different time series price assets. Separate it in 3 groups: (dataset 1,2,3), (dataset 4,5,6), (dataset 7,8,9). Train each group for price correlation with different AI models. Then in the end ensemble to one meta-model.

  • @popoolaolawale7564
    @popoolaolawale7564 2 года назад +1

    Thanks for the brilliant explanation. Please, kindly explain further how the large dataset can be used as train data and the small dataset as test data. Without the need to merge the datasets. It will be greatly appreciated.

    • @popoolaolawale7564
      @popoolaolawale7564 2 года назад

      Thanks.

    • @DataMagicAI
      @DataMagicAI  2 года назад

      Its simple just load both dataset in two different dataframe. Drop unecessary columns from both dataset. Use one for training and another one for testing.
      Still will try to create short episode on it soon.

  • @Futureyouth-be1bo
    @Futureyouth-be1bo 3 месяца назад +1

    i have two datasets those are flights can i merge them or i can use one for training and one for testing

  • @adrienykt874
    @adrienykt874 Год назад +1

    hi. can we merge 2 datasets which have the same number of columns but not necessarily the exact headers?
    if yes can you please upload a video showing it. thanksss

  • @manojbattula8100
    @manojbattula8100 5 месяцев назад +1

    Is it possible to combine two different deeplearning models. For example yolo and CNN.

    • @hudata
      @hudata 5 месяцев назад

      would love to know that aswell !

  • @Work-n8h
    @Work-n8h Месяц назад

    can u make video on federated learning tutorials from basic?

  • @sadegh333
    @sadegh333 2 года назад +1

    Thanks for Great Contents. Is it possible to make episode about getting data from different source with out copying in our computer such as direct connect from sql and github and kaggle . Thanks

    • @DataMagicAI
      @DataMagicAI  2 года назад

      I haven't tried but I guess its possible.
      For example:
      I can access data from sql table with ,
      pandas.read_sql_table()
      You can refer below API doc for more details,
      pandas.pydata.org/docs/reference/api/pandas.read_sql_table.html

  • @user-uh3rx3yl6r
    @user-uh3rx3yl6r Год назад

    Please can you make a video on training a deep learning model using a dataset and testing it on a different dataset? Please note that both datasets are from different representations

    • @DataMagicAI
      @DataMagicAI  Год назад

      Typically we won’t do it. It will give poor performance. We should have all representations during training so our model can learn best out of it.

  • @regal7548
    @regal7548 4 месяца назад

    Sir. So whenever i download a dataset from kaggle , i get like 6 or 7 datasets included in them. Each having different variables. When i thought of removing null from a dataset, theres like a huge number of null. So i cant remove them. For example an airbnb dataset from kaggle. Kindly help me. Im stuck at the point where i dont know what to do with these datasets. 😢😢

  • @shaminakaushar8421
    @shaminakaushar8421 11 месяцев назад

    please can you make a video on merge medical image dataset and A textual dataset of patient symptoms or diagnoses. i wanted to make a project on : Generate medical images (e.g., X-rays or MRI scans) based on textual patient symptoms or diagnoses for educational or diagnostic purposes

  • @anuragfunde1569
    @anuragfunde1569 Год назад

    this lecture was amazing one .really helps but i also wants to know can we merge two models trained on 2 different datasets into one model using multimodels concept...if yes its possible then can you please upload a video on this

    • @td0rmx
      @td0rmx Год назад

      I also would like to know this @Data Magic (by Sunny Kusawa)

  • @hundumamarzo7618
    @hundumamarzo7618 Год назад

    Thanks it very Nice Presentation But How Can combine Different Features /Column not records with Same common Feature ? E.g I want to predict Performance of Employee with different Features like in banking based on Quantitative parameters Such as transactions sold , Number of New Account etc by Salesman Wise. here want to prepare two data set to Transaction Data set and Account data set , final want two merge this salesman wise .

    • @DataMagicAI
      @DataMagicAI  Год назад

      From transaction data and accounts data find the feature which represent the unique employee.9For example it could be an employee id something). based on this employeed id merge the total transactions sole and to new account created. Again it might your data is having so many records of transaction or new account created in this case first data binning need to be done on hourly, daily or weekly basis as per your business requirement. If you are doing some realtime project contact me at datamagic2020@gmail.com I can consult you further.

  • @alndr4u
    @alndr4u Год назад +1

    merge two dataframes side by side, after merging right side value shows nan. Plz help

    • @DataMagicAI
      @DataMagicAI  Год назад

      Issue is your column names are not same in both datasets

  • @vlogersadda7625
    @vlogersadda7625 2 года назад +1

    Sir. Thank you but i want to know What if merge data set if i have multiple target.

    • @vlogersadda7625
      @vlogersadda7625 2 года назад

      i have Heart disease Data set and Diabetes data set can i combined it in one?

    • @DataMagicAI
      @DataMagicAI  2 года назад

      If you have multiple different target values it's fine but you features should be same.

  • @farzifables_17
    @farzifables_17 Год назад

    Hello sir, i have a doubt on merging of many dataset i have six dataset then how to merge all because in same dataset the data is totally different only gender is similar then how to merge categorical data? If i merge the effect tha different columns its shows null value please tell me how to solve this kind of dataset! 0:26

    • @DataMagicAI
      @DataMagicAI  Год назад

      If you don’t have common fields in dataset then don’t merge. Might be dataset for same purpose but it’s parameters are all together.

  • @feelgoodspace
    @feelgoodspace Год назад +1

    thanks a lot

  • @1of999
    @1of999 2 года назад

    Hello, I have been trying to look for datasets from Kaggle that have a common column or columns to merge and perform EDA.
    I have had a rough time in finding related datasets on Kaggle can you please help

    • @DataMagicAI
      @DataMagicAI  2 года назад

      Below are the two height weight dataset from kaggle. You may check it out.
      www.kaggle.com/datasets/yersever/500-person-gender-height-weight-bodymassindex
      www.kaggle.com/datasets/tmcketterick/heights-and-weights
      Always keep in mind, whenever you are selecting common columns to merge dataset, make sure it's values at same scale.
      For example. If weight is in KG in one dataset then same should be another dataset.
      Lets assume if one dataset have weight in KG and another in Pounds. Then you need to convert both dataset weight values either to KG or Pound and then merge the dataset.
      Hope it will help you to understand better.

  • @ManveenKaur0902
    @ManveenKaur0902 Год назад

    hi, i am doing a project on impact of climate change on indian birds. i need to combine 2 datasets , ie , climate change in india (temp,precipitation) and the second dataset would be how its affecting indian birds in those regions. do i need to combine the datasets or some other method can be used for my ml project?

    • @DataMagicAI
      @DataMagicAI  Год назад

      If your both dataset having common feature like date and region then you can merge it.

  • @rahmamohammed8541
    @rahmamohammed8541 2 года назад

    hello how do i combine different dataset eg i have sepsis dataset and jaundice dataset to develop a prediction model and how do i encode them after combining .some variables may not indicate the other disease how do i solve this problem please

    • @DataMagicAI
      @DataMagicAI  2 года назад

      First of check whether combining these two datasets make sence to solve business problem. Is this datasets for same purpose and sharing common features....if yes then only combine it...encoding feature values is not a big deal.

    • @rahmamohammed8541
      @rahmamohammed8541 2 года назад

      @@DataMagicAI
      Yes they are for same purpose to develop a prediction model ,they share some features but there are other features they don't share how do i combine then encode them in csv file ?
      Thank you

  • @soubhikbandhyopadhyay8160
    @soubhikbandhyopadhyay8160 24 дня назад

    How to feed null data using time series ml model

    • @DataMagicAI
      @DataMagicAI  22 дня назад

      Plz watch out our time series analysis playlist. There are few libraries which imputes null values by them selves. You can also fill this null values with valid values.

  • @vlogersadda7625
    @vlogersadda7625 2 года назад

    i have Heart disease Data set and Diabetes data set can i combined it in one?

    • @DataMagicAI
      @DataMagicAI  2 года назад

      If both dataset have same features or at least most of the features same then you can combine thise same features.

  • @vineethreddy2960
    @vineethreddy2960 Год назад

    Hey, what if two dataset columns are differing entirely?

    • @DataMagicAI
      @DataMagicAI  Год назад

      If your target values are same in that case you can merge the both dataset features for same target values.
      If targets are also different then these two dataset are for different representation don’t try to merge it.

  • @Sandyyy143
    @Sandyyy143 8 месяцев назад

    Bro i merged two data set...can i have that as a copy ??

    • @DataMagicAI
      @DataMagicAI  8 месяцев назад

      Yes. Save back and store it somewhere.

    • @Sandyyy143
      @Sandyyy143 8 месяцев назад

      ​@@DataMagicAI I merged the dataset by seeing your video...but it Is stored in a variable...now I need that dataset as an external copy...how can I get this ?