How To Read And Process Huge Datasets in Seconds Using Vaex Library| Data Science| Machine Learning

Поделиться
HTML-код
  • Опубликовано: 16 сен 2024

Комментарии • 48

  • @harshtamkiya8505
    @harshtamkiya8505 4 года назад +8

    This is close to madness. Wonderful library 👏

    • @kamarimarley617
      @kamarimarley617 3 года назад

      pro trick : you can watch series on flixzone. Been using them for watching all kinds of movies during the lockdown.

  • @gurdeepsinghbhatia2875
    @gurdeepsinghbhatia2875 4 года назад +1

    Vaex , first listen from your vedio sir , awsome sir , thanks sir to provide such usefull knowledge

  • @varagantimanjula3135
    @varagantimanjula3135 4 года назад +1

    I heard about vaex for the first time in this video...I heard about dask...as mentioned please do the video on dask as well...Thank you Krish...for everything you are doing for community...

  • @shivanandprasad5747
    @shivanandprasad5747 4 года назад

    Thanks Krish...u r my idol..to interest towards data science...just because of u...

  • @sandipansarkar9211
    @sandipansarkar9211 4 года назад +1

    Great explanation .Need to practice in Jupiter notebook.Thanks

  • @waqarsarwar7012
    @waqarsarwar7012 4 года назад +3

    waiting

  • @saumojitbhattacharjee7292
    @saumojitbhattacharjee7292 3 года назад +1

    Hi ! Can You Make One Video On Dask For Extracting 40gb+ dataset ?

  • @suneelkumarilli8311
    @suneelkumarilli8311 4 года назад +2

    Dear Krish Naik, I have tried to sorting data but I am getting error 'Hdf5MemoryMapped' object has no attribute 'sort_values'.

  • @devkaranjoshi816
    @devkaranjoshi816 3 года назад +1

    Can we further pass this data into model training.

  • @dadoll1660
    @dadoll1660 4 года назад +1

    Thanks Krish! What do you hate about this library? That you think could be possibly improve.

    • @krishnaik06
      @krishnaik06  4 года назад +1

      Will discuss while comparing with dask

  • @souravbiswas6892
    @souravbiswas6892 4 года назад +1

    Hi please create a detailed playlist on pyspark, thanks in advance

  • @nocode659
    @nocode659 4 года назад +2

    PLS TALK ABOUT IMP OF COMPETITVE PROGRAMMING!

  • @ashishbhatnagar9590
    @ashishbhatnagar9590 4 года назад +1

    Sir please complete docker remaining tutorials

  • @user-rp7rz9qz2n
    @user-rp7rz9qz2n 4 года назад +1

    Please sir ,why your book still not available in amazon..I really need to buy it ..thank you

  • @bcr5430
    @bcr5430 4 года назад +1

    When the data is too large like 4GB, it saves it in chunks of hdf5 files. Any idea how to read these? Should we read one by one and concatenate them?
    Also, how to deal with object datatype when it comes to hdf5 files?

  • @ishwarashar8503
    @ishwarashar8503 4 года назад +1

    Hey! I have already installed Vaex successfully but when I try to convert the csv file it shows error that " No module named 'vaex.hdf5' ".

  • @sreevanii2570
    @sreevanii2570 2 года назад

    While converting a text file to hdf5 using VAEX, it will not maintain the same precision for each column as in text file, and I can not modify it. Please help me.

  • @j.p.brochu8592
    @j.p.brochu8592 3 года назад +1

    Pip install vaex does not work on my windows 10 computer (with python 3.8). It gives me errors. Any body else who had hard time to install Vaex ?

    • @datareactor4143
      @datareactor4143 3 года назад

      I get the error message as "numpy.core.multiarray failed to import" after running the first set of codes

  • @anonyme103
    @anonyme103 4 года назад +1

    For comparison, you could've used %time in each cell

  • @DatascienceConcepts
    @DatascienceConcepts 4 года назад

    Great!

  • @abhishekjn3390
    @abhishekjn3390 4 года назад +1

    👍😊 but why we have to install in virtual environment ?

    • @parthagarwal4592
      @parthagarwal4592 3 года назад

      Install in your base environment.
      But I think that would fill up your C drive.
      So it's better to do so in some other drive.

  • @pandharpurkar_
    @pandharpurkar_ 3 года назад +1

    what is chunk_size meaning?

  • @prashanthardikar9895
    @prashanthardikar9895 3 года назад

    Hi Krish, I want to concatenate two CSV files I also want to edit a column value in each of the files before I concatenate. The file size of each file is about 200 MB with 1200 columns and 35000 rows. There are about 90 such pairs of csv files that I want to concatenate. Can this be done faster with Vaex than pandas? I did not see any improvement when I tried using dask. Thanks

  • @waqarsarwar7012
    @waqarsarwar7012 4 года назад +2

    shout out

  • @satyabansahoo6075
    @satyabansahoo6075 4 года назад

    sir, modin also... It makes pandas faster. Make a video on that.

  • @singhamitgkv7709
    @singhamitgkv7709 4 года назад +1

    Sir, This library does not import in colab.

    • @j.p.brochu8592
      @j.p.brochu8592 3 года назад

      I have the same problem in Jupiter Notebook windows 10 python 3.8. it gives me dependencies issues.

  • @jagadeeshkrishnamurthy1971
    @jagadeeshkrishnamurthy1971 4 года назад +1

    Will it read file from hadoop

  • @abubakarsunny5992
    @abubakarsunny5992 3 года назад

    But how can i drop column or row in veax Library

  • @gokulpisharody3155
    @gokulpisharody3155 4 года назад

    When will be next batch of Advance DL & NLP going to start ,since I want to join that batch.i am currently in ML Master course of ineuron.

  • @MS-ry9im
    @MS-ry9im 3 года назад

    It wont work with vectorization

  • @amirahb3329
    @amirahb3329 3 года назад

    how come it is possible ?

  • @raparthipoornachanderrao5121
    @raparthipoornachanderrao5121 3 года назад +2

    Could you do video on lux-api

  • @datareactor4143
    @datareactor4143 3 года назад

    Did anyone get error message as "numpy.core.multiarray failed to import" ? can someone help to solve this error?

  • @megdesouza1611
    @megdesouza1611 3 года назад

    Can we read file resides on remote desktop using vaex.open?

  • @vivekpuurkayastha1580
    @vivekpuurkayastha1580 4 года назад

    why is there no internship directly?

  • @priyankapradhan4539
    @priyankapradhan4539 4 года назад

    can it used to read 1M image dataset????

  • @AkashKumar-mb4pd
    @AkashKumar-mb4pd 4 года назад

    CAN SOMEBODY EXPLAIN PRE-PROCEESING & WHAT STEPS DOES IT INVLOLVE ??

  • @vishalkatoch8699
    @vishalkatoch8699 4 года назад

    sir please videos in hindi..

  • @bhavyaparikh6933
    @bhavyaparikh6933 4 года назад

    why you are vibrating??