DataAnalyticsWizardry
DataAnalyticsWizardry
  • Видео 14
  • Просмотров 26 422
Apply a function across multiple columns with the across function
Get data here to follow along:
www.kaggle.com/dgomonov/new-york-city-airbnb-open-data
Description:
The across function makes it easy to apply a function or transformation to multiple columns of a dataset.
Просмотров: 1 421

Видео

Mutate a column in R with the case_when function
Просмотров 4,7 тыс.2 года назад
Get data here to follow along: www.kaggle.com/dgomonov/new-york-city-airbnb-open-data Description: case_when is a super handy function to know as it allows us to test multiple if_else type statements.
Summarize Data using the Summarize function in R
Просмотров 3592 года назад
Get data here to follow along: www.kaggle.com/dgomonov/new-york-city-airbnb-open-data Description: The summarize function lets us compress our dataset into a single row with meaningful metrics such as mean, median, standard deviation, interquartile range (IQR), and others. If the summarize function is used with the group_by, then those metrics are generated for each of the group.
Mutate a column in R with the mutate function
Просмотров 2,2 тыс.3 года назад
Get data here to follow along: www.kaggle.com/dgomonov/new-york-city-airbnb-open-data The mutate function helps with the creation of new columns or modification of existing columns in R. This is a guide showing a few examples of using the dplyr mutate function in R.
Slice Function in R (for Data Subsetting)
Просмотров 1,5 тыс.3 года назад
Get data here to follow along: www.kaggle.com/dgomonov/new-york-city-airbnb-open-data The slice function allows us to narrow down our dataset based on specified row indices. This is a guide showing a few examples of using the dplyr slice function in R.
Selecting Specific Columns in R
Просмотров 5 тыс.3 года назад
Get data here to follow along:www.kaggle.com/dgomonov/new-york-city-airbnb-open-data The select function allows us to narrow down our dataset based on specified column names or conditions. This is a guide showing a few examples of using the dplyr select function in R.
Filtering data in R
Просмотров 1813 года назад
Get data here to follow along: www.kaggle.com/dgomonov/new-york-city-airbnb-open-data The filter function allows us to narrow down our dataset based on or more conditions. This is a guide showing a few examples of using the dplyr filter function in R.
Loading Data from a MySQL Database into R
Просмотров 1,2 тыс.3 года назад
It is fairly similar to load data from a MySQL Database as it is from a SQLite Database that we saw in the previous video. We use the RMariaDB (to connect to MySQL) and the DBI packages. The rest of it is similar to how we got the SQLite data but we will specify a couple of additional parameters.
Loading Data from a SQLite Database into R
Просмотров 1,5 тыс.3 года назад
Here is an example of how to load data from a SQLite database into R. We will use the RSQLite and DBI libraries to make the connection and then collect the data into memory.
Loading Multiple Excel Files into R
Просмотров 7 тыс.3 года назад
How to load multiple excel files into R using a combination of list.files, map_df, and read_excel.
Reading an Excel File into R
Просмотров 1913 года назад
How to load into an excel file into R using the readxl library function read_excel.
Reading CSV files into R
Просмотров 1863 года назад
Get data here to follow along:www.kaggle.com/dgomonov/new-york-city-airbnb-open-data How to read CSV files into R with the read_csv function from readr (tidyverse).
How to create an interactive Lineplot in R
Просмотров 2333 года назад
Step-by-step tutorial to create a Lineplot in R. We create an interactive plot of daily covid-19 cases in the US. Data Source: ourworldindata.org/coronavirus-source-data More details on ggplot: ourworldindata.org/coronavirus-source-data Examples of geom_line: plotly.com/ggplot2/geom_line/
How to create an interactive scatterplot in R
Просмотров 2083 года назад
Creating scatterplots in R: How to create scatterplots in R that are formatted and presentable with the help of ggplot and plotly library in R. Scatterplots are a great way to visualize the relationship between two numeric variables.

Комментарии

  • @nuestraaula1991
    @nuestraaula1991 6 месяцев назад

    Thank you so much, sir. I really needed this trick! God bless you! Greetings from Barranquilla, Colombia!

  • @lixinli3858
    @lixinli3858 8 месяцев назад

    Great video! I have a question: how can I create the data_source column with the name of each file (instead of file1, file2, etc)?

    • @dataanalyticswizardry8085
      @dataanalyticswizardry8085 8 месяцев назад

      It is possible. I manually typed the file names vector as an example. You can replace this step with a way to retrieve the file names that you have and then assign it to the files using the names(files) as I show with the file names that you retrieved.

  • @oscarsibanda9454
    @oscarsibanda9454 8 месяцев назад

    Thanks, very helpful

  • @gowingabriel7567
    @gowingabriel7567 Год назад

    Thanks a lot

  • @Tony_Toni_Tone
    @Tony_Toni_Tone Год назад

    Thank you so much. Your channel is amazing

  • @ayaqz3144
    @ayaqz3144 Год назад

    thank you

  • @lukeward1403
    @lukeward1403 Год назад

    Hi thank you for the tutorial It was very helpful, quick question, how would this work if one of the excel say "file 2" had multiple sheets of data within it?

  • @thissatori
    @thissatori Год назад

    Great video thank you! Quick question, if my files are UTF-8, how can i tell it the character encoding? I have a bunch of UTF-8 CSVs and I cannot find the answer anywhere...

    • @dataanalyticswizardry8085
      @dataanalyticswizardry8085 Год назад

      You can specify the encoding as shown below. Hope this helps. map_df(files, read_csv, locale = locale(encoding = 'UTF-8'))

  • @danielchapilliquen6777
    @danielchapilliquen6777 Год назад

    Hello i got an error at the moment to upload the data into R. This mentions that there is an error in Error in utils::unzip(zip_path, list = TRUE). The source is xlsx.

  • @ken799232006
    @ken799232006 Год назад

    Thank u sir

  • @ankursrivastava7160
    @ankursrivastava7160 Год назад

    Hi Sir got my data loaded but is it same as data framing?

  • @Sarbasttt
    @Sarbasttt 2 года назад

    Great, Thank u sir

  • @avnistar2703
    @avnistar2703 2 года назад

    Thanks! Really nicely explained!

  • @jasonlu2596
    @jasonlu2596 2 года назад

    So helpful, thanks so much!

  • @Moccalocca100
    @Moccalocca100 2 года назад

    Connections using insecure transport are prohibited while --require_secure_transport=ON.

  • @RealProDatascience
    @RealProDatascience 2 года назад

    Thank you for this amazing tutorial. I already subscribed to your channel.

  • @ssingh597
    @ssingh597 2 года назад

    Thanks for the video

  • @erkamcetkin8668
    @erkamcetkin8668 2 года назад

    excellent efficient explanation. thank you so much.

  • @syah7991
    @syah7991 2 года назад

    Thank you!

  • @mehmetkaya4330
    @mehmetkaya4330 2 года назад

    Thank you so much!

  • @ChristopherFNeto
    @ChristopherFNeto 2 года назад

    Thank you, very helpful video. Is there a reason why you broke out the concat into a different step instead of using the 'full.names = TRUE' argument for list.files?

  • @18_avishkar_hande19
    @18_avishkar_hande19 2 года назад

    great video ! but my code runs into an error :cannot allocate vector of size 10.1 mb

    • @dataanalyticswizardry8085
      @dataanalyticswizardry8085 2 года назад

      Hi, if you have a lot of processes running, then closing some of those might help. Can you try doing garbage collection before you run the code that I showed? Just execute the following command: gc( ) Also, you can check the memory limit that R is allowed to handle using the following command: memory.limit() If possible, increase it as such: memory.limit(size = 1800)

  • @asirintisar5504
    @asirintisar5504 2 года назад

    Thanks a lot for this helpful video