He looks more like Ole Gunnar Solskjaer honestly. Super sub that comes in and scores (explains something so easily to me) after a dull goalless game with nothing significant (me spending days stuck being confused).
I have 10 years working with R and I have never ever had a problem I couldn't solve with it. From ML to basic data analysis and visualization ( I'm a soil scientist). I program in both but I really like R
@@elvinmustafov7313 nope, but you should consider getting good spine inusrance, 40 strugle is real, progamming languges are not flavors they are desing to solove certain issues and make money cough java-script, python is the most easy lanaguage, like you never down the path of programming go for python just to get taste of it, when it comes to data,ml,AI python is go to, but that does not mean R is usesless R is there to solve satiscal learning,visulaization and much more.
@@elvinmustafov7313 As he said R is easy to get into. Start with c it'll make easier and give a better understanding and will took few months or just go with R if you have lesser time
@@elvinmustafov7313 as a general rule of thought personally its never too latge for anything ( maybe it sounds a bit cheesy but I believe it to be quite true)
In personal experience never used R for statistical analysis haha, only for data gathering and cleansing. I’ve been got data from SQL Server, Csv Files, Text Files, GitHub repo metadata, APIs and Point of Sales. For cleansing libraries such as dplyr, lubridate, anydate, stringr, etc, magrittr helps to keep data transformation really clear. Finally once the data is cleaned up and ready, land it to the warehouse. My personal opinion is that you can do the same ETL processes in R than in Python even with less code (if you know how to use R properly)
This is a really underrated usage of R, especially when you have the DBI package able to work with the major open source implementations of SQL databases. I manage a server hosting a lot of Shiny apps that rely on SQLite databases to work properly and the update routine of ETL is done using R scripts running on a daily basis with cronjobs.
In my opinion, when we talk about Python as a competitor of R, we should refer to it as Python-Pandas. More credit should be given to Wes McKinney to the surge of Python for data science
Pandas is a disaster. I don't know how it could become an industry standard. Sadly it's very obvious the creators of the video have no direct experience with these tools. Not only missing numpy's name, but saying things like "Python supports csv and json while R support xls" or "Python is good for data wrangling thanks to pandas, while R is good in statistical analysis" -- these are simply not true, absolutely misleading.
@@ultrasoft5555 this is definitely not misleading, Python can actually read json and csv files, web developer's usually use json files, even those using Django use json files. And the pandas library has functions that enable reading csv, I think it's called read_csv or something like that. And that guy literally explained what numpty is, that was intentional.
@@MKgobe the idea that python “ONLY” supports those is the problem. Python can read any file format under the sun (specific packages for everything from xlsx to avro, parquet etc ) I am pretty sure python supports more formats simply because it is multi purpose. Also - in visualization, the guy has probably never seen libraries like plotly, seaborn etc. Hec, even a package that copies ggplot2 exists in python.
@@MrPrabhatRastogi I recently switched to python from R. I'm by no means an expert, but the fact that he said if you like cool visualizations go for R, made this whole video kind of sus for me, cuz I love visualization and plotly seems to work better in python and seaborn can look even better than ggplot2 imo. Even though coding ggplot2 is much easier than plotly or seaborn, you cannot beat how smooth and nice those libraries look.
For a long time, I haven't seen such an informative presentation. You just made an inventory without explaining how the specific language features make the difference.
I use and teach both. In fact I am teaching one class in R right now and one class in Python. I've also helped establish professional organizations with data science frameworks. This video is actually very in line with what I think. I've had a hard time teaching Python to people who have only used Excel, R is easier. If I'm doing a complex analysis requiring advanced probability concepts, I work in R. But I see why professional companies have leaned more towards Python. Its infrastructure makes Python better for security, reproducibility, consistent AI for apps and machinery, working with streaming, graphical, or network data, and text mining.
My view is slightly different: R and more specify Tidyverse (packages / R dialect) is the clear leader in supporting fully reproducible workflows where the pipeline ends with reporting - reporting includes interactive web applications all the way down to static documents. The Python community has not been as strongly engaged in the reproducibility discussion outside of enterprise execution pipelines. They simply stop before reporting and that stopping point is premature. (Edit: typo misspelling corrected)
Hi Robert. I one of the 'people who have only used excel'. Can you tell learning R is easier for people like me? (Fyi I'm learning phyton as my first language atm)
@@Kei3th1424 why R is easier? One of my observations is that when in Excel, you're somewhat writing codes when you type out formulas that contain functions. The way u work with data in R is very similar in terms of the functional approach whereas if u go to python the syntax will be a bit more alien with heavier emphasis on object oriented designs and quite frankly a less intuitive way of working with data in pandas especially that makes it hard to learn for first time programmers.
@@hansisbrucker813Julia still has to prove mass adoption. R is the successor to S and thus much older then Python and Statisticians publish new algorithms in R whereas Python is taught to many beginners. Both will be around in twenty years. Can you say the same about Julia? There is no doubt that Julia is great in what it does but so were BASIC and PROLOG and Smalltalk languages. If Julia is good for what you do with it today, then go for it. If learning a language is a long time investment of your time and effort, choose something established.
2:55 Jupyter notebooks can support a range of languages. It started as IPython, which was Python-centric. And while Python is still the foundation and implements the Notebook API, there is a range of kernels allowing you to run other languages in notebook cells.
JuPyteR is Ju-lia Pyt-hon and R acronym but... The Jupyter system supports over 100 programming languages (called “kernels” in the Jupyter ecosystem) including Python, Java, R, Julia, Matlab, Octave, Scheme, Processing, Scala, and many more. Out of the box, Jupyter will only run the IPython kernel, but additional kernels may be installed. (c)
R is object orientated as well as functional and using both in both languages makes a lot of sense. Do you really think that any of this makes an important difference between these languages? I don't think these are the decisive differences.
Ok. I'm an R user and when he was explaining Python's pros I was thinking R can do the same. It's unfair mentioning Pandas without mentioning Tidyverse
I use both in my workflow. Speed-wise, I think Python is about twice as fast for general purpose operations (like for loops), but in practice the performance depends largely on the packages used.
Fine up to 5:40, then you lost me. The base R language reads CSV and JSON natively. Rs equivalent to pandas is tidyverse, except the R base language supports data frame natively. Also data.table and pandas have almost equivalent syntax. NumPY is not in the native python language, you Numpty. It's installed as a module. No mention of Rmarkdown, Rnotebooks or quarto. It's like jupyter except without the bonkers decision to use JSON as the raw file format. No one prototypes in one language and productionises in the other. There's no more certain way to double your time to delivery and halve your code quality.
This is how I feel whenever this debate comes up, a lot of people think R just does 1 or 2 things really well. But, the reality is packages like Shiny, Rmarkdown, and some ML libraries are misunderstood or just ignored for how good they are (community and documentation). Also, I have rarely had problems with learning a new R package that works in the base R code format, because everyone uses a similar coding format. Whereas Python packages are harder to pick apart, because people base their coding formats on different formats.
@BaroDrinksBeer Shiny is a criminally underrated part of R, along with Plumber APIs. I managed to create a significant saving for a company by making them cut an expensive PowerBI license and substitute all the dashboards for Shiny on a cloud server. Worked like a charm. Reactive modules with pre trained ML models in .rds files, with a daily retraining routine for the new data using cronjobs.
Starting with R is better, but for a weird reason, that is: "more incentive to learn python and other things". I suspect, most who started with Python are less likely to learn R because they think Python can handle "everything".
Started with Python for Data Analysis. Later tried R tidyverse. Oh dear, i hate python workflow now: pandas, numpy, seaborn - its all feels as an awful crutch
Specialists usually go for R Economists, Biologists, Epidemiologists et al, those whose domain knowledge required statistics. In economics there is no question you go for R due to the packages for economics. If stat and associated data is a means to an end then it's R. If your a programmer who primarily collates and collects data with basic processing then likely it's Python.
This is correct. We work with chemicals and design experiments all day long for optimization purposes. R has so much functionality and and what we chemists know as packages. The science behind our chemical technology is deeply rooted in research work based on Response Surface Methods and Mixture Designs. We're no programmers. We're scientists. And we analyze our data and design our experiments with R.
For data exploration in R, the {dplyr} and {tidyr} packages are excellent for sub-setting and wrangling. I think this approach is superior to pandas in python. In fact the grammar and semantic approach to data wrangling enabled by {dplyr} is a foundational reason to use R over python (i.e. pandas). These two libraries (dplyr, tidyr) along with {ggplot2} make up a core of the Tidyverse approach to using R. This approach is so well thought out that it often makes me chuckle when people suggest or argue that python is better. Clearly there are cases where python is better, just as there are cases where R is better. But when it comes to analytics and reproducibility I've not yet seen a dispostive argument in favor of python.
Python is much better for readability. It's too easy to write opaque R code. Python also has features like dictionaries, comprehensions and lambda functions that make coding more pleasant. R also sucks with data outside of dataframes. Matrices are much harder to work with than numpy arrays.
Tidyverse is a collection of libraries. It isn't even right to compare Pandas to Tidyverse. In this respect, you'd be better off comparing Tidyverse to Anaconda. It's really just a weird claim when matplotlib, pandas, numpy, seaborn, scikit learn, pycaret, yellowbrick all have just have incredible documentation. Not to mention, the combination of "Read the Docs" and "Sphinx" make open source package documentation a breeze. Why? Because good programming requires good inline documentation. Both of these tools make it incredible easy to take your inline documentation and build/distribute it to the open source community. It's just weird to say python libraries don't have great documentation when its just wrong. A preference for functional programming through dplyr and tidyr is not enough to say "R is better for data analysis." In Pandas, there too is support for a more functional approach through the pipe method. Additionally, dplyr and tidyr combined do not have anywhere near the amount of methods available for data manipulating. On a more important note about Pandas pipe, it allows for user defined functions for applying transformations on dataframes or series. Why is this important? Because it provides better encapsulation, more options to the developer, and improved readability. To be clear, you cannot, out of the box combine dplyr's (or tidyr) select, mutate, groupby, ect with a custom function via the pipe operator. dplyr has a not so friendly work around for providing this functionality where the user needs to google search through and figure out that they need a parameter called ".data" which is something very specific to dplyr's implementation and not at all obvious. This is a huge limitation when it requires the developer to take their work to a production setting. Or, even more simply, share their work with others and debug their code themselves. Don't even get me started on the data-masking that the tidyverse paradigm has created in the R community. While it makes data operations more concise, it comes at the expense of polluting the name space of the environment making the developer blind to what variables have been created in global scope increasing the risk of collisions.
Use the one you know. I don't find R much better for stats than scipy, or ggplot better than seaborn. But I do find python better for environment management, package development, and none data science projects like webscraping.
This is me, chuckling. I imagine I've seen a proportionally equal amount of opaque and unreadable python code as you may have seen opaque R code. I also imagine, and you might agree, that good coding is partially art and partially experience. So, if you have a language that is good for beginners, you're probably gonna see some bad code written in that language. no? Anyway, Yes, you can write anonymous functions in R -- totally can. Chucking again that the claims, without evidence, about python dictionaries , comprehensions, or matrices are supposed to be dispositive in favor of python. At best it comes down to use-cases. But really, OP's comment -- "choose the language of your community" -- is spot on. There seems to be no end of python-fan vituperative comments using straw-man criteria meant to express strengths when they actually express personal bias -- typically expressed alongside often outdated and just-plain-wrong claim about R. In any case, my comment was aimed at OPs comment about R's utility in data exploration. My position is that R (AND Tidyverse) is clearly better than OPs video suggests (somewhere in the middle-end of the video timeline). Beyond that, OPs recommendation to use R for beginners AND advanced is insightful to say the least. Clearly a strong and sound suggestion. Meanwhile, I can't stop chuckling at this response. I mean, just no. But, look, it's totally fine to have a preference. Not convincing, but if you're community digs Python, keep doing you.
2 года назад+9
@@BingbongRecto R's equivalent functions to Python's lambda functions are: apply, lapply, sapply, vapply, tapply. Followed by mcapply (multi core apply i.e. parallel processing).
I started this video being confused about which to learn and use. The rest of the video (until the end) left me still confused but with more information to justify my confusion. If I understood the very end correctly, I should learn and use both.
The thing about Python is that it completely half asses both the OOP and functional paradigms, meaning that while being multi-paradigm in theory, I think most Python programmers are better off sticking to good procedural code with common sense applications of other paradigms. Note that I'm not saying that's a bad thing. There's a reason Python is my go to language.
The answer to this post has certainly changed with the advent of Shiny and then again with the advent of webR. We should all get used to the two languages growing closer together.
The main difference is that Python is a general purpose language and R is a niche language. R was not created to be used for computer scientist, it was created to help other scientists do their computer stuff. On the other hand, Python is more focused on computer scientist but also in general public that maybe just want to do a simple script to automate boring stfu. It is difficult for a language to be better than other one, they normally just are designed for different purposes. In my case, as software engineer, I feel more confident with Python. But this is because I, normally, don't have to do specialized analysis about and specific field as economics, biology, etc. For this reason pandas, spark or others are enough for me an I am able to write other utilities (such as parsers, command line interpreters, etc.) that I need in my application (that is a software used by someone else, not a paper or an analysis).
R is the best for for data analysis: any adhocs, visualisation, any work related to work with data tables / frames Python is good for complicated apps and etl In python you need to spend way more time writing code for the same analytics tasks Regarding CSV files - it is even easier to work with CSVs in R
I recently started learning R for a "Quantitative Business Analysis" course I'm taking for my MBA. Really wish I knew about R when I was studying chemistry for undergrad. Thanks for the video!
Earlier this year, I searched for any language among the recent public favorites that would support reading fresh data from measuring instruments. I mean direct hardware port connectivity. Tough task! All of them seemed to be happy to just start manipulating already existing data, Excel and such. The third book I read about Python mentioned there is a library to provide port connectivity, but did not really elaborate. My old favorite, FORTH dropped out of competition when Windows started denying direct access to hardware. Quick Basic worked for a while longer. And after that, you needed drivers, possibly provided by the instrument supplier, if you were lucky. That is what Visual Basic depends on.
That sounds like a C++ job. You can use C++ inside your R Code with the RCpp package and similar things will probably exist for Python. Nevertheless, there are places for compiled languages even in modern times.
I can't understand why nobody talks in theese debates that R's data.table syntax is basically the same as python's pandas, with the gigantic upside of requiring half the amount of code to perform the same task. While I get that R is a specific language, while python is general porpouse (and thus have a much bigger community), people always make the completly subjective point that "python is easier to learn, much easier syntax". I can only imagine that this kind of mindset comes from people that believe that R has no options besides its base implementation (while simultaneously assuming that numpy and pandas comes as default). Tidyverse R is basically as clear and friendly as SQL ("select", "group by"...). There is no way that pandas is more newbie friendly than this for python to be the "go to language from data exploration" and, even if thats the case, once again, R data.table is very similar and requires much less input. With all that being said... I use both languages, and although my preference is clear, there are some situations where one is better suited than the other. I think they should be treated more as complementary than rival tools.
Not true. I do the entire end to end process using R. RSelenium for scraping information. Tidyverse for cleaning data and exploratory data analysis. Tidymodels for machine learning. Shiny for production and putting the analysis on the web for other people.
For early analysis use R, then transport your work to Python, then optimize the result with C. In this context I would use Julia from start to the end.
I found R to be really difficult since it does not follow traditional programming logic, since it is very math based. Do enough years of C++, Perl, Python, C, Java etc... R just seems weird
R was created with the aim of making it easier for humans or data analysts to perform computations, not to make it easier for computers to do their job. So, the logic more closer to human. For example, index start from 1 like you start count from 1, right?
Some may think R is weird for there is no variable type that can contain only one number, others may think Python is weird for the lack of curly braces around blocks. None of these are real obstacles in learning a programming language.
I have used both R and Python professionally. Here’s my take. R is for SMEs (not programmers). Python is for programmers. That said, you can mostly use both for the same job. The difference between them only becomes significant at the edges.
Or learn just one of these and then something different such as C++, SQL, HTML/CSS, a Shell, VIM or Emacs. Anything that has less overlap and thus brings more new things to the table!
R has no competition towards academics research and statistics. The statmodels framework is not mature enough. Visualization, GGPLOT2 (R) is more interesting. After that, R and Python are mostly even. With QUARTO CLI and reticulate, you can work with both.
I mean except for the fact that Python can be used to build massive code bases and complicated infrastructure easily, then sure, they are even... Ever wonder why a lot of companies use Python for their code base and nobody uses R?
Python is easier to pick up for a complete beginner IMO. I would definitely pick Python first and then learn R later if needed for more complex statistical analysis.
Have used both for years. My first choice is definitely Python. Running a parallel processing task in R is a nightmare, and I have faced extremely strange bugs in the process. Not to mention that doing vectorized coding is also not as straightforward in R as in Python. But unfortunately some advanced statistical packages for certain types of data (like single cell genomics) are not available or not very sophisticated in Python. It’s not exactly a problem of Python, but it’s because most analysts use R and thus, develop statistical packages for R mostly. So, overall, I really don’t like jumping from one to another all the time, but I usually have to do it anyway. However when possible, my first choice is Puthon
"You might be expecting a it-depends answer, but no, I'm going to tell you exactly which one to pick" - and then proceeds to give different scenarios for picking each one, in other words IT DEPENDS!
I'm currently learning Python for automation stuff and want to know R only for data analysis. Is this the best approach in your opinion since you know both.
I have to point out that when you talk about data exploration and don't mention the tidyverse for R it feels a bit lacking... As an R user, I must say that starting with Python is probably better since it's more broadly used and integrates better with other languages, as well as software developers that are using different methods, while R needs to be recompiled in order to work within the organisation
Was looking for this comment about tidyverse. As a statistician, I first learned python because of its general use and overall simplicity. Both are good for beginners either way. Then moved to R for more advanced work. And the more general and computer science-y I go, the more python I use
Dont agree that R is more suited to process Excel files. 70% of the times I import CSV. Its designed to process, Excel, CSV, txt, JSON, and many more formats.
It's just as easy to work with API calls and JSON in R. I don't have to worry about switching condas in python environments because subversions of packages work for one application and not another. When actually developing code against data it's hard to beat the RStudio IDE.
R is the best programming language as far as data related work is concerned. I have worked with both, python and R and have R to be better for Data related tasks, may it be machine learning, data science, data analysis or visualizations.
This man knows nothing about Python, I assume. Visualisation is so hard and complicated in R. We are talking about Python 3.11 these days. It is a master piece that can do anything in datascience. You can do nlp using nltk, linear algebra using numpy, any kinds of data manipulation or .csv .txt data reading or writing using pandas, visualisation using matplotlib or seaborn, anything can be done using python and maybe you cannot believe it but the result is even much better
Everyone has their own definition of big data. If you mean "bigger then RAM" data - yes, R has you covered. If you think of Petabytes, then there are no simple answers. Nothing will deal with that "easily" but who does ever get to process really big data without the database selecting only parts of the data first? Everyone talks about big data but almost nobody ever has to deal with it.😂
Im using both... Python for personal work and hobbies with data while i use R for assignments lol ... I love both and found that python will require few libraries like less than 10 to get any statistical job done while R will require more than 100 tools of libraries lol😂😂😂
Having 40 years of programming experience and having had learned R before Python, I beg to differ (somewhat). Yes, tidyverse is superior to Pandas, but only as long as you hard-code your variables. Yes, there are more statistical functions for R than for Python, but Python is catching up fast. I use statsmodels and, at least in my work, it can do anything an R library could. R is easier to use for elementary tasks, like performing a t-test (it's a one-liner in R), but as soon as you do something more complex, it is easy to get lost in R. Similar for graphics: Basic plotting is easier in R, but, to get the plot look as _you_ want it (as opposed to 'as the computer wants it'), Matplotlib and Seaborn beat ggplot any day. Python is easy, logical, systematic, orthogonal, elegant... In R, there are several libraries for computing the ROC curve and none is really satisfactory, In Python, you use sklearn and it works like charm. I could go on like this forever...
Only time will tell if investing Julia is worthwhile. For now it is an interesting prospect with unclear future. It is great if it does exactly what you need now or as a hobby.
Have to use both. When it comes to graphing R is by far the winner. When it comes to statistical work it becomes very obvious that R was designed to do this. While Python has the problem of dataframes not being a first class datatype which gives you messy and ugly code when you need to make scikit-learn talk to numpy to do just a simple regression. Python wins in the bigger stuff. Large, and complicated machine learning stuff. However with the new Nx, and related, libraries in Elixir we might have a contender that might blow Python out of the water soon.
Bahahahahahaha. Aaahahahahahahagabaha Can't stop laughing lol. Serious ml frameworks/ optimization tools: Torch, Jax, pyro all in python and more every day. Hardly anything in R. Mlops tools: mlflow, airflow, spark all python native. Some small section of support for R sometimes lol. And dont let me start with IDEs. rStudio feels like a tool from 1999. Chatgpt, good got support. Nothing there. rStudio is a joke, worst IDE I have ever seen. R is inconsistent with "quirks" a.k.a. shitty design no wonder CoPilot and ChatGPT have trouble understanding it.
No, there is no question that R is merely an implementation of the S language and thus the older language. The video is just wrong about this. I do not feel that being older is an advantage in a programming language. As a matter of fact R carries a lot of clutter from it's age and I envy the Python people who dared to start anew with Python 3.
I prefer R to Python for 3 main reasons: - The pipe operator in R makes codes much easier to built & read (more intuitive) - we just have to import the libraries at the beginning and we don't have to call them to use one ot their functions - indexing start from 1 not 0(crazy)
"There is a healthy debate raging over the best language for learning data science. Many people believe it’s the statistical programming language R. (We call those people wrong.)", said Joel Grus in his book "Data Science from Scratch: First Principles with Python". - I am currently reading the book and when I arrived at the quote, I coulnd't stop laughing on my own. That being said, I'd still choose Python, though💁♂
I always start with R and if I need any help from Python I can just load a helper .py file in my R code and call the Python functions I need directly from R.
I used R as part of society of actuaries certification program and I did not appreciate the time wasted on R. if you plan to do "something", really anything with your data, such as serving them on a website, storing them to a database, making it into part of pdf / word / excel / powerpoint programatically, start with python. R shares the same packages as python tbh but the choices and updates are just not there. it is certainly a pain to get started with package management in python but you will thank yourself later for learning a language that is so versatile. R may be a decent language in and of itself but you will reach a bottleneck pretty soon and have to relearn python.
Thank you Toni Kroos. You you gave people the pleasure of watching and now educate people your're pure diamond.
dayumm bro, fr, I went back and had a look, ong bro, he can fr go viral istg
He looks more like Ole Gunnar Solskjaer honestly. Super sub that comes in and scores (explains something so easily to me) after a dull goalless game with nothing significant (me spending days stuck being confused).
🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣🤣i was like, where have i seen this face before...seeing this comment made me realize where
THIS SHOULD BE THE STANDARD. Thank you for actually presenting your findings rather than give an "it depends" answer. Great video.
from now on i'm importing numpy as numpty instead of np
They were just trying
I’m just
Going by😮🙃
Oops sorry 2:18
I have 10 years working with R and I have never ever had a problem I couldn't solve with it. From ML to basic data analysis and visualization ( I'm a soil scientist). I program in both but I really like R
is 37 late to become programer want to learn R and python
@@elvinmustafov7313 nope, but you should consider getting good spine inusrance, 40 strugle is real, progamming languges are not flavors they are desing to solove certain issues and make money cough java-script, python is the most easy lanaguage, like you never down the path of programming go for python just to get taste of it, when it comes to data,ml,AI python is go to, but that does not mean R is usesless R is there to solve satiscal learning,visulaization and much more.
@@elvinmustafov7313 As he said R is easy to get into. Start with c it'll make easier and give a better understanding and will took few months or just go with R if you have lesser time
@@EDMisalive tring to learn R and and Python this was advice and opinion of few programers that i know
@@elvinmustafov7313 as a general rule of thought personally its never too latge for anything ( maybe it sounds a bit cheesy but I believe it to be quite true)
In personal experience never used R for statistical analysis haha, only for data gathering and cleansing. I’ve been got data from SQL Server, Csv Files, Text Files, GitHub repo metadata, APIs and Point of Sales. For cleansing libraries such as dplyr, lubridate, anydate, stringr, etc, magrittr helps to keep data transformation really clear. Finally once the data is cleaned up and ready, land it to the warehouse.
My personal opinion is that you can do the same ETL processes in R than in Python even with less code (if you know how to use R properly)
This is a really underrated usage of R, especially when you have the DBI package able to work with the major open source implementations of SQL databases. I manage a server hosting a lot of Shiny apps that rely on SQLite databases to work properly and the update routine of ETL is done using R scripts running on a daily basis with cronjobs.
In my opinion, when we talk about Python as a competitor of R, we should refer to it as Python-Pandas. More credit should be given to Wes McKinney to the surge of Python for data science
Pandas is a disaster. I don't know how it could become an industry standard. Sadly it's very obvious the creators of the video have no direct experience with these tools. Not only missing numpy's name, but saying things like "Python supports csv and json while R support xls" or "Python is good for data wrangling thanks to pandas, while R is good in statistical analysis" -- these are simply not true, absolutely misleading.
@@ultrasoft5555 so which one is best according to you? python or R
@@ultrasoft5555 this is definitely not misleading, Python can actually read json and csv files, web developer's usually use json files, even those using Django use json files. And the pandas library has functions that enable reading csv, I think it's called read_csv or something like that.
And that guy literally explained what numpty is, that was intentional.
@@MKgobe the idea that python “ONLY” supports those is the problem. Python can read any file format under the sun (specific packages for everything from xlsx to avro, parquet etc ) I am pretty sure python supports more formats simply because it is multi purpose.
Also - in visualization, the guy has probably never seen libraries like plotly, seaborn etc. Hec, even a package that copies ggplot2 exists in python.
@@MrPrabhatRastogi I recently switched to python from R. I'm by no means an expert, but the fact that he said if you like cool visualizations go for R, made this whole video kind of sus for me, cuz I love visualization and plotly seems to work better in python and seaborn can look even better than ggplot2 imo. Even though coding ggplot2 is much easier than plotly or seaborn, you cannot beat how smooth and nice those libraries look.
On point there, R is my best for manipulation and visualization. In all you do, just learn Python and R.
For a long time, I haven't seen such an informative presentation. You just made an inventory without explaining how the specific language features make the difference.
May I just say your ability to mirror-write is breathtaking!
I know, you would not believe how much practice it requires. [j/k] Search on "lightboard videos".
@@IBMTechnology just write normal and mirror the video in post. wouldn't it work that way?
I use and teach both. In fact I am teaching one class in R right now and one class in Python. I've also helped establish professional organizations with data science frameworks. This video is actually very in line with what I think. I've had a hard time teaching Python to people who have only used Excel, R is easier. If I'm doing a complex analysis requiring advanced probability concepts, I work in R. But I see why professional companies have leaned more towards Python. Its infrastructure makes Python better for security, reproducibility, consistent AI for apps and machinery, working with streaming, graphical, or network data, and text mining.
My view is slightly different: R and more specify Tidyverse (packages / R dialect) is the clear leader in supporting fully reproducible workflows where the pipeline ends with reporting - reporting includes interactive web applications all the way down to static documents. The Python community has not been as strongly engaged in the reproducibility discussion outside of enterprise execution pipelines. They simply stop before reporting and that stopping point is premature.
(Edit: typo misspelling corrected)
Hi Robert. I one of the 'people who have only used excel'. Can you tell learning R is easier for people like me? (Fyi I'm learning phyton as my first language atm)
@@Kei3th1424 why R is easier? One of my observations is that when in Excel, you're somewhat writing codes when you type out formulas that contain functions. The way u work with data in R is very similar in terms of the functional approach whereas if u go to python the syntax will be a bit more alien with heavier emphasis on object oriented designs and quite frankly a less intuitive way of working with data in pandas especially that makes it hard to learn for first time programmers.
What is your take on Julia vs R?
@@hansisbrucker813Julia still has to prove mass adoption. R is the successor to S and thus much older then Python and Statisticians publish new algorithms in R whereas Python is taught to many beginners. Both will be around in twenty years. Can you say the same about Julia? There is no doubt that Julia is great in what it does but so were BASIC and PROLOG and Smalltalk languages. If Julia is good for what you do with it today, then go for it. If learning a language is a long time investment of your time and effort, choose something established.
Funny, intelligent, and not agonizing to watch for more than 5 minutes. Well done:)
2:55 Jupyter notebooks can support a range of languages. It started as IPython, which was Python-centric. And while Python is still the foundation and implements the Notebook API, there is a range of kernels allowing you to run other languages in notebook cells.
JuPyteR is Ju-lia Pyt-hon and R acronym but... The Jupyter system supports over 100 programming languages (called “kernels” in the Jupyter ecosystem) including Python, Java, R, Julia, Matlab, Octave, Scheme, Processing, Scala, and many more. Out of the box, Jupyter will only run the IPython kernel, but additional kernels may be installed. (c)
@@stearin1978 thanks, informative.
Also RStudio supports programming languages other than R
You will end up using both, because Machine Learning and Statistics cannot be split apart. Python for ML and R for stats.
the most wise comment
I use python for stats all the time, works great. I used to use R but I'll probably never go back
I agree with you
Now tell me what is machine learning all about...
I'd concur if Machine learning was only Neural Networks, which for some people it may be.
I totally agree on his take.. though python is not only OO but also functional
R is object orientated as well as functional and using both in both languages makes a lot of sense. Do you really think that any of this makes an important difference between these languages? I don't think these are the decisive differences.
Ok. I'm an R user and when he was explaining Python's pros I was thinking R can do the same. It's unfair mentioning Pandas without mentioning Tidyverse
I use both in my workflow. Speed-wise, I think Python is about twice as fast for general purpose operations (like for loops), but in practice the performance depends largely on the packages used.
Fine up to 5:40, then you lost me.
The base R language reads CSV and JSON natively.
Rs equivalent to pandas is tidyverse, except the R base language supports data frame natively. Also data.table and pandas have almost equivalent syntax.
NumPY is not in the native python language, you Numpty. It's installed as a module.
No mention of Rmarkdown, Rnotebooks or quarto. It's like jupyter except without the bonkers decision to use JSON as the raw file format.
No one prototypes in one language and productionises in the other. There's no more certain way to double your time to delivery and halve your code quality.
This is how I feel whenever this debate comes up, a lot of people think R just does 1 or 2 things really well. But, the reality is packages like Shiny, Rmarkdown, and some ML libraries are misunderstood or just ignored for how good they are (community and documentation). Also, I have rarely had problems with learning a new R package that works in the base R code format, because everyone uses a similar coding format. Whereas Python packages are harder to pick apart, because people base their coding formats on different formats.
@BaroDrinksBeer Shiny is a criminally underrated part of R, along with Plumber APIs. I managed to create a significant saving for a company by making them cut an expensive PowerBI license and substitute all the dashboards for Shiny on a cloud server. Worked like a charm. Reactive modules with pre trained ML models in .rds files, with a daily retraining routine for the new data using cronjobs.
Starting with R is better, but for a weird reason, that is: "more incentive to learn python and other things".
I suspect, most who started with Python are less likely to learn R because they think Python can handle "everything".
For me it's the opposite. I learned programming with R, and feel like Python is much less intuitive.
Started with Python for Data Analysis. Later tried R tidyverse. Oh dear, i hate python workflow now: pandas, numpy, seaborn - its all feels as an awful crutch
I learned R because i loved their dataframe structure. Now with pandas, R can go pound sand.
Pandas is an awkward attempt to add dataframe objects in python that resemble the way R works.
Specialists usually go for R Economists, Biologists, Epidemiologists et al, those whose domain knowledge required statistics. In economics there is no question you go for R due to the packages for economics. If stat and associated data is a means to an end then it's R. If your a programmer who primarily collates and collects data with basic processing then likely it's Python.
What a biased opinion. "Cool people use R. Python is for programmers who do basic shit" lmao
@@MrMaxtng No its true. Any job focusing heavily on data analysis will have you using R. Where I live any actuary has to know R…
This is correct. We work with chemicals and design experiments all day long for optimization purposes. R has so much functionality and and what we chemists know as packages. The science behind our chemical technology is deeply rooted in research work based on Response Surface Methods and Mixture Designs. We're no programmers. We're scientists. And we analyze our data and design our experiments with R.
@@MrMaxtng It is literally the case though.
@@victorst5997 R is also used in Biology and medicine.
For data exploration in R, the {dplyr} and {tidyr} packages are excellent for sub-setting and wrangling. I think this approach is superior to pandas in python. In fact the grammar and semantic approach to data wrangling enabled by {dplyr} is a foundational reason to use R over python (i.e. pandas). These two libraries (dplyr, tidyr) along with {ggplot2} make up a core of the Tidyverse approach to using R. This approach is so well thought out that it often makes me chuckle when people suggest or argue that python is better.
Clearly there are cases where python is better, just as there are cases where R is better. But when it comes to analytics and reproducibility I've not yet seen a dispostive argument in favor of python.
Python is much better for readability. It's too easy to write opaque R code. Python also has features like dictionaries, comprehensions and lambda functions that make coding more pleasant.
R also sucks with data outside of dataframes. Matrices are much harder to work with than numpy arrays.
Tidyverse is a collection of libraries. It isn't even right to compare Pandas to Tidyverse. In this respect, you'd be better off comparing Tidyverse to Anaconda. It's really just a weird claim when matplotlib, pandas, numpy, seaborn, scikit learn, pycaret, yellowbrick all have just have incredible documentation. Not to mention, the combination of "Read the Docs" and "Sphinx" make open source package documentation a breeze. Why? Because good programming requires good inline documentation. Both of these tools make it incredible easy to take your inline documentation and build/distribute it to the open source community. It's just weird to say python libraries don't have great documentation when its just wrong.
A preference for functional programming through dplyr and tidyr is not enough to say "R is better for data analysis." In Pandas, there too is support for a more functional approach through the pipe method. Additionally, dplyr and tidyr combined do not have anywhere near the amount of methods available for data manipulating. On a more important note about Pandas pipe, it allows for user defined functions for applying transformations on dataframes or series. Why is this important? Because it provides better encapsulation, more options to the developer, and improved readability. To be clear, you cannot, out of the box combine dplyr's (or tidyr) select, mutate, groupby, ect with a custom function via the pipe operator. dplyr has a not so friendly work around for providing this functionality where the user needs to google search through and figure out that they need a parameter called ".data" which is something very specific to dplyr's implementation and not at all obvious. This is a huge limitation when it requires the developer to take their work to a production setting. Or, even more simply, share their work with others and debug their code themselves.
Don't even get me started on the data-masking that the tidyverse paradigm has created in the R community. While it makes data operations more concise, it comes at the expense of polluting the name space of the environment making the developer blind to what variables have been created in global scope increasing the risk of collisions.
Use the one you know.
I don't find R much better for stats than scipy, or ggplot better than seaborn. But I do find python better for environment management, package development, and none data science projects like webscraping.
This is me, chuckling. I imagine I've seen a proportionally equal amount of opaque and unreadable python code as you may have seen opaque R code. I also imagine, and you might agree, that good coding is partially art and partially experience. So, if you have a language that is good for beginners, you're probably gonna see some bad code written in that language. no?
Anyway, Yes, you can write anonymous functions in R -- totally can. Chucking again that the claims, without evidence, about python dictionaries , comprehensions, or matrices are supposed to be dispositive in favor of python. At best it comes down to use-cases. But really, OP's comment -- "choose the language of your community" -- is spot on.
There seems to be no end of python-fan vituperative comments using straw-man criteria meant to express strengths when they actually express personal bias -- typically expressed alongside often outdated and just-plain-wrong claim about R.
In any case, my comment was aimed at OPs comment about R's utility in data exploration. My position is that R (AND Tidyverse) is clearly better than OPs video suggests (somewhere in the middle-end of the video timeline). Beyond that, OPs recommendation to use R for beginners AND advanced is insightful to say the least. Clearly a strong and sound suggestion.
Meanwhile, I can't stop chuckling at this response. I mean, just no. But, look, it's totally fine to have a preference. Not convincing, but if you're community digs Python, keep doing you.
@@BingbongRecto R's equivalent functions to Python's lambda functions are: apply, lapply, sapply, vapply, tapply. Followed by mcapply (multi core apply i.e. parallel processing).
I started this video being confused about which to learn and use. The rest of the video (until the end) left me still confused but with more information to justify my confusion. If I understood the very end correctly, I should learn and use both.
You should consider what your team does and what you plan on doing. If in doubt, learn the more General Use language first.
You can use both for big data applications. It's really base on preference
Python sure has OOP support, but doesn't force you to use them. It has some pretty good functional support also, but again, not forced to use them.
The thing about Python is that it completely half asses both the OOP and functional paradigms, meaning that while being multi-paradigm in theory, I think most Python programmers are better off sticking to good procedural code with common sense applications of other paradigms.
Note that I'm not saying that's a bad thing. There's a reason Python is my go to language.
R also has OOP systems: S3, S4, S7, R6, etc. R6 is the equivalent to classes in Python.
The good thing about Python is that I can create a model and also a backend service to serve it with one language.
it can be done in R too - what's your point?
@@borisn.1346 but you wouldn t want that!
Why would would I run a backend application in R😂 when fastapi and django rocks
@@einstein_god yes fastapi is great for modelling.
The answer to this post has certainly changed with the advent of Shiny and then again with the advent of webR. We should all get used to the two languages growing closer together.
The main difference is that Python is a general purpose language and R is a niche language. R was not created to be used for computer scientist, it was created to help other scientists do their computer stuff.
On the other hand, Python is more focused on computer scientist but also in general public that maybe just want to do a simple script to automate boring stfu.
It is difficult for a language to be better than other one, they normally just are designed for different purposes.
In my case, as software engineer, I feel more confident with Python. But this is because I, normally, don't have to do specialized analysis about and specific field as economics, biology, etc. For this reason pandas, spark or others are enough for me an I am able to write other utilities (such as parsers, command line interpreters, etc.) that I need in my application (that is a software used by someone else, not a paper or an analysis).
R is used on the financial sector and auditing. With R you can make complex statistical analysis.
I'm learning it by myself and I feeling certain difficult to handle this larger range of applications. Do you've knowledges about other data tools?
Hahah (2:09) clearly the best speaker from IBM. Sympathetic man and instructive content, gladly more
R is the best for for data analysis: any adhocs, visualisation, any work related to work with data tables / frames
Python is good for complicated apps and etl
In python you need to spend way more time writing code for the same analytics tasks
Regarding CSV files - it is even easier to work with CSVs in R
I recently started learning R for a "Quantitative Business Analysis" course I'm taking for my MBA. Really wish I knew about R when I was studying chemistry for undergrad. Thanks for the video!
Me too learnt about R 3 years after doing my post graduate masters 😢😢😢
@@neroetal I learnt R in the middle of my career as a researcher... I felt like I was born a second time 😄
Earlier this year, I searched for any language among the recent public favorites that would support reading fresh data from measuring instruments. I mean direct hardware port connectivity. Tough task! All of them seemed to be happy to just start manipulating already existing data, Excel and such. The third book I read about Python mentioned there is a library to provide port connectivity, but did not really elaborate. My old favorite, FORTH dropped out of competition when Windows started denying direct access to hardware. Quick Basic worked for a while longer. And after that, you needed drivers, possibly provided by the instrument supplier, if you were lucky. That is what Visual Basic depends on.
That sounds like a C++ job. You can use C++ inside your R Code with the RCpp package and similar things will probably exist for Python. Nevertheless, there are places for compiled languages even in modern times.
Matplotlib and seaborne libraries can be used for data visualization in python. Heat maps, bar, gant chart
Plotly is another option.
I can't understand why nobody talks in theese debates that R's data.table syntax is basically the same as python's pandas, with the gigantic upside of requiring half the amount of code to perform the same task.
While I get that R is a specific language, while python is general porpouse (and thus have a much bigger community), people always make the completly subjective point that "python is easier to learn, much easier syntax". I can only imagine that this kind of mindset comes from people that believe that R has no options besides its base implementation (while simultaneously assuming that numpy and pandas comes as default).
Tidyverse R is basically as clear and friendly as SQL ("select", "group by"...). There is no way that pandas is more newbie friendly than this for python to be the "go to language from data exploration" and, even if thats the case, once again, R data.table is very similar and requires much less input.
With all that being said... I use both languages, and although my preference is clear, there are some situations where one is better suited than the other. I think they should be treated more as complementary than rival tools.
Agree
R data table is my choice
Python pandas is just terrible if we talk about amount of the code you need for the same operations
Not true. I do the entire end to end process using R. RSelenium for scraping information. Tidyverse for cleaning data and exploratory data analysis. Tidymodels for machine learning. Shiny for production and putting the analysis on the web for other people.
For early analysis use R, then transport your work to Python, then optimize the result with C.
In this context I would use Julia from start to the end.
He can write mirrored letters so freaking fast! 🤯
Maybe the video was flipped
It's long time we add Julia to these conversations. The "Ju" in "Julyter Notebook" literally stands for Julia.
I found R to be really difficult since it does not follow traditional programming logic, since it is very math based.
Do enough years of C++, Perl, Python, C, Java etc... R just seems weird
R uses actual logic.
R was created with the aim of making it easier for humans or data analysts to perform computations, not to make it easier for computers to do their job. So, the logic more closer to human. For example, index start from 1 like you start count from 1, right?
Some may think R is weird for there is no variable type that can contain only one number, others may think Python is weird for the lack of curly braces around blocks. None of these are real obstacles in learning a programming language.
I have used both R and Python professionally. Here’s my take. R is for SMEs (not programmers). Python is for programmers. That said, you can mostly use both for the same job. The difference between them only becomes significant at the edges.
true, both should be learned. if you have some knowledge in one language ,the other one isn't a big problem. so just do it!
Or learn just one of these and then something different such as C++, SQL, HTML/CSS, a Shell, VIM or Emacs. Anything that has less overlap and thus brings more new things to the table!
R has no competition towards academics research and statistics. The statmodels framework is not mature enough. Visualization, GGPLOT2 (R) is more interesting. After that, R and Python are mostly even. With QUARTO CLI and reticulate, you can work with both.
I mean except for the fact that Python can be used to build massive code bases and complicated infrastructure easily, then sure, they are even... Ever wonder why a lot of companies use Python for their code base and nobody uses R?
Haha, we hear so much about Python that goign towards R feels awkward. Great video :)
Python is easier to pick up for a complete beginner IMO. I would definitely pick Python first and then learn R later if needed for more complex statistical analysis.
What's the pirate's favourite programming language?
You think it would be R, but their true love be the C.
I'm a Data Engineer that doesn't use AWS. I use both Python and R. I use Python for automated data tasks and R when I am not doing an automated task.
Wow Martin, I always thought of you as a "master homebrewer". Now I know you're a "master inventor" too!
Popular data visualisation libraries such as Seaborn, Matplotlib, Plotly and several other specialised ones are fantastic.
Have used both for years. My first choice is definitely Python. Running a parallel processing task in R is a nightmare, and I have faced extremely strange bugs in the process. Not to mention that doing vectorized coding is also not as straightforward in R as in Python. But unfortunately some advanced statistical packages for certain types of data (like single cell genomics) are not available or not very sophisticated in Python. It’s not exactly a problem of Python, but it’s because most analysts use R and thus, develop statistical packages for R mostly.
So, overall, I really don’t like jumping from one to another all the time, but I usually have to do it anyway. However when possible, my first choice is Puthon
"You might be expecting a it-depends answer, but no, I'm going to tell you exactly which one to pick" - and then proceeds to give different scenarios for picking each one, in other words IT DEPENDS!
If you use R before Python, R is better. If you use python before R, Python is better. I love all, and use both in my works.
I'm currently learning Python for automation stuff and want to know R only for data analysis. Is this the best approach in your opinion since you know both.
@@hugowesley4074 If you learn python first
I have to point out that when you talk about data exploration and don't mention the tidyverse for R it feels a bit lacking... As an R user, I must say that starting with Python is probably better since it's more broadly used and integrates better with other languages, as well as software developers that are using different methods, while R needs to be recompiled in order to work within the organisation
Was looking for this comment about tidyverse. As a statistician, I first learned python because of its general use and overall simplicity. Both are good for beginners either way. Then moved to R for more advanced work. And the more general and computer science-y I go, the more python I use
R is under rated I think.
Python borrowed a lot from R
Rated R
Python is a great glue language.
No it's not.
It's just very specific.
Well...if you need to do statistical tests in Python, you'll be suffering.
Thank you for this very calm and useful presentation.
I mainly use R for shallow models but as i need neural networks and beyond or other deep models, i switch to python
I use R everyday as a data scientist at my current job for a year now! In my precious job, we used python!
And which do you prefer?
please take a word about R ,
now im try to start with it, and have a bit doubt of my choice
What a nice video to be like and share.
Dont agree that R is more suited to process Excel files. 70% of the times I import CSV. Its designed to process, Excel, CSV, txt, JSON, and many more formats.
It's just as easy to work with API calls and JSON in R. I don't have to worry about switching condas in python environments because subversions of packages work for one application and not another. When actually developing code against data it's hard to beat the RStudio IDE.
Wow, I could listen to your soothing voice all day 😊
R is the best programming language as far as data related work is concerned. I have worked with both, python and R and have R to be better for Data related tasks, may it be machine learning, data science, data analysis or visualizations.
1:18 see what he did there?
Pure shock that the guy behind my favorite homebrewing channel works for IBM!
this video making is awesome, and I use both as my teams requirements
Thanks for your comment. We strive to make videos that mean something to our audience - and when we make mistakes we make light of those. #human
Brilliant, l loved the explanation. l believe l will be learning R. lm using this for my learning of Business Analysis and my data analysis study.
If you need strictly data analytics, R is a great choice!
Accepts misspelling of "Numpy"👏
Proceeds to misspell "Jupyter"😂
Awesome and highly needed comparison tho!😏👍
How is he writing backwards so good?
Came here to say this, that this video is obviously just an excuse to show off his backward writing.
The video is probably mirrored. I assume he is right-handed.
It's a mirrored image
What a great video!! Thank you!
just use both people. No need for competition, these are 2 very useful tools
I wouldn't forget about MatLab. I often see how some researchers use it in ML, DS and ANN.
But that's a lot of money. Mastering a language only to lose it because no one is paying for the license is a bummer.
Hahahahaha no one heard of Matlab for 15 years at least. Approx 5 scientists use it all together.
scilab is a good open source alternative
R and Matlab are both natively vectorial languages. And this is a huge bonus. Python is scalar.
I used to hate R, but I started having fun with it when I used it for my thesis
This man knows nothing about Python, I assume. Visualisation is so hard and complicated in R. We are talking about Python 3.11 these days. It is a master piece that can do anything in datascience. You can do nlp using nltk, linear algebra using numpy, any kinds of data manipulation or .csv .txt data reading or writing using pandas, visualisation using matplotlib or seaborn, anything can be done using python and maybe you cannot believe it but the result is even much better
how many months expert in these both language?
do R support big data ?
lol of course and works much better with it
Everyone has their own definition of big data. If you mean "bigger then RAM" data - yes, R has you covered. If you think of Petabytes, then there are no simple answers. Nothing will deal with that "easily" but who does ever get to process really big data without the database selecting only parts of the data first? Everyone talks about big data but almost nobody ever has to deal with it.😂
I think Python is a better way to start as a beginner than R. Plus, is easier implemented in tools like Spark and web-development.
Im using both... Python for personal work and hobbies with data while i use R for assignments lol ...
I love both and found that python will require few libraries like less than 10 to get any statistical job done while R will require more than 100 tools of libraries lol😂😂😂
Having 40 years of programming experience and having had learned R before Python, I beg to differ (somewhat). Yes, tidyverse is superior to Pandas, but only as long as you hard-code your variables. Yes, there are more statistical functions for R than for Python, but Python is catching up fast. I use statsmodels and, at least in my work, it can do anything an R library could. R is easier to use for elementary tasks, like performing a t-test (it's a one-liner in R), but as soon as you do something more complex, it is easy to get lost in R. Similar for graphics: Basic plotting is easier in R, but, to get the plot look as _you_ want it (as opposed to 'as the computer wants it'), Matplotlib and Seaborn beat ggplot any day. Python is easy, logical, systematic, orthogonal, elegant... In R, there are several libraries for computing the ROC curve and none is really satisfactory, In Python, you use sklearn and it works like charm. I could go on like this forever...
"In R, there are several libraries for computing the ROC curve and none is really satisfactory, ...", you want to try tidymodels?
40 years and you know nothing
Good job lol
What about Julia?
Only time will tell if investing Julia is worthwhile. For now it is an interesting prospect with unclear future. It is great if it does exactly what you need now or as a hobby.
how does the trick with the pen work? thanks
What a nice video to be like and share.
Have to use both. When it comes to graphing R is by far the winner. When it comes to statistical work it becomes very obvious that R was designed to do this. While Python has the problem of dataframes not being a first class datatype which gives you messy and ugly code when you need to make scikit-learn talk to numpy to do just a simple regression.
Python wins in the bigger stuff. Large, and complicated machine learning stuff. However with the new Nx, and related, libraries in Elixir we might have a contender that might blow Python out of the water soon.
Bahahahahahaha. Aaahahahahahahagabaha
Can't stop laughing lol. Serious ml frameworks/ optimization tools: Torch, Jax, pyro all in python and more every day. Hardly anything in R. Mlops tools: mlflow, airflow, spark all python native. Some small section of support for R sometimes lol. And dont let me start with IDEs. rStudio feels like a tool from 1999. Chatgpt, good got support. Nothing there. rStudio is a joke, worst IDE I have ever seen. R is inconsistent with "quirks" a.k.a. shitty design no wonder CoPilot and ChatGPT have trouble understanding it.
Anyone who criticizes R language should be sent to the wall.
Wait, python can also be used for data viz , seaborn , matplotlib and etc to name a few.
Yeah. But that is true for every programming language. I use both but R is just WAY stronger in this area.
As a Biostatistician student, we are trained to specialize in R more than python
I think it just depends on the the problem you want to solve. I personally like R for statistics
Very nicely explained.
I guess the question of which is older depends on whether you consider S to be ‘proto-R’ or something else entirely
No, there is no question that R is merely an implementation of the S language and thus the older language. The video is just wrong about this. I do not feel that being older is an advantage in a programming language. As a matter of fact R carries a lot of clutter from it's age and I envy the Python people who dared to start anew with Python 3.
I think as long as you are not going to the machine learning page, definitely use R as a data scientist, for now.
Why? You can also do machine learning in R (i.e tidymodels package). For deep learning Python is better with tensorflow and keras.
Nice vid. Thanks. Beware pandas’ current treatment of missing data though, where the sum of NaNs becomes 0.
Very good breakdown
I expected a "well it depends kind of answer", and I got a "well it depends (on the following questions) kind of answer".
I prefer R to Python for 3 main reasons:
- The pipe operator in R makes codes much easier to built & read (more intuitive)
- we just have to import the libraries at the beginning and we don't have to call them to use one ot their functions
- indexing start from 1 not 0(crazy)
Indey N_0 is simply superior to N_>0. As a mathematician, Indexing over the natural numbers starts at 0.
These are very abstract generalisations. I do data analysis in genomics. I that area I have to use both R and Python. One is not enough
Nice presentation, from the homebrew channel guy?
This is not biased for R at all. Man, how more obvious can you get?
Thank you. loved the video.
What a nice video to be like and share.
that ka numpty expanation of that guy just coming from the dark side just made my night aaaaahhhhhh
Great Video!
"There is a healthy debate raging over the best language for learning data science. Many people believe it’s the statistical programming language R. (We call those people wrong.)", said Joel Grus in his book "Data Science from Scratch: First Principles with Python". - I am currently reading the book and when I arrived at the quote, I coulnd't stop laughing on my own. That being said, I'd still choose Python, though💁♂
from scratch... starts every script with import numpy as np 🤣
I will go definitely for JULIA
I always start with R and if I need any help from Python I can just load a helper .py file in my R code and call the Python functions I need directly from R.
I used R as part of society of actuaries certification program and I did not appreciate the time wasted on R. if you plan to do "something", really anything with your data, such as serving them on a website, storing them to a database, making it into part of pdf / word / excel / powerpoint programatically, start with python. R shares the same packages as python tbh but the choices and updates are just not there. it is certainly a pain to get started with package management in python but you will thank yourself later for learning a language that is so versatile. R may be a decent language in and of itself but you will reach a bottleneck pretty soon and have to relearn python.
I see no one even mentioned how good poltly in R compared to plotly in python
So much easier
It's difficult to google for 'R' readings 😉
R can import JSON files