[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Statslet here. Can someone please explain me the advantages of

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 84
Thread images: 7

File: r-vs-python-blog.jpg (13KB, 525x300px)
r-vs-python-blog.jpg
13KB, 525x300px
Statslet here.
Can someone please explain me the advantages of R over Python? I just cannot make sense of R's type system and syntax is ugly.
Does R provide better libs? What Actually do I miss if I use python only?
>>
R was designed thinking about stats and python is general purpose. python is way more clean but for specific things you'll have to write way less code in R. If you are into stats, learn both. you probably miss little if you use python only
>>
>>9116472
Exactly this.
>>
>>9116472
How good should I be in R to be a good statistitian/"""data scientist"""?
>>
Better IDE, better convenience functions. It's a comfy language.
>>
Also what are your major trip ups with R's data types?
>>
>>9116641
How Rstudio is better that pycharm? It is a good ide, but I cant think about a single aspect where pycharm is worse.
>>
I was considering base release: Rgui vs IDLE. Personally I hate Rstudio.
>>
>>9116644
They are not intuitive like in other languages.
What is the difference between vector, array and list? I googled it, but it still isnt completely clear to me.
>>
>>9116683
Whats wrong with external ides?
>>
>>9116609
R and Python are both high-level languages; anything you do in one can be also be done in the other, though certain things like statistical tests will be easier to implement in R than Python.

If you have no prior experience with statistics but want to git gud, I'd strongly advise you to study the theory before learning R, or at least doing both concurrently. fortunes::fortune(184) is probably relevant here.

Good statisticians write good R code because they think like good statisticians and the language enables them to express their thoughts in an idiomatic fashion. R is fairly aggressive about this, so the converse also holds: people who are bad at statistical analysis (even if they may be good programmers) will find the language unintuitive as it forces them to write their programs in a manner that is uncomfortable for them.
>>
>>9116696
A vector is what you'd call a list/tuple in python, just a single dimension series of values. All values in a vector have to have the same type (numeric or Boolean or string etc). An array is a vector extended to multiple dimensions, like a matrix. Same rules apply wrt types. A list is what you use when you have different data types and potentially different data sizes you want to store in a single object.

Say you want to store a bunch of observations, where you have 20 measurements from each experiment. A matrix (or a data frame) would be a good way to store that, you just set up the table and populate it.

Now say you had a different situation where you wanted to store someone's name and a variable set of information on them. For some people you might have one piece of info, for others you might have ten. No one array will suffice, but you can structure your data in a list. The list packages the data together.

>>9116701
Nothing is wrong with them, I just don't think they should be part of the discussion if you're comparing languages.
>>
>>9116712
But isn there a special type for matrix?
>>
>>9116766
A matrix is a 2d array. They're identical types under the hood. The matrix() function is just sugar.
>>
>>9116770
Oh, thanks.
By the way, what is your degree? Also what do you mean by "good statisticians"?
>>
>>9116794
ah that was someone else that posted >>9116709, you'll have to ask them.

im a biologist
>>
R is not even a real programming language, R sucks dude.
>>
>>9116960
Neither is Python.
Python is a scripting language.
>>
>>9116960
t. cs brainlet
>>
>>9116967
>>9116960
could not be more wrong
>>
How long to learn R from the ground up? Know my stats theory, just interested in how long it takes to learn all the language, ggplot2 and the like libraries.
>>
>>9117011
There are hundreds of libs, I honestly have no Idea how one should deal with them.
>>
>>9117019
Yeah, but seeing a 700 pages long book on ggplot2 alone kinda killed my motivation. Do you recommend any other library besides ggplot though?
>>
File: maxresdefault.jpg (46KB, 1280x720px)
maxresdefault.jpg
46KB, 1280x720px
>>9117024
> 700 pages long book on ggplot2
>>
>>9117019
>>9117024
like with any language: start simple. you don't need ggplot2 to get started, base graphics are fine for learning.

R base is fine for most use. all the external packages tend to deal with hyperspecific use cases that 99.9% of people don't need to give a shit about.

R has a lot of data sets built into the default install, and if you want to learn to do exploratory data analysis, you only need base R.

ignore all the extra shit until you have a specific need for it
>>
>>9117011
>>9117049
and to expand on that:

if you have any basic programming knowledge, R is easy to pick up. it's a high level language, honestly a lot like python except for syntax.

the most useful thing you can do is learn to deal with data frames and how to do complex subsetting. get used to using boolean descriptions in subsetting, and how to cross reference between data frames (subset a metadata data frame to get columns of interest, then query your data storage data frame with the information from the metadata frame). data frames are the bread and butter of R
>>
>>9117058
also: i think the biggest hangup people might have with R is that most of it's basic data manipulation functions are return-edited-object, but there are a few (row/colnames() come to mind) that are edit in place. that and, if you try to do something like,

x <- c(1,2)
y <- c(1,2,3)
x < y

R will happily do it. if you're doing an inherently vectorized operation between two sets of inequal length, it copies the shorter set repeatedly until it's longer than the longest set. the above comparison is comparing:
1 vs 1, 2 vs 2, 1 vs 3

for any other programming language, that's BONKERS behavior. but for R, that's normal, because R was never meant to be a general purpose programming language. It's a domain specific language, with lots of design choices and convenience functions meant for quick statistical analysis
>>
>>9116449
STATA
>>
What do you think about Wolfram Mathematica, fellow statisticians?
>>
>>9117146
have no experience, but to my knowledge it's kind of a different different beast

i think of it in terms of say, matlab verses mathematica/maple

you can twist one to approximate the other, but it's just really not hte same thing
>>
>>9116677
Can you get PyCharm to show variable assignments in a live workspace like RStudio or Octave GUI? I always use IPython when I'm doing something quick in Python
>>
File: observer.png (78KB, 1140x616px)
observer.png
78KB, 1140x616px
>>9117604
Yes, to some extend.
>>
>>9117202
It looks like a really powerful tool, but it is expensive, maybe thats why there isnt a lot of information about it.
>>
>>9116449
Question doesn't really make much sense cos of >>9116472, but stay the fuck away from both, use MatLab if you want to learn something by writing models yourself, and not just execute "open source" functions made by autists
>>
>>9116994
C is a programming language, Python is just an attempt at ideologically-based sugar coding made by an autistic sperglord
>>
>>9117115
found the mongoloid
>>
>>9118578
Why cant one write his own functions in r or python? Also I expect most of libs to be written in c++, and one can easily integrate c++ to python, and probably to r also.
And matlab isnt free.
And why should one write his own functions, when he can simply read the theory of the used algorithm?
>>
>>9118587
Asm is a programming language, C is just an attempt at ideologically-based sugar coding made by an autistic sperglord
>>
>>9118592
>And why should one write his own functions, when he can simply read the theory of the used algorithm?
You haven't really understood a model until you can rebuild it from scratch. Bottom-up is for people who understand, Top-down is for management school retards who just click play. It becomes important when you start working on real data and your functions fit like shit, you need to understand precisely where the problem is
>>
>>9118593
> ihih look mom I made an abstraction joke im so smart

Except there is no official C penis enlargement scam-looking web page filled with shit like "elegant is better than ugly"
>>
>using anything else than MATLAB
>>
How about SAS is it the safest pickup for the job market? Is it better?
>>
>>9118607
You actually make me think about R being better than Python. Though you cant deny that python has a number of useful libs.
>>
>>9118578
>>9118616
t. mathworks sales
>>
>>9118628
No no no wait

I love to shit on Python because it deserves it, but I still consider it a legitimate programming language and it saved my ass when I needed to use Tensorflow, credit where credit is due.

R, on the other hand, is the most autistic project ever conceived, because it's a language mantained by statisticians and not by computer scientists, which makes it ugly, incredibly tedious to use, and completely unpredictable (like that time when they discontinued the tseries packet JUST BECAUSE). Stay the fuck away from R and consider yourself free to assume anyone who uses it is a complete autist
>>
>>9118616
matlab sucks for statistics though

if you're doing statistics and you're not using R then you're fucking up bad
>>
>>9118634
>not getting matlab automatically for no extra cost when you went to university

I finished university 2 years ago and I can still use matlab on my computer
>>
>>9118695
no u

cython + stan
>>
>>9120763
stan is also available for R, and besides, statistical work is a lot more than just Bayesian modeling.
>>
>>9120789
Someone develop this please
>>
what are the best books for learning how to use python for data/stats stuff? ive heard libraries like scipy being thrown around, i want to learn how to use those.
>>
>>9118593
>Asm is a programming language
WRONG
>>
>>9120853
develop what? rstan is already a thing: https://github.com/stan-dev/rstan/wiki/RStan-Getting-Started
>>
>>9116449
"Statslet".... I am so done with this dumb website.
>>
>>9116449
If you have a choice in the matter, R is a lot cleaner for stats. Having used both R and python's pandas and numpy libraries, things are a lot nicer in R. Not to say there's anything wrong with python, but it's obvious it wasn't built specifically for statistics like R is.

Also, R runs way faster for run-of-the-mill data manipulation in my experience.
>>
>>9121104
Asm is a programming language. Even hex is a programming language.
>>
File: yoba confused.png (37KB, 512x512px)
yoba confused.png
37KB, 512x512px
>>9116709
>I'd strongly advise you to study the theory before learning R
I wanted to do exactly that, can you give couple of good textbooks? Being a russkie myself, I have little idea about your textbooks.
>>
>>9122236
Пpивeт. Кaкoй y тeбя бэкгpayнд?
>>
File: 1501137505003.png (757KB, 963x973px)
1501137505003.png
757KB, 963x973px
>>9122328
quantum physics
>>
>>9122342
Dont you have to know probability to do quantum physics?
>>
>>9122358
on the basic level. For statistical mechanics you need to know more, but none of that was 'data analysis' oriented. It was "statistics for physicists".
>>
>>9122372
I am reading Bishop Patrern recognition and machine learning now.
>>
>>9122400
What about you? Yandex schooler by any chance?

>>9120985
https://courses.edx.org/courses/course-v1:UCSanDiegoX+DSE200x+2T2017/course/
maybe you can start with this and see how it goes. For me it was ok, but a little boring
>>
>>9121636
t. statslet
>>
>>9122416
No, Yandex is to hard to get for me. Just a brainlet undergrad in the uk.
>>
I personally find manipulating data easier with R than Python. Also there seems to be far less work involved to achieve a result with.
>>
>>9116449
RStudio is the best ide. Prove me wrong.
>>
>>9122534
Relative to Rgui, noticeable response lag even for incredibly simple operations like addition, and very slow startup times.
>>
>>9122236
You should check out the swirl package in R. That's what I'm using to learn R, and it goes through some stat concepts as well.

To learn the basics of stat, check out www.burkeyacademy.com. It's run by an economics professor, but it has a decent amount of stat courses, as well as a little bit of R. Also, if you're interested econometrics, there are a bunch of those courses as well.
>>
>>9122463
Same here bruh.
>>
>>9122142
so is my dick in ur mom s ass
>>
>>9118617
couldnt tell you much, but my tutor who's teaching our data analysis class uses python exclusively for his data work contracting and gave us the option to learn using that, when the university curiculum tells us to learn SAS. My team chose python. It's tough and less intuitive but its free and doesn't require the funds of a large company to have use.
>>
File: 1500369260607.gif (4MB, 280x358px)
1500369260607.gif
4MB, 280x358px
>>9122616
the best post on /sci/
>>
>>9116449
God I hate R. it has the most autistic community of any language ever, and im including lispers and haskelians in that.

Even when I cant find a python library to do something (most recently to find linear combinations of time series that are stationary) i rather use r2py and run the R code in python than open up an R ide.

Python also has its share of shit, like non existent documentation for 90% of what you want to do, and hundreds of abandoned libraries that force you to use outdated versions of other libraries and get them to run, and bugs in major modules like pandas that have been reported since 2014 but are still not fixed.

Both are just shit languages actually. But python is by far the less shit one.
>>
I've been waiting for this thread all my life
>>
>>9122607
swirl is literally brainlet tier. Takes you through the basics but nothing else.
>>
>>9122757
it's not even really good for teaching, just as a refresher if you've been away from the language for a while
>>
File: what's the problem.png (8KB, 225x235px) Image search: [Google]
what's the problem.png
8KB, 225x235px
Use a real language, OP
>>
>>9123133
Is it french?
>>
>>9123141
no
>>
>>9116709
I joined a research project where I'll have to use R before even taking statistics. If what you're saying is true I'm screwed.
>>
>>9116449
>MUH TYPES
>>>/g/
>>
>>9123133
You too. Keep your autism in >>>/g/.
>>
>>9116449
R is designed as a vector-based language. It also works better as a functional language. It's easier to implement stats-based packages like data clustering or machine learning.

The problem is that it's slower for the sake of readability
Thread posts: 84
Thread images: 7


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.