Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Im curious about the people who use R in big tech companies that you've worked at. Were the R users the people who had just come out of school and still working using their academic dev environment before weening off?

I always found that was the group who used R - kind of a use what you are used to until it gets out of step with the remaining workflow.

I also would say that the amount of R I see is far less than python.



So, (speaking as someone who started with R and now predominantly writes Python), I think there's a bunch of things going on here.

1. R is 100% better for analytics work and statistical modelling. There's just no contest.

2. Python is much, much better for data getting (APIs/scraping etc) and dealing with non table-like data. Again, there's basically no contest here.

3. Software engineers hate R (in most cases), which means that it's easier to hand over work for production in Python.

This leads to a situation where it looks like most of the prod-level work is being done in Python, but if you look under the covers you'll discover that most prototyping/analysis/exploration is done in R and then ported to Python if it works.

Like, Python is a great language for lots of things, but it's pretty terrible for exploratory DS work (pandas is like the worst features of base R and base Python mashed together in an unholy hybrid).

There's also the fact that all the NN stuff is predominantly Python, so lots of companies believe that they need Python people, which reinforces the stereotype.

And finally, while I love R, Python has more guardrails, and it's harder to make an unmaintainable mess with it (relative to R). Particularly when people use all the various lazy evaluation packages that the tidyverse has used over the past decade (I once maintained a codebase that used all of these in different places, it was not a fun experience).


One of the better comments in this thread, I would only qualify that different levels of ability mediate much of the "how hard is it to make an unmaintainable mess" dimension. Dplyr/tidy code can be pasta, as can pandas, and there is really a whole new level of that given llm generated nonesense edited/tweaked by novices masquerading as seniors.

Apropos this idea of a vs code competitor, I wish they would spend more effort on existing products. I find quarto frustratingly buggy and meanwhile see no reason to move my workflow from vscode to this new thing. Ymmv


> I would only qualify that different levels of ability mediate much of the "how hard is it to make an unmaintainable mess" dimension

Oh definitely, but at least Python's stdlib is relatively consistent, which helps packages be a little more so.

My favourite example is t.test, which is not a t method for the test class, unlike summary.lm which is.

And there's like 4 different styles of function naming in base & stats alone.

Python has problems (for gods sake, why isn't len a method?) but it's a little more consistent.

I used to think that R was responsible for a lot more of the mess than I now do, having seen the same kind of DS code (and I am a DS) written in both Python and R.

And it would be sweet if R had a pytest equivalent, if I never have to write self.assertEqual again, it'll be too soon.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: