An illustration of a data scientist gazing into the distance

Data Scientists Face an Existential Crisis

The term data scientist has always been a bit controversial. William Cleveland coined the term in 2001 to advocate the practical use of statistics in other technical fields and believed that use warranted a new name. Nowadays, professionals sporting a data science title typically hold a Ph.D., possess some detailed domain knowledge, and are either computer science majors who learned statistics or statisticians who learned to program. And while most of us have seen Drew Conway’s diagram showing that combination of skills, I think Joel Grus’ addition of evil intent allows us to better recognize other interesting combinations (see Figure 1 below).

Differing views on what makes a good data scientist

However, new technologies and a difficult economic environment caused by COVID-19 restrictions have raised new questions about the data scientist role, including:

The Reality: Organizations Need Data Scientists More Than Ever

COVID-19 crisis has created much fear, uncertainty, and doubt—commonly abbreviated as FUD—in all of our lives. However, based on what we see from organizations using our packages and products, RStudio believes that this FUD is unwarranted because:

Serious Data Science Helps Data Scientists Demonstrate Their Value

From RStudio’s point of view, the most convincing arguments we see for a bright data science future come from the work being done by the R and Python data science communities. We hear such stories regularly, and we’ve been collecting examples of this work as part of our Customer Stories program. We’ll add more of those stories in the months to come as we continue to talk to the tens of thousands of data scientists in our community that use our tools every day. It’s their work that inspires us to build and distribute the software we create.

These stories have helped us envision what we believe to be the new role of the successful data scientist, Specifically, we believe that the role of a data scientist is to deliver what we’ve called Serious Data Science. Just as was implied in Conway’s Venn diagram in Figure 1a, Serious Data Science draws on some of the best practices found in software development, statistics, and domain expertise to deliver results that are:

Look for a deeper discussion of the ways data scientists can enhance their role using Serious Data Science during the month of June.

Learn More about the Data Scientist’s Role and Serious Data Science

For a real-world view of how data scientists work to solve hard, complex, and valuable problems, watch Pim Bongaerts from the California Academy of Sciences speak about using data science to help save coral reefs.

Eduardo Ariño de la Rubia, Data Science Manager at Facebook, spoke at rstudio::conf 2020 on the role of a data scientist, with an emphasis on how they bring value beyond putting models in production.

We also recommend our prior blog posts in this series: