Giant Dutch windmills

Data science is now a hot area of investment for many organizations. Countless blogs, articles, and analyst reports emphasize that effective data science is critical for competitive advantage, and many business leaders believe that data science is vital for an organization to survive, much less thrive, over the next several years.

However, many data science leaders grapple with an existential crisis for their teams. On the one hand, many vendors and analyst reports emphasize the rise of Citizen Data Scientists, empowered by tools that promise to augment and automate the hard work of data science to automagically answer vital questions, no Data Scientist required. On the other hand, machine learning and deep learning methods in the hands of software engineers, fueled by lots of computational power, answer more and more questions (as long as the problem is well-defined, and there is sufficient data available). Squeezed in between these trends, what is the role of a data scientist?

Even worse, nearly as many blogs and analyst reports emphasize the challenges of effectively implementing data science in an organization, and emphatically state that most analytics and data science projects fail, and most companies don’t achieve the revenue and profit growth that they hoped their data science investments would deliver.

We will dive into the role of a data scientist in more detail in the coming weeks, but here we will focus on this question: Why is getting real, lasting value from data science investments so difficult?

Many data science projects lack credibility and impact over time

In talking to many different organizations implementing data science projects, we have seen many challenges that prevent data science investments from delivering the value they should. These typically fall into three categories:

Andrew Mangano, Data Intelligence Lead at Salesforce, spoke at rstudio::conf 2020 about the importance of delivering useful insights to your stakeholders.

Real-world problems need serious data science

So what’s the answer? And how do you cut through all the hype and confusion?

The reality is that hard, vaguely defined but valuable to solve, problems exist in the world. Commodity approaches (whether via augmented analytics for citizen data scientists, or standard machine learning approaches for software engineers) yield commodity answers. Real-world business problems require smart, agile data science teams empowered with the flexibility and breadth of open source languages like R and Python. We know this because tens of thousands of you use our software every day to do amazing things.

To deliver real, lasting value, organizations need to set aside the hype and build on a strong foundation. We recommend adopting a strategy we call Serious Data Science. As shown in Figure 1, Serious Data Science is an approach to data science designed to deliver insights that are:

Serious Data Science is….

Credible Agile Durable
  • Uses widely deployed and trusted tools
  • Includes comprehensive data science capabilities
  • Offers flexibility through the use of code
  • Provides transparency through visualizations and code
  • Employs existing knowledge and analytic investments
  • Allows rapid development and iteration
  • Scales well for enterprise and production use
  • Empowers your business stakeholders
  • Provides reusable, reproducible code and results
  • Delivers relevant, up-to-date insights
  • Supports and is supported by a vital open source community
  • Avoids vendor lock-in

Figure 1: Crucial elements of a Serious Data Science platform.

Why you should adopt Serious Data Science

We’ll be writing in detail about these components of Serious Data Science in the weeks to come. But before we get to that, we must address a topic near and dear to every data science leader: the role of the data scientist within the organization. Our post next Tuesday will address how that role is changing in today’s organizations, and why they will need the Serious Data Science framework to continue demonstrating their value in the months and years to come.

Learn more about Serious Data Science

If you’d like to learn more about Serious Data Science, we recommend: