RStudio 1.3 Released

2020-05-27 Jonathan McPherson
Thumbnail screenshot
Today we’re excited to announce the general release of RStudio 1.3. This release features many major improvements to the IDE, including: Dramatically improved accessibility for sight-impaired users, which also upgrades keyboard navigation, contrast ratios, and visibility for everyone. A real-time spell-checking engine, with suggestions, customizable dictionaries, and a built-in whitelist for common data science terms. Extensible, in-IDE tutorials powered by the learnr package. Settings and preferences are now stored in plain text files you can back up or manage with other tools; they can also be applied globally to all users on an RStudio Server. Read more →

The Role of the Data Scientist

2020-05-27 Carl Howe, Sean Lopp
Thumbnail thumbnail.jpg
A slew of new vendors believe that no-code analytics and visualization tools can replace the role of the traditional data scientist. This brief describes why we believe organizations will demand pro-code data scientists for years to come. Read more →

Driving Real, Lasting Value with Serious Data Science

2020-05-19 Lou Bajuk
Thumbnail thumbnail.png
Driving lasting value in an organization with data science is critical but difficult. The truth is most projects fail. What’s the answer? Serious Data Science is credible, agile and durable. Read more →

Equipping Your Data Science Team to Work from Home

2020-05-12 Carl Howe
Thumbnail work-from-home-desk
Photo by Djurdjica Boskovic on Unsplash If your data science team experienced an abrupt transition to working at home, it may be a good time to rethink their development tools. In this post, I’ll talk about why laptop-centric data science gets in the way of strong data science teams and why you should consider deploying development and publishing servers. Working from Home Has Affected Both People and Data Like tigers and koalas, we data scientists are fairly solitary creatures. Read more →

sparklyr 1.2: Foreach, Spark 3.0 and Databricks Connect

2020-05-06 Yitao Li
A new version of sparklyr is now available on CRAN! In this sparklyr 1.2 release, the following new improvements have emerged into spotlight: A registerDoSpark() method to create a foreach parallel backend powered by Spark that enables hundreds of existing R packages to run in Spark. Support for Databricks Connect, allowing sparklyr to connect to remote Databricks clusters. Improved support for Spark structures when collecting and querying their nested attributes with dplyr. Read more →

Wrangling Unruly Data: The Bane of Every Data Science Team

2020-05-05 Carl Howe
Thumbnail data-wrangling
There’s an old saying (at least old in data scientist years) that goes, “90% of data science is data wrangling.” This rings particularly true for data science leaders, who watch their data scientists spend days painstakingly picking apart ossified corporate datasets or arcane Excel spreadsheets. Does data science really have to be this hard? And why can’t they just delegate the job to someone else? Data Is More Than Just Numbers The reason that data wrangling is so difficult is that data is more than text and numbers. Read more →

Avoid Irrelevancy and Fire Drills in Data Science Teams

2020-04-28 Sean Lopp, RStudio
Thumbnail irrelevancy-and-fire-drills
Balancing the twin threats of data science development Data science leaders naturally want to maximize the value their teams deliver to their organization, and that often means helping them navigate between two possible extremes. On the one hand, a team can easily become an expensive R&D department, detached from actual business decisions, slowly chipping away only to end up answering stale questions. On the other hand, teams can be overwhelmed with requests, spending all of their time on labor intensive, manual fire-drills, always creating one more “Just in Time” Powerpoint slide. Read more →

Getting to the Right Question

2020-04-22 Carl Howe, RStudio
Thumbnail process.jpg
The Root Problem: We Don’t All Speak the Same Language Organizations across the modern business world recognize the critical importance of Data Science for competitive advantage. That recognition has driven Glassdoor to rate Data Scientist as one of the 25 top paying jobs in America in 2020. However, many organizations struggle to put these data scientists’ knowledge to work in their businesses where they can actually have an impact on success. Read more →

RStudio and COVID-19

2020-04-17 Hadley Wickham
Thumbnail covid-19-map
A lot of the R community are involved in the response to the COVID-19 pandemic, and we want to help out where we can. Read more →

Effective Visualizations for Credible, Data-Driven Decision Making

2020-04-16 Jason Milnes
Thumbnail waldo
Recently, we were joined by the smart folks at Roche & Novartis to present a webinar on effective data visualization. You can watch the recording of the full presentation here. It was the latest installment in a series of webinars highlighting industry leaders in the Pharmaceutical and Life Science spaces that are doing world-changing data science work. They presented many great insights, most of which are relevant to data scientists in every industry, and we wanted to share our learnings. Read more →