We are excited to announce the second formal summer internship program at RStudio. The goal of this program is to enable RStudio employees to collaborate with students to do work that will help both RStudio users and the broader R community, and help ensure that the community of R developers is as diverse as its community of users. Over the course of the internship, you will work with experienced data scientists, software developers, and educators to create and share new tools and ideas.

The internship pays approximately $12,000 USD (paid hourly), lasts up to 10-12 weeks, and will start around June 1 (depending on your availability, applications are open now, and close at the end of February. To qualify, you must currently be a student (broadly construed - if you think you’re a student, you probably qualify) and have some experience writing code in R and using Git and GitHub. To demonstrate these skills, your application needs to include a link to a package, Shiny app, or data analysis repository on GitHub. It’s OK if you create something specifically for this application: we just want to know that you’re already familiar with the mechanics of collaborative development in R.

RStudio is a geographically distributed team which means you can be based anywhere in the United States (we hope to expand the program to support interns in other countries next year). This means that unless you are based in Boston or Seattle, you will be working 100% remotely, though you will meet with your mentor regularly online, and we will pay for you to travel to one face-to-face work sprint with them.

We are recruiting interns for the following projects:

Calibrated Peer Review - Prototype some tools to conduct experiments to see whether calibrated peer review is a useful and feasible feedback strategy in introductory data science classes and industry workshops. (Mine Çetinkaya-Rundel)

Tidy Blocks - Prototype and evaluate a block-based version of the tidyverse so that young students can do simple analysis using an interface like Scratch. (Greg Wilson)

Data Science Training for Software Engineers - Develop course materials to teach basic data analysis to programmers using software engineering problems and data sets. (Greg Wilson)

Tidy Practice - Develop practice projects for learners to tackle to practice tidyverse (or other) skills using interesting real-world data. (Alison Hill)

Teaching and Learning with RStudio - Create a one-stop guide to teaching with RStudio similar to Teaching and Learning with Jupyter (https://jupyter4edu.github.io/jupyter-edu-book/) (Alison Hill)

Grader Enhancements - grader works with learnr tutorials to grade student code. This project will enhance this ambitious project to help grader identify students’ exact mistakes so that it can help students do better. (Garrett Grolemund)

Object Scrubbers - A lot of R objects contain elements that could be recreated and these can result in large object sizes for large data sets. Also, terms, formulas, and other objects can carry the entire global environment with them when they are saved. This internship would help write a set of methods that would scrub different types of objects to reduce their size on disk. (Max Kuhn and Davis Vaughan)

Production Testing Tools for Data Science Pipelines - This project will build on “applicability domain” methods from computational chemistry to create functions that can be included in a dplyr pipeline to perform statistical checks on data in production. (Max Kuhn)

Shiny Enhancements - There are a several Shiny and Shiny-related projects that are available, depending on the intern’s interests and and skill set. Possible topics include: Shiny UI enhancements, improving performance bottlenecks by rewriting in C and C++, fixing bugs, and creating a set of higher-order reactives for more sophisticated reactive programming. (Barret Schloerke)

ggplot2 Enhancements - Contribute to ggplot2 or an associated package (like scales). You’ll write R code for graphics, but mostly you’ll learn the challenges of managing a large, popular open source project including the care needed to avoid breaking changes, and actively gardening issues. You work will impact the millions of people who use ggplot2. (Hadley Wickham)

R Markdown Enhancements - R Markdown is a cornerstone product of RStudio used by millions to create documents in their own publishing pipelines. The code base has grown organically over several years; the goal of this project is to refactor it. This involves tidying up inconsistencies in formatting, adding a comprehensive test suite, and improving consistency and coverage of documentation. (Rich Iannone)

Apply now! Application deadline is February 22nd.

RStudio is committed to being a diverse and inclusive workplace. We encourage applicants of different backgrounds, cultures, genders, experiences, abilities and perspectives to apply. All qualified applicants will receive equal consideration without regard to race, color, national origin, religion, sexual orientation, gender, gender identity, age, or physical disability. However, applicants must legally be able to work in the United States.