As of this writing, two deep learning frameworks are widely used in the Python community: TensorFlow and PyTorch. TensorFlow, together with its high-level API Keras, has been usable from R since 2017, via the tensorflow and keras packages. Today, we are thrilled to announce that now, you can use Torch natively from R!
This post addresses three questions:
- What is deep learning, and why might I care?
- What’s the difference between
- How can I participate?
If you are already familiar with deep learning – or all you can think right now is “show me some code” – you might want to head directly over to the more technical introduction on the AI blog. Otherwise, you may find it more useful to hear about the context first, and then play with the step-by-step example in that complementary post.
What is deep learning, and why might I care?
If you’re a data scientist, and your data normally comes in tabular, mostly-numerical form, a toolbox of linear and non-linear methods like those presented in James et al.’s Introduction to Statistical Learning may be all you need. This holds even more strongly if the number of data points is limited, as tends to be the case in some academic fields, such as anthropology or ethnology. In this case, Bayesian modeling, as taught by Richard McElreath’s Statistical Rethinking, may be the best approach. Carrying the argument to the extreme: Yes, we can construct deep learning models to predict penguin species based on biometric attributes, and doing this may be very useful in teaching, but this type of task is not really where deep learning shines.
In contrast, deep learning has seen its greatest successes when there are lots of data of a type that is often (misleadingly) called “unstructured” – images, text, heterogeneous data resisting unification. Over the last decade, public triumphs have spread from image classification and related tasks, like segmentation and detection (important in many sciences), to natural language processing (NLP); prominent examples are translation, summarization, and dialogue generation. Beyond these areas of benchmark datasets and official, academically organized competitions, deep learning is pervasively employed in generative art, recommendation systems, and probabilistic modeling. Needless to say, current research is working to expand its limits even more, striving to integrate capabilities for e.g. concept learning or causal inference.
Many readers are likely to work in a field that could benefit from deep learning. But even if you don’t, learning about how a technology works yields power, power to look behind appearances and make up your own mind and decisions.
What’s the difference between
In the Python world, as of 2020, which framework you end up using for a project may be largely a matter of chance and context. (Admittedly, to say so takes the fun out of “TensorFlow vs. PyTorch” debates, but that’s no different from other popular “comparison games”. Take vim vs. emacs, for example. How many people, among those who use one of them preferentially, have come to do so because “that’s what I learned first” or “that’s what was used in my first company”?).
Not too long ago, there was a big difference, though. Before the introduction of TensorFlow 2 (the current release is 2.3), TensorFlow code was compiled to a static graph, and raw TensorFlow code was hard to write. Many users didn’t have to write low-level code, however: The high-level API Keras provided concise, declarative idioms to define, train, and evaluate a neural network. On the other hand, Keras did not, at that time, offer a way to easily customize the training process. Ease of customization, then, used to be PyTorch’s competitive advantage, relevant to researchers in particular. On the other hand, PyTorch did not, initially, excel in production and deployment facilities. Historically, thus, the respective strengths used to be seen as ease of experimentation on the one side, and production readiness on the other.
Today, however, with TensorFlow having become more flexible and PyTorch being increasingly employed in production settings, the traditional dichotomy has weakened. For the R user, this means that practical considerations are likely to prevail.
One such practical consideration that, for some users, may be of
tremendous importance, is the following.
based on reticulate, that
helpful genie which lets you use Python packages seamlessly from R. In
other words, they do not replace Python TensorFlow/Keras; instead,
they wrap its functionality and in many cases, add syntactic sugar,
resulting in more R-like, aestethically-pleasing (to the R user) code.
torch is different. It is built directly on
PyTorch’s C++ backend. There is no dependency on Python, resulting in a
leaner software stack and more straightforward installation. This should
make a huge difference, especially in environments where users have no
control over, or are not allowed to modify, the software their
Otherwise, at the current point in time, maturity of the ecosystem (on
the R side) naturally constitutes a major difference. As of this
writing, a lot more functionality – as well as documentation – is
available in the
tensorflow ecosystem than in the
torch one. But
time doesn’t stand still, and we’ll get to that in a second.
To wrap up, let’s quickly mention another aspect, to be explained in
more detail in a dedicated article. Due to its in-built facility to do
torch can also be used as an R-native,
high-performing, highly-customizable optimization tool, beyond the realm
of deep learning. For now though, back to our hopes for the future.
How can I participate?
As with other projects, we sincerely hope that the R community will find the new functionality useful. But that is not all. We also hope that you, many of you, will take part in the journey. There is not just a whole framework to be built. There is not just a whole “bag of data types” to be taken care of (images, text, audio…), each of which requires their own pre-processing functionality. There is also the expanding, flourishing ecosystem of libraries built on top of PyTorch: PySyft and CrypTen for privacy-preserving machine learning, PyTorch Geometric for deep learning on manifolds, and Pyro for probabilistic programming, to name just a few.
Thanks for reading, and have fun with