The Deep Learning Framework Wars

Within the realm of data science, deep learning frameworks are predominantly delivered via software found in the Python ecosystem. When looking at the options in the space, it may appear to some as a battle for supremacy, or for one to reign supreme, but the reality is that for a variety of reasons people have their favorites. Calling this a “war” is perhaps being a bit overdramatic.

Around 2012, deep learning became a major point of interest in the machine learning and data science space. Around 2016, numerous frameworks became available kicking off this “war” of deep learning frameworks. Some came out of universities, others from research labs within corporations. In those early days, lots of framework names would get thrown around including: Keras, Caffe, MXNet, PyTorch, TensorFlow (TF), PaddlePaddle, and CNTK (now called MicrosoftCognitive Toolkit).

The maturity and adoption of these frameworks is quite interesting from a political perspective. PaddlePaddle hasn’t really caught on outside of China, but is still in active development. MXNet hasn’t really caught on outside of Amazon. TensorFlow came out of Google, and was considered very verbose and difficult to use, but Keras provided its API directly into TensorFlow. This API was considered easy to use and really helped to drive adoption of TensorFlow.

PyTorch came out of Facebook and seemed popular among researchers.

Now, this is not to say people are not happily using these frameworks outside of the political boundaries I’ve laid out. The one certain fact in this space is that no matter how stable the selections look at any point, be assured more change will occur.

Taking an honest look at the levels of adoption of these deep learning frameworks, as well as those libraries which are adjacent, we can observe some interesting trends.

First is PyTorch, with its tremendous following and mindshare. If you look at the metrics alone it might be easy to miss, but PyTorch is quite possibly the most used and talked about deep learning framework out there. The numbers will lead you to see TensorFlow as the top framework, but the people are talking most about PyTorch. The community around PyTorch is significant and very active. The use cases are showing that PyTorch is doing the work. One of the names that built a tremendous following is HuggingFace, which built its transformers architecture on PyTorch and has attracted many users to its natural language processing (NLP) libraries. OpenAI is another name that has come up fast with its DALL-E 2 AI for creating images from text. They put all their eggs in the PyTorch basket.

Second is TensorFlow. It is used for many of the same use cases as PyTorch. It is easy to find it attached to use cases for speech recognition, object detection, and natural language processing. Google is the benefactor of TensorFlow, and this is both good and bad. The good is, it means they put the time into making it perform well. The bad is that they are more focused on making their infrastructure meet their own needs than that of the community. Again, this isn’t to say this is always the case, or run away, this is just a reference of some sentiment that can be found in the community and is a detail that has prompted many to move to PyTorch.

Software supporting these workflows is as important as these frameworks themselves. There are a plethora of these libraries, but an important one to call out is Numba. This library is far more generalized than the deep learning frameworks. It is a JIT compiler that translates a subset of Python and NumPy code into fast machine code. This is a way to optimize mathematical workloads for native hardware. Basically, a single line of code to annotate an algorithm produces speed equivalent to having written the code in a low level language, all without the need to leave the comfort and convenience of the Python language.

Finally, the last library to discuss here is JAX. Very young and exceedingly immature compared to the others mentioned. JAX is often conflated with the deep learning frameworks, but the reality is, it looks a lot more like Numba than it does like PyTorch or TensorFlow. The real claim to fame for JAX right now is its ability to provide auto differentiation which is a crucial step in many scientific computing endeavors.

We don’t really have a war on our hands, and there are most certainly a lot of tools available and at our disposal to solve complex problems. If you aren’t in the thick of it, then sit back, grab some popcorn, as this is a very interesting space to watch over the coming years.


Subscribe to Big Data Quarterly E-Edition