Note: The repository for this project may be found here. As per the instructions on Gitter, the license badge has been added to this proposal file as well.
Note 2: The repository to my personal site set-up on Github Pages can be found here -- I did have one earlier, but this was a good opportunity to give the content a refresh and move it to Github Pages :)
Something that I've been wanting to set time aside to work on for a long time is a comprehensive, extendable NodeJS wrapper for the PyTorch library. PyTorch is a Python package that enables Tensor computation (like NumPy) with strong GPU acceleration, and deep neural networks built on a tape-based autograd system.
PyTorch has been catching up with Tensorflow in terms of popularity (ref) over the last few years, and with features like dynamic computation graphs, the ability to use standard control-flow statements during training, PyTorch is closing the gap and becoming the framework of choice for a lot of people in the deep learning space. We're also seeing projects like Tensorboard, initially a library for the Tensorflow ecosystem, becoming PyTorch compatible, and the PyTorch community has really grown over the last year or two.
With the advent of NodeJS, and the rise in popularity of the JavaScript community in general, I've realised that unlike Tensorflow (ref: Tensorflow.js), PyTorch lacks a JavaScript equivalent which would allow developers and data-scientists to deploy PyTorch models in NodeJS environments. This (usually) leads to most PyTorch model deployments occuring via Microservice-esque Flask/Django deployments, which isn't always the ideal solution (especially when it's just about deploying a single model and your codebase is relatively smaller than most large corporate codebases). I believe that developing a NodeJS binding for PyTorch would be an extremely interesting, and impactful project, because it would both allow Node developers to avoid the need to adopt a whole different language/framework (Python using Flask/Django, or even something like TorchServe) while attempting to deploy and host PyTorch trained models.
This is a library that I've personally been hoping would crop up for the longest time. As someone who's worked a fair amount with both training Deep Learning models as well as putting them into production over the last year, this is a problem I was always hoping would be addressed eventually. I found a few alternatives, but each has their own set of caveats and pifalls, that I would want to try and mitigate:
- torch-js: This is the closest example to what I envision this project potentially looking like. However, the functionality is limited (things as basic as PyTorch transforms aren't really supported) and the project seems more like a one-off. However, there is an active fork of this project by arition. I plan on working on this fork in order to provide some underlying functionality for my library, because this would mean that we don't need to worry about running Torchscript in a JS environment.
- onnx.js: onnx.js lets developers run models exported using the ONNX interface, in a JavaScript environment. ONNX is great, and is an amazing project, but suffers from the pitfalls that any common interface/protocol would face from, namely the lack of the ability to use those features that make a library like PyTorch distinct.
I think that building a NodeJS wrapper for PyTorch, while maybe a tall ask, could be a beneficial project. Starting at just an execution runtime in NodeJS for example, may be a more feasible approach, allowing developers who have trained PyTorch models in Python to use a NodeJS runtime to put them into production. I think this could also open up a whole world of interesting possibilities, including calling PyTorch models during client-side executation as well, since JavaScript runs in many forms, on both the client side and the server side.
I think this would be the primary use case for such a library anyway, as we've seen with the community response to Tensorflow.js. JavaScript and NodeJS are great for writing web-servers, but not always to train Deep Learning models. However, deploying them in JS runtimes is something that I feel is worth working on.