Data Science with JavaScript

publishedabout 1 month ago
2 min read

Hi there, since the beginning of this year we've been exploring how far we can take Data Science with JavaScript. As part of this journey, we started hal9.ai, an integrated environment to help us be more productive when analyzing data with JavaScript.

JavaScript Visualizations built with Plot.js and Hal9
JavaScript Visualizations built with Plot.js and Hal9

We want to ask for your feedback, but more importantly, we want to use this post to share what we've learned so far:

  1. Visualizations: JavaScript is great at visualizing interactive data, this is probably obvious but worth mentioning nonetheless. Some of the highlights here, D3.js is still a great library to perform visualizations; however, D3.js is really low level -- Kinda like TensorFlow, not Keras. We actually went to create our own charting library to combine the flexibility of D3 with the ease-of-use of other libraries like Plotly; just to find out later on that Plot.js got launched as an amazing library that builds on top of D3. So we ended up integrating Plot.js as our recommended charting library.
  2. Transformations: We found out that JavaScript in combination with D3.js has a pretty decent set of data import and transformation functions; however, it comes nowhere near to Pandas or dplyr. After shopping around, we found out about Tidy.js, loved it, and adopted it. The combination of Tidy.js and D3.js and Plot.js is absolutely amazing for visualizations and data wrangling with small datasets, say 10-100K rows. We were very happy with this for a while; however, once you moved away from visualizations into data analysis, we found out 100K rows is quite restrictive, which is also slow when having 1K-10K columns. So we switched gears and started using Arquero.js, a columnar JS library that enabled us to process +1M rows in the browser, decent size for real-world data analysis.
  3. Modeling: We are currently exploring this space so our findings are not final, but let us share what we've found so far. TensorFlow.js is absolutely amazing, it provides a native port from TensorFlow to JavaScript with support for CPU, WebGL, WebAssembly and NodeJS backends. We were able to write an LSTM to do time series prediction, so far so good. However, we started hitting issues when we started to do simpler models, like a linear regression. We tried Regression.js but we found it falls short, so we wrote our own script to handle single-variable regressions using TF.js and gradient descent. It actually worked quite well but exposed a flaw in this approach; basically, there is a lot of work to be done to bring many models into the web. We also found out Arquero.js does not play nicely with TF.js since well, Arquero.js does not support tensors; so we went on to explore Danfo.js, which integrates great with TF.js but we found out it's hard to scale transformations to +1M rows and found other rough edges. Since then, well, we started exploring Pyodide and perhaps contributing to Danfo.js, or perhaps involving more server-side compute with NodeJS, but that will be a story for another time.

So net-net, we are still super excited about exploring Data Science, Data Engineering, Visualization and Artificial Intelligence with JavaScript; but realistically, it is going to take a few years for this to mature.

In the meantime, we think Data Science with JavaScript shines with smaller datasets and interactive visualizations; which we believe Hal9 can help you be productive at. That said, we do believe that motivated JavaScript users can help unblock themselves by adding new functionality and contributing back libraries to NPM or components to our open source project, please do reach out in Hal9's GitHub repo if you wanna lend a hand!

Alright, so call to action? Please head to hal9.ai and give it a shot! We would love to hear where you think this could be useful, what features we are missing, and any feedback you may have.

To keep in touch, please subscribe to our weekly email at news.hal9.ai, contact us at info@hal9.ai, or follow us on Twitter as @hal9.ai

Thanks for reading along!


I respect your privacy. Unsubscribe at any time