Scala, Apache Spark and Deeplearning4j

Scala programmers seeking to build machine learning solutions can use Deeplearning4j’s Scala API ScalNet or work with the Java framework using the Builder pattern.

Skymind’s numerical computing library, ND4J (n-dimensional arrays for the JVM), comes with a Scala API, ND4S. Our full walkthrough of Deeplearning4j’s Apache Spark integration is here. Our examples include a number of tutorials using Scala notebooks with Zepellin.

Scala

Scala is one of the most exciting languages to be created in the 21st century. It is a multi-paradigm language that fully supports functional, object-oriented, imperative and concurrent programming. It also has a strong type system, and from our point of view, strong type is a convenient form of self-documenting code.

Scala works on the JVM and has access to the riches of the Java ecosystem, but it is less verbose than Java. As we employ it for ND4J, its syntax is strikingly similar to Python, a language that many data scientists are comfortable with. Like Python, Scala makes programmers happy, but like Java, it is quite fast.

GET STARTED WITH DEEP LEARNING

Finally, Apache Spark is written in Scala, and any library that purports to work on distributed run times should at the very least be able to interface with Spark. Deeplearning4j and ND4J go a step further, because they work in a Spark cluster, and boast Scala APIs called ScalNet and ND4S.

We believe Scala’s many strengths will lead it to dominate numerical computing, as well as deep learning. We think that will happen on Spark. And we have tried to build the tools to make it happen now.

Apache Spark

Deeplearning4j depends on Apache Spark for fast ETL. While many machine-learning tools rely on Spark for computation, this is in fact quite inefficient, and slows down neural net training. The trick to using Apache Spark is pushing the computation to a numerical computing library like ND4J, and its underlying C++ code.

See also

A non-exhaustive list of organizations using Scala:

  • AirBnB
  • Amazon
  • Apple
  • Ask.com
  • AT&T
  • Autodesk
  • Bank of America
  • Bloomberg
  • Credit Suisse
  • eBay
  • Foursquare
  • (The) Guardian
  • IBM
  • Klout
  • LinkedIn
  • NASA
  • Netflix
  • precog
  • Siemens
  • Sony
  • Twitter
  • Tumblr
  • UBS
  • (The) Weather Channel
  • Xerox
  • Yammer
Chat with us on Gitter