Features
Here’s a non-exhaustive list of Deeplearning4j’s features. We’ll be updating it as new nets and tools are added.
Integrations
- Spark
- Hadoop/YARN
- Model Import from Keras
APIs
- Scala
- Java
Libraries
- ND4J: N-dimensional arrays for the JVM
- libND4J: Native CPU/GPU operations for ND4J
- DataVec: Data preparation for DL4J
- Deeplearning4j
Nets
- Restricted Boltzmann machines
- Convolutional nets
- Recursive autoencoders
- Recurrent nets: Long Short-Term Memory (LSTM) (including bi-directional LSTMs)
- Deep-belief networks
- Denoising and Stacked Denoising autoencoders
- Deep autoencoders
Since Deeplearning4j is a composable framework, users can arrange shallow nets to create various types of deeper nets. Combining convolutional nets with recurrent nets, for example, is how Google accurately generated captions from images in late 2014.
Tools
DL4J contains the following built-in vectorization algorithms:
- DataVec: Machine Learning Data Pipelines for Vectorization/Tensorization
- Moving-window for images
- Moving-window for text
- Viterbi for sequential classification
- Word2Vec
- Bag-of-Words encoding for word count and TF-IDF
- Doc2Vec for Paragraph Vectors
- Constituency parsing
- DeepWalk
DL4J supports the following type of optimization algorithms:
- Stochastic gradient descent
- Stochastic gradient descent with line search
- Conjugate gradient line search (c.f. Hinton 2006)
- L-BFGS
Each of these optimization algorithms may be paired with training features (known as ‘updaters’ in DL4J) such as:
- SGD (learning rate only)
- Nesterovs momentum
- Adagrad
- RMSProp
- Adam
- AdaDelta
Hyperparameters
- Dropout (random ommission of feature detectors to prevent overfitting)
- Sparsity (force activations of sparse/rare inputs)
- Adagrad (feature-specific learning-rate optimization)
- L1 and L2 regularization (weight decay)
- Weight transforms (useful for deep autoencoders)
- Probability distribution manipulation for initial weight generation
- Gradient normalization and clipping
Loss/objective functions
- MSE: Mean Squared Error: Linear Regression
- EXPLL: Exponential log likelihood: Poisson Regression
- XENT: Cross Entropy: Binary Classification
- MCXENT: Multiclass Cross Entropy
- RMSE_XENT: RMSE Cross Entropy
- SQUARED_LOSS: Squared Loss
- NEGATIVELOGLIKELIHOOD: Negative Log Likelihood
Activation functions
Activations functions are defined in ND4J here
- ReLU
- Leaky ReLU
- Tanh
- Sigmoid
- Hard Tanh
- Softmax
- Identity
- ELU: Exponential Linear Units
- Softsign
- Softplus