It is widely used for teaching, research, and industrial applications, contains a plethora of built-in tools for standard machine learning tasks, and additionally gives transparent access to well-known toolboxes such as scikit-learn, R, and Deeplearning4j.

Machine Learning without Programming

Weka can be used to build machine learning pipelines, train classifiers, and run evaluations without having to write a single line of code:

Open a dataset

First, we open the dataset that we would like to evaluate.

Choose a classifier

Second, we select a learning algorithm to use, e.g., the J48 classifier, which learns decision trees.

Evaluate predictive accuracy

Finally, we run a 10-fold cross-validation evaluation and obtain an estimate of predictive performance.

Note that programmers can also easily implement this pipeline using Weka's Java API:

Deep Learning with WEKA

WekaDeeplearning4j is a deep learning package for Weka. Deep neural networks, including convolutional networks and recurrent networks, can be trained directly from Weka's graphical user interfaces, providing state-of-the-art methods for tasks such as image and text classification.

WEKA Interoperability

WEKA can be integrated with the most popular data science tools.

R

Weka models can be used, built, and evaluated in R by using the RWeka package for R; conversely, R algorithms and visualization tools can be invoked from Weka using the RPlugin package for Weka.

Python

Weka's functionality can be accessed from Python using the Python Weka Wrapper. Conversely, Python toolkits such as scikit-learn can be used from Weka.

Spark

For running Weka-based algorithms on truly large datasets, the distributed Weka for Spark package is available. It makes it possible to train any Weka classifier in Spark, for example.