Incremental learning is specialized category of machine learning techniques. It is relevant with time series analysis. Plain time series analysis statistical techniques have simple computational approaches, while in incremental learning one needs to implement specific algorithms to handle time component in the data and extract useful information/tend out of it. Statistical time series analysis techniques tend to cancel the non-stationary component of the data while incremental machine learning techniques handle non stationary data and hence are special. In this post, let us discuss about what exactly incremental learning is and what are the problems associated with it.
Various data mining techniques can be used for analyzing problems in real world applications and discovering their solution in a scientific manner. Supervised learning techniques use instances, which have already been pre-classified in some manner. That means each instance has a label, which recognizes the class to which it belongs.
Classification is a supervised data mining technique, predicts about data values, using already known results found from data gathered in advance. Classification maps, data into groups of classes established in advance. In short, it tells us which data instance, should belong to which category of data, thereby simplifying the processing of large amounts of data, such as stream data or data from decision support systems. Data evolves over time, as time goes on and rapid increase in volume of data is observed.
Hence, it is presumed that the dataset contains all required information for learning the pertinent concepts. However, this technique has proven impractical for many real world applications, e.g., spam identification, climate monitoring, credit card fraud detection, etc. Data is usually received over time in streams of instances and is not at hand from the beginning. Such data conventionally reaches there in two ways, first is incremental arrival or batch arrival.
Therefore, the question is to use all the information up to a certain time step t to estimate up to the minute instances coming at time step t+1. Learning in such cases is referred to as incremental learning. If learning from enormous data which is progressing by sequential steps is desired, then incremental learning or online algorithms are best suitable and preferred. Incremental learning algorithm is always fed with input data as the data arrives in sequence, the algorithm computes hypothesis by taking into account order of data arrival, each one gives details of all the data seen so far. Note that the algorithm should be depend on prior hypotheses and the ongoing training data.
Properties of Incremental Learning Algorithm:
Incremental learning algorithm should possess all the properties mentioned below:
1. The algorithm should gain knowledge that is additional in recent data.
2. To train already present classifier, the algorithm need not access to the initial data
3. It should keep up the knowledge which it has previously learned (No catastrophic forgetting should occur)
4. New data may bring in concept class. The algorithm should be able to recognize and accommodate such a new class which can be introduced over the period of time.
This definition points out that, learning from stream data needs a classifier that can be incrementally modified to get full benefit from newly arrived data, while simultaneously preserving performance of the classifier on matured data underlining the fact to of “stabilityplasticity” dilemma which describes how a learning system can be designed to stay stable and unchanged to immaterial affairs , while plastic (i.e. be able to change when necessary in order to deal with different situations) to recent and crucial data (e.g. concept change) .
Hence, the stability-plasticity dilemma poses a cline on which incremental learning classifiers can prevail. Batch learning algorithms (i.e. system of algorithms trained on static data) are on one end of the cline which is representation of stability. These algorithms overlook al new data, instead is aimed fully at formerly learned concepts.
Another end of the cline contain online learning algorithms where the model is altered upon observing the new instance and the instance is discarded immediately. Even if batch learning systems prevail on one of the end of cline of stability-plasticity dilemma, they are fundamentally not incremental in nature. Because batch learners do not have the capacity of describing any new instances once, they have been learned and thus fail to fulfill property 1 of incremental learning.
Solution to Incremental learning problem: Ensemble method
This restraint is alleviated by building ensembles of batch learners, where fresh batch learners can be learned on the fresh data, and then united through a voting system. Ensemble technique (multiple classifier system) is widespread in the area of machine learning, specifically in the incremental environment.
An ensemble technique is extracted by merging diverse classifiers. There are various differentiating parameters who help to achieve diversity that in turn entitles each classifier to produce several decision boundaries. Appropriate diversity allows to gain different errors to be made by individual classifier and finally strategic integration of them can cut off the total error in the entire system.
The Ensemble can be build up in several ways like:
2) Boosting – Adaboost is pretty popular algorithm
3) Stacked generalization
4) Mixture of experts
The diversity need can be fulfilled by applying various approaches such as:
1) Training each classifier using several data chunks
2) Training each classifier using several parameters of a given classifier architecture
3) Training each classifier using several classifier models
4) Random Subspace method (training each classifier using several subset of characteristics)
Ensemble approach can be used to address every challenge associated with incremental learning as a basic design model. Several solutions for those issues are found that are inherently combined with ensemble systems.
Read more about specialized incremental clustering techniques, here:
My previously published research paper, complete paper can be found, here