Incremental Classification: Gaussian Naïve Bayes Algorithm with example

# Incremental Classification: Gaussian Naïve Bayes Algorithm with example

Naïve Bayes algorithm

Naïve Bayes is easy and effective technique for predictive modeling in machine learning. Incremental learning expects one at a time data instance while training. To achieve this in classification, simplest form of system is available in the literature of machine learning and statistics, i.e. Gaussian Naive Bayes. Before proceeding to read about this wonderful technique, make sure that you have read the basic concepts of machine learning and incremental learning as a pre-requisite.

Naïve Bayes Algorithm: Introduction

It is a classification method built on Bayes’ Theorem with a theory of independence between forecasters. In simple terms, a Naive Bayes classifier assumes that the occurrence of a specific attribute in a class is unconnected to the occurrence of any other attribute.

For example, a vehicle may be considered to be a car if it has 4 wheels, doors and typical size. If it’s bigger and have more wheels, it will be another type of vehicle.  Even if these attributes depend on each other or upon the presence of the other attributes, all of these features self-sufficiently subsidize to the probability that this vehicle is a car and hence it is called as ‘Naive’.

Naïve Bayes algorithm is simple to construct and mainly valuable for very large data sets. Along with ease, Naïve Bayes is known to outperform even extremely refined classification systems.

Bayes’ theorem

In general, Bayes Theorem defines the probability of an event, based on prior knowledge of conditions be related of conditions to the event. Therefore, it fits effortlessly for machine learning, because that is just what machine learning does: making estimates for the future based on experience. Mathematically one may write the Bayes theorem as:

Let’s break the equation down:

• A and B : events.
• P(A) and P(B) [P(B) not equals to 0] are the probabilities of the event independent from each other.
• P(A|B) is the probability A under the condition B.
• Equivalent with P(B|A), it is the probability of observing event B given that event A is true.

Posterior probabilities involving A and B both are called as conditional probabilities and they are telling the probability of A under the condition of B.

Example:

Assume that A is the probability, that if you see outside your home “you will find at least single man”. Let us say that B says, “it is pouring”. Then P(A|B) would be the probability that you will find at least single man outside your home if it is pouring. This works for negations too. Hence P(A|not B), where not B is “it is not pouring”, will define the probability that you will find at least single man outside your home if it is not pouring.

Application of Naïve Bayes classifier in real world:

A standard use case for Naive Bayes is text classification: Deciding whether a given (text) file resembles to one or more classes. In the text document case, the attributes used might be the occurrence or presence of key words.

However, Naïve Bayes executes unwell on sentiment analysis, because the words used are not as pertinent as the order in which they are carved.

Other applications are medical diagnosis, spam e-mail detection, face recognition.

Gaussian Naïve Bayes

Gaussian Naïve Bayes is Incremental/Online/One time version of plain nave Bayes algorithm. It learns single data sample at particular instance of time. It is able to perform online updates to model parameters.

It follows normal distribution of the data. It is special category of Naïve Bayes and particularly applied to the data when the attributes of the input dataset has continuous values. It helps to minimize sum of squared errors

Gaussian NB model is used in classification/supervised learning

This is the coolest type of algorithm to work with since you only require estimating the mean and the standard deviation from the training data set.

How to build a Gaussian Naïve Bayes model in Python?

”’ Import necessary python package from sklearn to implement Gaussian naïve bayes”’

from sklearn.naive_bayes import GaussianNB import numpy as np

”’ input data, predicter and target variables, for now we will take some random numpy number array”’

x = np.array([[-3,7],[1,5], [1,1], [-2,0], [2,3], [-4,0], [-1,1], [1,2], [-2,2], [-2,7], [-4,1], [2,7]])

Y = np.array([3, 4, 3, 3, 4, 3, 3, 4, 3, 3, 4, 4])

”’ Build a Gaussian Naïve Bayes Classifier object by calling constructor from the sklearn library for the same with default parameters’

GnbModelObj = GaussianNB()

‘Call fit method to rain the model using the given training set, Here you can replace your input data by converting into appropriate numpy array format”’

GnbModelObj.fit(x, y)

‘In order to predict the Output, specify test data as parameter to predict() and print the results”’

predicted= GnbModelObj.predict([[1,2],[3,4]])

print predicted

Output: ([3,4])

Reference:

SKLearn Official documentation