Data normalization is sometimes necessary when learning a classifier like Support Vector Machines (SVM). Its essential especially when combining features which might have different ranges (min to max) for learning a classifier. Additionally, if some feature dimensions have high variations, it will take longer to learn a classifier like SVM and that particular feature dimension may dominate learning which will impact the generalization property over test data of a classifier. There are two different ways in which you could do normalization.
1) Range Scaling : Getting all the feature dimensions to like between certain range $[0,1]$ or $[-1,1]$. This can be easily accomplished by just defining $x_{i,j} = \frac{(x_{i,j}-\min{x_{:,j}})(u-l)}{\max{x_{:,j}}-\min{x_{:,j}}}$ where $i = 1,2,...,N$ is the number of data samples and $j = 1,2,...,M$ is the feature dimensions, $l$ and $u$ are the lower and upper value of ranges that we want our data to scale to.
2) Standard Normal : Another way of normalization is to use standard normal by scaling each feature dimensions to be a gaussian with $0$ mean standard deviation $1$ in each feature dimension.
More details on both the methods here.
I now show an example where I perform standard normalization. I generated two class data which seems linearly separable.
Now if one normalizes the data separately by using positive and negative class, we end with data that in inseparable as in the figure below.
So, whenever one is performing normalization, one should use both positive and negative class data together and then the normalization process will yield desired results.
Additionally, one should store the mean and variance for each dimension in feature space and use it on testing data so that it lies in the same range. This is necessary because in all pattern recognition problems we assume that our test data follows the same distribution as training data.
Here is the matlab code that generated the data and normalization in pictures above.
1) Range Scaling : Getting all the feature dimensions to like between certain range $[0,1]$ or $[-1,1]$. This can be easily accomplished by just defining $x_{i,j} = \frac{(x_{i,j}-\min{x_{:,j}})(u-l)}{\max{x_{:,j}}-\min{x_{:,j}}}$ where $i = 1,2,...,N$ is the number of data samples and $j = 1,2,...,M$ is the feature dimensions, $l$ and $u$ are the lower and upper value of ranges that we want our data to scale to.
2) Standard Normal : Another way of normalization is to use standard normal by scaling each feature dimensions to be a gaussian with $0$ mean standard deviation $1$ in each feature dimension.
More details on both the methods here.
I now show an example where I perform standard normalization. I generated two class data which seems linearly separable.
Now if one normalizes the data separately by using positive and negative class, we end with data that in inseparable as in the figure below.
So, whenever one is performing normalization, one should use both positive and negative class data together and then the normalization process will yield desired results.
Additionally, one should store the mean and variance for each dimension in feature space and use it on testing data so that it lies in the same range. This is necessary because in all pattern recognition problems we assume that our test data follows the same distribution as training data.
Here is the matlab code that generated the data and normalization in pictures above.