Normalization is a data preparation technique that is frequently used in machine learning. The process of transforming the columns in a dataset to the same scale is referred to as normalization.
Normalization
•
Normalization is a data preparation technique that is
frequently used in machine learning. The process of transforming the columns in
a dataset to the same scale is referred to as normalization. Every dataset does
not need to be normalized for machine learning.
•
Normalization makes the features more consistent with
each other, which allows the model to predict outputs more accurately. The main
goal of normalization is to make the data homogenous over all records and
fields.
•
Normalization refers to rescaling real-valued numeric
attributes into a 0 to 1 range. Data normalization is used in machine learning
to make model training less sensitive to the scale of features.
•
Normalization is important in such algorithms as
k-NN, support vector machines, neural networks, and principal components. The
type of feature preprocessing and normalization that's needed can depend on the
data.
•
It is a method of adaptive reparameterization,
motivated by the difficulty of training very deep models. In Deep networks, the
weights are updated for each layer. So the output will no longer be on the same
scale as the input.
•
When we input the data to a machine or deep learning
algorithm we tend to for change the values to a balanced scale because, we
ensure that our model can generalize appropriately.
•
Batch normalization is a technique for standardizing
the inputs to layers in a neural network. Batch normalization was designed to
address the problem of internal covariate shift, which arises as a consequence
of updating multiple-layer inputs simultaneously in deep neural networks.
•
Batch normalization is applied to individual layers,
or optionally, to all of them: In each training iteration, we first normalize
the inputs by subtracting their mean and dividing by their standard deviation,
where both are estimated based on the statistics of the current mini-batch.
•
Next, we apply a scale coefficient and an offset to
recover the lost degrees of freedom. It is precisely due to this normalization
based on batch statistics that batch normalization derives its name.
•
We take the output a[i-1] from the
preceding layer, and multiply by the weights W and add the bias b of the
current layer. The variable I denotes the current layer.
Z[i] = W [i] a[i-1] + b[i]
•
Next, we usually apply the non-linear activation
function that results in the output a[i] of the current layer. When
applying batch norm, we correct our data before feeding it to the activation
function.
•
To apply batch norm, calculate the mean as well as
the variance of current z.
μ = Σ mi=1 Zj
•
When calculating the variance, we add a small
constant to the variance to prevent potential divisions by zero.
σ2
= 1/m Σmi=1 (Zj - μ)2 + €
•
To normalize the data, we subtract the mean and
divide the expression by the standard deviation.
Z[i]
= Z[i]-μ / √σ 2
•
This operation scales the inputs to have a mean of 0
and a standard deviation of 1.
•
Advantages of Batch Normalisation:
a)
The model is less delicate to hyperparameter tuning.
b)
Shrinks internal covariant shift.
c)
Diminishes the reliance of gradients on the scale of the parameters or their
underlying values.
Artificial Intelligence and Machine Learning: Unit V: Neural Networks : Tag: : Neural Networks - Artificial Intelligence and Machine Learning - Normalization
Artificial Intelligence and Machine Learning
CS3491 4th Semester CSE/ECE Dept | 2021 Regulation | 4th Semester CSE/ECE Dept 2021 Regulation