As we have seen, Normalization and Standardization techniques are able to bring the different scale attributes to a common scale, but if the distribution is skewed, then it remains skewed after the scaling process. Log transformations can help in making a skewed distribution to a normal distribution or a highly skewed distribution to less skewed.
Few real world examples where Log transformations are done:
- Measuring Earthquake
- Measuring Sound
Now let us look at our last example where the attribute “Fare” is highly right skewed.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
data['new_fare'] = data['Fare'].apply(lambda r: np.log(r)) # Apply log transformation
As we can see that our original data was highly skewed and after transformation skewness is reduced. Now the transformed values are more visible.
Point to Remember: