栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 软件开发 > 后端开发 > Python

Normalization Methods

Python 更新时间: 发布时间: IT归档 最新发布 模块sitemap 名妆网 法律咨询 聚返吧 英语巴士网 伯小乐 网商动力

Normalization Methods

Normalization

Data transformation is one of the critical steps in Data Mining. Among many data transformation methods, normalization is a most frequently used technique. For example, we can use Z-score normalization to reduce possible noise in sound frequency.

We will introduce three common normalization method, Max-Min Normalization, Z-Score Normalization, Scale multiplication.

Max-Min Normalization
xnormal=(xmin(x))(max(x)min(x))x_{normal}= frac{(x- min(x))}{(max(x)- min(x))}xnormal​=(max(x)−min(x))(x−min(x))​
it will scale all the data between 0 and 1.
Example:
Chinese high schools use 150 point scale, USA high schools use 100 point scale and Russian high schools use 5 point scale.

`

Z-Score Normalization

Xznormal=(Xmean)sdX_{z-normal}= frac{(X- mean)}{sd}Xz−normal​=sd(X−mean)​
It will transform the data in units relative to the standard deviation.
Example:
It is useful when comparing data sets with different units (cm and inch).

Scale multiplication

$ Z_{z-normal} =X*10 or Z_{z-normal} =X/10$
It will transform the data in scales of muliple of 10.
Example:
Some money transactions are too large, we will divide 1000 to make it viewer friendly.

Code
import random
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import colors
from matplotlib.ticker import PercentFormatter
from matplotlib import pylab


y=random.sample(range(0,150),50)
x=list(map(int,y))
x1=np.array(x)
xmin=min(x)
xmax=max(x)

#Max-Min normalization
mmnorm=(x1 - xmin)/(xmax-xmin)
#plot

fig,axs=plt.subplots(1,2,sharey=True)

#Original random number
axs[0].hist(x, bins=10)
axs[0].title.set_text("Random Data")


#Max-Min normalizaed histogram Plot
axs[1].hist(mmnorm, bins=10,color="lightblue")
plt.title("Max-Min Normalized Data")
plt.show()

#Z-score Normalization

y2=random.sample(range(0,150),50)
x2=list(map(int,y3))
x21=np.array(x2)
mean=np.mean(x21)
sd=np.std(x21)


#scale normalization
znorm=(x21-mean)/sd

#plot

fig,axs=plt.subplots(1,2,sharey=True)

#Original random number
axs[0].hist(x2, bins=10, color="green")
axs[0].title.set_text("Random Data")


#scale normalizaed histogram Plot
axs[1].hist(znorm, bins=10,color="lightgreen")
plt.title("Z-score Normalized Data")
plt.show()

#scale

y3=random.sample(range(1000,10000),50)
x3=list(map(int,y3))
x31=np.array(x3)

#scale normalization
snorm=x31/1000

#plot

fig,axs=plt.subplots(1,2,sharey=True)

#Original random number
axs[0].hist(x3, bins=10, color="orange")
axs[0].title.set_text("Random Data")


#scale normalizaed histogram Plot
axs[1].hist(snorm, bins=10,color="yellow")
plt.title("Scale Normalized Data")
plt.show()

转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/221367.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号