sklearn.preprocessing.StandardScaler数据标准化

时间：2019-09-21 19:38:16 阅读：136 评论：0 收藏：0 [点我收藏+]

如果某个特征的方差远大于其它特征的方差，那么它将会在算法学习中占据主导位置，导致我们的学习器不能像我们期望的那样，去学习其他的特征，这将导致最后的模型收敛速度慢甚至不收敛，因此我们需要对这样的特征数据进行标准化/归一化。

1.StandardScaler

标准化数据通过减去均值然后除以方差（或标准差），这种数据标准化方法经过处理后数据符合标准正态分布，即均值为0，标准差为1，转化函数为：x =(x - ??)/??

import numpy as np
from sklearn.preprocessing import StandardScaler

‘‘‘
scale_： 缩放比例，同时也是标准差
mean_： 每个特征的平均值
var_:每个特征的方差
n_sample_seen_:样本数量，可以通过patial_fit 增加
‘‘‘
x = np.array(range(1, 10)).reshape(-1, 1)
ss = StandardScaler()
ss.fit(X=x)
print(x)
print(ss.n_samples_seen_)
print(ss.mean_)
print(ss.var_)
print(ss.scale_)
print(‘标准化后的数据:‘)
y = ss.fit_transform(x)

>>>

[[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]]
9
[5.]
[6.66666667]
[2.5819889]
标准化后的数据:
[[-1.54919334]
[-1.161895 ]
[-0.77459667]
[-0.38729833]
[ 0. ]
[ 0.38729833]
[ 0.77459667]
[ 1.161895 ]
[ 1.54919334]]　

sklearn.preprocessing.StandardScaler数据标准化

标签：orm 比例数据 res nsf 归一化 sam python 导致

原文地址：https://www.cnblogs.com/lovewhale1997/p/11563881.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行