利用python 学习数据分析 (学习四)

时间：2019-01-14 20:10:06 阅读：217 评论：0 收藏：0 [点我收藏+]

标签：nump 关系修改数组 step not 形式对象 sda

内容学习自:

Python for Data Analysis, 2nd Edition

就是这本

纯英文学的很累,对不对取决于百度翻译了

前情提要:

各种方法贴:

　　https://www.cnblogs.com/baili-luoyun/p/10250177.html

　　　　内容提要:本次内容主要讲的是pands基本入门

　　　　　　一:pandas 主要有两种数据结构

　　　　　　　　Series,DataFrame

　　　　　　二: Series

　　　　　　　　1:定义:

　　Series是一种类似于一维数组的对象，它由一组数据（各种NumPy数据类型）以及一组与之相关的数据标签（即索引）组成

　　　　　　　　2:表现形式

　　Series的字符串表现形式为：索引在左边，值在右边。

　　　　　　　　3:创建一个一维数组

obj =pd.Series([4,5,6,7,8])        #创建一维数组
print(obj)


print(obj.index)
print(obj.values)
>>>>>>>>>
0    4
1    5
2    6
3    7
4    8
dtype: int64
RangeIndex(start=0, stop=5, step=1)
[4 5 6 7 8]

　　　　　　　　4:通过索引获得内容

　　　　　　　　　　1>:单索引

obj1 = pd.Series([4,6,-7,-8],index=[‘d‘,‘a‘,‘b‘,‘c‘]) #修改索引
print(obj1)
>>>>

#通过索引获得内容
print(obj1[‘d‘])
>>>>

d 4
a 6
b -7
c -8
dtype: int64
4

　　　　　　　　　　2>:多索引

#多索引
print(obj1[[‘d‘,‘a‘,‘c‘]])
>>>>
d    4
a    6
b   -7
c   -8
dtype: int64
d    4
a    6
c   -8
dtype: int64

　　　　　　　　　 3>:布尔过滤

print(obj1[obj1<0])
>>>>

d 4
a 6
b -7
c -8
dtype: int64
b -7
c -8
dtype: int64

　　　　　　　　　　4>:应用乘法

print(obj1*2)
>>>>>>>>>>
d    4
a    6
b   -7
c   -8
dtype: int64
d     8
a    12
b   -14
c   -16
dtype: int64

　　　　　　　　　5>:应用级函数

print(np.exp(obj1))
>>>>>
d    4
a    6
b   -7
c   -8
dtype: int64
d     54.598150
a    403.428793
b      0.000912
c      0.000335
dtype: float64

　　　　　　　　6>:索引的映射关系

print(‘b‘in obj1)
print(‘e‘in obj1)

>>>>>
d    4
a    6
b   -7
c   -8
dtype: int64
True
False

　　　　　　　　5 :创建字典的Series:

　　　　　　　　　　1:>创建字典型Series

sdata ={‘Ohio‘:35000,‘Texas‘:71000,‘Oregon‘:16000,‘Utah‘:5000 }
obj3 =pd.Series(sdata)
print(obj3)

>>>>

Ohio      35000
Texas     71000
Oregon    16000
Utah       5000
dtype: int64

　　　　　　　　　　2:>Series 插入index 和valuse

sdata ={‘Ohio‘:35000,‘Texas‘:71000,‘Oregon‘:16000,‘Utah‘:5000 }
obj3 =pd.Series(sdata)
print(obj3)
# 插入index 和valuse
states =[‘California‘,‘Ohio‘,‘Oregon‘,‘Texas‘]
obj4 =pd.Series(sdata,index=states)

print(obj4)

>>>>>>>>>>>>>>

Ohio      35000
Texas     71000
Oregon    16000
Utah       5000
dtype: int64
California        NaN
Ohio          35000.0
Oregon        16000.0
Texas         71000.0
dtype: float64

　　　　　　　　　　3>:检测数据是否缺失

l =pd.isnull(obj4)
print(l)
l2 =pd.notnull(obj4)
print(l2)

>>>>>>>>>>>>
California     True
Ohio          False
Oregon        False
Texas         False
dtype: bool
California    False
Ohio           True
Oregon         True
Texas          True
dtype: bool

　　　　　　　　　　4>:赋予名字

obj4.name =‘population‘
obj4.index.name =‘state‘
print(obj4)
>>>>>>>>>state
California        NaN
Ohio          35000.0
Oregon        16000.0
Texas         71000.0
Name: population, dtype: float64

　　　　　　　　　　5>:修改索引,修改索引的名字

obj =pd.Series([4,7,-6,3])
print(obj)
obj.index=[‘bob‘,‘Steve‘,‘jeff‘,‘Ryan‘]
print(obj)
>>>>>>>>>
0    4
1    7
2   -6
3    3
dtype: int64


bob      4
Steve    7
jeff    -6
Ryan     3
dtype: int64

　　　　　　　　　三:DataFrame

　　　　一:定义

DataFrame是一个表格型的数据结构，它含有一组有序的列，每列可以是不同的值类型（数值、字符串、布尔值等）。DataFrame既有行索引也有列索引，它可以被看做由Series组成的字典（共用同一个索引）。DataFrame中的数据是以一个或多个二维块存放的（而不是列表、字典或别的一维数据结构）

　　　　二:创建

利用python 学习数据分析 (学习四)

标签：nump 关系修改数组 step not 形式对象 sda

原文地址：https://www.cnblogs.com/baili-luoyun/p/10268364.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行