pythonpandas.DataFrame.loc函数使⽤详解
官⽅函数
DataFrame.loc
Access a group of rows and columns by label(s) or a boolean array.
.loc[] is primarily label based, but may also be used with a boolean array.
# 可以使⽤label值,但是也可以使⽤布尔值
Allowed inputs are: # 可以接受单个的label,多个label的列表,多个label的切⽚
A single label, e.g. 5 or ‘a', (note that 5 is interpreted as a label of the index, and never as an integer position along the
index). #这⾥的5不是数值指定的位置,⽽是label值
A list or array of labels, e.g. [‘a', ‘b', ‘c'].
slice object with labels, e.g. ‘a':'f'.
Warning: #如果使⽤多个label的切⽚,那么切⽚的起始位置都是包含的
Note that contrary to usual python slices, both the start and the stop are included
A boolean array of the same length as the axis being sliced, e.g. [True, False, True].
实例详解
⼀、选择数值
rows函数的使用方法及实例1、⽣成df
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
...  index=['cobra', 'viper', 'sidewinder'],
...  columns=['max_speed', 'shield'])
df
Out[15]:
max_speed shield
cobra        1    2
viper        4    5
sidewinder    7    8
2、Single label. 单个 row_label 返回的Series
df.loc['viper']
Out[17]:
max_speed  4
shield    5
Name: viper, dtype: int64
2、List of labels. 列表 row_label 返回的DataFrame
df.loc[['cobra','viper']]
Out[20]:
max_speed shield
cobra    1    2
viper    4    5
3、Single label for row and column 同时选定⾏和列
df.loc['cobra', 'shield']
Out[24]: 2
4、Slice with labels for row and single label for column. As mentioned above, note that both the start and stop of the slice are included. 同时选定多个⾏和单个列,注意的是通过列表选定多个row label 时,⾸位均是选定的。
df.loc['cobra':'viper', 'max_speed']
Out[25]:
cobra  1
viper  4
Name: max_speed, dtype: int64
5、Boolean list with the same length as the row axis 布尔列表选择row label
布尔值列表是根据某个位置的True or False 来选定,如果某个位置的布尔值是True,则选定该row
df
Out[30]:
max_speed shield
cobra        1    2
viper        4    5
sidewinder    7    8
df.loc[[True]]
Out[31]:
max_speed shield
cobra    1    2
df.loc[[True,False]]
Out[32]:
max_speed shield
cobra    1    2
df.loc[[True,False,True]]
Out[33]:
max_speed shield
cobra        1    2
sidewinder    7    8
6、Conditional that returns a boolean Series 条件布尔值
df.loc[df['shield'] > 6]
Out[34]:
max_speed shield
sidewinder    7    8
7、Conditional that returns a boolean Series with column labels specified 条件布尔值和具体某列的数据df.loc[df['shield'] > 6, ['max_speed']]
Out[35]:
max_speed
sidewinder    7
8、Callable that returns a boolean Series 通过函数得到布尔结果选定数据
df
Out[37]:
max_speed shield
cobra        1    2
viper        4    5
sidewinder    7    8
df.loc[lambda df: df['shield'] == 8]
Out[38]:
max_speed shield
sidewinder    7    8
⼆、赋值
1、Set value for all items matching the list of labels 根据某列表选定的row 及某列 column 赋值
df.loc[['viper', 'sidewinder'], ['shield']] = 50
df
Out[43]:
max_speed shield
cobra        1    2
viper        4  50
sidewinder    7  50
2、Set value for an entire row 将某⾏row的数据全部赋值
df.loc['cobra'] =10
df
Out[48]:
max_speed shield
cobra      10  10
viper        4  50
sidewinder    7  50
3、Set value for an entire column 将某列的数据完全赋值
df.loc[:, 'max_speed'] = 30
df
Out[50]:
max_speed shield
cobra      30  10
viper      30  50
sidewinder    30  50
4、Set value for rows matching callable condition 条件选定rows赋值df.loc[df['shield'] > 35] = 0
df
Out[52]:
max_speed shield
cobra      30  10
viper        0    0
sidewinder    0    0
三、⾏索引是数值
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
...  index=[7, 8, 9], columns=['max_speed', 'shield'])
df
Out[54]:
max_speed shield
7    1    2
8    4    5
9    7    8
通过⾏ rows的切⽚的⽅式取多个:
df.loc[7:9]
Out[55]:
max_speed shield
7    1    2
8    4    5
9    7    8
四、多维索引
1、⽣成多维索引
tuples = [
...  ('cobra', 'mark i'), ('cobra', 'mark ii'),
...  ('sidewinder', 'mark i'), ('sidewinder', 'mark ii'),
...  ('viper', 'mark ii'), ('viper', 'mark iii')
.
.. ]
index = pd.MultiIndex.from_tuples(tuples)
values = [[12, 2], [0, 4], [10, 20],
...    [1, 4], [7, 1], [16, 36]]
df = pd.DataFrame(values, columns=['max_speed', 'shield'], index=index)
df
Out[57]:
max_speed shield
cobra  mark i      12    2
mark ii      0    4
sidewinder mark i      10  20
mark ii      1    4
viper  mark ii      7    1
mark iii    16  36
2、Single label. 传⼊的就是最外层的row label,返回DataFrame df.loc['cobra']
Out[58]:
max_speed shield
mark i    12    2
mark ii    0    4
3、Single index tuple.传⼊的是索引元组,返回Series
df.loc[('cobra', 'mark ii')]
Out[59]:
max_speed  0
shield    4
Name: (cobra, mark ii), dtype: int64
4、Single label for row and column.如果传⼊的是row和column,和传⼊tuple是类似的,返回Series
df.loc['cobra', 'mark i']
Out[60]:
max_speed  12
shield    2
Name: (cobra, mark i), dtype: int64
5、Single tuple. Note using [[ ]] returns a DataFrame.传⼊⼀个数组,返回⼀个DataFrame
df.loc[[('cobra', 'mark ii')]]
Out[61]:
max_speed shield
cobra mark ii    0    4
6、Single tuple for the index with a single label for the column 获取某个colum的某row的数据,需要左边传⼊多维索引的tuple,然后再传⼊column
df.loc[('cobra', 'mark i'), 'shield']
Out[62]: 2
7、传⼊多维索引和单个索引的切⽚:
df.loc[('cobra', 'mark i'):'viper']
Out[63]:
max_speed shield
cobra  mark i      12    2
mark ii      0    4
sidewinder mark i      10  20
mark ii      1    4
viper  mark ii      7    1
mark iii    16  36
df.loc[('cobra', 'mark i'):'sidewinder']
Out[64]:
max_speed shield
cobra  mark i    12    2
mark ii    0    4
sidewinder mark i    10  20
mark ii    1    4
df.loc[('cobra', 'mark i'):('sidewinder','mark i')]
Out[65]:
max_speed shield
cobra  mark i    12    2
mark ii    0    4
sidewinder mark i    10  20
到此这篇关于python pandas.DataFrame.loc函数使⽤详解的⽂章就介绍到这了,更多相关pandas.DataFrame.loc函数内容请搜索以前的⽂章或继续浏览下⾯的相关⽂章希望⼤家以后多多⽀持!