# 2.5 collections 模块

`collections` 模块为数据处理提供了许多有用的对象。本部分简要介绍其中的一些特性。

### 示例：事物计数

``````portfolio = [
(‘GOOG‘, 100, 490.1),
(‘IBM‘, 50, 91.1),
(‘CAT‘, 150, 83.44),
(‘IBM‘, 100, 45.23),
(‘GOOG‘, 75, 572.45),
(‘AA‘, 50, 23.15)
]
``````

### 计数

``````from collections import Counter
total_shares = Counter()
for name, shares, price in portfolio:
total_shares[name] += shares

total_shares[‘IBM‘]     # 150
``````

### 示例：一对多映射

``````portfolio = [
(‘GOOG‘, 100, 490.1),
(‘IBM‘, 50, 91.1),
(‘CAT‘, 150, 83.44),
(‘IBM‘, 100, 45.23),
(‘GOOG‘, 75, 572.45),
(‘AA‘, 50, 23.15)
]
``````

``````from collections import defaultdict
holdings = defaultdict(list)
for name, shares, price in portfolio:
holdings[name].append((shares, price))
holdings[‘IBM‘] # [ (50, 91.1), (100, 45.23) ]
``````

`defaultdict`模块确保每次访问键的时候获取到一个默认值。

### 示例：保留历史记录

``````from collections import deque

history = deque(maxlen=N)
with open(filename) as f:
for line in f:
history.append(line)
...
``````

## 练习

`collections` 可能是最有用的库模块之一，用于解决特殊用途的数据处理问题，例如表格化或者索引化。

``````bash % python3 -i report.py
``````

### 练习 2.18：使用 Counter 模块表格化

``````>>> portfolio = read_portfolio(‘Data/portfolio.csv‘)
>>> from collections import Counter
>>> holdings = Counter()
>>> for s in portfolio:
holdings[s[‘name‘]] += s[‘shares‘]

>>> holdings
Counter({‘MSFT‘: 250, ‘IBM‘: 150, ‘CAT‘: 150, ‘AA‘: 100, ‘GE‘: 95})
>>>
``````

``````>>> holdings[‘IBM‘]
150
>>> holdings[‘MSFT‘]
250
>>>
``````

``````>>> # Get three most held stocks
>>> holdings.most_common(3)
[(‘MSFT‘, 250), (‘IBM‘, 150), (‘CAT‘, 150)]
>>>
``````

``````>>> portfolio2 = read_portfolio(‘Data/portfolio2.csv‘)
>>> holdings2 = Counter()
>>> for s in portfolio2:
holdings2[s[‘name‘]] += s[‘shares‘]

>>> holdings2
Counter({‘HPQ‘: 250, ‘GE‘: 125, ‘AA‘: 50, ‘MSFT‘: 25})
>>>
``````

``````>>> holdings
Counter({‘MSFT‘: 250, ‘IBM‘: 150, ‘CAT‘: 150, ‘AA‘: 100, ‘GE‘: 95})
>>> holdings2
Counter({‘HPQ‘: 250, ‘GE‘: 125, ‘AA‘: 50, ‘MSFT‘: 25})
>>> combined = holdings + holdings2
>>> combined
Counter({‘MSFT‘: 275, ‘HPQ‘: 250, ‘GE‘: 220, ‘AA‘: 150, ‘IBM‘: 150, ‘CAT‘: 150})
>>>
``````

### 说明：collections 模块

`collections` 模块是 Python 所有库中最有用的库模块之一。实际上，我们可以为此做一个拓展教程，但是，现在这样做会分散注意力。从现在开始，把`collections`列为您的睡前读物，以备后用。

(0)
(0)