0% found this document useful (0 votes)
10 views

03 Python

The document provides an overview of Python dictionaries and sets. It discusses how dictionaries can serve as sparse arrays, and how to create, access, update and remove entries from dictionaries. It also gives examples of counting word frequencies in text using dictionaries.

Uploaded by

webinar trainer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

03 Python

The document provides an overview of Python dictionaries and sets. It discusses how dictionaries can serve as sparse arrays, and how to create, access, update and remove entries from dictionaries. It also gives examples of counting word frequencies in text using dictionaries.

Uploaded by

webinar trainer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Overview

Python dicts • Python doesn’t have traditional vectors


and arrays!
• Instead, Python makes heavy use of the

and sets dict datatype (a hashtable) which can


serve as a sparse array
• Efficient traditional arrays are available as
modules that interface to C
• A Python set is derived from a dict
Some material adapted
from Upenn cis391
slides and other sources

Dictionaries: A Mapping type Creating & accessing dictionaries


• Dictionaries store a mapping between a set of >>> d = {‘user’:‘bozo’, ‘pswd’:1234}
keys and a set of values >>> d[‘user’]
• Keys can be any immutable type. ‘bozo’
• Values can be any type >>> d[‘pswd’]
• A single dictionary can store values of 1234
different types >>> d[‘bozo’]
• You can define, modify, view, lookup or delete Traceback (innermost last):
the key-value pairs in the dictionary File ‘<interactive input>’ line 1,
in ?
• Python’s dictionaries are also known as hash KeyError: bozo
tables and associative arrays

1
Updating Dictionaries Removing dictionary entries
>>> d = {‘user’:‘bozo’, ‘pswd’:1234} >>> d = {‘user’:‘bozo’, ‘p’:1234, ‘i’:34}
>>> d[‘user’] = ‘clown’ >>> del d[‘user’] # Remove one.
>>> d >>> d
{‘user’:‘clown’, ‘pswd’:1234}
{‘p’:1234, ‘i’:34}
• Keys must be unique
>>> d.clear() # Remove all.
• Assigning to an existing key replaces its value
>>> d
>>> d[‘id’] = 45
>>> d {}
{‘user’:‘clown’, ‘id’:45, ‘pswd’:1234}
>>> a=[1,2]
• Dictionaries are unordered
>>> del a[1] # del works on lists, too
• New entries can appear anywhere in output >>> a
• Dictionaries work by hashing [1]

Useful Accessor Methods A Dictionary Example


Problem: count the frequency of each word in
>>> d = {‘user’:‘bozo’, ‘p’:1234, ‘i’:34}
text read from the standard input, print results
>>> d.keys() # List of keys, VERY useful Six versions of increasing complexity
[‘user’, ‘p’, ‘i’] •wf1.py is a simple start
•wf2.py uses a common idiom for default values
>>> d.values() # List of values
[‘bozo’, 1234, 34]
•wf3.py sorts the output alphabetically
•wf4.py downcase and strip punctuation from
>>> d.items() # List of item tuples words and ignore stop words
[(‘user’,‘bozo’), (‘p’,1234), (‘i’,34)] •wf5.py sort output by frequency
•wf6.py add command line options: -n, -t, -h

2
Dictionary example: wf1.py Dictionary example wf1.py
#!/usr/bin/python #!/usr/bin/python
import sys import sys
freq = {} # frequency of words in text freq = {} # frequency of words in text
for line in sys.stdin: for line in sys.stdin:
This is a common
for word in line.split(): for word in line.split(): pattern
if word in freq: if word in freq:
freq[word] = 1 + freq[word] freq[word] = 1 + freq[word]
else: else:
freq[word] = 1 freq[word] = 1
print freq print freq

Dictionary example wf2.py Dictionary example wf3.py


#!/usr/bin/python #!/usr/bin/python
import sys import sys
freq = {} # frequency of words in text freq = {} # frequency of words in text
for line in sys.stdin: for line in sys.stdin:
for word in line.split(): for word in line.split():
freq[word] = 1 + freq.get(word, 0) freq[word] = freq.get(word,0)
print freq
for w in sorted(freq.keys()):
print w, freq[w]
key Default value
if not found

3
Dictionary example wf4.py Dictionary example wf4.py
#!/usr/bin/python for line in sys.stdin:
import sys for word in line.split():
punctuation = """'!"#$%&\'()*+,-./:;<=>? word = word.strip(punct).lower()
@[\\]^_`{|}~'""" if word not in stop_words:
freq[word] = freq.get(word,0)+1
freq = {} # frequency of words in text

stop_words = set() # print sorted words and their frequencies


for line in open("stop_words.txt"): for w in sorted(freq.keys()):
stop_words.add(line.strip()) print w, freq[w]

Dictionary example wf5.py Dictionary example wf6.py


#!/usr/bin/python from optparse import OptionParser
import sys # read command line arguments and process
parser = OptionParser()
from operator import itemgetter
parser.add_option('-n', '--number', type="int",
… default=-1, help='number of words to report')
parser.add_option("-t", "--threshold", type="int",
words = sorted(freq.items(), default=0, help=”print if frequency > threshold")
key=itemgetter(1), reverse=True) (options, args) = parser.parse_args()
...
for (w,f) in words:
# print the top option.number words but only those
print w, f # with freq>option.threshold
for (word, freq) in words[:options.number]:
if freq > options.threshold:
print freq, word

4
Why must keys be immutable? defaultdict
• The keys used in a dictionary must be >>> from collections import defaultdict!
>>> kids = defaultdict(list, {'alice': ['mary',
immutable objects? 'nick'], 'bob': ['oscar', 'peggy']})!
>>> name1, name2 = 'john', ['bob', 'marley'] >>> kids['bob']!
>>> fav = name2 ['oscar', 'peggy']!
>>> d = {name1: 'alive', name2: 'dead'} >>> kids['carol']!
Traceback (most recent call last): []!
File "<stdin>", line 1, in <module> >>> age = defaultdict(int)!
TypeError: list objects are unhashable >>> age['alice'] = 30!
• Why is this? >>> age['bob']!
0!
• Suppose we could index a value for name2 >>> age['bob'] += 1!
• and then did fav[0] = “Bobby” >>> age!
defaultdict(<type 'int'>, {'bob': 1, 'alice': 30})!
• Could we find d[name2] or d[fav] or …?
!
!

You might also like