0% found this document useful (0 votes)
17 views

Pythonlearn 07 Files

This document discusses file processing and reading files in Python. It begins by explaining that a text file can be thought of as a sequence of lines, with each line ending in a newline character. It then covers opening a file using the open() function, which returns a file handle that can be used to read or write to the file. The rest of the document discusses reading the file line by line as a sequence, with each line represented as a string.

Uploaded by

Hưng Minh Phan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Pythonlearn 07 Files

This document discusses file processing and reading files in Python. It begins by explaining that a text file can be thought of as a sequence of lines, with each line ending in a newline character. It then covers opening a file using the open() function, which returns a file handle that can be used to read or write to the file. The rest of the document discusses reading the file line by line as a sequence, with each line represented as a string.

Uploaded by

Hưng Minh Phan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

10/01/21

Software What
It is time to go find some
Next? Data to mess with!
Input Central
and Output Processing Files R
Devices Unit Us

Reading Files Secondary


Memory
if x < 3: print
Chapter 7
Main From [email protected] Sat Jan 5 09:14:16 2008
Memory Return-Path: <[email protected]>
Date: Sat, 5 Jan 2008 09:12:18 -0500To:
[email protected]:
[email protected]: [sakai] svn commit: r39772 -

Python for Everybody content/branches/Details:


http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772
...
www.py4e.com

1 2

File Processing Opening a File


A text file can be thought of as a sequence of lines • Before we can read the contents of the file, we must tell Python
From [email protected] Sat Jan 5 09:14:16 2008
which file we are going to work with and what we will be doing
Return-Path: <[email protected]> with the file
Date: Sat, 5 Jan 2008 09:12:18 -0500
To: [email protected]
From: [email protected] • This is done with the open() function
Subject: [sakai] svn commit: r39772 - content/branches/

Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772
• open() returns a “file handle” - a variable used to perform
operations on the file
http://www.py4e.com/code/mbox-short.txt
• Similar to “File -> Open” in a Word Processor

3 4
10/01/21

Using open() What is a Handle?


>>> fhand = open('mbox.txt')
fhand = open('mbox.txt', 'r') >>> print(fhand)
<_io.TextIOWrapper name='mbox.txt' mode='r' encoding='UTF-8'>
• handle = open(filename, mode)

• returns a handle use to manipulate the file

• filename is a string

• mode is optional and should be 'r' if we are planning to


read the file and 'w' if we are going to write to the file

5 6

When Files are Missing The newline Character


>>> stuff = 'Hello\nWorld!'
>>> fhand = open('stuff.txt') >>> stuff
• We use a special character 'Hello\nWorld!'
Traceback (most recent call last):
File "<stdin>", line 1, in <module> called the “newline” to indicate >>> print(stuff)
when a line ends Hello
FileNotFoundError: [Errno 2] No such file or World!
directory: 'stuff.txt' • We represent it as \n in strings >>> stuff = 'X\nY'
>>> print(stuff)
X
• Newline is still one character - Y
not two >>> len(stuff)
3

7 8
10/01/21

File Processing File Processing


A text file can be thought of as a sequence of lines A text file has newlines at the end of each line

From [email protected] Sat Jan 5 09:14:16 2008 From [email protected] Sat Jan 5 09:14:16 2008\n
Return-Path: <[email protected]> Return-Path: <[email protected]>\n
Date: Sat, 5 Jan 2008 09:12:18 -0500 Date: Sat, 5 Jan 2008 09:12:18 -0500\n
To: [email protected] To: [email protected]\n
From: [email protected] From: [email protected]\n
Subject: [sakai] svn commit: r39772 - content/branches/ Subject: [sakai] svn commit: r39772 - content/branches/\n
\n
Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772 Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772\n

9 10

File Handle as a Sequence


• A file handle open for read can
be treated as a sequence of
Reading Files in Python strings where each line in the xfile = open('mbox.txt')
file is a string in the sequence for cheese in xfile:
print(cheese)
• We can use the for statement
to iterate through a sequence

• Remember - a sequence is an
ordered set

11 12
10/01/21

Counting Lines in a File Reading the *Whole* File


fhand = open('mbox.txt') >>> fhand = open('mbox-short.txt')
• Open a file read-only count = 0 We can read the whole >>> inp = fhand.read()
for line in fhand: file (newlines and all) >>> print(len(inp))
• Use a for loop to read each line count = count + 1 94626
into a single string
print('Line Count:', count) >>> print(inp[:20])
From stephen.marquar
• Count the lines and print out
the number of lines
$ python open.py
Line Count: 132045

13 14

Searching Through a File OOPS!


From: [email protected]
fhand = open('mbox-short.txt') What are all these blank
We can put an if statement in
for line in fhand: lines doing here? From: [email protected]
our for loop to only print lines if line.startswith('From:') :
that meet some criteria print(line) From: [email protected]

From: [email protected]
...

15 16
10/01/21

OOPS! Searching Through a File (fixed)


What are all these blank From: [email protected]\n fhand = open('mbox-short.txt')
• We can strip the whitespace for line in fhand:
lines doing here? \n
From: [email protected]\n from the right-hand side of line = line.rstrip()
if line.startswith('From:') :
• Each line from the file \n the string using rstrip() from print(line)
has a newline at the end From: [email protected]\n the string library
\n
From: [email protected]\n From: [email protected]
• The print statement adds • The newline is considered
\n From: [email protected]
a newline to each line “white space” and is
... From: [email protected]
stripped From: [email protected]
....

17 18

Skipping with continue Using in to Select Lines


fhand = open('mbox-short.txt')
We can look for a string for line in fhand:
line = line.rstrip()
fhand = open('mbox-short.txt') anywhere in a line as our if not '@uct.ac.za' in line :
We can conveniently for line in fhand: selection criteria continue
skip a line by using the line = line.rstrip() print(line)
if not line.startswith('From:') :
continue statement continue
From [email protected] Sat Jan 5 09:14:16 2008
print(line) X-Authentication-Warning: set sender to [email protected] using –f
From: [email protected]
Author: [email protected]
From [email protected] Fri Jan 4 07:02:32 2008
X-Authentication-Warning: set sender to [email protected] using -f...

19 20
10/01/21

Prompt for
fname = input('Enter the file name: ') fname = input('Enter the file name: ')
try:

Bad File
fhand = open(fname)
count = 0 fhand = open(fname)

File Name
for line in fhand: except:
print('File cannot be opened:', fname)

Names
if line.startswith('Subject:') :
count = count + 1 quit()
print('There were', count, 'subject lines in', fname)
count = 0
for line in fhand:
if line.startswith('Subject:') :
count = count + 1
print('There were', count, 'subject lines in', fname)
Enter the file name: mbox.txt
There were 1797 subject lines in mbox.txt
Enter the file name: mbox.txt
Enter the file name: mbox-short.txt There were 1797 subject lines in mbox.txt
There were 27 subject lines in mbox-short.txt
Enter the file name: na na boo boo
File cannot be opened: na na boo boo

21 22

Summary Acknowledgements / Contributions


These slides are Copyright 2010- Charles R. Severance ...
(www.dr-chuck.com) of the University of Michigan School of
• Secondary storage • Searching for lines Information and open.umich.edu and made available under a
Creative Commons Attribution 4.0 License. Please maintain this
last slide in all copies of the document to comply with the
attribution requirements of the license. If you make a change,
• Opening a file - file handle • Reading file names feel free to add your name and organization to the list of
contributors on this page as you republish the materials.

• File structure - newline character • Dealing with bad files Initial Development: Charles Severance, University of Michigan
School of Information

… Insert new Contributors and Translators here


• Reading a file line by line with a
for loop

23 24

You might also like