This week we'll learn how to use one of the most fundamental computer data structures. The file.

Review

  • Last week we took input from the command line
    • We'll use this again this week.
  • We also looked at how to take input from an environment variable

Here's an example of taking input on the command line:

import sys 
 
prog, myvar, yourvar = sys.argv 
print (f'prog = {prog} myvar = {myvar} yourvar = {yourvar}')

You should know what this program does when you call it this way:

$ python3.6 program fun profit 

What would happen if you call it like this?

$ python3.6 program fun 
  • Environment variables are essential for controlling how programs work.
  • The FLASK_DEBUG variable is great for debugging your Flask appications.

Here's an example of accessing environment variables.

import os 
print ('You are logged in as the user {}'.format(os.environ['USER']))

Try that code in your c9 project. What user are you?

Reading Files

  • Variables are the most fundamental type of storage on a computer
    • But they are gone when the power is off
  • Files are typically stored on non-volatile media.
    • Non-volatile means that it retains its data without the need for electrical power
    • Hard disks (spinning and SSD) are non-volatile
    • Flash keys are too.
  • Files are logically a bunch of bytes in a particular order
    • A byte is 8 bits
    • A bit is one binary digit (a single one or zero)
  • In order to access a file you must first open it

The examples below use this file:

file.txt
This is file.txt
This is the second line
This is the third line

This code opens the file file.txt and makes it accessible through the variable f.

>>> f = open('file.txt') 
>>> f 
<_io.TextIOWrapper name='file.txt' mode='r' encoding='UTF-8'>
>>> type(f)
<class '_io.TextIOWrapper'>
  • The contents of a file can be slurped up one line at a time
  • They can also be read in bulk (the whole file) or in specific increments.

Here's code that reads the file one line at a time:

>>> f = open('file.txt') 
>>> f.readline()
'This is file.txt\n'
>>> f.readline()
'This is the second line \n'
>>> f.readline()
'This is the third line \n'
>>> f.readline()
'\n'
>>> f.readline()
''
>>> f.readline()
''
  • Notice that each readline() advances our place in the file.
  • The file works like a book as you read you move forward.
  • When readline() hits the end of the file it returns the empty string
  • The read() function reads the whole file into a string.

Here's how the read() function works:

>>> f = open('file.txt') 
>>> f.read()
'This is file.txt\nThis is the second line \nThis is the third line \n\n'
>>> f.read()
''
You may have noticed that the file contents contains something funny, the \n character. This looks like two characters but it's only one: the newline character. The newline character starts the next line. We'll learn more about fancy characters next week
  • When you're done with a file you have to close it.
  • Closing is extra important when you're writing to a file because the contents may not be on disk until you call close()
  • After calling close() you can no longer access the file.

Here's an example of how to use close()

>>> f = open('file.txt')
>>> f.read()
'This is file.txt\nThis is the second line \nThis is the third line \n\n'
>>> f.close()
>>> f.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file.
  • A ValueError happens when you read a closed file.

Code Example

Here's a program that works like UNIX's head program. It prints the first few lines of a file.

head.py
''' CIS-15 Example 
head.py - A Python version of UNIX head
'''
 
import sys 
 
prog, filename = sys.argv 
 
# Open the file
filehandle = open(filename, 'r')
 
# Print the first four lines
print (filehandle.readline(), end='')
print (filehandle.readline(), end='')
print (filehandle.readline(), end='')
print (filehandle.readline(), end='')
 
# Close the file. 
filehandle.close()

Here's the same program written with the with open construct in Python3

head.py
''' CIS-15 Example 
 
head.py - A Pythno version of UNIX head
'''
 
import sys 
 
prog, filename = sys.argv 
 
# Open the file
with open(filename, 'r') as filehandle :
 
    # Print the first four lines
    print (filehandle.readline(), end='')
    print (filehandle.readline(), end='')
    print (filehandle.readline(), end='')
    print (filehandle.readline(), end='')
 
# File is closed automatically!

Writing Files

  • What happens when you want to store data permanently?
  • You need to write a file.
  • When you want to write a file you have to slightly change the call to open()

Here's an example of how to write a file:

>>> f = open('output.txt', 'w') 
>>> f.write('This is my important saved data.') 
32
>>> f.write('And some less important data.') 
29
>>> f.close()
  • Notice something?
  • The write() function returns an integer
    • The integer is the number of bytes actually written to the file.
    • This is useful if you are printing formatted or f-strings.

Here's what output.txt contains after running the code above:

This is my important saved data.And some less important data.
  • The output doesn't contain newlines (\n).
  • write() isn't like print() it never automatically adds any characters.
  • If you want your files to have newlines you must write them explicitly
>>> f = open('output.txt', 'w') 
>>> f.write('This is my important saved data.\n') 
33
>>> f.write('And some less important data.\n') 
30
>>> f.close()

Note that the character counts went up! Now the file contains:

This is my important saved data.
And some less important data.

Be Careful Writing Files

  • When you open a file with the w flag one of two things will happen:
    • If the file doesn't exist it will be created.
    • If the file exists it will be opened and the entire contents will be deleted

Seeking in a File

  • The file keeps track of the place you were reading or writing.
  • You can manipulate that place using the seek() function.
  • seek() is necessary if you want to re-read something that you've already read.

Here's an example of using seek()

>>> f = open('file.txt')
>>> f.readline()
'This is file.txt\n'
>>> f.seek(0)
0
>>> f.readline()
'This is file.txt\n'
>>> f.seek(1)
1
>>> f.readline()
'his is file.txt\n'
  • Notice that the seek() function returns the place that was sought to.
  • Seeking to position 1 seeks back to the second byte in the file

You can seek in a file opened for writing, too, but be careful:

>>> f = open('output.txt', 'w') 
>>> f.write('This is the first thing that I wrote.\n')
38
>>> f.seek(0)
0
>>> f.write('Another thing\n')
14
>>> f.close()

The file now contains:

Another thing
rst thing that I wrote.
  • Notice that the first part of the first sentence was overwritten
  • The write() function never inserts data such that old data is moved.
  • The write() function always overwrites.

About Files and Lines

  • A common mistake for beginners is to think that lines as we see them mean more than they do in a file.
  • A file is a compact data structure
  • There is no “extra space” between the end of the line and the beginning of the next one.
  • Look at the seek() example above and notice how overwriting messed up the lines

File Modes

* So far we've opened files for reading or writing. 
* It's possible to do both.
* Here's a cheat sheet for understanding the common file modes. 
Mode Meaning
r Opens the file for reading starting from byte 0
w Opens the file for writing. The file will be created if it doesn't exist.
All contents of the file will be lost
r+ Opens the file for reading and writing, starting from byte 0
The file must exist and it's contents will be preserved
w+ Opens the file for reading and writing, starting from byte 0
The file will be created if it doesn't exist.
All contents of the file will be lost
a Open the file for writing starting at the end of the file.
Contents are safe
  • Appending a file can be very useful.
  • If you want to log what's happened opening a file to append is great.

Here's an example of a program that remembers what time it was run:

logger.py
from datetime import datetime 
 
log = open('runlog.txt', 'a')
log.write(str(datetime.now()))
log.write('\n')
log.close()

After running this program three times on my machine I got this output in the runlog.txt file:

2017-09-25 20:04:14.998552
2017-09-25 20:04:15.933838
2017-09-25 20:04:16.532732

Python 3's Context Managers

  • Python 3 introduced a feature called a context manager
  • You'll learn how to code a contact manager later, after you learn about classes.
  • In Python 3 the open() function can be a context manager.
  • When you use open this way you don't have to remember to close() the file.
  • It's always done for you automatically.

Here's a rewrite of logger.py above to use open() as a context manager:

logger_context.py
from datetime import datetime
 
with open('runlog.txt', 'a') as log :
    log.write(str(datetime.now()))
    log.write('\n')
  • Notice that the variable name log is assigned by the as word
    • The with/as line ends in a colon :
    • The write() functions are inside the with/as context manager
    • You don't call close()
  • The close() function is automatically called when the with/as block exits

Understanding Web Requests

  • Programming for the web isn't like programming on a single machine.
    • There are at least two computers involved (many more in advanced setups)
    • Program flow is based on request/response cycles
  • In web programming the two parties involved are
    • The client. This is the web browser (e.g. Chrome or Firefox).
      • The client initiates all requests.
    • The server. The server is your program.
      • The server waits for client requests
      • The server validates and responds to requests
  • Requests have to be directed somewhere.
  • The location that a request is directed to is a URL
  • URL is Universal Resource Locator.
  • URLs have four parts

Here's an example URL:

http://www.lifealgorithmic.com/cis-15?arg=value&arg2=value2
URL Part Meaning
http: Specifies the Hypertext Transfer Protocol should be used to connect to the server.
There are many protocols but HTTP is the most common
//www.lifealgorithmic.com The host or the DNS name of the server to be contacted
/cis-15 The path of the resource. This is like a file path but the server is free to interpret this any way it wants to.
This part is case sensitive
?arg=value&arg2=value2 Arguments to the path. These are a convention used by GET forms. The server is free to do anything it wants with these
  • Web browsers primarily issue two kinds of requests.
    • GET requests load a URL with arguments like the ones in the URL above
      • GET requests reveal their arguments in the browser location bar
      • That makes them a poor choice for forms with sensitive (or a lot) of data
    • POST requests load a URL with embedded arguments.
      • Embedded arguments are a bit easier to retrieve in a program.

Here's an example of what the web browser sends the server in a GET request:

GET /hello.htm HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: www.tutorialspoint.com
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

Here's an example of what the web browser sends the server in a POST request:

POST /cgi-bin/process.cgi HTTP/1.1
User-Agent: Mozilla/4.0 (compatible; MSIE5.01; Windows NT)
Host: www.tutorialspoint.com
Content-Type: application/x-www-form-urlencoded
Content-Length: length
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Connection: Keep-Alive

licenseID=string&content=string&/paramsXML=string
  • For more information about GET and POST requests read this:
  • Flask makes it easy to select where a request goes.
  • This selection in web speak is known as routing
    • Which is not the same as the routing you learn about in cis-81

Here's a snippet from this week's sample code:

@app.route('/result', methods=['POST'])
def result():
    return render_template('result.html')    
 
@app.route('/words', methods=['POST'])
def do_madlib():
    return render_template('words.html')
 
@app.route('/')
def start_madlib():
    return render_template('madlib.html')
  • The @app.route() decorator informs Flask where to send requests.
  • Here's what they mean:
Route Meaning
@app.route('/result', methods=['POST']) POST requests that have /result in the path
GET requests are not allowed
@app.route('/words', methods=['POST']) POST requests that have /words in the path
GET requests are not allowed
@app.route('/') GET requests sent to the root path (/).
GET is the default, no other request is allowed.
  • It's possible to have a route that handles multiple request types. Those appear like this:
@app.route('/words', methods=['POST', 'GET'])
def do_madlib():
    return render_template('words.html')

Example Code

You can find starter code for this week's project here:

madlib.py
from flask import Flask, request 
 
app = Flask(__name__) 
 
start_template = '''
<html>
    <head>
        <title>Enter your Mad Lib</title>
    </head>
    <body>
        <form action="/" method="post">
 
          Enter the madlib with embedded {} marks:<br>
          <input type="text" name="madlib"><br>
 
          Enter the type of the first word:<br>
          <input type="text" name="word_type_1"><br>
 
          <input type="submit" value="Submit">
        </form>
    </body>
</html>
'''
 
words_template = ''' 
<html>
    <head>
        <title>Enter the Words</title>
    </head>
    <body>
        <form action="/result" method="post">
 
          Enter a {word_type_1}:<br>
          <input type="text" name="word_1"><br>
 
          <input type="hidden" name="madlib" value="{madlib}">
 
          <input type="submit" value="Submit">
        </form>
    </body>
</html>
''' 
 
result_template = '''
<html>
    <head>
        <title>Enter the Words</title>
    </head>
    <body>
        The completed Mad Lib is:<br>
        <pre>{complete_madlib}</pre>
        <br>
        <a href="/">Start Over</a>
    </body>
</html>
'''
 
@app.route('/result', methods=['POST'])
def result() :
    madlib = request.form['madlib']
    word1 = request.form['word_1']
    complete = madlib.format(word1)
    return result_template.format(complete_madlib=complete)
 
@app.route('/', methods=['GET', 'POST'])
def index() :
    if request.method == 'GET' :
        return start_template
    else:
        madlib = request.form['madlib']
        wt1 = request.form['word_type_1']
        return words_template.format(word_type_1 = wt1, madlib = madlib )
 
 
if __name__ == '__main__' : 
    app.run(host='0.0.0.0', port=8080, debug=True)