Python Tutorial Part Seven: File As A Database, Objects

By Simon Bluck

Table of Contents

Input and Output

(Updated from Part 6: Input)

  • A disk file consists simply of a sequence of bytes. The file is automatically extended as you write more bytes.

  • You can input data (text or bytes) from a file or from stdin (the text window you are running your program in).

  • You can output data (text or bytes) to a file or to stdout (the text window you are running your program in).

  • You have to open a file for reading, or for writing, before you read or write data from or to that file — and you have to close it afterwards.

  • There are general functions available to format the data you write, and to help interpret the data you read.

  • You can do line-oriented reading/writing, or just read/write text/bytes without regard for a line structure.

  • There are straightforward facilities for writing and reading Python objects to a dedicated file, using json. Those facilities serialise and dedeserialise the objects. Serialisation records the object data, and its structure as a series of bytes, in a way that deserialisation can reconstruct the original object.

  • Operations on files should always be checked for success, and failures handled and reported appropriately. File operations can so easily fail – e.g. you don’t have permission to read/write the file, or you’ve run out of disk space.

Examples

Create a file called Fred in the current directory. The file is opened for writing. If the file already exists, it will simply be overwritten. The open call returns a “file object”:

fout = open('Fred', 'w') # 'w' opens it for writing.

Write some text to the file:

fout.write('This is some text\n')
fout.write('with a number: ' + str(42) + '\n')

Close the file:

fout.close()

Open the file for reading; read the data; and close the file:

fin = open('Fred')  # Second parameter defaults to 'r'.
fredsdata = fin.read()  # Read all the data.
fin.close()

You can instead read successive lines from a file using the readline() method. When readline() ultimately returns an empty string, that indicates you’ve reached the end of the file. E.g.:

while True:
    line = fin.readline()
    if line == '':  break
    print(line, end='')

Or, nicely, you can regard the file object as an iterator that delivers one line at a time and do:

for line in fin:
    print(line, end='')

Aside: for line-based files the file object is actually a generator, which is like an iterator except that it generates (obtains) each value as it is required. This means the whole file doesn’t get read into memory, just each line as it is needed.

The file reading can be further refined, to ensure the file is closed, even if errors occur. Python has a designed-in “context manager” mechanism for ensuring that a block of code is entered and exited in a sound manner. This is effected through use of the with statement:

with open('Fred') as fin:
    for line in fin:
        print(line, end='')

Using with here arranges for Python to “context manage” the contained block of code in a way that is specific to the open function. It effectively guarantees the file will be closed if it was successfully opened, even if an error occurs reading from the file.

File as a Database

You can use a file as a database, updating parts of it:

# Create an empty file called dbs, failing if it already exists:
fdbs = open('dbs', 'x')  # 'x' means exclusive creation.
fdbs.close()
...
fdbs = open('dbs', 'r+b')  # binary (bytes) read/write.
fdbs.seek(0)  # Position at start of file.
fdbs.write(b'just sum text')
fdbs.seek(5)  # Position after the first 5 bytes
fdbs.write(b'some text')
fdbs.seek(0)
print(fdbs.read())

And that outputs:

b'just some text'

Note that print and write are fairly similar in what they achieve. And print has a file argument so you can write e.g.:

print (b'some text', file=fdbs)

But print doesn’t print data as bytes: it prints them as Unicode characters. So it doesn’t work for writing to a binary file.

Other Python Tutorials

There are a surprising number of free-to-use online learning tutorials and tutorial-like material. These can be excellent resources for self-learning, and I would recommend working through one. Interactive ones are great, but the non-interactive ones come with lots of examples and you can play around with those in a Python interpreter to help your understanding and learning.

Here’s a list of (mostly Python 3) tutorials:

python.org tutorial — Free, comprehensive, non-interactive. The standard Python Tutorial.

Hitchhiker’s Guide to Python — (O’Reilly) A great read, even though somewhat incomplete. It doesn’t itself cover the language details, but it does have some references for that in the Learning Python section. Lots of useful references.

TutorialsPoint — See FAQ for use restrictions. Interactive, though a little clunky. Covers a lot.

Codecademy Python 2 — Nicely interactive ensuring you really do learn it. Doesn’t cover whole of language; bbut they’ve just announced significant updates for Q1 2017.

python-course.eu by Bernd Klein — Free, and extensive. Also many other courses relating to Python, including Python 2.

Computer Science Circles (University of Waterloo) — Free; semi-interactive.

How to Think Like a Computer Scientist (Runestone) — Superbly interactive, but examples sometimes don’t leverage the full power of Python. Open Source. Also other Python and other courses.

Problem Solving with Algorithms and Data Structures using Python (Runestone) — Superbly interactive, but again seeing examples that could better leverage the power of Python, e.g. by using while with an else part.

Non-Programmer’s Tutorial for Python 3 (Wikibooks) — Free, non-interactive.

A Beginner’s Python Tutorial (Wikibooks) — Free, non-interactive.

Hands-on Python Tutorial — Free for non-commercial use.

Python Programming Tutorial (Programiz) — Free (according to Programiz Facebook page), very comprehensive, non-interactive. Very clear to follow.

Pythonspot — Free according to their Twitter account. Also available as a PDF.

Welcome To Python For You And Me (Book by Kushal Das) — Lots of examples.

Learn X in Y minutesCC BY-SA 3.0 license

One Day of IDLE Toying — Visual guide to IDLE.

Python Objects

The design of Python can be viewed as consisting of a number of layers. At the lowest level, the Python data model consists of objects, just objects. Every object has a unique identity, a type and a value. All three of those are fixed, except that value can be changed if the object is mutable.

The object’s identity is implementation dependent but is typically the memory address of the object (which therefore implies that Python does not move the object around in memory).

The object’s type is itself an object, and we call that the class of the object. I.e. an object is an instance of a class object. The class describes what the created object should look like.

An object’s value is the full set of information it contains. That information is organised much like an ordered dictionary, i.e. as a sequence of name:value pairs.

Python allows us to define our own objects. We do that by writing a class definition, and then using that to instantiate (create) an object.

Example of a class definition

Here’s a class definition, combined within a program that uses it. This has examples of: class definition; object creation; object initialisation; method definition; method invocation.

import textwrap

class Trip:
    """ A trip is named and consists of points (places).
        A point has a place, time and optional info.
    """

    all_trips = []     # all_trips is shared by all instances.

    def __init__(self, trip_name):
        self.trip_name = trip_name
        self.trip_points = []  # Points for this trip
        self.all_trips.append(self)

    def add_point(self, place, when, info=None):
        self.trip_points.append({"place":place, "when":when, "info":info})

    def print_trip(self):
        print(self.trip_name)
        print("-" * len(self.trip_name))
        for trip_point in self.trip_points:
            print(trip_point["when"], " ", trip_point["place"])
            if trip_point["info"]:
                print(textwrap.indent(trip_point["info"], "        "))
        print()

    def print_all(self):
        for trip in self.all_trips:  trip.print_trip()


# Let's go somewhere:
trip = Trip("Salt Spring Island")  # Creates object.
trip.add_point("Home", "Tue")
trip.add_point("FulFord", "Tue")
trip.add_point("Ganges", "Wed", "Do lots of sight-seeing.")

trip.print_all()

Output

Salt Spring Island
------------------
Tue   Home
Tue   FulFord
Wed   Ganges
        Do lots of sight-seeing.