4. Making Lists

We have seen simple Python values such as numbers and strings and booleans, but we have not yet seen how to combine them into bigger structures. We do so now.

Introducing lists

A list in Python is an ordered sequence of elements. Here is the list containing the words representing the first few numbers:

['zero', 'one', 'two', 'three', 'four', 'five']

Equally, we could put numbers or booleans in our list, or nothing – the empty list is written []. Here are the first few prime numbers:

[2, 3, 5, 7, 11, 13]

We can find the length of the list with len, just as we used it to find the length of a string:

Python
>>> len([2, 3, 5, 7, 11, 13])
6

It is possible to mix up elements of different types:

[1, 'one', False]

We will not be doing that in this book, however.

Accessing elements

We can fetch a single element from the list (the first element is number 0):

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l[0]
'zero'
>>> l[5]
'five'
>>> l[6]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

Notice the error when we go out of range. It is uncommon to write large computer programs correctly the first time, and we often have to track down and correct such errors.

Iterating over lists

We can iterate over the elements of a list with a for loop, just like we iterated of a range of numbers with range:

Python
>>> for x in l:
...     print(x + ' has ' + str(len(x)) + ' letters.')
... 
zero has 4 letters.
one has 3 letters.
two has 3 letters.
three has 5 letters.
four has 4 letters.
five has 4 letters.

There is a connection between this mechanism and the range function we used with for loops earlier. We can use the list function to build lists from ranges:

Python
>>> list(range(1, 10))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(range(1, 10, 3))
[1, 4, 7]

In fact, we could write a for loop by making a list from a range:

Python
>>> for x in list(range(1, 5)):
...     print(x)
... 
1
2
3
4

This has the same effect as simply writing range(1, 5), but needlessly constructs the list of numbers. When we use range on its own no such intermediate list need be created.

Sometimes we need both the index in the list and the item at that index. By using enumerate, and giving two names – one for the index and one for the value – we can do this easily:

Python
>>> for i, elt in enumerate([1, 2, 4, 8, 16]):
...     print('2 to the power ' + str(i) + ' is ' + str(elt))
... 
2 to the power 0 is 1
2 to the power 1 is 2
2 to the power 2 is 4
2 to the power 3 is 8
2 to the power 4 is 16

List slices

We can pick parts of the list out using what is called a slice. A slice is defined using start and stop positions:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l[1:4]
['one', 'two', 'three']
>>> l[1:6]
['one', 'two', 'three', 'four', 'five']
>>> l[0:6]
['zero', 'one', 'two', 'three', 'four', 'five']

Notice that the stop value defines the position to stop before, just like with a range. We may omit the start or stop value. This will then be taken to stretch to the omitted end of the list:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l[:4]
['zero', 'one', 'two', 'three']
>>> l[1:]
['one', 'two', 'three', 'four', 'five']

Even when the slice contains only one value, it is a list of one element, not just the element:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l[4:5]
['four']

If a slice contains no values, it is the empty list []:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l[4:4]
[]

A negative number in a slice counts from the end of the list instead:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l[-3:-1]
['three', 'four']

Adding to a list

We can add an item to the end of a list:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l.append('six')
>>> l
['zero', 'one', 'two', 'three', 'four', 'five', 'six']

Functions like append which are accessed by putting a dot after the value itself, are called methods. Notice that the list l is modified, rather than a new list being returned. We can concatenate lists using the same + operator used for concatenating lists and strings.

Python
>>> l1 = [1, 2, 3]
>>> l2 = [4, 5, 6]
>>> l1 + l2
[1, 2, 3, 4, 5, 6]

The lists l1 and l2 are unaltered.

Modifying lists

We have seen that, unlike strings, lists can be modified. Lists are mutable, strings immutable (from the word mutate, meaning to change). In fact, we can change existing elements as well as adding elements:

Python
>>> l
['zero', 'one', 'two', 'three', 'four', 'five']
>>> l[0] = 'nought'
>>> l
['nought', 'one', 'two', 'three', 'four', 'five']

We can, of course, delete items from the list. We use the built-in del construct:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> del l[1]
>>> l
['zero', 'two', 'three', 'four', 'five']

The del construct can also be used with a slice:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> del l[1:3]
>>> l
['zero', 'three', 'four', 'five']

Alternatively, if we wish to retrieve an element and delete it too, we can use the pop method:

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l.pop(1)
>>> 'one'
>>> l
['zero', 'two', 'three', 'four', 'five']

The remove method on lists allows us to remove an item by giving not the index but the actual item.

Python
>>> l = ['zero', 'one', 'two', 'three', 'four', 'five']
>>> l.remove('two')
>>> l
['zero', 'one', 'three', 'four', 'five']

If the list contains more than of the given item, only the first is removed. Let us put it back again, in its old position, using the insert method:

Python
>>> l
['zero', 'one', 'three', 'four', 'five']
>>> l.insert(2, 'two')
>>> l
['zero', 'one', 'two', 'three', 'four', 'five']

Since lists are mutable, we sometimes need to copy a list – simply assigning it to another variable name will not copy it. For this, we can use the copy method:

Python
>>> l
['zero', 'one', 'two', 'three', 'four', 'five']
>>> l2 = l
>>> l3 = l.copy()
>>> l[0] = 'nought'
>>> l
['nought', 'one', 'two', 'three', 'four', 'five']
>>> l2
['nought', 'one', 'two', 'three', 'four', 'five']
>>> l3
['zero', 'one', 'two', 'three', 'four', 'five']

Membership testing

We can test to see if an item is a member of a list using in or not in:

Python
>>> l
['zero', 'one', 'two', 'three', 'four', 'five']
>>> 'two' in l
True
>>> 'six' not in l
True

We can use index to find the index of the first occurrence of a item, so long as it exists:

Python
>>> l
['zero', 'one', 'two', 'three', 'four', 'five']
>>> l.index('two')
2

Or, we can count the number of occurrence of an item:

Python
>>> l
['zero', 'one', 'two', 'three', 'four', 'five']
>>> l.count('zero')
1
>>> l.count('six')
0

In the questions we will use many of these mechanisms, as well as exploring some new ones, to build functions which process lists.

Common problems

As soon as we begin to build compound data structures which contain positions, we open ourselves up to getting the positions wrong:

Python
>>> l = ['one', 'two', 'three']
>>> l[1]
'two'

Equally seriously, we can try to use a position which is simply not available:

Python
>>> l[3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

Often, these errors are exposed only after using a program for a while – we happen to hit a certain input which fails when many others have succeeded. These kinds of errors can be particularly difficult to track down. They occur also when deleting items from the list:

Python
>>> del l[3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list assignment index out of range

>>> l.remove('zero')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list

And, of course, when looking things up:

Python
>>> l.index('zero')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 'zero' is not in list

In this case, we could check membership in the list first, and use index only if the item is known to be present. Later in the book, we shall learn another way: to let the errors occur, and then to handle and recover from them.

Another problem concerns our use of ranges. A range in Python is not a list:

Python
>>> range(1, 10)
range(1, 10)

To turn it into a list, we can use list:

Python
>>> list(range(1, 10))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

For example, we could try to concatenate two ranges:

Python
>>> range(1, 10) + range(20, 30)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'range' and 'range'

We must turn them into lists first:

Python
>>> list(range(1, 10)) + list(range(20, 30))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]

This confusion arises because, in the for construct, we can use a range without converting it to a list: for knows both how to iterate over a range and how to iterate over a list.

Summary

In this chapter we have introduced lists, our first compound data structure. We have manipulated lists by addition and deletion and slicing. We have iterated over lists, and tested items for membership. The range of interesting programs we can write has grown still further. In the next chapter, we will look at some more advanced list functionality.

Questions

Write a function first to return the first element of a list, and a function last to return the last element of the list. You may assume the list is non-empty.
Write a function to build a new list which is the reverse of a given list.
Write a function to print the minimum and maximum numbers in a list. You may assume the list is non-empty.
As well as start and stop positions, a slice may have a third part, the step (just like a step in a range). For example l[0:10:2]. Write a function evens to return a list containing the items at even positions 0, 2, … in the given list.
A negative step value in a slice selects the elements from end to beginning. Use this to make your reverse function simpler.
Write a function setify which takes a list, possibly containing duplicates, and builds a new list which represents a set with no duplicates. For example, setify([1, 2, 3, 2, 1]) might yield [1, 2, 3] or [1, 3, 2].
Write a function histogram to print out a table of frequencies of the elements in a list. You might use the setify function you have just written to help.
The membership tester in works on strings too. Use it to write a function which checks if three given words are all in a given sentence.
Write a function copy_list to copy a list in the same way as the copy method, but without using it.
Use your copy_list function to write a function which removes an item from a list in the manner of the remove method, but returns a new list.
A Caesar cipher is a crude method of making secret messages. The alphabet is ‘rotated’ by some amount (here, we started at Q instead of A):
```
ABCDEFGHIJKLMNOPQRSTUVWXYZ
QRSTUVWXYZABCDEFGHIJKLMNOP
```
Each letter in the lower row is the substitute for the letter in the upper row. For example, here is an encoded message:
```
DAHHK SKNHZ
```
Write a function to generate the rotated alphabet, for any given amount of rotation. Now write encoding and decoding functions for messages.
Use lists to improve your answer to the Morse code question from the previous chapter, by using them to hold the code and letter data – rather than using a big if construct as before.
Randomly generate a secret four digit code (see question 7 of the previous chapter). Have the user repeatedly guess it, telling them how many digits a) were correct and in the correct place; and (b) were correct but in the incorrect place. Repeat until the user gets the right answer.