EC102, Lab 3, Numpy and Matplotlib

[This is the third in a series of lecture notes for the lab component of the core ‘Macroeconomics I’ course that I teach in the M.A. Economics programme at Ambedkar University, Delhi.]

First: Lab 1.
Previous: Lab 2.

Built in sequence types: list, tuples and strings

So far we have seen simple types of objects — integers and floating point numbers. Python also has a number of built-in types that represent collections of other objects. The simplest — list and tuple — deal with sequences of objects. We can define lists by putting the objects separated by commas in square brackets. So,

a = [4,5,6]

defines a list consisting of the three numbers 4, 5 and 6 and binds it to the name a. We can refer to individual members of the list by subscripted expressions of the form a[i] where a is any expression (not only a name) which produces a list and i is an integer specifying the position in the list of the element we want to refer to. Python counts positions starting from 0, so

a[1]

produces 5.

Lists are the first type we meet the objects of which have methods. Methods are functions tied to a particular object. While an ordinary function f is called by an expression like f(arg1,arg2), a method m attached to an object o is called by an expression like o.m(arg1,arg2).

For example, lists have an append method which adds an object to the end of the list. Suppose we want to add the number 5.5 to the end of the list a defined above, we would write

a.append(5.5)

Here the append method bound to the object a is called, so it knows which list to add elements to. Printing a now gives

[4,5,6,5.5]

There are many more methods for lists which you are described in the Python Standard Library Reference.

Note above that the list a in our example now has objects both of the integer type (such as 6) as well as of floating-point type (such as 5.5). Python lists are heterogeneous — they can contain objects of multiple types.

Tuples are similar to lists except that they cannot be modified once created — a property useful in some situations. Tuples are specified by separating expressions by commas (without the [] of lists). So

b = 3,4

defines a tuple with two elements 3 and 4. Conventionally the definitions of tuples are enclosed in parentheses like this

b = (3,4)

Just like lists, tuples can be subscripted, so that b[1] equals 5.

Strings are sequences of characters. In Python we can specify constant strings by enclosing the text in single or double quotes. The following defines two strings

a = "Hello"
b = 'world'

Python defines a rich set of tools for working with strings. The simplest of them is the + operator which in the case of strings just pastes the strings together. So a + " " + b produces the string "Hello world".

Numpy arrays

While the sequence types above are fundamental to general-purpose programming in Python, they are not designed with the needs for high-performance scientific computing in mind. Most numerical software for Python is built instead to operate with the multidimensional array types provided by the open-source NumPy package.

NumPy is part of the Anaconda distribution. In all subsequent examples I assume that you have imported the package in your Python session by executing the statement

import numpy as np

Note that we are using the as keyword in the import statement to give a shorter name to the numpy package. This is entirely for our convenience.

The central type defined by the NumPy package is an rectangular n-dimensional array type called ndarray. What do we mean by a “rectangular n-dimensional array”? In two-dimensions this is a matrix with M rows each row of which has N elements, so that any pair i, j with 0 ≤ i ≤ M − 1 and 0 ≤ j ≤ N − 1 refers to an element (remember Python starts counting elements from 0).

In general a n-dimensional array has d₁ × ⋯ × d_n elements, with any tuple (i₁,⋯,i_n) such that each tuple (i₁,…,i_n) with 0 ≤ i_k ≤ d_k − 1, 1 ≤ k ≤ n referring to an element.

In the special case n = 1 the n-dimensional array is just a vector with d₁ elements.

Constructing arrays

The numpy module provides many functions for constructing ndarrays. The most direct is the array function which given a Python sequence (such as a list or tuple) constructs a 1-dimensional array with the same elements. Try

a = np.array([4,5,6])

To construct 2-d arrays we pass a list of lists.

b = np.array([[0,-1],[-2,0]])

creates a 2-d array. Arrays can be subscripted just like lists and tuples. To subscript arrays with more than one dimension, we pass a tuple of indices as the subscript. So to get the element in the second column of the first row of b we do (once again, remember counting from zero):

b[0,1]

NumPy also provides much more powerful generalizations of subscripting which we will discuss below.

Another very useful function for constructing arrays is linspace. It is called like linspace(start,stop,num) and produces a 1-d array of num evenly spaced point starting with start and ending with stop.

np.linspace(0,1,5)

generates

array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ])

The decimal point in 0. and 1. indicates that they are floating-point numbers and not integers. Unlike Python lists, numpy arrays must have all elements of the same type. In most circumstances numpy figures out the correct type to use.

Other useful functions for constructing arrays are zeros, ones and arange.

Array Attributes

Just like methods are functions attached to objects, attributes are secondary objects attached to an object. Every NumPy array has a number of attributes that give information about the array. The most import ones are ndim which gives the number of dimensions in the array and shape which is a tuple with the number of elements along each dimensions.

Suppose we define

b = np.array([[2,3,4],[1,0,1]])

then b.ndim will be 2 and b.shape will be (2,3). While working interactively with NumPy it is useful to examine the attributes of arrays we construct to make sure that we are producing the results that we intended.

Arithmetic and functions on arrays

Given an array

x = np.array([1,2,3])

one can write x+2 to get a new array each of whose elements are obtained by adding 2 to the corresponding element of x. In general, for an arithmetic operation between a scalar and an array, the result is an array obtained by carrying out the operation betweent the scalar and corresponding elements of the array.

Suppose we define another array

y = np.array([-1,0,-1])

then x+y will be an array each of whose element will be obtained by adding the corresponding elements of x and y, i.e. the array with elements [0,2,2]. In general an arithmetic operation between two arrays with the same shape carries out the operations element-by-element between the two arrays.

The numpy module also defines versions of standard mathematical functions which apply element by element to an entire array. For example np.sqrt is the square-root function, so

np.sqrt(x)

with the above definition of x produces

array([ 1.        ,  1.41421356,  1.73205081])

These facilities mean that once we have constructed some initial arrays, further calculations can often be carried out by using operations on entire arrays without having to refer to the individual elements of the array at all.

Matplotlib

The Matplotlib package, which is also installed by default with Anaconda, provides facilities for producing graphical figures. We will begin by using the pyplot module from this package which is meant for quick interactive use. In IPython notebook run

import matplotlib.pyplot as plt
%matplotlib inline

The second line above is not strictly part of the Python language. Rather it is an instruction to the IPython software that any plots produced by Matplotlib should be included as graphical output within the notebook itself. This also works in the IPython QtConsole.

The simplest plotting function in the pyplot module is plot. In its simplest use we pass two one-dimensional vectors to plot and it interprets the elements of the first vector as x-coordinates, the elements of the second vector as y-coordinates and draws line segments joining the corresponding points. By choosing points close together we can produce graphs that are visually indistinguishable from smooth curves.

For example, suppose you want to draw the graph of the function

y = ϕ(x) = 4x²

for x in the range [−5,5]. We do

x = np.linspace(-5,5,100)
y = 4*x**2
plt.plot(x,y)

The first line creates a 100-element vector with values spaced evenly between -5 and 5. These are the values of x for which we will compute y = ϕ(x). The number 100 chosen here is arbitrary, but if you choose too few points your graph will come out blocky. The next line computes the value of y corresponding to each of the values of x. Note how we can use the vector as a whole in arithmetic expressions and have the results computed element by element. The final line plots y against x joining the points by line segments to get a smooth-looking graph.

plot can take a third argument which specifies a format. So for example

plt.plot(x,y,"r.")

plots the same graph as before but now as a scatter plot with the red dots as markers. The r signifies the color and . signifies the marker. In general the following format descriptions are available:

`-`	solid line style
`--`	dashed line style
`-.`	dash-dot line style
`:`	dotted line style
`.`	point marker
`,`	pixel marker
`o`	circle marker
`v`	triangle_down marker
`^`	triangle_up marker
`<`	triangle_left marker
`>`	triangle_right marker
`1`	tri_down marker
`2`	tri_up marker
`3`	tri_left marker
`4`	tri_right marker
`s`	square marker
`p`	pentagon marker
`*`	star marker
`h`	hexagon1 marker
`H`	hexagon2 marker
`+`	plus marker
`x`	x marker
`D`	diamond marker
`d`	thin_diamond marker
`\|`	vline marker
`_`	hline marker

The following color abbreviations are supported:

`b`	blue
`g`	green
`r`	red
`c`	cyan
`m`	magenta
`y`	yellow
`k`	black
`w`	white

There are not the only way to specify colors. You can give a color by name or red-green-blue (RGB) values by using a named color argument

plt.plot(k,y,'--',color='turquoise')

See the initial part of this document for more on specifying colors.

You can overlay multiple plots in the same figure. Suppose we want to overlay the line 0.01x on the graph above. It is as simple as issuing two commands one after another.

plt.plot(x,y)
plt.plot(x,0.1*x,"--")

If you run this you will see that Matplotlib automatically assigns different colors to the two lines to distinguish them.

It is not even necessary to call the plot function twice. You can have multiple x, y, and (optional) format sets as arguments to a single call to plot. So the following single call to plot produces the same figure as our earlier example

plt.plot(x,y,x,0.1*x,"--")

Each set of inputs must provide its own x and y so we had to repeat x in the call above.

To save your graphs use the savefig function. It takes as its argument a filename and saved the current graph to that file, figuring out the file format by the extension of the filename.

plt.savefig("plot.png")

creates a file called plot.png in the same folder as the notebook. The .png extension denotes a PNG file which can easily be incorporated in word-processor document or web pages.

Exercise

The Solow model in macroeconomics specifies a production function of the form y = f(k). Let’s assume Cobb-Douglas technology so that f(k) = k^α. The savings rate is given by s and the rate of population growth by n. k evolves according to the differential equation

dk/dt = sf(k) − nk

Assume s = 0.2, α = 0.3, n = 0.1. On one diagram plot sf(k) and nk.
On the same diagram now also plot sf(k) for s = 0.5.
Use Python to compute the steady-state capital stock at which dk/dt = 0. Read up on the Matplotlib function axvline and use it to draw a dashed vertical line at the steady-state k for s = 0.2.