%matplotlib inline
import numpy as np;
from matplotlib import pyplot as plt
Benjamin Bray, January 2017 for EECS 445 @ University of Michigan
(Updated August 2018 for CS 4540 @ Georgia Tech)
The goal of this tutorial is not to teach you everything you need to know about Python and the scientific libraries you will be using in this class, but rather to provide you with helpful resources and vocabulary you can use to search for help on your own.
This document contains dozens of helpful links!
Even just a few years ago, installing Python was a nightmare, especially on Windows. Now, we are lucky enough to have Anaconda, a Python distribution that comes with many useful libraries and a package manager called conda
that simplifies the installation of new libraries. For this class, we strongly recommend using Anaconda. Some warnings about Python:
PATH
. This can be avoided by using conda
environments or Python virtual environments, which also allow you to install different Python / library versions.Use Python 3! Unless you're running legacy code or need to use obscure, unmaintained libraries, there is absolutely no good reason to continue using Python 2, which will be officially retired in 2020.
Python is an interpreted, dynamically typed, open-source language that is especially suitable for quick prototyping and scientific visualization. Python is very easy to read, which is why many people like to think of it as executable pseudocode.
"I came to Python not because I thought it was a better/acceptable/pragmatic Lisp, but because it was better pseudocode. Several students claimed that they had a hard time mapping from the pseudocode in my AI textbook to the Lisp code that Russell and I had online. So I looked for the language that was most like our pseudocode, and found that Python was the best match. Then I had to teach myself enough Python to implement the examples from the textbook. I found that Python was very nice for certain types of small problems, and had the libraries I needed to integrate with lots of other stuff, at Google and elsewhere on the net." — Peter Norvig (emphasis added)
Python is not a functional programming language (like e.g. Haskell), but it does borrow some functional programming concepts like list comprehensions and lambda functions.
Python has all the basic features you would expect it to have. In Python, there are no curly braces to indicate scope, and therefore whitespace matters. Here is some basic Python code:
# recursively compute the nth Fibonacci number
def fibonacci(n):
if n <= 1:
return 1;
else:
return fibonacci(n-1) + fibonacci(n-2);
# print the first several Fibonacci numbers
for k in range(10):
print(k, fibonacci(k));
You should look up the following basic concepts in Python, only a handful of which I will demonstrate in this notebook:
args
and kwargs
# list example
A = [1,2,3,4];
A
# list comprehension example
B = [ k**2 for k in range(10) ];
B
# dictionary example
grades = {
"A" : 4.0,
"A-" : 3.7,
"B+" : 3.3,
"B" : 3.0,
"B-" : 2.7,
"C+" : 2.3,
"C" : 2.0,
"C-" : 1.7
}
grades["A"]
Semicolons are not strictly necessary in Python code, but I prefer to use them because my code feels "naked" otherwise. In interactive environments like IPython, Jupyter notebooks, or a terminal, the omission of a semicolon after a command will cause the return value of the command to be printed to the console. For example:
5 + 7
5 + 7;
Python is remarkably fast for an interpreted language, but in general it cannot beat compiled languages designed with speed in mind. However, well-written Python code can come very close in speed to languages like C++ and Fortran. Python can also be extended directly with C/C++/Fortran code, using native Python C Extensions or using libraries like ctypes
.
(you may enjoy the blog post Why Your Python Runs Slow, Part 1: Data Structures)
You will often hear the phrase "Pythonic", referring to Python code that is particularly clear, and that conforms to the conventions established by the Python community. Pythonic code is not hard to write--if you are familiar with the language features, there is usually only one good way to perform a particular action. If you want to know what "Pythonic" code looks like, have a look at PEP 8: Style Guide for Python Code.
The following Python manifesto is built into the language itself:
import this
There are several common ways to run Python:
python
into a terminal..py
files all at once: python myscript.py
This document was written entirely inside of a Jupyter notebook (formerly known as IPython notebooks, but renamed because they now support many other languages)! These notebooks are great for documentation and prototyping because they allow you to mix Python code with math, text, and images! Use a mixture of Markdown, MathJax (which uses $\LaTeX$ notation), and HTML to mark up your notebooks.
$$ \frac{\partial u}{\partial t} + (u \cdot \nabla) u - \nu \nabla^2 u = -\nabla w + g $$If you aren't familiar with $\LaTeX$ math notation, you may find this cheat sheet helpful. You can also use Detexify to look up the commands by drawing symbols!
You can embed dynamically-created plots by calling the %matplotlib
magic command at the top of a notebook (remember to call it before importing matplotlib
, though!). We will cover matplotlib
in more detail later.
%matplotlib
from matplotlib import pyplot;
plt.plot(np.linspace(0,1,100)**2)
Python is used extensively in machine learning and the sciences for its scientific visualization capabilities. The following libraries will be very useful to you for EECS 445:
Many of these libraries come preinstalled with Anaconda, and the rest can be installed using the conda
or pip
package managers.
The numpy
library makes it easy to work with vectors and matrices. Many other libraries (like SciKit-Learn and pandas
) make extensive use of numpy
objects, so it is essential to familiarize yourself with its basic features. The conventional way to import numpy
uses a shorter alias:
import numpy as np;
numpy
: Manually-Defined Arrays¶You can populate an array manually, just as you would with Matlab:
A = np.array([ [1,2,3], [4,5,6], [7,8,9]]);
A
numpy
: Data Types¶If you do not provide a datatype, numpy
will infer the data type for you. Be careful! Your calculations may not work as expected if numpy
thinks you are working with integer arrays.
A.dtype
Numpy allows you to explicitly define the type as follows:
A = np.array([ [1,2,3], [4,5,6], [7,8,9] ], dtype=np.int64)
A.dtype
Alternatively, we could have used floating-point notation to define our array:
B = np.array([1., 2., 3.]);
B.dtype
numpy
: Ranges¶# np.arange(start, stop, step) -- includes only the start value
np.arange(0,20,2)
# np.linspace(start, stop, num) -- includes both endpoints
np.linspace(0,1,11)
numpy
: Built-In / Special Matrices¶np.ones((3,3))
np.zeros(10)
np.eye(5)
np.diag([1,2,3,4])
numpy
: Random Matrices¶# random matrix with uniform[0,1] entries
np.random.rand(3,4)
# random matrix with Gaussian-distributed entires
np.random.randn(4,2)
numpy
: Matrix / Vector Operators¶All of the standard operators are elementwise...
np.diag(np.arange(3)) + np.array([ [1,2,3], [4,5,6], [7,8,9] ])
...including multiplication.
np.eye(3) * np.array([ [1,2,3], [4,5,6], [7,8,9] ])
To perform matrix-matrix and matrix-vector multiplication, use np.dot
or the @
operator.
A = np.random.randn(3,3);
x = np.random.randn(3);
np.dot(A, x)
np.eye(3) @ np.array([ [1,2,3], [4,5,6], [7,8,9] ])
We can easily take the transpose of a matrix:
A = np.random.randint(1,20,(3,5))
print(A);
print(A.T);
numpy
: Indexing and Slicing¶Consider the following matrix:
A = np.random.randint(-10, 10, (3,4))
print(A)
We can easily obtain individual elements by indexing. Everything in Python, including numpy, is zero-indexed.
A[1,2]
We can obtain submatrices by slicing. For example,
# second column
A[:,1]
# third row
A[2,:]
# submatrix (end index is not included)
A[0:2,1:3]
numpy
: Slicing Weirdness¶Be careful! Sometimes slices are just pointers to the original matrix, rather than copies.
A = np.eye(3);
B = A[:,1];
print(A, B);
B[0] = 100;
print(A, B);
To avoid this problem, you can explicitly copy ndarrays:
A = np.eye(3);
A_copy = np.copy(A);
A_copy[0,1] = 100;
print(A);
print(A_copy);
If you want to learn more about why this happens, read the following documents:
The matplotlib
library allows for the creation of basic graphs and charts in Python. The matplotlib
API is almost an exact copy of Matlab's, so if you already know Matlab you should feel right at home!
I highly recommend reading the PyPlot tutorial.
x = np.linspace(0,1,100);
y1 = x ** 2;
y2 = np.sin(x);
plt.plot(x, y1, label="parabola");
plt.plot(x, y2, label="sine");
plt.legend();
plt.xlabel("x axis");
# import 3d stuff
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# create figure; enable 3d
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# plot parametric curve
t = np.linspace(0, 10, 100);
x = np.cos(t * 3);
y = np.sin(t * 3);
ax.plot(x,y,t)
# need to define a grid on the xy plane
xvals = np.linspace(-10,10, 100);
yvals = np.linspace(-10,10, 100);
X,Y = np.meshgrid(xvals, yvals);
Z = X**2 + Y**2;
# enable 3d
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z, cmap="coolwarm");
Create a 5x5 matrix with 1,2,3,4 just below the diagonal.
np.diag([1,2,3,4], k=-1)
Create an 8x8 matrix and fill it with a checkerboard pattern.
A = np.zeros((8,8))
A[::2,::2] = 1
A[1::2,1::2] = 1
A
Write a function to generate a random orthogonal matrix of a specified size.
def random_orthogonal(n):
A = np.random.randn(n,n);
Q,R = np.linalg.qr(A);
return Q;