In [1]:

```
%matplotlib inline
import numpy as np;
from matplotlib import pyplot as plt
```

Benjamin Bray, January 2017 for EECS 445 @ University of Michigan

*(Updated August 2018 for CS 4540 @ Georgia Tech)*

The goal of this tutorial is not to teach you everything you need to know about Python and the scientific libraries you will be using in this class, but rather to provide you with helpful resources and vocabulary you can use to search for help on your own.

*This document contains dozens of helpful links!*

Even just a few years ago, installing Python was a nightmare, especially on Windows. Now, we are lucky enough to have Anaconda, a Python distribution that comes with many useful libraries and a package manager called `conda`

that simplifies the installation of new libraries. For this class, we **strongly recommend** using Anaconda. Some warnings about Python:

- Python (like most software) works best on Linux :)
- Good luck debugging problems on Windows or Mac. ;)
- Windows users may find these precompiled Python extension binaries helpful.

- Anaconda comes with a graphical interface for Python, but familiarizing yourself with the command line interface will make life much easier!
- On all platforms, installing Python may have unintended consequences on your platform
`PATH`

. This can be avoided by using`conda`

environments or Python virtual environments, which also allow you to install different Python / library versions.- This is useful, e.g. if you need to use Python 3 for one course and Python 2 for another, however...

Use Python 3! Unless you're running legacy code or need to use obscure, unmaintained libraries, there is absolutely no good reason to continue using Python 2, which will be officially retired in 2020.

For the rest of this tutorial, I will assume you are using Anaconda with Python >= 3.5.

Python is an interpreted, dynamically typed, open-source language that is especially suitable for quick prototyping and scientific visualization. Python is very easy to read, which is why many people like to think of it as *executable pseudocode*.

"I came to Python not because I thought it was a better/acceptable/pragmatic Lisp, but because it was— Peter Norvig (emphasis added)better pseudocode. Several students claimed that they had a hard time mapping from the pseudocode in my AI textbook to the Lisp code that Russell and I had online.So I looked for the language that was most like our pseudocode, and found that Python was the best match.Then I had to teach myself enough Python to implement the examples from the textbook. I found that Python was very nice for certain types of small problems, and had the libraries I needed to integrate with lots of other stuff, at Google and elsewhere on the net."

Python is **not** a functional programming language (like e.g. Haskell), but it does borrow some functional programming concepts like list comprehensions and lambda functions.

Python has all the basic features you would expect it to have. In Python, there are no curly braces to indicate scope, and therefore **whitespace matters**. Here is some basic Python code:

In [2]:

```
# recursively compute the nth Fibonacci number
def fibonacci(n):
if n <= 1:
return 1;
else:
return fibonacci(n-1) + fibonacci(n-2);
# print the first several Fibonacci numbers
for k in range(10):
print(k, fibonacci(k));
```

You should look up the following basic concepts in Python, only a handful of which I will demonstrate in this notebook:

- Lists vs. Tuples
`args`

and`kwargs`

- Dictionaries
- List / Dictionary Comprehensions
- String Manipulation
- Generators
- Python Classes
- Decorators
- Lambda Functions

In [3]:

```
# list example
A = [1,2,3,4];
A
```

Out[3]:

In [4]:

```
# list comprehension example
B = [ k**2 for k in range(10) ];
B
```

Out[4]:

In [5]:

```
# dictionary example
grades = {
"A" : 4.0,
"A-" : 3.7,
"B+" : 3.3,
"B" : 3.0,
"B-" : 2.7,
"C+" : 2.3,
"C" : 2.0,
"C-" : 1.7
}
grades["A"]
```

Out[5]:

Semicolons are not strictly necessary in Python code, but I prefer to use them because my code feels "naked" otherwise. In interactive environments like IPython, Jupyter notebooks, or a terminal, the omission of a semicolon after a command will cause the return value of the command to be printed to the console. For example:

In [6]:

```
5 + 7
```

Out[6]:

In [7]:

```
5 + 7;
```

Python is remarkably fast for an interpreted language, but in general it cannot beat compiled languages designed with speed in mind. **However**, well-written Python code can come very close in speed to languages like C++ and Fortran. Python can also be extended directly with C/C++/Fortran code, using native Python C Extensions or using libraries like `ctypes`

.

*(you may enjoy the blog post Why Your Python Runs Slow, Part 1: Data Structures)*

You will often hear the phrase "*Pythonic*", referring to Python code that is particularly clear, and that conforms to the conventions established by the Python community. Pythonic code is not hard to write--if you are familiar with the language features, there is usually only one good way to perform a particular action. If you want to know what "Pythonic" code looks like, have a look at PEP 8: Style Guide for Python Code.

The following Python manifesto is built into the language itself:

In [8]:

```
import this
```

There are several common ways to run Python:

- Interactively on the command line, using the Python REPL. Simply type
`python`

into a terminal.- Some users prefer the IPython interactive terminal instead.

- Execute the contents of
`.py`

files all at once:`python myscript.py`

- Interactively in a Jupyter notebook, which is a mix of the above two approaches.

This document was written entirely inside of a Jupyter notebook (formerly known as IPython notebooks, but renamed because they now support many other languages)! These notebooks are great for documentation and prototyping because they allow you to mix Python code with math, text, and images! Use a mixture of Markdown, MathJax (which uses $\LaTeX$ notation), and HTML to mark up your notebooks.

$$ \frac{\partial u}{\partial t} + (u \cdot \nabla) u - \nu \nabla^2 u = -\nabla w + g $$If you aren't familiar with $\LaTeX$ math notation, you may find this cheat sheet helpful. You can also use Detexify to look up the commands by drawing symbols!

You can embed dynamically-created plots by calling the `%matplotlib`

magic command at the top of a notebook (remember to call it **before** importing `matplotlib`

, though!). We will cover `matplotlib`

in more detail later.

```
%matplotlib
from matplotlib import pyplot;
```

In [9]:

```
plt.plot(np.linspace(0,1,100)**2)
```

Out[9]:

Python is used extensively in machine learning and the sciences for its scientific visualization capabilities. The following libraries will be very useful to you for EECS 445:

- The SciPy stack.
- NumPy for numerical linear algebra.
- Matplotlib for plotting and visualization, in particular the PyPlot interface.
- SymPy for symbolic math.
- Pandas for reading and processing data.

- Machine Learning
- SciKit-Learn contains implementations of almost every algorithm we will discuss in EECS 445.
- NLTK for natural language processing.

Many of these libraries come preinstalled with Anaconda, and the rest can be installed using the `conda`

or `pip`

package managers.

The `numpy`

library makes it easy to work with vectors and matrices. Many other libraries (like SciKit-Learn and `pandas`

) make extensive use of `numpy`

objects, so it is essential to familiarize yourself with its basic features. The conventional way to import `numpy`

uses a shorter alias:

`import numpy as np;`

`numpy`

: Manually-Defined Arrays¶You can populate an array manually, just as you would with Matlab:

In [10]:

```
A = np.array([ [1,2,3], [4,5,6], [7,8,9]]);
A
```

Out[10]:

`numpy`

: Data Types¶If you do not provide a datatype, `numpy`

will infer the data type for you. **Be careful!** Your calculations may not work as expected if `numpy`

thinks you are working with integer arrays.

In [11]:

```
A.dtype
```

Out[11]:

Numpy allows you to explicitly define the type as follows:

In [12]:

```
A = np.array([ [1,2,3], [4,5,6], [7,8,9] ], dtype=np.int64)
A.dtype
```

Out[12]:

Alternatively, we could have used floating-point notation to define our array:

In [13]:

```
B = np.array([1., 2., 3.]);
B.dtype
```

Out[13]:

`numpy`

: Ranges¶In [14]:

```
# np.arange(start, stop, step) -- includes only the start value
np.arange(0,20,2)
```

Out[14]:

In [15]:

```
# np.linspace(start, stop, num) -- includes both endpoints
np.linspace(0,1,11)
```

Out[15]:

`numpy`

: Built-In / Special Matrices¶In [16]:

```
np.ones((3,3))
```

Out[16]:

In [17]:

```
np.zeros(10)
```

Out[17]:

In [18]:

```
np.eye(5)
```

Out[18]:

In [19]:

```
np.diag([1,2,3,4])
```

Out[19]:

`numpy`

: Random Matrices¶In [20]:

```
# random matrix with uniform[0,1] entries
np.random.rand(3,4)
```

Out[20]:

In [21]:

```
# random matrix with Gaussian-distributed entires
np.random.randn(4,2)
```

Out[21]:

`numpy`

: Matrix / Vector Operators¶All of the standard operators are **elementwise**...

In [22]:

```
np.diag(np.arange(3)) + np.array([ [1,2,3], [4,5,6], [7,8,9] ])
```

Out[22]:

...including multiplication.

In [23]:

```
np.eye(3) * np.array([ [1,2,3], [4,5,6], [7,8,9] ])
```

Out[23]:

To perform matrix-matrix and matrix-vector multiplication, use `np.dot`

or the `@`

operator.

In [24]:

```
A = np.random.randn(3,3);
x = np.random.randn(3);
np.dot(A, x)
```

Out[24]:

In [25]:

```
np.eye(3) @ np.array([ [1,2,3], [4,5,6], [7,8,9] ])
```

Out[25]:

We can easily take the transpose of a matrix:

In [26]:

```
A = np.random.randint(1,20,(3,5))
print(A);
print(A.T);
```

`numpy`

: Indexing and Slicing¶Consider the following matrix:

In [27]:

```
A = np.random.randint(-10, 10, (3,4))
print(A)
```

We can easily obtain individual elements by indexing. Everything in Python, including numpy, is **zero-indexed**.

In [28]:

```
A[1,2]
```

Out[28]:

We can obtain submatrices by slicing. For example,

In [29]:

```
# second column
A[:,1]
```

Out[29]:

In [30]:

```
# third row
A[2,:]
```

Out[30]:

In [31]:

```
# submatrix (end index is not included)
A[0:2,1:3]
```

Out[31]:

`numpy`

: Slicing Weirdness¶**Be careful!** Sometimes slices are just pointers to the original matrix, rather than copies.

In [32]:

```
A = np.eye(3);
B = A[:,1];
print(A, B);
```

In [33]:

```
B[0] = 100;
print(A, B);
```

To avoid this problem, you can explicitly copy ndarrays:

In [34]:

```
A = np.eye(3);
A_copy = np.copy(A);
A_copy[0,1] = 100;
print(A);
print(A_copy);
```

If you want to learn more about why this happens, read the following documents:

The `matplotlib`

library allows for the creation of basic graphs and charts in Python. The `matplotlib`

API is almost an exact copy of Matlab's, so if you already know Matlab you should feel right at home!

I highly recommend reading the PyPlot tutorial.

In [35]:

```
x = np.linspace(0,1,100);
y1 = x ** 2;
y2 = np.sin(x);
plt.plot(x, y1, label="parabola");
plt.plot(x, y2, label="sine");
plt.legend();
plt.xlabel("x axis");
```

In [36]:

```
# import 3d stuff
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# create figure; enable 3d
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# plot parametric curve
t = np.linspace(0, 10, 100);
x = np.cos(t * 3);
y = np.sin(t * 3);
ax.plot(x,y,t)
```

Out[36]:

In [37]:

```
# need to define a grid on the xy plane
xvals = np.linspace(-10,10, 100);
yvals = np.linspace(-10,10, 100);
X,Y = np.meshgrid(xvals, yvals);
Z = X**2 + Y**2;
# enable 3d
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z, cmap="coolwarm");
```

Create a 5x5 matrix with 1,2,3,4 just below the diagonal.

In [38]:

```
np.diag([1,2,3,4], k=-1)
```

Out[38]:

Create an 8x8 matrix and fill it with a checkerboard pattern.

In [39]:

```
A = np.zeros((8,8))
A[::2,::2] = 1
A[1::2,1::2] = 1
A
```

Out[39]:

Write a function to generate a random orthogonal matrix of a specified size.

In [40]:

```
def random_orthogonal(n):
A = np.random.randn(n,n);
Q,R = np.linalg.qr(A);
return Q;
```