An important part of the advanced lab courses is learning how to properly analyze data and professionally present results. Many basic plotting or spreadsheet-based programs like Microsoft Excel, Kaleidagraph and Graphical Analysis either are not capable of doing the rigorous error analysis we require or do so in a “black-box”, hidden way. Additionally, these programs have limited capabilities for manipulating plots when making figures.
For these reasons (as well as an overall desire to introduce basic programming into the undergraduate physics curriculum), we will be using Python in this course to handle the analysis and presentation of data for reports.
When it comes to preliminary analysis and data processing, you may use whatever methods you like. It is fine, for example, to use a spreadsheet to record or manipulate data, or even to do simple plotting as-you-go, or pre-processing of the data.
When fitting and plotting for the report or for presentations, however, you must complete your work in Python or another equivalent high-level program.
NOTE: Requests to use an alternate language (e.g. Root, Mathematica) will be evaluated on a case-by-case basis and students will need to demonstrate their ability to produce all required outputs before starting the course. Do not assume that because you have a program like Mathematica installed on your computer you have the tools needed to for this class.
Our requirements for plots and analysis include the following:
The Python tutorial and scripts we provide will walk you through how to do all these things. (And more!) It is expected that you can use these tutorial scripts as a starting place for each experiment, and that with some light modifications, they will provide all the analysis tools you will need.
In computer programming, there are compiled languages and interpreted languages.
For a compiled language, you must first write a complete piece of code which is then translated (by the compiler) into an executable file. If there is an error in the code, it will not compile; therefore, the code must be complete and correct before you can run it. Similarly, once compiled, the executable cannot be edited; to change the program, you must change the code then recompile. Examples of compiled languages include Fortran, C/C++, and LabVIEW.
An interpreted language, on the other hand, is executed by feeding commands one line at a time to an interpreter (also called a kernel). The interpreter executes the commands and saves output to memory as each comes up. The code does not need to be complete (or correct) before you start, and you add new commands based on the outputs generated. Examples of interpreted languages include Mathematica, JavaScript and Python.
The most direct way to use Python is to open up the terminal on Mac (or the Command Prompt in Windows), start a python kernel, and type commands directly in line-by-line. This is fine if you need only a simple calculation, but it becomes very cumbersome if you need to do more than a few lines at once.
The second way to use Python is to write a script, which is a simple text document that contains a list of commands to be run in sequential order. You can write the entire piece of code, then feed the code to the kernel all at once. The kernel will run through the script from top to bottom, so if there is an error at some point, the code will still run up to that point before stopping. A script can be written in a simple plain text editor (scripts usually are given the extension *.py to differentiate them from normal text files), or there are specialized python editors like Spyder, Komodo or IDLE that have added features.
Usually when a script is run, a new kernel opens, runs the script, then closes. In this way, no variables or other saved values from a previous program remain in memory when a new script is run.
Finally, the most powerful way to run Python is through an interactive environment called a Jupyter Notebook. In a Jupyter notebook, we have cells that contain either python code or so-called “markdown” text (which can include plain text, headers, bullet point lists, hyperlinks, images, videos, code snippets and more). This format allows a user to separate code into smaller chunks (meaning you can edit and rerun just a part of the code rather than the whole thing when you want to make a change) and to intersperse it with rich commenting and context.
A Jupyter notebook also saves output within the file (meaning you don't need to rerun the program to see what result it gave last time) and keeps a detailed record of changes (so it can function a a timeline of edits.)
You may use Python in whichever way you'd like when running code yourself, but we will be providing tutorials in Jupyter notebook format exclusively. Similarly, TAs and lab staff will be able to most easily help you if you are working in this environment. Therefore, we highly encourage you to use Jupyter whenever possible in this course.
For this course, we encourage you to download and use Anaconda. Navigate to the download page and select the appropriate Python 3.x download for your operating system. (Do not download the Python 2.x version.)
Anaconda is a complete package that it easy to use and install, even for beginners. It includes basic Python, the most common scientific libraries, a nice script text editor (Spyder), the interactive Juypyter notebook, and more.
WARNING: We DO NOT suggest you try to install Python or add libraries from any source other than directly through Anaconda unless you have experience with such things already. We can provide help only if you download Anaconda directly; we likely cannot help you rebuild your operating system after a bungled manual install if you try a different method.
Similarly, though Mac OS installs a rudimentary version of Python as part of its normal operations, this is not sufficient for what we need to do in this class. Do not try to patch or upgrade what you have; instead download Anaconda.
Once you have Anaconda downloaded, open the Anaconda Navigator program and launch Jupyter Notebook to get started.
You should see a new tab in your internet browser start up and you should see the directory structure of your computer. If you have an existing notebook you want to open, navigate to the location and double-click it. If you want to create a new notebook, navigate to an appropriate location and select New: Python 3.
Jupyter does not use script files (*.py files), but instead uses notebooks (*.ipynb files). These notebooks consist of a number of cells, and these cells can either be Python code, markdown text, or plain text (Raw NBConvert). The notebook allows you to run your Python snippets individually (no need to re-run the whole program to update just one plot, for example) and the non-Python cells allow you to add formatted text comments, hyperlinks, images and more. To execute a cell, hit “shift+enter”. (That will run the code or execute the markdown syntax.)
Here is a good primer on some markdown syntax that can help make your comment cells clearer and more useful. Note that this syntax is not part of Python, so it will not function in code cells.
At this point, you can continue the tutorial in a more interactive form using a Jupyter notebook. You can do this by following the link to our python tutorial notebook. If you want to test your installation of Jupyter notebooks, you can download the file and run it locally by right-clicking and downloading the file in the following link: Download for Python tutorial
The top section of the code is where you (typically) import modules (sometimes called libraries). A module is a file containing some Python code. Importing a module creates a namespace which is a list of names. Names can be variables, functions, lists of variables, etc. Think of importing a module as importing a bunch of prepackaged definitions.
To access a name from an imported namespace, call it using the form “namespace.name”. For example, if the module called “coolstuff” contains a function called “geewhiz()” you can now use it as “coolstuff.geewhiz()”.
There are a couple ways to import modules:
import someModule
A namespace is created named “someModule” and the contents of the module can be accessed as “someModule.name”
import someModlue as newlyNamedModule
A namespace is created named “newlyNamedModule” and the contents of the module can be accessed as “newlyNamedModule.name”
from someModule import justTheOneNameYouWant
Only the listed name is imported. To call that name, you do not need to specify the namespace, just the name. However, be careful not to reuse that name in your own code or your newer definition will replace the imported one.
from someModule import *
All names from the module will be imported. To call a name, you do not need to specify the namespace, just the name. However, again, if you reuse that name in your own code, the newer definition will replace the imported one.
For this class, the three most common libraries we will pull from are “numpy”, “scipy”, and “matplotlib.pyplot”.
Comments are commands that are ignored by Python. They can appear anywhere in the code and are preceded by #. They may even begin mid-way through a line, though they always “end” when the line finishes.
If you want to do a multi-line comment, you must use the clunkier system of “triple quotes”. Anything after three apostrophes – ''' – will be ignored until three more apostrophes appear to close the quote.
Code | Output |
# This is a comment print(“Hello world”) # This is a comment too... but only the part after the # '''Now, we’re in a multi-line comment... ... and we’re still in it... ... but now we’re at the end''' print(“What’s your name?”) |
> Hello world |
In the above example we used the built-in Python command “print()” which will display whatever follows on the screen. We can have “print” show multiple things by separating them by commas or we can have “print” create a blank line by giving it no argument at all. If “print” is followed by a variable, it will display the current value associated with that variable. If we want to display text, we must pass “print” a string which is a list of characters contained in either single or double quotes.
For example, the following code gives the following output:
Code | Output |
a = 3.14159 b=1 c = 'Q: Am I a string?' print(a, b) print() print(c) print("A:", "Yes you are.") |
> 3.14159 |
Note a few things.
First, “print” automatically inserts a single space between arguments; there is a single space between 3.14159 and 1 and a space was automatically added between “A:” and “Yes…”
Second, single and double quotes are treated identically to identify strings, but you cannot mix them; you must begin and end with the same type.
Finally, our variables a, b and c each held a different type of data. The variable a is a floating point number or float (that is, a number with a decimal value), b is an integer, and c is a string.
Lists and numpy arrays
Besides the variable types shown above, we can also assign a whole list to a variable. Lists in Python are contained in square brackets and elements are separated by commas; lists can also be nested to form higher dimensional arrays or tensors. We can act on lists with mathematical operators, but the results can sometimes be confusing.
Look at the following example:
Code | Output |
a = [1, 2, 3, 4] b = [9, 8, 7, 6] print('[a,b] =', [a,b]) print('a + b =', a + b) print('2*a =', 2*a) | > [a,b] = [ [1, 2, 3, 4], [9, 8, 7, 6] ] > a + b = [1, 2, 3, 4, 9, 8, 7, 6] > 2*a = [1, 2, 3, 4, 1, 2, 3, 4] |
The first line is a nested list (as expected). In the second line, Python interprets “+” to mean “add second list to the end of the first list.” In the third line, Python interprets “2*a” to be “a + a” and again appends the second list to the first.
Can we instead treat lists more naturally so that we can act on them with operators in a normal way? Within the “numpy” library, there is a variable type known as the array. If we use arrays instead of lists, then mathematical operations carry their normal meaning and we can use lists of variables in the same way we use variables themselves.
Code | Output |
import numpy as np a = np.array([1, 2, 3, 4]) b = np.array([9, 8, 7, 6]) print('a + b =', a + b) print('2*a =', 2*a) print('2*a =', 1.5 + b) | > [10, 10, 10, 10] > [2, 4, 6, 8] > [10.5, 9.5, 8.5, 7.5] |
This time, adding the arrays meant adding each element, multiplying the arrays meant multiplying each element, and adding a number to an array, meant adding that number to each element.
If you want to select just one element of a list or array, you may select it by specifying the index between square brackets, e.g. a[i]. The counter always begins with zero, so a list of four items has elements with indices 0, 1, 2 and 3. You can also use the negative sign to count from the end; a[-1] will return the last element of the list a.
If you want to select a subset of the list, you can specify the start and end points in the form a[m:n]. Note that it will include the element at the starting index, but exclude the element at the ending index.
Omitting the first index will include all elements from the beginning of the list. Omitting the second index will include all elements to the end of the list.
You may select every nth element of a list by using two colons, e.g. a[::n].
Code | Output |
a = [0, 1, 2, 3, 4, 5] print(a[1]) print(a[-1]) print(a[1:4]) print(a[:-1]) print(a[::2]) print(a[::3]) | > 1 > 5 > [1, 2, 3] > [0, 1, 2, 3, 4] > [0, 2, 4] > [0, 3] |
Python, has an unusual notation for raising a number to a power: two asterisks.
Code | Output |
print('2 squared is', 2** 2) print('2 cubed is', 2** 3) print('the square root of 4 is', 4** 0.5) | > 2 squared is 4 > 2 cubed is 8 > the square root of 4 is 2.0 |
In programming, some symbols have slightly different meanings from those used in pure mathematics.
Take, for example, the equal sign. In Python (as in all programming languages), the equal sign is used to assign a value; it does not represent equality. Importantly, the left hand side and right hand side have different meanings and are treated at different times. During an assignment, the right hand side is evaluated first and the result is stored as the variable(s) given on the left hand side.
The following assignments are allowed:
Code | Interpretation |
x = 1 | assigns the integer 1 to the variable x |
x = x + 1 | takes the current value of x, adds 1 to it_,_ then assigns this new value to x |
x = a + b
| sums the current values of a and b and assigns that sum to x |
y, z = 1, 2 | assigns 1 to x and 2 to z |
The following assignments are not allowed:
Code | Interpretation |
1 = x | x cannot be assigned to a number (it isn’t a variable) |
x + 1 = x | the left hand side is an expression, not a variable |
One can make comparisons between variables of the same type, with a few caveats.
Code | Output |
a = 1 b= 5 c = 'Hello' d = 'Hulloh' e = 'He' + 'llo' print(a > b) print(a < b) print(c == d) print(c == e) print(c == d and a < b) print(c == d or a < b) | > False > True > False > True > False > True |
Indentation has meaning in Python. You may always include white space in the middle of a line or a blank line in the middle of your program, but you may not always be able to include an arbitrary amount of white space at the beginning of a line.
Loops use indents to delimit different levels. A new loop begins when the indent increases and a loop ends when the indent decreases back to a previous indent level. Look at the following example:
Code |
if x > 4: print(“x is greater than 4...”) if y < 1: print(“...and y is less than 1”) else: print(“...and y is greater than or equal to 1”) else: print(“x is less than or equal to 4...”) print(“...and nobody cares about y”) |
Here, we see three indent levels. Let’s look at three outputs for given values of x and y.
Values | x = 5 y = 5 | x = 5 y = 0 | x = 1 y = 5 |
Output | > x is greater than 4… > …and y is greater than or equal to 1 | > x is greater than 4… > …and y is less than 1 | > x is less than or equal to 4… > …and nobody cares about y |