Introduction To Python, Numpy and The OPUS Expression Language
Introduction To Python, Numpy and The OPUS Expression Language
Introduction to Python
None
Last Session
Today’s Objectives
Starting Python
Starting Python
• Another way to start an interactive Python session is with IDLE, which is a simple
GUI for Python that comes on all platforms. It is available from the Windows
program menu, or can be started from the command line by typing ‘idle’.
Starting Python
• IDLE also has an editor, that allows you to write scripts, or programs, and save
them to disk for running later. Note that Python programs on disk always have
a .py filename suffix, like ‘myprogram.py’. The .py indicates that it is Python.
• But let’s stick with the interactive prompt for the moment. Notice that it has a
prompt with three ‘greater than’ signs: >>>
• Here is your first program: make Python print ‘Hello World!’ (this is the first
program in every programming language course)
Starting Python
• IDLE also has an editor, that allows you to write scripts, or programs, and save
them to disk for running later. Note that Python programs on disk always have
a .py filename suffix, like ‘myprogram.py’. The .py indicates that it is Python.
• But let’s stick with the interactive prompt for the moment. Notice that it has a
prompt with three ‘greater than’ signs: >>>
• Here is your first program: make Python print ‘Hello World!’ (this is the first
program in every programming language course)
• Let’s begin by using Python for some interactive calculations. Note that we can
simply type in a formula and Python will evaluate it:
>>> (2+3)/3.2
1.5625
• And we can assign values to variables and then operate on those variables:
>>> a = 2
>>> b = 3
>>> c = 3.2
>>> (a+b)/c
1.5625
• Let’s begin paying attention to data types. Numbers without decimals are
Integers. Numbers with decimals are Floats. Let’s see what happens if we use
only integers with the preceding example:
>>> (2+3)/3
1
• Is this what you expected? Probably not. It is an integer division, and this
results in a truncated result (no decimal value) with no rounding up.
• If you want to produce a result as a Float, at least one of the arguments
(elements) of the calculation must be a float. That is why the previous example
gave the ‘right’ answer.
• Integer: 1, 2, 3
>>> a = 1
• Float: 2.32592
>>> b = 3.141
• String: ‘Abracadabra’
>>> c = ‘Kalamazoo’
>>> c[0] #(Python uses ‘indexes to look things up -- the first entry is 0)
‘K’
• List: [‘apple’, ‘pear’, ‘orange’], [1, 5, 9, 12]
>>> d = [‘apple’, ‘pear’, ‘orange’]
>>> d[0:2] #(Using a second index, you will get one less than you might expect)
[‘apple’, ‘pear’]
Python Modules
def countdown(n):
Usually we would store longer commands or
while n > 0:
multi-line scripts like this in a Python module. If
print n
we use a text editor like the one built into IDLE,
n = n-1
or use an alternative like Scite or TextMate (there
print "Blastoff!"
are many options) that ‘understand’ Python
countdown(n)
syntax and highlight it and provide useful
shortcuts, then save it to disk as a file with a
save this file to blastoff.py name like blastoff.py (don’t call it countdown.py
since the method has already used that name)
But we don’t want to ‘hard-code’ the number for n -- we want to pass it as an argument to
the program. So we add 2 lines to the beginning of the program:
import sys
n = int(sys.argv[1])
now we can run it from the command line, like this, passing the value of n as an argument:
save this file to write_file.py save this file to read_file.py, and run it:
and then run it.
python read_file.py
Some text
goes here
\n inserts a newline
Introducing Numpy
Numpy Basics
>>> a = numpy.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
• Now you can operate on this array all at once rather than iterating over it:
Numpy Computations
• It did this as integer math -- so you get truncated results. Force it to a float to
get the expected result. In general, keep track of data types in calculations.
>>> a/2.0
array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])
>>> a.dtype.name
'int32'
• Multiply a and b, element by element (note that this is not matrix multiplication):
>>> a*b
array([[ 0.66270365, 1.20811349, 1.38790865],
[ 0.91277594, 1.09838649, 1.6169034 ]])
Sunday, September 12, 2010
P. Waddell, 2010
• The Open Platform for Urban Simulation (OPUS) is a software platform written
using Python and Numpy. UrbanSim is implemented in OPUS.
• Models require extensive and often complex calculations to predict choices of
agents, like household location or whether a parcel should be developed:
OPUS Datasets
• In OPUS, we create ‘Datasets’ that are collections of Numpy arrays describing all
of the elements in a type of entity, like buildings, or parcels, or households
• The data are stored on disk as Numpy arrays, but are moved back and forth
from databases using OPUS tools built into the GUI or from the command line
• There is a one-to-one correspondence between an OPUS Dataset and a
corresponding database Table
• One (odd) convention we have used is that the name of a table in the database
has been plural, while the counterpart OPUS Dataset is singular (this may be
made consistent in the future, but was intended to reduce possible confusion)
• Datasets that are related to each other have some common key: the household
dataset has a building_id, which corresponds to the internal id of the building
dataset -- this allows complex joins and navigation among datasets in memory
• To make much of this easier for users to use and extend, we developed the
‘OPUS Expression Language’
• It is built on Python and Numpy, but adds syntax parsing that makes it possible
to use more natural and concise commands to get complex work done
• Examples:
- parcel.aggregate(building.total_sqft) #Sum the sqft in buildings in a parcel
- zone.aggregate(household.income, function=mean, intermediates=[building,
parcel]) #Compute an average household income in a zone, navigating through the
relationships that households are in buildings, which are in parcels, which are in
zones
- parcel.aggregate(building.total_sqft)/parcel.lot_sqft #Calculate a Floor Area
Ratio (FAR)
- parcel.disaggregate(zone.population_density) #this would assign to each parcel
in a zone the value of the zone variable population_density
- zone.number_of_agents(job) # Counts the number of jobs within a zone
Assignment 1
• Given the relationships in this diagram
• Assume households have income, persons and cars
- Assume that parcels have lot_sqft, and buildings have total_sqft, and residential_units
- Create OPUS Expressions to do the following things:
‣ Calculate the total population by zone
‣ Calculate an average number of cars per household by zone
‣ Calculate the ratio of jobs to population within each city
‣ Calculate the density of employment by zone
‣ Calculate the per capita income by county
‣ Create 4 additional indicators using this
data structure, and assuming you can add
attributes to any of these objects
Due on Monday