Thursday, November 3, 2011

Scope and Variable Binding in Python

1. Python only has two notions of scope -- global and local. This means that if a variable is declared in a function, it is bound to the function. Otherwise it is bound to the global state.

You can inspect everything that is bound globally by calling globals(), and everything that is bound locally by calling locals()

2. In the following post I will investigate variable binding and scope in python via an interactive python session. That is, by example. If you want to follow along, check out the documentation for dir() and dis.

Local modifications stay local

# define a global variable a
In [1]: a = 5
# modify a locally
In [2]: def foo():
...: a = 3
...:
In [3]: foo()
# did global a change?
In [4]: a
Out[4]: 5
# nope.

Global variables can be accessed but not modified

In [10]: a = 5

In [11]: def foo():
....: print a
....:

In [12]: foo()
5

In order to modify a function in global scope, you must declare it to be global first. The relevant difference is explained by the following byte code snippets. In the case where a is a local variable, STORE_FAST is called, whereas in the case where a is a declared global, STORE_GLOBAL is called.

Compare:

In [21]: def foo():
....: a = 3
....:

In [22]: dis.dis(foo.func_code)
2 0 LOAD_CONST 1 (3)
3 STORE_FAST 0 (a)
6 LOAD_CONST 0 (None)
9 RETURN_VALUE

And:

In [24]: def foo():
....: global a
....: a = 3
....:
# modification is successful
In [25]: foo()
In [26]: print a
3
In [27]: dis.dis(foo.func_code)
3 0 LOAD_CONST 1 (3)
3 STORE_GLOBAL 0 (a)
6 LOAD_CONST 0 (None)
9 RETURN_VALUE

Expressions are always in global scope.

As a review, expressions are constructs like if/while/for/try/except. The point below is made with if statements, but would apply equally to any of the others.

# b is first defined inside a trivially true if statement
In [6]: if True:
...: b = 21
...:
# b is still defined globally
In [7]: b
Out[7]: 21

# it does matter that the if statement is evaluated
In [10]: if False:
....: b = 1337
....:
In [11]: b

Out[11]: 21

This contrasts with other languages like C, where variables have block scope.

#include

int main()
{
if(1)
{
int a = 6;
printf("%d\n", a);
}
printf("%d\n", a);
return 0;

}

for example, generates a compile time error: ‘a’ undeclared at the second printf statement.

Nested Function Scope

# let’s see what happens here
In [18]: def foo():
....: a = 5
....: def goo():
....: print a
....: goo()
In [19]: foo()
Out[19]: 5

Inner functions have access to the outer functions’ local variables. This is the basis of functional closures. Similar behavior occurs for lambdas.

In [23]: def foo():
....: a = 5
....: b = (lambda : a)()
....: print b
....:

The fact that functions and lambdas have access to external scope is useful for closures, but also opens up this common bug*:

# define a sequence of functions
In [40]: fs = [(lambda n: i + n) for i in xrange(10)]

# e.g.
In [41]: fs
Out[41]:
[<function __main__.<lambda>>,
<function __main__.<lambda>>,
<function __main__.<lambda>>,
<function __main__.<lambda>>,
<function __main__.<lambda>>,
<function __main__.<lambda>>,
<function __main__.<lambda>>,
<function __main__.<lambda>>,
<function __main__.<lambda>>,
<function __main__.<lambda>>]

# now apply one of the functions in the sequence
In [43]: fs[0](3)

Out[43]: 12 # wait...wtf?

Let’s see exactly what’s happening here, just for fun:

In [49]: [dis.dis(i.func_code) for i in fs]
1 0 LOAD_GLOBAL 0 (i)
3 LOAD_FAST 0 (n)
6 BINARY_ADD
7 RETURN_VALUE
1 0 LOAD_GLOBAL 0 (i)
3 LOAD_FAST 0 (n)
6 BINARY_ADD
7 RETURN_VALUE
.
.
.

<snipped>

The lambda loads i from globals, because i cannot be found within its local scope. But during the construction of the function sequence, i changes. By the time any of the functions are called, i has been incremented to 9, and so fs[0][3] doesn’t add 0 to 3 but rather 9 to 3. This is reflected in the byte code by the LOAD_GLOBAL instruction.

To avoid problems like this one, remember that python will always look for unknown function variables in global (or higher level function) scope.

Class vs Instance vs method scope

Scope behaves in the same way for functions as for classes and objects. In fact, functions are objects. If you have some experience with python, you probably already know how to define class variables and instance variables. Class variables are declared under the class declaration, and outside of any functions. Instance variables are defined attached to self, as follows:

class Foo(object):
# class variable
a_class_var = 'yo yo'
def __init__(self):
# instance variable
self.an_instance_var = 'whattttt'

Brief Exercise for the reader: input the above into ipython, and compare dir(Foo)with dir(Foo())

Consider the following class

class Foo(object):
# class variable
x = 'yo yo'

bar = Foo()

What is bar.x?

What is Foo.x?

Solution: Both are ‘yo yo’

How about this class?

class Foo(object):
x = 'yo yo'
def __init__(self):
# print self.x
self.x = 'whattttt'
# print self.x

bar = Foo()

Now,

What is bar.x?

What is Foo.x?

In this case, bar.x= ‘whattttt’ and Foo.x = ‘yo yo’. If you uncomment the print statements though, you’ll notice that initially self.x = ‘yo yo’. Huh?

Conceptually, we can think of Foo as a template for all of it’s instances. The class variable x, declared as ‘yo yo’ gets copied to the instance bar on creation. self.x in the initializer is not referring to the class’s x, but rather to the instance’s x. We never tamper with the class’s x at all. If you do want to change Foo’s x, you would set self.__class__.blah.

One important thing to note is that creating foo from Foo does a shallow copy of Foo’s internals. So

class Foo2(object):
hi = []
def __init__(self):
self.hi.append('1')

bar2 = Foo2()
print bar2.hi
bar3 = Foo2()
print bar3.hi
print Foo2.hi

produces the output:
[
1]
[1, 1]
[1, 1]

The class Foo2’s copy of hi was modified by its instances because on instance creation, a reference to hi was copied, but not hi itself. You may have seen this error disguised as:

In [1]: def foo(blah = []):
...: blah.append(1)
...: return blah
...:
In [2]: foo()
Out[2]: [1]
In [3]: foo()
Out[3]: [1, 1]
In [4]: foo()
Out[4]: [1, 1, 1]

End

Hopefully these examples have given you some intuition about scope in python. For more detailed explanations of the examples in this post, check out the Python Language Reference. Or try out your own explorations in ipython.

*http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/ inspired by this post. Even though the title of that article seems strongly worded. It appears that Guido agrees: http://www.artima.com/weblogs/viewpost.jsp?thread=98196