Evaluating Python expressions

Python provides several ways to interact with the interpreter from within a program. For example, the eval function evaluates a string as if it were a Python expression. You can pass it a literal, simple expressions, or even use built-in functions:

Example: Using the eval function
# File: builtin-eval-example-1.py

def dump(expression):
    result = eval(expression)
    print expression, "=>", result, type(result)

dump("1")
dump("1.0")
dump("'string'")
dump("1.0 + 2.0")
dump("'*' * 10")
dump("len('world')")
1 => 1 <type 'int'>
1.0 => 1.0 <type 'float'>
'string' => string <type 'string'>
1.0 + 2.0 => 3.0 <type 'float'>
'*' * 10 => ********** <type 'string'>
len('world') => 5 <type 'int'>

A problem with eval is that if you cannot trust the source from which you got the string, you may get into trouble. For example, someone might use the built-in __import__ function to load the os module, and then remove files on your disk:

Example: Using the eval function to execute arbitrary commands
# File: builtin-eval-example-2.py

print eval("__import__('os').getcwd()")
print eval("__import__('os').remove('file')")
/home/fredrik/librarybook
Traceback (innermost last):
 File "builtin-eval-example-2", line 2, in ?
 File "<string>", line 0, in ?
os.error: (2, 'No such file or directory')

Note that you get an os.error exception, which means that Python actually tried to remove the file!

Luckily, there’s a way around this problem. You can pass a second argument to eval, which should contain a dictionary defining the namespace in which the expression is evaluated. Let’s pass in an empty namespace:

>>> print eval("__import__('os').remove('file')", {})
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "<string>", line 0, in ?
os.error: (2, 'No such file or directory')

Hmm. We still end up with an os.error exception.

The reason for this is that Python looks in the dictionary before it evaluates the code, and if it doesn’t find a variable named __builtins__ in there (note the plural form), it adds one:

>>> namespace = {}
>>> print eval("__import__('os').remove('file')", namespace)
Traceback (innermost last):
  File "<stdin>", line 1, in ?
  File "<string>", line 0, in ?
os.error: (2, 'No such file or directory')
>>> namespace.keys()
['__builtins__']

If you print the contents of the namespace variable, you’ll find that it contains the full set of built-in functions.

The solution to this little dilemma isn’t far away: since Python doesn’t add this item if it is already there, you just have to add a dummy item called __builtins__ to the namespace before calling eval:

 
Example: Safely using the eval function to evaluate arbitrary strings
# File: builtin-eval-example-3.py

print eval("__import__('os').getcwd()", {})
print eval("__import__('os').remove('file')", {"__builtins__": {}})
/home/fredrik/librarybook
Traceback (innermost last):
  File "builtin-eval-example-3.py", line 2, in ?
  File "<string>", line 0, in ?
NameError: __import__

Note that this doesn’t protect you from CPU or memory resource attacks (for example, something like eval(“’*’*1000000*2*2*2*2*2*2*2*2*2”) will most likely cause your program to run out of memory after a while)

 

Comment:

This can be too restrictive: harmless built-ins like len will also not work, and you won't have access to global variables. A more flexible solution is to temporary remove specific things from the global namespace, such as import, sys and os:

import sys, os

def safe_evaluate(s):
  #backup import, os and sys
  backup = {}
  backup['import'] =  globals()['__builtins__'].__import__
  backup['os'] = globals()['os']
  backup['sys'] = globals()['sys']
  
  #delete them from global namespace
  del (globals()['__builtins__'].__import__)
  del(globals()['os'])
  del(globals()['sys'])

  #evaluate your string
  print eval(s) 

  #restore global namespace
  globals()['__builtins__'].__import__ = backup['import']
  globals()['os'] = backup['os']
  globals()['sys'] = backup['sys']

s1 = "len('Hello world')"
s2 = "__import__('os').getcwd()"

safe_evaluate(s1)
safe_evaluate(s2)
Now, s1 will safely be evaluated but s2 will fail.

Posted by Sjoerd de Vries (2006-10-25)

Comment:

Not sure about the above, really. Removing a module from your globals doesn't necessarily mean that you cannot reach that module's content via some other module (or even indirectly, via a function or class). No time to come up with a counter-example, though.

Posted by Fredrik (2006-10-25)

Comment:

Well, the previous thing did not work since I forgot to hide the variables backup and safe_evaluate themselves. I am fairly new to Python, but still, shame on me!

After reading a lot of discussion about the subject, I feel that the only way to make it safe is to meet two conditions:

  • EVERY callable instance in the namespace passed to eval/exec must be individually backed up and restored afterwards.
  • Dangerous types and classes, for example, 'file' should be made inaccessible. In-built types themselves cannot be removed, but references to them in __builtin__ can be.

The first condition can be a lot of work depending on the implementation, but,if done properly, I cannot see a way around. An attacker can make an object callable, but if it is never called outside eval, it will do no harm.

For the second condition, people have found two loopholes to retrieve dangerous types:

  • The __subclass__ attribute
  • The mro() method to retrieve the type 'object', whose __subclass__ atrribute contains many dangerous types and classes. The only way to close those loopholes is to scan the eval'led/exec'ed string for them.

I hope that this is an accurate description, but maybe there is some loophole that I overlooked.

Posted by Sjoerd de Vries (2006-10-26)

Comment:

Oh and finally there is the exec statement that needs to be disabled.

Posted by Sjoerd de Vries (2006-10-26)

A Django site. this page was rendered by a django application in 0.03s 2010-09-02 14:49:29.150737. hosted by webfaction.