从Python的闭包到装饰器

在 Sun 05 July 2015 发布于 Python 分类

函数里面定义函数叫做嵌套函数, 如:

def Print(msg):
  def doPrint():
    print msg
  doPrint()

print dir(Print)
#['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
print Print.__closure__
# None
Print('Hello')
# Hello

如果,把上面函数的最后一行改为return doPrint,即,返回一个函数doPrint,外层函数Print2的参数msg 构造doPrint的实现:

def Print2(msg):
  def doPrint():
    print msg
  return doPrint
print dir(Print)
#['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
print Print2.__closure__
#None
fn = Print2("Hello____1")
print fn
#<function doPrint at 0x7f168699bf50>
fn()
#Hello____1

上述的是和一个和闭包相关的例子。 wiki上关于闭包的解释如下:

In programming languages, closures (also lexical closures or function closures) are techniques for implementing lexically scoped name binding in languages with first-class functions. Operationally, a closure is a record storing a function[a] together with an environment:[1] a mapping associating each free variable of the function (variables that are used locally, but defined in an enclosing scope) with the value or reference to which the name was bound when the closure was created.[b] A closure—unlike a plain function—allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.

简单说,闭包是引用了自由变量的函数。这个被引用的自由变量将和这个函数一同存在,即使已经离开了 创造它的环境也不例外。 在wiki上给给出了一般化的例子:

function startAt(x)
   function incrementBy(y)
       return x + y
   return incrementBy

variable closure1 = startAt(1)
variable closure2 = startAt(5)

dir一个自定义的函数的时候,都可以看到一个内建属性__closure__, 在python的官方文档中有解释:None or a tuple of cells that contain bindings for the function’s free variables. 就是说__closure__是一个只读的元组,其包含该函数绑定的自由变量, 让我们看看上面一个例子中最后的闭包函数的的__closure__的值:

print fn.__closure__
#(<cell at 0x7f8218ee1558: str object at 0x7f8218ee0b10>,)
print fn.__closure__[0].cell_contents
#Hello____1

从这里,可以得出Python实现闭包的3个条件: 必须存在函数嵌套 被嵌套的函数必须引用其外层函数定义的变量 * 外层函数必须返回这个被嵌套的函数

对于更多关于闭包的例子和用法,可以看看Python的爬虫框架Scrapycore下的源码。

Python的装饰器得益于Python的闭包语法。

def outter(fn):
    def inner():
        print "{0:s} called begin....".format(fn.__name__)
        fn()
        print "{0:s} called end....".format(fn.__name__)
    return inner

@outter
def printHello():
    print "Hello"

@outter
def printWorld():
    print "World"

if __name__ == '__main__':
    printHello()

#Output
printHello called begin....
Hello
printHello called end....

让我们看看闭包在其中的体现吧:

print printHello
#<function inner at 0x7fd900872b18>
print printHello.__closure__
#(<cell at 0x7fd9008b2558: function object at 0x7fd900872aa0>,)
print printHello.__closure__[0].cell_contents
#<function printHello at 0x7fd900872aa0>

可以看到printHello其实变成了内嵌的函数inner,只因为其元信息更改了(后面会讲到怎么避免函数在被装饰器修饰后元信息的改变), 内嵌函数inner引用了printHello, 这是一个标标准准的闭包。其实,Python的装饰器其实是一个语法糖:

@outter
def printHello():
  ...

可以等价写成printHello = outter(printHello)

带参数的修饰器:

def formatPrint(begin, end):
    def create_decorator(fn):
        def inner():
            return begin + fn() + end
        return inner
    return create_decorator


@formatPrint('<', '>')
def foo():
    return "hello, world"

if __name__ == '__main__':
    print foo()

#Output
<hello, world>

当迭代器带参数时,

@formatPrint('<', '>')
def foo():
  ...

等价于:foo = formatPrint('<', '>')(foo), 即formatPrint首先传入参数4,生成create_decorator函数, 然后的逻辑和上面没有带参数的装饰器的逻辑一样。 可以看一下foo.__closure__:

for ele in foo.__closure__:
  print  ele.cell_contents
#Output:
'<'
'>'
'<function inner at 0x7f6998149b90>'

对于多个装饰器修饰:

@formatPrint('<', '>')
@formatPrint('<', '>')
@formatPrint('<', '>')
def foo():
  ...

等价于:foo = formatPrint('<', '>')(formatPrint('<', '>')(formatPrint('<', '>')(foo))), 所以,你才会在foo.__closure__中看到:

for ele in foo.__closure__:
  print  ele.cell_contents
#Output:
'<'
'>'
'<function inner at 0x7f965f201b18>'
#我们调用一下这个引用的函数:
foo.__closure__[2].cell_contents()
#Output:
'<<hello, world>>'

也可以使用类来做装饰器,来实现保存函数的执行状态。比如统计函数调用的次数:

class Counter:
    _instanceDict = {}
    def __init__(self, fn):
        self.func = fn
        if fn not in Counter._instanceDict:
            Counter._instanceDict[fn] = 0

    def __call__(self):
        Counter._instanceDict[self.func] = Counter._instanceDict[self.func] + 1
        self.func()

    @staticmethod
    def displayCounts():
        return dict([(f.__name__, Counter._instanceDict[f]) for  f in Counter._instanceDict])

@Counter
def foo():
    pass

@Counter
def bar():
    pass

if __name__ == '__main__':
    for i in range(50):
        foo()

    bar()

    print Counter.displayCounts()
#Output
{'foo': 50, 'bar': 1}                                  

按照上面的经验,上面的装饰器等价于:foo = Counter(foo), 即一个类的实例,需要实现__init__初始化,__call__让其有了函数的特质即可以像函数那样调用,有点像C++的类实现了()的运算符重载。

更多的装饰器的例子,可以看看python官方的文档。 在这片官方文档中,第一个例子就可以解决我们在装饰了一个函数后,这个函数就不是其本身了,主要是因为其元信息更改了,所以其把之前函数的__name__, __module__等又重新赋值给包装后函数,并更新了包装后函数的__dict__Python的库functools有专门解决这问题的装饰器wraps,可以看看其源码:

def update_wrapper(wrapper,
                   wrapped,
                   assigned = WRAPPER_ASSINGMENTS,
                   updated = WRAPPER_UPDATES):
    for attr in assigned:
      setattr(wrapper, attr, getattr(wrapped, attr))
    for attr in updated:
      getattr(wrapper,attr).update(getattr(wrapped, attr, {}))
    return wrapper

def wraps(wrapped,
          assigned = WRAPPER_ASSINGMENTS,
          updated = WRAPPER_UPDATES):
    return partial(update_wrapper, wrapped = wrapped,
                    assigned = assigned, updated = updated)

二者的实现大同小异,这里就不在详细复述了。