The Sourcerer
by Kris Kowal.
The Sourcerer has moved! Please visit askawizard.blogspot.com.
Fri, 03 Oct 2008
Designing Django's Object-Relational-Model - The Python Saga - Part 6
Django is a web application framework in the Python language. One of the advantages that Django has over other libraries is that it was written and designed by Python experts. That is to say, they knew about variadic arguments, properties, and metaclasses. Furthermore, they knew how to cleverly use these ideas to sweep a lot of complexity under the hood (or bonnet if you will; some of them appear to be British) so that common developers, or uncommonly good developers who want to think about other things most of the time, can gracefully suspend disbelief that anything complicated is going on when they design their database in pure Python. This article will illustrate how Django uses metaclasses and properties to present an abstraction layer where you can specify a database schema with Python classes. For simplicity, the "database" backend will be plain Python primitive objects—tables will be dictionaries of dictionaries.
In the end, we want to be able to write code that looks a whole lot like it's using Django's Object-Relational-Model:
class Cow(Model):
id = PrimaryKey()
name = ModelProperty()
cow = Cow(id = 0, name = 'Moolius')
cow.save()
cow = Cow.objects.get(0)
The easiest part (for the purpose of this exercise) is Django's concept of an "object manager". In Django, every model has an object manager that provides a query API and, depending on the backend, might cache instances of Model objects. Conveniently, a very narrow subset of the object manager API is almost exactly the same as a dictionary. Conceptually, the object manager boils down to a dictionary proxy for the database where you can use the get function to retrieve records from the database. For simplicity, our ObjectManager is just going to be a dictionary.
class ObjectManager(dict):
pass
Beyond the scope of this article, the ObjectManager should be handy for grabbing lots of objects from the database at once. Django provides a very thorough and relatively well-optimized lazy query system with its object managers. The ObjectManager has get, and filter methods which, instead of simply accepting the primary key, accept keyword arguments that translate to predicate logic rules. In particular, the filter function is lazy, so you can chain filter commands to construct complex queries and Django only goes to the database once.
While it would be super-cool to model all of this with native Python, it actually is a lot of code, so that's a topic for maybe later. We'll just use the built in dict.get.
We'll also need all of the code from Part 5 since models will be another application of the ordered property pattern. This is how Django creates SQL tables with fields in the same order as the Python properties.
from ordered_class import \
OrderedMetaclass,\
OrderedClass,\
OrderedProperty
We use the OrderedMetaclass to make a ModelMetaclass. The model metaclass will have all the same responsibilities as our StructMetaclass, including "dubbing" the properties so that they know their own names. The model metaclass will also create an ObjectManager for the class. This isn't the complete ModelMetaclass; we'll come back to it.
class ModelMetaclass(OrderedMetaclass):
def __init__(self, name, bases, attys):
super(ModelMetaclass, self).__init__(name, bases, attys)
if '_abstract' not in attys:
self.objects = ObjectManager()
for name, property in self._ordered_properties:
property.dub(name, self)
The next step is to create a ModelProperty base class. This class will be an OrderedProperty so it's sortable. It will also need to implement the dub method so it can figure out its name. Other than that, it'll be just like the StructProperty from the previous section: it will get and set its corresponding item in the given object.
class ModelProperty(OrderedProperty):
def __get__(self, objekt, klass):
return objekt[self.item_name]
def __set__(self, objekt, value):
objekt[self.item_name] = value
def dub(self, name):
self.item_name = name
return self
There is a distinction in the refinement of ModelProperty from StructProperty: ModelProperty objects will eventually need to distinguish the value stored in the dictionary from the value returned when you access an attribute. In the primitive case, they're the same, but for ForeignKey objects, down the road, you'll store the primary key for the foreign model instead of the actual object. This is the same as the behavior in an underlying database backend.
class ModelProperty(OrderedProperty):
def __get__(self, objekt, klass):
return objekt[self.item_name]
def __set__(self, objekt, value):
objekt[self.item_name] = value
def dub(self, name):
self.attr_name = name
self.item_name = name
return self
Let's consider a PrimaryKey ModelProperty. The purpose of a PrimaryKey is to designate a property of a model that will be used as the index in its object manager dictionary. In Django, this can be an implicit id field at the beginning of the table. For simplicity in this exercise, we'll require every model to explicitly declare a PrimaryKey. The ModelMetaclass will identify which of its ordered properties is the primary key by observing its type. Other than their type, a primary key's behavior is the same as a normal ModelProperty, so it's a really easy declaration:
class PrimaryKey(ModelProperty):
pass
Now we can go back to our ModelMetaclass and add the code we need for every class to know the name of its primary key. I create a list of PrimaryKey objects from my _ordered_properties and pop off the last one, leaving error checking as an exercise for a more rigorous implementation. There should be only one primary key.
class ModelMetaclass(OrderedMetaclass):
def __init__(self, name, bases, attys):
super(ModelMetaclass, self).__init__(name, bases, attys)
if '_abstract' not in attys:
self.objects = ObjectManager()
for name, property in self._ordered_properties:
property.dub(name)
self._pk_name = [
name
for name, property in self._ordered_properties
if isinstance(property, PrimaryKey)
].pop()
Now all we need is a Model base class. The model base class will just be a dictionary with the model metaclass and a note that it's abstract: that is, it does not have properties so the metaclass better not treat it as a normal model.
class Model(OrderedClass, dict):
__metaclass__ = ModelMetaclass
_abstract = True
The model will also have a special pk attribute for accessing the primary key and a save method for committing a model to the ObjectManager.
class Model(OrderedClass, dict):
__metaclass__ = ModelMetaclass
_abstract = True
def save(self):
self.objects[self.pk] = self
@property
def pk(self):
return getattr(self, self._pk_name)
Now we have all the pieces we need to begin using the API. Let's look at that cow model.
class Cow(Model):
id = PrimaryKey()
name = ModelProperty()
cow = Cow(id = 0, name = 'Moolius')
cow.save()
cow = Cow.objects.get(0)
All of this works now. You make a cow model; that invokes the model metaclass that sets up Cow._pk_name to be "id" and tacks on a Cow.objects object manager. Then we make a cow and put it in Cow.objects with the save method. This is analogous to committing it to the database backend. From that point, we can use the object manager to retrieve it again.
We can refine the Model base class to take advantage of the fact that it's not just a dictionary anymore: it's an ordered dictionary. We create a better __init__ method that will let us assign the attributes of our Cow either positionally or with keywords. That makes our cow more like a hybrid of a list and a dictionary. Also, since our model instances aren't merely dictionaries, we create a new __repr__ method that will note that cows are cows and moose are moooose. The new __repr__ method also takes the liberty to write the items in the order in which their properties were declared.
class Model(OrderedClass, dict):
…
def __init__(self, *values, **kws):
super(Model, self).__init__()
found = set()
for (name, property), value in zip(
self._ordered_properties,
values,
):
setattr(self, name, value)
found.add(name)
for name, value in kws.items():
if name in found:
raise TypeError("Multiple values for argument %s." % repr(name))
setattr(self, name, value)
…
def __repr__(self):
return '<%s %s>' % (
self.__class__.__name__,
" ".join(
"%s:%s" % (
property.item_name,
repr(self[property.item_name])
)
for name, property in self._ordered_properties
)
)
Now we can make a cow model with positional and keyword arguments, and print it out nice and fancy-like:
>>> Cow(0, name = 'Moolius')
<Cow id:0 name:"Moolius">
The next step is to introduce ForeignKey model properties. These are properties that will refer, via a relation on a primary key, to an object in another model. So, the ForeignKey class will accept a Model for the foreign model. Its dub method will override the item_name (preserving the attr_name) provided by it's super-class's dub method. The new item_name with be the attr_name and the name of the primary key from the foreign table, delimited by an underbar. This will let the foreign key property hide the fact that it does not contain a direct reference to the foreign object; it just keeps the foreign object's primary key. However, if you access the foreign key property on a model instance, it will go off and diligently fetch the corresponding model instance. If you assign to the foreign key property, it'll tolerate either a primary key or an actual instance.
class ForeignKey(ModelProperty):
def __init__(self, model, *args, **kws):
super(ForeignKey, self).__init__(*args, **kws)
self.foreign_model = model
def __get__(self, objekt, klass):
return self.foreign_model.objects.get(objekt[self.item_name])
def __set__(self, objekt, value):
if isinstance(value, self.foreign_model):
objekt[self.item_name] = value.pk
else:
objekt[self.item_name] = value
def dub(self, name):
super(ForeignKey, self).dub(name)
self.item_name = '%s_%s' % (
name,
self.foreign_model._pk_name,
)
Now we can write code with more than one model using relationships. Let's give our cow a bell.
class Bell(Model):
id = PrimaryKey()
class Cow(Model):
id = PrimaryKey()
name = ModelProperty()
bell = ForeignKey(Bell)
bell = Bell(0)
bell.save()
cow = Cow(0, 'Moolius', bell)
cow.save()
Note that you must save the bell so that when you construct the cow, it can fetch the bell from Bell.objects.
There's more to Django's ORM, of course. This article doesn't cover parsing and validation, which are both assisted by the ORM. Nor does it cover queries, query sets, the related_name for ForeignKey properties on foreign models, Django's ability to use strings for forward references to models that have not yet been declared, or many of the other really neat features.
What this article does cover though, is that you can create a powerful abstraction of a proxied database with pure-Python in less than 200 lines of code. This means that you could create a light-weight proxy over HTTP to a Django database that exposes itself with a REST API. You could also create an abstraction layer that would allow you to pump Django ORM duck-types back into Django to use pure Python objects in addition to or in stead of a database backend.
But, if this article does nothing else, I hope it communicates that Django is cool. I have read a whole lot of code from every dark corner of the web and I have liked very little of it; people I've worked with will testify that I've regularly "hated on" every library or framework I've ever seen. I've never met Simon Willson and the growing developer community around Django. However, I've read their code and now I can tell you, over the course of several articles, that they're really smart and you should use their code.
this entry
was posted on
Fri, 03 Oct 2008
at 15:13 in
Wed, 01 Oct 2008
Ordered Properties - The Python Saga - Part 5
In C and SQL, structs and records have properties that appear in a particular order. In Python, objects use hash-tables to store their attributes, so the order is not deterministic nor relevant. That's no consolation if you're trying to model C structs or SQL tables in Python though. Reading though the Django code, I discovered that those wily coders had synthesized a technique that combines the virtues of properties and metaclasses that we can generally model the order in which fields of a struct or SQL table appear.
The trick is to provide an API that allows your users to write code like:
from time import gmtime
from cstruct import Struct, IntegerProperty
class EpochTime(Struct):
seconds = IntegerProperty()
microseconds = IntegerProperty()
epoch_time = EpochTime()
timestruct = gmtime(epoch_time.seconds)
In this case, the Struct type cares about the size, order, and disposition of the C structure it models so that it can unpack a byte buffer presumably retrieved from a C-type library.
There are two components to the general solution. First, there's an OrderedProperty base type. Ordered properties track the order in which they were initialized. For this purpose, ordered properties use a global counter. The absolute value of a property's creation counter is irrelevant. The only requirement is that they monotonically increase as each property is declared, and that classes declared concurrently do not interfere.
from itertools import count
next_counter = count().next
I learned about itertools from Brett Cannon who was preparing his thesis defense when I was at Cal Poly. In another piece of code, which I would cite if I could recall, I learned the trick of using the itertools.count function to atomize a global counter. It takes advantage of Python's dubious global interpreter lock.
Then the OrderedProperty base type just stores a creation counter upon initialization. Keep in mind that all derived types must trickle their __init__ call—your base types are not always what they seem and there are use cases you can not foresee where you might be required to receive and pass your initialization arguments to an unknown super-type. That's a side-effect of working in a language with mix-ins, and it's a "good thing".
class OrderedProperty(object):
def __init__(self, *args, **kws):
self._creation_counter = next_counter()
super(OrderedProperty, self).__init__(*args, **kws)
The next part of the puzzle is inspecting your property order. We use a metaclass to note when new types are created and we give them an _ordered_properties attribute with the ordered-property items in their attributes. Keep in mind that base types might have properties too. __mro__ is an attribute of all classes that is a linearized list of a classes inheritance hierarchy. It's effectively the result of a dynamic topological sort algorithm for the closure of your base types. We traverse it backwards so that if you create a dictionary via dict(_ordered_properties), name collisions are resolved chronologically.
class OrderedMetaclass(type):
def __init__(self, name, bases, attys):
super(OrderedMetaclass, self).__init__(name, bases, attys)
self._ordered_properties = sorted(
(
(name, value)
for base in reversed(self.__mro__)
for name, value in base.__dict__.items()
if isinstance(value, OrderedProperty)
),
key = lambda (name, property): property._creation_counter,
)
Then, we create our OrderedClass that we can inherit to get free _ordered_properties attributes.
class OrderedClass(object):
__metaclass__ = OrderedMetaclass
Let's see it in action:
class Foo(OrderedClass):
bar = OrderedProperty()
baz = OrderedProperty()
Foo._ordered_properties == [
('bar', <Ordered Property instance somewhere>),
('baz', <Ordered Property instance somewhere>),
]
With a small modification, we can track ordered inner classes too.
class OrderedMetaclass(type):
def __init__(self, name, bases, attys):
super(OrderedMetaclass, self).__init__(name, bases, attys)
self._creation_counter = next_counter()
self._ordered_properties = sorted(
(
(name, value)
for base in reversed(self.__mro__)
for name, value in base.__dict__.items()
if isinstance(value, OrderedProperty)
or isinstance(value, OrderedMetaclass)
),
key = lambda (name, property): property._creation_counter,
)
So, we put all of those order classes in our generic ordered properties module. Now we can delve into our particular Struct example.
First, we need field types. We will need a base type and derivative types for all of the field types we want to model from C, like IntegerField. Bear in mind that OrderedProperty is just a mix-in; it doesn't actually define __get__ and its ilk because those are all specific to the derivative implementation. All struct fields are going to store their actual values in a special _value dictionary on their corresponding Struct instance.
class StructProperty(OrderedProperty):
def __get__(self, objekt, klass):
return objekt._values[self.name]
def __set__(self, objekt, value):
objekt._values[self.name] = value
def dub(self, name):
self.name = name
return self
You'll notice that instances of StructProperty need to know their name, the key in their corresponding instance's _values dictionary. The struct property instances don't get this from their initializer. Instead, each property will get "dubbed", given a name, when the StructMetaclass visits its _ordered_properties.
Let's just look at an integer.
class IntegerProperty(StructProperty):
_format = 'i'
The only thing special about an integer is that it's format specifier for the pack routine is "i".
We're going to want to use a metaclass again, this time inheriting from our OrderedMetaclass. The purpose of this metaclass will be to analyze the ordered properties, construct an aggregate format specifier for pack and unpack methods, and to dub each of its properties with their name.
class StructMetaclass(OrderedMetaclass):
def __init__(self, name, bases, attys):
super(StructMetaclass, self).__init__(name, bases, attys)
for name, property in self._ordered_properties:
property.dub(name)
self._format = "".join(
property._format
for name, property in self._ordered_properties
)
All that remains is to create our API base class, Struct. This class initializes its values and declares its metaclass.
from struct import unpack
class Struct(OrderedClass):
__metaclass__ = StructMetaclass
def __init__(self, *args, **kws):
super(Struct, self).__init__(*args, **kws)
self._values = {}
def unpack(self, value, prefix = None):
if prefix is None: prefix = ""
for (name, property), value in zip(
self._ordered_properties,
unpack(prefix + self._format, value)
):
self._values[name] = value
So, now our epoch time example is possible using Struct and IntegerProperty, completely oblivious to the machinations behind the scenes.
from time import gmtime
from datetime import datetime
class EpochTime(Struct):
seconds = IntegerProperty()
microseconds = IntegerProperty()
epoch_time = EpochTime()
timestruct = gmtime(epoch_time.seconds)
datestruct = (timestruct[:6] + (epoch_time.microseconds,))
print datetime(*datestruct)
this entry
was posted on
Wed, 01 Oct 2008
at 20:38 in
Tue, 30 Sep 2008
Metaclasses - The Python Saga - Part 4
The original type function, whose behavior is preserved in modern Python 2.5, accepts an object and returns the class, albeit the type, that would emit it. It's like the typeof operator in JavaScript that returns the String name of the primitive type of an object, or the C++ function that returns a pointer to an object's virtual function table. They're all sufficient for comparing apples to oranges, but all of them are also insufficient for the more interesting comparison of apples to the idea of a Fiji apple: the question, "Does your type inherit from this?", that can be accomplished with Python's isinstance, JavaScript's instanceof, or C++'s infernal dynamic_cast. So, type's single argument behavior is effectively retired.
At some transcendental moment, somebody deeply involved in the Python project must have been thinking, "Well, if functions and classes return objects, what returns a class? Could a class, like a property, be syntactic sugar for some deeply metaphysical latent behavior in pure Python?". I figure this is how the type function grew its new wings.
So consider a class declaration:
class Foo(object):
bar = 10
def __init__(self, bar = None):
if bar is not None:
self.bar = bar
This is what is actually happening behind the curtains:
name = 'Foo'
bases = (object,)
def __init__(self, bar = None):
if bar is not None:
self.bar = bar
attys = {'bar': bar, '__init__': __init__}
Foo = type(name, bases, attys)
That is to say, there is no magic in the syntax. Ultimately all of the magic happens when you call type. By "magic" I mean functionality that cannot be replicated in pure Python without the interpreter's intervention.
The type function returns a type: a function that returns new instances. It's also called a "metaclass". type just happens to also be the implied metaclass of object. That is to say, you can create your own metaclasses.
The big question about metaclasses is, "Why on earth would you want to define a metaclass?". David Mertz from IBM wrote that you would simply know when you needed them. Since I read that article, I've wracked my mind for a reason to use metaclasses to no avail. At some point, I was reading Django's ORM code and it occurred to me that the reason you would want to define a metaclass is to provide a class in your API that, when subclassed by unsuspecting users, would invoke certain preparations without their knowledge or consent. Here's how:
Define a metaclass. The best way to define a metaclass is to inherit type and override its __init__ method.
class FooType(type):
def __init__(self, name, bases, attys):
super(FooType, self).__init__(name, bases, attys)
print '%s was declared!' % name
Define a base class for your API. The trick here is that you can override its metaclass. Let's look at this one in an interactive interpreter:
>>> class Foo(object):
... __metaclass__ = FooType
...
Foo was declared!
>>>
Whoa! You didn't call anything. Not true. Here's what actually happened:
name = 'Foo'
bases = (object,)
attys = {}
attys['__metaclass__'] = FooType
Foo = attys.get('__metaclass__', type)(name, bases, attys)
Python checks your attributes for a metaclass before defaulting to type.
That means that your FooType.__init__ got called. Hot damn. I wonder what happens if you create a subclass.
>>> class Bar(Foo):
... pass
...
Bar was declared!
>>>
Whoa! I totally inherited a metaclass.
So, the reason for writing a metaclass is that metaclasses give you an opportunity to get and manipulate your derived class objects before anyone instantiates them. You get to do this once, right after the class dictionary is fully populated. You can take this opportunity to monitor class declarations, to prepare additional attributes, or to interpolate additional base types.
Keep in mind that metaclasses are jealous. If you create a metaclass for a type that inherits from base classes in someone else's API, your metaclass must inherit from their metaclass. I suspect that it's best not to assume that your base types use a particular metaclass. Thankfully, you can use an expression for your base type.
class FooType(getattr(Bar, '__metaclass__', type)):
pass
class Foo(Bar):
__metaclass__ = FooType
This takes advantage of the Python idiom of accessor methods like dict.get and getattr that accept a default-if-none-exists argument. Unfortunately, Python's object doesn't explicitly state that type is its metaclass. Otherwise, you could safely say:
class FooType(Bar.__metaclass__):
pass
Such things are to be looked for in Python 3. I find that the Python developers have either, after considerable review and debate, already accepted or rejected most of my ideas before I even consider them, so I'm not even going to check for a PEP on this one.
this entry
was posted on
Tue, 30 Sep 2008
at 22:45 in
Mon, 29 Sep 2008
Properties - The Python Saga - Part 3
Properties come out of a tired programming language genesis. In the beginning, there were structs. The trouble with structs was that an opaque data structure could not programmatically monitor or intercept access and mutation of its member data.
So that's not a big deal; we could solve the problem with classes. The best practice to avoid programming yourself into a corner was to never expose a datum; you would write accessor and mutator functions, whether you needed them at the moment or not. Thus, as your design grew, you could eventually do nice things like validation, observation, or proxying. The trouble with this approach was that you had to write six times as much code on the off chance you'd need to extend it some day. But it was worth it.
The idea of managed properties came along eventually in various languages (Python, C#, some implementations of JavaScript, and recent versions of [C]). The notion is that you would initially write all of your classes like structs with member data camped in public view. You would encourage your API consumers to interact with those members directly. Then, as need arose, you would subvert the member variables with property objects. These objects would intercept accesses and mutations with functions that you could write at any time of your design process.
Lets observe this design shift in Python. Here's a class with unmanaged data:
class Foo(object):
def __init__(self);
self.bar = 10
Here's some other fellow's code that uses your class:
foo = Foo()
foo.bar = 20
print foo.bar
del foo.bar
So there you have it. Just to keep on the same page, the idea at this point is to add a feature to Python that permits both of those code samples to work and, in-fact, be perfectly cromulent. However, we also want to eventually add features to Foo such that its bar attribute can be managed, validated, proxied, secured, or outright lied about. Enter property. property is a function that accepts an accessor function and optional mutator and deleter functions. The property must be a class attribute to work. Here's how you would use a property:
class Foo(object):
def __init__(self):
self.bar = 10
def get_bar(self, objekt, klass):
return self.baz / 2
def set_bar(self, objekt, value):
self.baz = value * 2
def del_bar(self, objekt):
del self.baz
bar = property(get_bar, set_bar, del_bar)
Now we have a Foo class that transparently maintains the invariant that "bar" will always be half of "baz".
Sometimes you don't need to have a setter for a property, and you almost never need a deleter. For the common case, you can use the property function as a decorator.
class Foo(object):
def __init__(self):
self.baz = 20
@property
def bar(self):
return self.baz / 2
Creating the property function.
So, it's easy to assume that the property function does all the magic behind the scenes, setting up traps in your class's accessor and mutator paths. There's actually another layer of code that can be done entirely in Python. That is, we can implement the property function in pure Python. The trick is that the property function is actually a type or factory method (who cares which) that returns a Python duck-type: a property object. A property object is any object that implements __get__, __set__, or __del__. These are special magic Python functions that intercept access, mutation, and deletion on members. All you have to do is install an object on a class with one of methods defined, with the name of the member you want to manage. The property function just handles the common cases. Let's redefine the property function in Python, as the Property class.
class Property(object):
def __init__(self, fget):
self.fget = fget
def __get__(self, objekt, klass):
return self.fget(objekt)
This defines enough of the Property object to decorate an accessor function.
class Foo(object):
def __init__(self):
self.baz = 20
@Property
def bar(self):
return self.baz / 2
Here's a full implementation of Property. You will note that, in order to exactly emulate the property object, the __init__ method has the same argument names as the internal property so that code that uses keyword arguments will function in perfect ambivalence.
class Property(object):
def __init__(
self,
fget,
fset = None,
fdel = None,
doc = None,
):
self.fget = fget
self.fset = fset
self.fdel = fdel
self.__doc__ = doc
def __get__(self, objekt, klass):
return self.fget(objekt)
def __set__(self, objekt, value):
self.fset(objekt, value)
def __del__(self, objekt):
self.fdel(objekt)
this entry
was posted on
Mon, 29 Sep 2008
at 23:34 in
Sun, 28 Sep 2008
Decorators - The Python Saga - Part 2
Python introduced a short-hand for the adapter pattern on functions. You can "decorate" a function with another function. This is a neat tool you can use to factor out some common code from a bunch of functions. You can fiddle with the arguments, return values, or intercept exceptions thrown by any function you decorate.
The canonical example is a memoize decorator. The idea is to generalize the notion of memoization so you can simply subscribe to it in any function you want to memoize.
def factorial(n):
if n == 1: return 1
return n * factorial(n - 1)
factorial = memoize(factorial)
You accomplish this by writing the memoize decorator. A decorator is a function that accepts a function and returns another. Python virtuously provides a shorthand for taking the function, decorating it, and assigning it to a variable with the same name.
@memoize
def factorial(n):
if n == 1: return 1
return n * factorial(n - 1)
In the imagined normal case of decorators, the returned function accepts the same arguments and returns the same kinds of values as the accepted function. However, a decorator does have the liberty of extending or restricting that interface, like accepting additional arguments or raising an exception if the arguments are of the wrong type. It might also perform some common computation on the original arguments and pass the result to the original function as an additional argument. In any case, you can use some closures to create a decorator:
def memoize(function):
cache = {}
def decorated(*args):
if args not in cache:
cache[args] = function(*args)
return cache[args]
return decorated
Of course, that's too simple. A lot of things you put after the "@" symbol are just functions that return decorators so that they can be configured with arguments. For example, you probably want to make a memoize decorator that lets you specify your own cache object. So, you need another layer of deference.
def memoize(cache = None):
if cache is None: cache = {}
def decorator(function):
def decorated(*args):
if args not in cache:
cache[args] = function(*args)
return cache[args]
return decorated
return decorator
@memoize({})
def factorial(n):
if n == 1: return 1
return n * factorial(n - 1)
Since, in Python, functions, objects, and types are indistinguishable to the casual observer, you can do the exact same thing with a class, although I shudder to think that you might want to forgo the simplicity and elegance of closures. After the transform, the previous code might look like this:
class memoize(object):
def __init__(self, cache = None):
self.cache = cache
def __call__(self, function):
return Memoized(function, self.cache)
class Memoized(object):
def __init__(self, function, cache = None):
if cache is None: cache = {}
self.function = function
self.cache = cache
def __call__(self, *args):
if args not in self.cache:
self.cache[args] = self.function(*args)
return self.cache[args]
@memoize()
def factorial(n):
if n == 1: return 1
return n * factorial(n - 1)
So now you can use a Least Recently Used Cache, assuming it is a dictionary-like-object (a duck-dict, if you will):
from lru_cache import LruCache
@memoize(LruCache(max_size = 100, cull = .25))
def factorial(n):
if n == 1: return 1
return n * factorial(n - 1)
Download decorators.zip.
this entry
was posted on
Sun, 28 Sep 2008
at 21:42 in
Variadic Positional and Keyword Arguments - The Python Saga - Part 1
Python supports "variadic" arguments. Variadic arguments are the man behind the curtain for C's printf function. The idea is that a function can accept a variable number of positional arguments, the values to put in your format string. In C this is accomplished with an ellipsis, ..., and some VA macro-linked-list-stuff that I always have to look up. Python goes a couple steps further with variadic arguments and the results are stunning, orthogonal, and actually useful almost every day. With Python, you get both "positional" arguments, like C, and keyword arguments: those arguments that conceptually map, in any order, to the names of the arguments in your function's declaration. The magic symbols are "*" and "**" for positional and keyword arguments respectively. With one "*", you can declare a function that accepts any number of arguments as the declared list object:
def foo(*args):
return args
assert foo(1, 2, 3) == [1, 2, 3]
You can also pass an array of positional arguments to a function with very similar syntax:
def foo(a, b, c):
return [a, b, c]
assert foo(*[1, 2, 3]) == [1, 2, 3]
And you can do the same thing with keyword arguments except you use dictionaries:
def foo(**kwargs):
return kwargs
assert foo(a = 10, b = 20, c = 30) == {'a': 10, 'b': 20, 'c': 30}
Likewise, you can pass keyword arguments:
def foo(a, b, c):
return [a, b, c]
assert foo(**{'a': 10, 'b': 20, 'c': 30}) == [10, 20, 30]
You can use them in combination, along with default arguments to provide beautiful, orthogonal, reusable abstractions:
def foo(a, b = None, c = None, d = None):
return [a, b, c, d]
assert foo(*[1, 2], **{'c': 3}) == [1, 2, 3, None]
def bar(a, b, c, *args, **kws):
return [a, b, c], args, kws
assert bar(1, 2, 3, 4, 5, f = 6) == ([1, 2, 3], [4], {'f': 5})
this entry
was posted on
Sun, 28 Sep 2008
at 00:09 in
|