Generic Functions and CLOS - Confession 11

2014.07.16 07:34:16

One of the more difficult concepts to explain is that of generic functions, the way Object Oriented Programming (OOP) is done in Common Lisp. Since a lot of people seemed to be confused about it even after having read PCL, I will try my best to explain this anew here.

For those who have never touched OOP before, the idea behind it is to encapsulate code in packages called objects. These objects have, like structs, fields that can store data. The crux now is that you have functions that act differently depending on which object they're called with. This methodology is expressed differently depending on the language. In a lot of them such as Java, C#, C++ and Python the functions are declared as part of the class. They belong to the class so to speak and you can reach these class methods the same way you would access a field, except you can call it as a function.

However, this is not the only way it can be done. Instead of methods that belong to the class, another methodology would be to have generic functions that are called like any other function, except they dispatch to a method depending on the arguments it receives. This means that while the generic function looks and is called like any other, it is specialised depending on the kind of arguments you pass into it.

This approach leads to a couple of very interesting and useful extensions that would not be possible in the, I hesitate to call it ‘classic’, way of methods. First you might realise that classes don't actually come into play at any point in this. That's right! Generic functions don't require you to create any classes at all, you can define methods that specialise on other internal types such as string, integer and so on, or even methods that specialise on nothing at all. And if you do define classes, they're much closer to an evolution of structs that now posses inheritance and slots instead of fields. Classes in the sense of CLOS are immensely extensible, but I will get to that some other time perhaps.

Alright, so generic functions can exist without the need for classes. This alone sounds strange enough, but how is it OOP? A generic function by itself consists of the function name, the arguments list and maybe a documentation string. Not much and by itself useless as any arguments you might supply would result in an error. Why? Because there are no methods to dispatch to. The generic function would not know how to handle the arguments. In order to fix this, we define one or more methods on the generic function. Any method belongs to the generic function it matches in name and has to match the generic function's arguments signature. Methods are defined by specialising some (or none!) of the required arguments. Specialising means that this method is only called if the argument that is passed matches the class your method's argument specialised on. Wording this is a bit difficult because here's the second difference that this approach to OOP allows: Multiple dispatch. Methods can specialise on multiple arguments at the same time.

While the classic approach forces methods to be single dispatch by belonging to one class and one class only, generic functions allow your methods to specialise on however many required arguments it wants. To illustrate with an example, you could define a DRAW generic function that takes a TARGET and an OBJECT. You could then specialise varying methods on TARGET, such as a file, image buffer or whatever that all call DRAW again with the supplied OBJECT and a general painting device they created. Every object you want to be able to draw then merely specialises on this painting device for the TARGET and its own class for the OBJECT and uses those to draw itself. This in effect allows you to then call draw on a multitude of targets and objects without having to rewrite the same functions hundreds of times or jumping through gross hoops to avoid it.

Another benefit from the fact that methods are defined separate from the class and the generic function is that someone from the outside can easily extend your generic function with their own methods. As an example a library might offer a generalised SERIALIZE function and a user might extend this for special treatment of his own objects with a new method. Before I get into some of the classic pitfalls that people fall into when they try CLOS and don't yet understand it, I'd like to explain how the generic function dispatch works, as that is an essential part.

You might be wondering how CLOS can even decide which method to dispatch to, given that things like inheritance exist. In CLOS dispatch is decided according to how specific the methods are. The higher up the class hierarchy the method is, the less specific it is, the closer it is to the passed argument, the more specific it is. Arguments that come earlier in the arguments list are more specific. Thus methods are ordered by their order of arguments and how specific each specialisation is. This total order is especially important since any method can call CALL-NEXT-METHOD which will relay the call to the next method in line. This is in some cases equivalent to calling SUPER in other languages, but not really.

Now in my explanation of how methods are ordered I lied a bit to simplify things. The order of methods is computed by the method combination. Defining custom method combinations is something that I have not explored myself yet either, however it is important to be aware of another important feature that CLOS brings, thanks to the ability to define method combinations. Aside from defining methods to a given generic function, methods can also take a qualifier. This qualifier tells the method combination how to treat the method. By default three qualifiers are recognised: :before, :after, and :around. All applicable :before methods are executed in order of least specific to most specific before the ‘proper’ method/s. All :after methods are executed in order of most specific to least specific afterwards. And around methods are executed in order of least to most specific exactly around the block of :before-proper-:after. Each :around method thus has to call CALL-NEXT-METHOD or the proper method will never be called.

As you might imagine being able to attach methods before and after or even build environments around a method can be incredibly useful. Exactly how useful it is, is difficult to fathom without using CLOS, but suffice to say it does wonders for building extensible software and frameworks. Taking a step back to standard methods again, recall that all methods necessarily belong to a generic function. This is a tripping point for a lot of people at first because they are unaware of the generic function. CLOS allows you to define methods straight away without defining a generic function first. However, the generic function is simply created implicitly in the back for you. This is fine in most cases, but it will trip you up if you try to change your method's arguments list.

This will result in an error because the generic method –which does indeed exist but is nowhere explicitly defined in your code– has the same old signature of the initial method. Recall now that methods need to match their generic function in signature. They have to, otherwise how would dispatching even work? The same problem happens the other way around, if you try to change your generic function's arguments but methods exist that don't match it. The way to solve this is to either delete the entire generic function along with all methods using FMAKUNBOUND. The other way to solve this is to remove all conflicting methods using REMOVE-METHOD or the Slime inspector, redefining the generic function and then the newly matching methods.

Another tripping point is that, since CL does not distinguish where in a file your code is, it cannot distinguish whether you simply changed a definition or are adding a new one. Thus if you already have a method compiled that is specialised in some fashion but then change its specialisation in your code and recompile, the old method still exists. You will have to either remove the old method manually or the entire generic as above and redefine all desired methods.

Aside from these tripping points, the power CLOS offers is difficult to grasp if you are used to other systems; exploring all its capabilities is a fantastic adventure. For the basics, simply remember that methods exist separate from classes, but they all each belong to a generic function.

I hope I was able to illustrate some of the concepts of CLOS in a somewhat comprehensible fashion. If there are further questions, corrections or other feedback, I would very much welcome that as I hope this blog entry can at some point be truly useful to Lisp newcomers.

Addendum 1: As Xach pointed out to me, a generic function without methods can occasionally be of use if appropriate methods are added to NO-APPLICABLE-METHOD, the standard function that is called if no matching methods are found.

Addendum 2: To be really precise, methods don't specialise on types (even internal types as stated above), but rather on classes. This works because most of the simple types in the specification have an equivalent class defined so that CLOS may work with it.

Thanks to eudoxia, Guthur, hitecnologys, splittist, and Xach from Freenode/#lisp as well as isoraqathedh from TyNET/#Stevenchan for additions and corrections.

Written by shinmera