Archive

Author Archive

Closures explained with JavaScript

April 22nd, 2011 31 comments

Last year I wrote A brief introduction to closures which was meant to help people understand exactly what a closure is and how it works. I’m going to attempt that again, but from a different angle. I think with these kinds of concept, you just need to read as many alternative explanations as you can in order to get a well rounded view.

First-class functions

As I explained in Why JavaScript is AWESOME, one of the most powerful parts of JavaScript is its first-class functions. So, what does “first-class” mean in a programming language? To quote wikipedia, something is first-class if it:

  • can be stored in variables and data structures
  • can be passed as a parameter to a subroutine
  • can be returned as the result of a subroutine
  • can be constructed at runtime
  • has intrinsic identity (independent of any given name)

So, functions in JavaScript are just like objects. In fact, in JavaScript functions are objects.

The ability to nest functions gives us closures. Which is what I’m going to talk about next…

Nested functions

Here’s a little toy example of nested functions:

The important thing to note here is that there is only one f defined. Each time f is called, a new function g is created, local to that execution of f. When that function g is returned, we can assign its value to a globally defined variable. So, we call f and assign the result to g5, then we call f again and assign the result to g1. g1 and g5 are two different functions. They happen to share the same code, but they were executed in different environments, with different free variables. (As an aside, we don’t need to use a function definition to define g and then return it. Instead, we can use a function expression which allows us to create a function without naming it. These are called ‘anonymous functions’ or lambdas. Here’s a version of the above with g converted to an anonymous function.)

Free variables and scope

A variable is free in any particular scope if it is defined within an enclosing scope. To make that more concrete, in the scope of g, the variable x is free, because it is defined within the scope of f. Any global variables are free within the scopes of f and g.

What do I mean by scope? A scope is an area of code where a variable may be defined, without the enclosing scope knowing about it. JavaScript has ‘function scope’, so each function has its own scope. So, any variable defined within f is invisible outside of f. Scopes can be nested, so in the above example, g has its own scope which is contained within the scope of f, which is contained within the global scope. Whenever a variable is accessed, the JavaScript interpreter first looks within the current scope. If the variable is not found to be defined within the current scope, the interpreter checks the enclosing scope, then the enclosing scope of the enclosing scope, all the way up to the global scope. If the variable is not found, a ReferenceError is thrown.

Closures are functions that retain a reference to their free variables

And this is the meat of the matter. Let’s look at a simplified version of the above example first:

It’s no surprise that when you call f with the argument 5, when g is called it has access to that argument. What’s a bit more surprising is that if you return g from the argument, the returned function still has access to the argument 5 (as shown in the original example). The bit that can be a bit mind-blowing (and I think generally the reason that people have such trouble understanding closures) is that the returned g actually is remembering the variable x that was defined when f was called. That might not make much sense, so here’s another example:

When person is called, the argument name is bound to the value passed. So, the first time person is called, name is bound to ‘Dave’, and the second time, it’s bound to ‘Mary’. person defines two internal functions, get and set. The first time these functions are defined, they have a free variable name which is bound to ‘Dave’. These two functions are returned in an array, which is unpacked on the outside to get two functions getDave and setDave. (If you want to return more than one value from a function, you can either return an object or an array. Using an array here is more verbose, but I didn’t want to confuse the issue by including objects as well. Here’s a version of the above using an object instead, if that makes more sense to you.)

And this is the magic bit. getDave and setDave both remember the same variable name, which was bound to ‘Dave’ originally. When setDave is called, that variable is set to ‘Bob’. Now when getDave is called, it returns ‘Bob’ (Dave never liked the name ‘Dave’ anyway). So getDave and setDave are two functions that remember the same variable. This is what I mean when I say “Closures are functions that retain a reference to their free variables”. getDave and setDave both remember the free variable name. Even though person has now returned, the variable name lives on, because it is referenced by getDave and setDave.

The variable name was bound to ‘Dave’ when person was called the first time. When person is called a second time, a new version of name comes into existence, as well as new versions of get and set. So the functions getMary and setMary are completely different to the functions getDave and setDave. They execute identical code, but in two different environments, with different free variables.

Summary

In summary, a closure is a function called in one context that remembers variables defined in another context – the context in which it was defined. Multiple closures defined within the same context remember that same context, so changes they make are shared between them. Every time a function is called that creates closures, a new, shared context is created for the new closures.

The best way to learn is to play, so I’d recommend editing the fiddles above (by clicking on the + at the top right) and having a fiddle. Enjoy!


Addendum: Lasse Reichstein had some useful comments on this post on the JSMentors mailing list – read them here. The important thing to realise is that a closure actually remembers its environment rather than its free variables, so if you define a new variable in the environment of the closure after the closure’s definition, it will be accessible inside the closure.

Seven steps towards becoming a professional developer

April 9th, 2011 11 comments

When you first start learning to program, the best thing you can do is just write code, and lots of it. Coding is a muscle, and it needs to be exercised.

That’s just a start though. If you enjoy it, and it’s something you want to do professionally, then there are some essential practices that you need to learn, and make part of your routine.

This list is split into two parts. The first items are directly related to coding. They all take some time to setup and master, but once you’ve got to that stage you’ll get nothing but good out of them. Think of them as an investment in your future.

The second bit is about the world beyond coding, and they mostly involve interacting with real people. If that’s not a problem, then great, but if it is, you’re gonna have to get over it, because dealing with people is a surprisingly big part of this job.

Coding

  1. You need to be testing your code. You won’t get everything perfect the first time, so at some point you’ll need to refactor. And if you refactor without tests, “you’re not refactoring; you’re just changing shit”.

  2. Your code needs to be under version control. I use git, but there are plenty of other options. Code that’s not under some form of version control may as well not exist. And using something with easy branching like git makes it much easier to experiment with big changes to your code that might not work out.

  3. Learn to touch-type. It’ll take a bit of effort, but it’s worth it. If you’re serious about being a professional, then you’re going to be spending a lot of your time on a keyboard. Learn to use it properly. Steve Yegge wrote a great blog post on touch-typing. If you want to learn, I don’t know of anything better than GNU Typist.

Everything else

  1. Start a blog. Write about code you’ve written, problems you’ve solved, things you’re learning about. It’s not for anyone else, just you. The act of putting your thoughts into words forces your brain to look at issues from a different angle. There may come a point that people start finding your content and making use of it, so it’s worth making it public.

  2. Join Twitter. Find local developers and the big names in whatever languages you’re currently learning. Twitter’s like a global virtual watercooler. People talk a lot of shit, but some of the most important conversations happen there as well, as well as job offers, so if you’re not involved you could miss out.

  3. Use and participate in Stack Overflow. You can get a lot of useful guidance if you learn how to ask questions sensibly, and you can learn loads by answering questions as well. Also, once you’ve got a good amount of answered questions, Stack Overflow becomes a great showcase of your knowledge and writing.

  4. Find a local user group. Google for the name of your town or city, the name of a language you’re interested in, and ‘user group’ (in Ruby that would be "#{city_name} #{language} user group"). There’ll be something out there.

That all looks too much like hard work to me

If you’re just starting out, these lists may seem a bit daunting. You don’t need to do everything at once, but if you’re taking this seriously then at some point you should be doing them all.

Categories: Miscellaneous Tags:

A linguistic analysis of Rebecca Black’s ‘Friday’

April 1st, 2011 14 comments

Edit: New readers are advised to check the date of posting before taking anything too seriously…


In recent months I’ve grown bored of this whole programming thing. It’s time for me to start focussing on my true passion – linguistic analysis of popular music.

I was first introduced to this fascinating subject by my university lecturer Allan Moore, whose insights into the hidden depths of popular songs were quite enlightening.

My first post on this new subject will be an analysis of Rebecca Black’s latest hit, ‘Friday’.

‘Friday’, or, Roads to Freedom

Rebecca Black, or, the protagonist

‘Friday’ starts with what is ostensibly a description of Black’s morning routine. Obviously, there is much more to the lyrics than what is visible on the surface. The first question we must ask ourselves is whether the protagonist of the song is Black herself, or a persona created purely for the purposes of the song. The ‘truth’ is not important – only our perception counts, so I won’t attempt to answer this question.

The protagonist wakes up at 7 a.m. Is that normal for the youth of America? It’s hard to say, but it is obscenely early, and it looks like Black is highlighting the plight of school children across the United States, under this oppressive regime.

‘Gotta be fresh’: what does this mean? ‘Fresh’ in the sense of clean or awake, or ‘fresh’ in the more hip-hop sense of the word? The use of such an ambiguous word at the start of this song sets the tone for the rest of it.

The protagonist goes to the bus stop, but then sees her friends. We assume that she was planning to take the bus, but was this just a ruse? Is the ‘bus stop’ a metaphor for meeting friends? Here comes the first dilemma of the day – should she sit in the front or the back? Clearly this is a metaphor for the classroom – should she sit at the front of the class or the back? We all know the ramifications of these alternative positions. In a normal classroom there is always the option of the middle-ground, but in a car this is reduced to the stark choice of front or back. Look a bit further, and you’ll find an oddity. Black doesn’t ask ‘which seat should I take’, but rather ‘which seat can I take’. It turns out there is no choice involved here – the protagonist is at the will of the fates (much like her school life).

We then get to the crux of the song – the evocation of ‘Fridayness’. Just like the Crunchie bar is designed to evoke ‘that Friday feeling’, ‘Friday’ is all about the truth and beauty of Friday. To really get the point across, Black repeats the word ‘Friday’ multiple times in the chorus. She also highlights the ambiguity of this strange day of the week. Is it the weekend? Or the precursor to the weekend? With the line ‘Everybody’s lookin’ forward to the weekend’ it’s hard to tell.

Now it’s 7:45. A.m or p.m? Who can tell? Have 12 hours passed already, or is this still the morning? I think what is happening here is some kind of quantum superposition of the two states – the actual time doesn’t matter as much as the day. We are reminded slightly of the post-modernist masterpiece ‘Ferris Bueller’s Day Off’, where school time and play time merge into a state of general ‘beingness’.

‘Cruisin’ so fast, I want time to fly’: we have now moved from quantum physics to general relativity. By including these references so close to each other, Black is almost mocking the current lack of a unified theory of Physics. This is followed by more Physics/Buddhist thought: ‘I got this, you got this’. We don’t know what ‘this’ is, but the fact that Black and her friend both have it at the same time tells us a lot about her holistic view of the world.

Skip forward a bit, and we get to the most genius part of the song, repeated here:

Yesterday was Thursday, Thursday
Today i-is Friday, Friday
We-we-we so excited
We so excited
We gonna have a ball today
Tomorrow is Saturday
And Sunday comes afterwards
I don’t want this weekend to end

Black is truly putting the day of Friday in its context – after Thursday, before Saturday, and two days before Sunday. The lyrics here are split – we hear about Thursday and Friday, then a break, and then Saturday and Sunday. So Friday belongs with Thursday, and Saturday with Sunday. The past is the present, the future the future. This is the kind of depth that is all-too lacking in most modern pop music.

We now move to the secondary protagonist of the song, who I believe to be Black’s alter ego. This character is driving, thus removing the dilemma of front seat/back seat (there can be only one driver’s seat). Then on to ‘It’s Friday, it’s a weekend’, which contrasts with the other lyric ‘Lookin’ forward to the weekend’ – setting up a conceptual dissonance that is never quite resolved.

Conclusion, or, why does my heart feel so bad?

We have only scratched the surface of ‘Friday’ in this analysis, but I hope I’ve given you an idea of the depth of Black’s music. It’s all too easy to see this piece as just another Bieberfied tween pop song with no artistic merit, but as you will now see, this could not be further from the truth. So, go forth and enjoy this beautiful Friday!

jQuery vs DOM return false

March 22nd, 2011 2 comments

What does return false do inside a JavaScript event handler? What does it do in a jQuery event handler? Are they the same thing?

The short answer is “no”. return false in a native JavaScript event handler prevents the default behaviour of the event. It’s the equivalent of event.preventDefault(). This is the same for DOM level 0 event handlers like this:

element.onclick = function () { };

and DOM level 2 event handlers like this:

element.addEventListener('click', function () {});

return false in jQuery, on the other hand, will stop the default behaviour and stop the event propagating. This can be seen in the jQuery source:

if ( ret === false ) {
  event.preventDefault();
  event.stopPropagation();
}

If the return value of the handler is false, jQuery will call preventDefault and stopPropagation on the event object.

I’ve made a fiddle to demonstrate this behaviour, showing what return false does with DOM level 0 handlers and with jQuery handlers, and how to achieve the same in a DOM level 2 handler.

Categories: Programming Tags: ,

jQuery.tmpl() presentation

March 20th, 2011 7 comments

Back in January I did a presentation at the Bristol Ruby User Group on jQuery.tmpl() – jQuery’s templating engine.

The presentation was filmed and is available on the BRUG website. The slides are also available.

I wrote the slideshow engine using jQuery.tmpl() – if you’re interested in viewing some real-world jQuery templating, have a look at the source on Github. I tried to use as many of the concepts introduced in the presentation as possible, so it’s a useful showcase of what can be achieved using jQuery templates.

Categories: Programming Tags: ,

id_barcode – a barcode maker for Adobe InDesign

March 4th, 2011 6 comments

id_barcode is multiple things. It’s an example of writing structured, modular code for an InDesign plugin. It’s a demonstration of test-driven development for producing InDesign plugins. It also makes barcodes.

InDesign barcode maker

First off, this is a barcode maker for InDesign. It’s fairly simple – just put in your ISBN, and choose the font for the ISBN and for the numbers under the bars.

The script file is available to download here. If you don’t know how to install an InDesign script, see these instructions on InDesign Secrets.

Here it is:

id_barcode dialog

and this is what the barcode looks like:

Pretty wild.

Writing modular InDesign plugins

I split id_barcode across four files:

Really barcode_main.js should be split up into two files, one for the GUI and one for rendering the barcode.

Each part of the code does one thing. barcode_library doesn’t know anything about InDesign, all it does is produce a data structure giving the relative bar widths. The GUI code doesn’t contain any logic, it just lets the user enter the relevant details. Then the rendering code takes the barcode from the GUI, uses the library to get the bar widths, and draws the barcode with these widths and the fonts selected by the user.

Because I wanted to make the plugin easy to use, I wrote a shell script to concatenate the relevant JavaScript files and remove the #import lines. This was pretty straightforward.

Test-driven development

I’ve been reading Test-Driven JavaScript Development – an excellent book. I used some of the basics from that to construct a very basic testing framework, contained in barcode_test.js. Also in that file are a number of tests used to ensure that barcode_library.js was doing the right thing. Every time I changed something, or refactored, rather than checking to see if the barcodes still came out correctly I could just run the tests and get instant feedback. I haven’t written tests for the rendering code, but that would be a good next step.

Next steps

So far this plugin is very basic. You can’t customise anything apart from the fonts, so if you standardly use a different barcode design you’d need to do manual work to each barcode. The error handling’s a bit rubbish as well. If I make it more customisable, I think it would be worth test-driving the layout changes. The most important thing is to write a test that ensures the bars are all the correct width. I’d then have to start delving further into InDesign’s object model, which is a bit of a mess. We’ll see.

If anybody’s interested in contributing you can fork me on github!

Categories: Programming Tags: ,

A short word on nomenclature

February 26th, 2011 17 comments

These symbols have accepted names (in the context of programming):

  • [ ] These are opening and closing brackets. They are square, so are sometimes called square brackets.
  • { } These are opening and closing braces. You could call them curly braces, but there’s no need.
  • ( ) Theses are parentheses. One is a parenthesis. If those are a bit of a mouth- or keyboardful, call them parens. Don’t call them brackets or I’ll destroy your keyboard.
  • ~ This is a tilde.
  • | This is a pipe.

These symbols have multiple accepted names.

  • # I call this a hash because I’m British. Americans call it a pound sign. Pedants call it an octothorpe.
  • * Asterisk, star, splat. Definitely not an “asterix”.
  • ! Exclamation mark/point, bang.
  • < > Less-than and greater-than, but may also be called angle brackets.
  • . Full-stop, period, dot.
  • ? Question mark, query.
Categories: Programming Tags:

Why you should learn brainfuck (or: learn you a brainfuck for great good!)

February 20th, 2011 23 comments

Before I begin, a little disclaimer: there’s no way to write about brainfuck without offending someone. Some people will be upset about the swearing, others about the censorship. I’ve come up with a partial solution: Decensor this article (decensor plugin available on github) If you’re reading this in a feed reader then sorry for the swearing.

What is brainfuck?

Brainfuck is close to being the simplest programming language possible, with only 8 instructions:

> < + - , . [ ]

These instructions move an internal data pointer, increment and decrement the value at the data pointer, input and output data, and provide simple looping.

As an aside: brainfuck was written with the intent of having a language with the smallest-possible compiler. Many compilers for brainfuck are smaller than 200 bytes! See Wikipedia for more details.

Why would I want to learn such a stupid language?

It’s been suggested that you should learn at least 6 programming languages to be a good programmer. The more programming languages you know, the more perspectives you’ll have on problems. Brainfuck is such a simple language to learn, there’s almost no reason not to learn it. If you read the whole of this blog post now, and try out all the code samples in the interpreter, in half-an-hour to an hour you’ll have a new programming language under your belt. Tell me that’s a waste of time :)

Another good reason to learn brainfuck is to understand how basic a Turing-complete programming language can be. A common argument when programmers compare languages is “well they’re all Turing-complete”, meaning that anything you can do in one language you can do in another. Once you’ve learnt brainfuck, you’ll understand just how difficult it can be to use a Turing-complete language, and how that argument holds no water.

Ok, you’ve convinced me. Teach me!

Good. I knew you were one of the smart ones.

Brainfuck doesn’t have variable names – just one long array of cells, each of which contains a number:

address: 0  1  2  3  4  5  6  ...
value:  [0][0][0][0][0][0][0][...]

Each cell is initialized to zero. A data pointer starts off pointing to the first cell:

pointer: v
address: 0  1  2  3  4  5  6  ...
value:  [0][0][0][0][0][0][0][...]

Moving the data pointer and modifying the data

A brainfuck program is a stream of the 8 commands (other characters are allowed in this stream but are ignored). Here’s a basic brainfuck program:

>>>>++

The greater-than command moves the data pointer one cell to the right. The plus command increases the value of the cell under the data pointer by 1. Here’s what the data looks like after we run the above program:

pointer:             v
address: 0  1  2  3  4  5  6  ...
value:  [0][0][0][0][2][0][0][...]

The data pointer moved four cells to the right, and the value of that cell was increased by 2. Pretty simple.

You can probably guess what the less-than and minus commands do:

program: >>>>++<<++-

pointer:       v
address: 0  1  2  3  4  5  6  ...
value:  [0][0][1][0][2][0][0][...]

So, move four cells right, increment twice, move two cells left, increment twice and decrement once.

Well done, you’ve already learnt half of the syntax of brainfuck.

I/O

A programming language isn’t much use without the ability to input and output data. Brainfuck has two commands for I/O – , (comma) and . (full-stop/period). The comma inputs a character from the input into the current cell, and the period outputs the character in the current cell. The ASCII code of the character is used on input and output. You may not be used to interchanging characters and integers, so it’s worth having a look at this chart to see how they map.

Here’s a simple program that inputs a character, increments it once, then outputs it:

program: ,+.
input:   a

pointer:  v     
address:  0  1  2  3  4  5  6  ...
value:  [98][0][0][0][0][0][0][...]

The program takes a character from the input, in this case the letter ‘a’, and puts it into the first data cell. It then increments it, and outputs the value of the cell. A lowercase ‘a’ has the ASCII code 97, and ‘b’ has the ASCII code 98.

Try it out. Go to my brainfuck interpreter, put the string ,+. into the “Code” box and the letter ‘a’ into the “Input” box (or use this link), then press “Run”. Try it with different input, then try using more pluses or minuses.

Every time you use the comma command you remove a character from the input stream:

program: ,>,>,.<.<.
input:   abc

pointer:  v     
address:  0   1   2  3  4  5  6  ...
value:  [97][98][99][0][0][0][0][...]

This program puts the first character of input in the first cell, the second character in the second cell and the third character in the third cell (although it looks like an evil dragon smiley). Try it out to see what happens.

Looping

Ok, that’s three-quarters of the language covered now. All that’s left is looping. This is a bit trickier, but fairly simple once you’ve got the hang of it.

[ and ] (left and right brackets) are used for looping. Anything in between pairs of brackets will loop (you can have nested loops – the pairs match like parentheses (i.e. this is the equivalent of an inner loop) and this is the outer loop (did you see what I did there? (inner inner!))). Of course there’s no point looping forever, so brainfuck needs a way of knowing when to stop. It does this by checking the value of the current data cell to see if it is zero. If it is, execution will skip to after the end of the loop.

To illustrate this I’m going to need to show the position of the current instruction. This is called the instruction pointer, or i_ptr in the figure below.

To start with, the ‘z’ is read into the first cell:

i_ptr:   v
program: ,[.-]
input:   z

pointer:  v     
address:  0   1  2  3  4  5  6  ...
value:  [122][0][0][0][0][0][0][...]

The instruction pointer moves to the next instruction in the program. It checks to see if the value in the current data cell is zero. It isn’t, so the loop is entered.

i_ptr:    v
program: ,[.-]
input:   z

pointer:  v     
address:  0   1  2  3  4  5  6  ...
value:  [122][0][0][0][0][0][0][...]

The value in the current data cell is output (‘z’), then decremented:

i_ptr:      v
program: ,[.-]
input:   z

pointer:  v     
address:  0   1  2  3  4  5  6  ...
value:  [121][0][0][0][0][0][0][...]

The instruction pointer reaches the end of the loop and jumps back to the beginning:

i_ptr:    v
program: ,[.-]
input:   z

pointer:  v     
address:  0   1  2  3  4  5  6  ...
value:  [121][0][0][0][0][0][0][...]

The current cell is still not zero, so the loop is entered again, the new value is output (‘y’), and decremented again. This keeps on going until the value of the first data cell is zero. At this point, the instruction pointer jumps to after the end of the loop, which also happens to be the end of the program:

i_ptr:        v
program: ,[.-]
input:   z

pointer: v     
address: 0  1  2  3  4  5  6  ...
value:  [0][0][0][0][0][0][0][...]

Try it out to see what happens.

Refresher

And that’s the entire language. Here’s a quick recap:

  • > Move the data pointer one cell to the right
  • < Move the data pointer one cell to the left
  • + Increment the value of the cell at the data pointer
  • - Decrement the value of the cell at the data pointer
  • , Take a character from the input and place its value into the current data cell
  • . Output the value of the current data cell as a character
  • [ If the current data cell is zero, skip to after the closing bracket, otherwise continue
  • ] Skip back to the matching opening bracket (a common optimization is to skip over this instruction if the current cell is zero, rather than going back to the opening bracket and checking)

The end

If you’ve read the whole blog post and tried out all the code, you now know brainfuck. Well done, that’s another item for your CV! If you try making anything relatively complex you’ll realise the true value of the abstractions we build software on. All languages have different levels of abstraction, and you should be working at the highest level you can afford. It’s worth dropping down to the level of brainfuck occasionally so you can really appreciate the abstractions you’re given.

Further reading

Categories: Programming Tags: ,

Zen and the art of statefulness

February 12th, 2011 38 comments

The venerable master Qc Na was walking with his student, Anton. Hoping to prompt the master into a discussion, Anton said “Master, I have heard that objects are a very good thing – is this true?” Qc Na looked pityingly at his student and replied, “Foolish pupil – objects are merely a poor man’s closures.”

Chastised, Anton took his leave from his master and returned to his cell, intent on studying closures. He carefully read the entire “Lambda: The Ultimate…” series of papers and its cousins, and implemented a small Scheme interpreter with a closure-based object system. He learned much, and looked forward to informing his master of his progress.

On his next walk with Qc Na, Anton attempted to impress his master by saying “Master, I have diligently studied the matter, and now understand that objects are truly a poor man’s closures.” Qc Na responded by hitting Anton with his stick, saying “When will you learn? Closures are a poor man’s object.” At that moment, Anton became enlightened.

Anton van Straaten

When I first read the above koan some time ago, I didn’t really understand it. I had a very basic idea of closures, but at the time they were just a syntactic oddity to me – something you could do a few cool things with, but not particularly useful. Since then I’ve worked through quite a bit of Structure and Interpretation of Computer Programs and delved into functional programming in JavaScript, which has given me a much deeper understanding of closures. On the other side of the divide, I’ve been doing a lot of Ruby programming, which has helped me grok objects a lot better. I now feel like I can begin to comprehend what the quote was getting at.

My moment of enlightenment came today while reading Test-Driven JavaScript Development and looking at the code for a JavaScript strftime function (abbreviated here for brevity):

Date.prototype.strftime = (function () {
  function strftime(format) {
    var date = this;

    return (format + "").replace(/%([a-zA-Z])/g, function (m, f) {
      //format date based on Date.formats
    });
  }

  // Internal helper
  function zeroPad(num) {
    return (+num < 10 ? "0" : "") + num;
  }

  Date.formats = {
    // Formatting methods
    d: function (date) {
      return zeroPad(date.getDate());
    },
    //...
    //various other format methods
    //...

    // Format shorthands
    F: "%Y-%m-%d",
    D: "%m/%d/%y"
  };

  return strftime;
}());

The above code uses an IIFE (Immediately-invoked function expression) to produce a function with additional data (if Date.formats was instead declared as a local variable, this would be a better example). If it doesn’t make sense, I thoroughly recommend Ben Alman’s post on IFFEs for an overview of the technique. The code executes, and returns a function. The important thing is that one function is used to define another function and its context.

In Ruby, when you define a class, the class definition is executed as Ruby code, unlike in Java, for example, where a class definition is just a syntactic construct read at compile-time, but not executed in the way other Java code is. A Ruby class definition is read at run-time, and builds up a new class as it is interpreted.

In a lot of ways, Ruby class definitions and JavaScript function-defining functions are equivalent. I’ll give you a little example to illustrate:

zen.js

var Cat = function () {
  var age = 1;

  function catYears() {
    return age * 7;
  }

  function birthday() {
    age++;
  }

  return {
    catYears: catYears,
    birthday: birthday
  }
};

var powpow = Cat();
var shorty = Cat();

//yo shorty, it's your birthday:
shorty.birthday();

alert(powpow.catYears()); // => 7
alert(shorty.catYears()); // => 14

zen.rb

class Cat
  def initialize
    @age = 1
  end

  def cat_years
    @age * 7
  end

  def birthday
    @age += 1
  end
end

powpow = Cat.new
shorty = Cat.new

#yo shorty, it's your birthday:
shorty.birthday

puts powpow.cat_years # => 7
puts shorty.cat_years # => 14

Strictly speaking, in zen.js I’m returning an object, but the private data and methods of that object are saved in a closure. So, zen.js stores its state in a closure, while zen.rb stores its state in an object. Every time Cat() is called in zen.js, a new closure is created with a unique age. Every time Cat.new is called in zen.rb, a new object is created with a unique @age. (These two examples aren’t strictly equivalent – each cat in zen.js gets a new copy of the functions, whereas in zen.rb they share the same methods. It’s possible to make the JavaScript version function more like a Ruby class definition, but it takes a bit more code.)

Of course, there’s a lot you can do with JavaScript closures that you can’t do with Ruby objects. And there’s a lot you can do with Ruby objects that can’t be emulated using JavaScript closures. Any time you decide that one is better than the other, just imagine Qc Na hitting you with his stick.

Further reading

Really really simple Ruby metaprogramming

February 5th, 2011 12 comments

Metaprogramming in Ruby has the reputation of being something only the true zen Ruby masters can even hope to understand. They say the only true way to learn metaprogramming is to train with Dave Thomas in his mountain retreat for five years – the Ruby equivalent of this XKCD:

But it’s not that bad. In fact, I’m going to go so far to say that it’s possible to learn to metaprogram in Ruby without even meeting Dave Thomas. Yeah, I know, it doesn’t sound possible. But just you wait…

What is metaprogramming?

Metaprogramming is what makes Ruby awesome. It’s writing code to write code. It’s dynamic code generation. It’s the move from imperative to declarative. It’s a rejection of Java’s endless repetition of boilerplate code. It’s the living embodiment of DRY. Here’s an example:

class Monkey
  def name
    @name
  end

  def name= n
    @name = n
  end
end

I can see you there, in the middle of the classroom, with one arm straining up, the other one holding it up because it’s so damn hard to hold up. OK, Jimmy, what is it? Oh, you don’t need to write all that code in Ruby? You can just use attr_accessor?

Jimmy’s right. That code snippet above could be written like so:

class Monkey
  attr_accessor :name
end

So, attr_accessor‘s magic right? Well, actually, it’s not. Just like in Scooby-Doo where what looked like magic to start off with turned out to be an elaborate hoax, attr_accessor is just a class method of Module. It defines the name and name= methods on Monkey, as if you’d manually defined them.

And that’s all metaprogramming is. Just a way of adding arbitrary code to your application without having to write (or copy-paste) that code.

An (extended) example

The source code for this example is available in my StringifyStuff plugin on github, which in turn was adapted from Ryan Bates’ Railscast Making a Plugin. Incidentally, if you feel you can improve the plugin, fork me on github!

I’m assuming basic knowledge of Ruby, and some familiarity with Rails would be helpful as well, but not essential.

Have a quick peek at the github repo now – the important files to look at for now are init.rb and lib/stringify_time.

First, stringify_time. This file defines a module called StringifyTime, which defines a method called stringify_time:

module StringifyTime
  def stringify_time *names
    #crazy stuff going on in here
  end
end

The other two files, stringify_stuff and stringify_money are similar.

Now, init.rb:

class ActiveRecord::Base
  extend StringifyStuff
  extend StringifyTime
  extend StringifyMoney
end

This extends ActiveRecord::Base with the three modules listed. This now means that any class that inherits from ActiveRecord::Base (e.g. your models) now has the methods defined in each of those modules as class methods.

The StringifyStuff plugin is used like so:

class Project < ActiveRecord::Base
  stringify_stuff
  stringify_time :start_date, :end_date
end

stringify_time is passed a list of symbols representing the relevant model attributes. It will provide useful accessors for those attributes. Let’s have a look at a simplified version of stringify_time:

module StringifyTime
  def stringify_time *names
    names.each do |name|

      define_method "#{name}_string" do
        read_attribute(name) &&
          read_attribute(name).strftime("%e %b %Y").strip()
      end

    end
  end
end

stringify_time is passed a list of symbols. It iterates over these symbols, and for each one calls define_method. define_method is the first clever Ruby metaprogramming trick we’re going to look at. It takes a method name and a block representing the method arguments and body, and magically adds an instance method with that name, arguments and body to the class in which it was called. In this case, it was called from Project, so this will gain a new method with the name "#{name}_string". In this example, we passed in :start_date and :end_date, so two methods will be added to Project: start_date_string and end_date_string.

So far so good. Now though, it gets a little bit hairy, so hold onto your horses (or equivalent equid).

In a normal model instance method, you’d access an attribute using a method of the same name. So if you wanted to get the start_date converted to a string, you’d write:

def my_method
  start_date.to_s
end

The problem with doing this in define_method is that we don’t have start_date as an identifier – it’s held as a symbol. There are two ways to accomplish the above if start_date was passed in as a symbol:

def my_method attr_sym
  read_attribute(attr_sym).to_s #This is a Rails method
end

or:

def my_method attr_sym
  send(attr_sym).to_s #Ruby built-in method
end

For simplicity, I’m using write_attribute, but send is useful to know about too.

So, back to define_method:

      define_method "#{name}_string" do
        read_attribute(name) &&
          read_attribute(name).strftime("%e %b %Y").strip()
      end

So, each time round the loop, name will be one of the symbols passed into stringify_time. Let’s go with start_date to see what happens. define_method will define a new instance method on the Project class called start_date_string. This method will check to see if there is a non-nil attribute called start_date, and if so, call strftime on it. This formatted date will be the return value of the method.

Wrap up

That’s more than enough for one day (I’ve been told off for writing kilo-word posts before). I’ll explain the workings of the rest of the plugin in a future post. If you want to learn more, I highly recommend David Flanagan’s The Ruby Programming Language.

Metaprogramming is writing code that gives you more code. It’s a bit like the Sorcerer’s Apprentice; in fact, I think that should be recommended watching for this subject.

Be the Sorcerer. Don’t be Mickey.

Categories: Programming Tags: , ,