Archive

Posts Tagged ‘programming’

JavaScript and the end of progressive enhancement

May 4th, 2011 87 comments

Progressive enhancement is the Right WayTM to do things in web development. It works like this:

  1. Write the HTML for your content, 100% semantically (i.e. only the necessary tags to explain the meaning of the content).

  2. Style the HTML using CSS. You may need to add hooks to your HTML in the form of classes and ids for the CSS to target.

  3. Add JavaScript enhancements to the interface, but only enhancements.

Parallel to this, the functionality of the site/app is progressively enhanced:

  1. All navigation must happen via links and form submissions – don’t use JavaScript for navigation.

  2. Add JavaScript enhancements to navigation, overriding the links and form submissions, for example to avoid page reloads. Every single link and form element should still work when JavaScript is disabled.

But of course you knew that, because that’s how you roll. I’m not sure this is the way forward though. To explain why I’ll need to start with a bit of (over-simplified) history.

A history lesson

Static pages

Originally, most sites on the web were a collection of static HTML pages saved on a server. The bit of the URL after the domain gave the location of the file on the server. This works fine as long as you’re working with static information, and don’t mind each file having to repeat the header, footer, and any other shared HTML.

Server-side dynamism

In order to make the web more useful, techniques were developed to enable the HTML to be generated on the fly, when a request was received. For a long time, this was how the web worked, and a lot of it still does.

Client-side dynamism

As JavaScript has become more prevalent on the client, developers have started using Ajax to streamline interfaces, replacing page loads with partial reloads. This either works by sending HTML snippets to replace a section of page (thus putting the burden of HTML generation on the server) or by sending raw data in XML or JSON and letting the browser construct the HTML or DOM structure.

Shifting to the client

As JavaScript engines increase in speed, the Ajax bottleneck comes in the transfer time. With a fast JavaScript engine it’s quicker to send the data to the client in the lightest way possible and let the client construct the HTML, rather than constructing the HTML on the server and saving the browser some work.

This raises an issue – we now need rendering code on the client and the server, doing the same thing. This breaks the DRY principle and leads to a maintenance nightmare. Remember, all of the functionality needs to work without JavaScript first, and only be enhanced by JavaScript.

No page reloads

If we’re trying to avoid page reloads, why not take that to its logical conclusion? All the server needs to do is spit out JSON – the client handles all of the rendering. Even if the entire page needs to change, it might be faster to load the new data asynchronously, then render a new template using this new data.

Working this way, a huge amount of functionality would need to be written twice; once on the server and once on the client. This isn’t going work for most people, so we’re left with two options – either abandon the “no reload” approach or abandon progressive enhancement.

The end of progressive enhancement

So, where does this leave things? It depends on how strongly you’re tied to progressive enhancement. Up until now, progressive enhancement was a good way to build websites and web apps – it enforced a clean separation between content, layout, behaviour and navigation. But it could be holding us back now, stopping the web moving forward towards more natural interfaces that aren’t over-burdened by its history as something very different to what it is today.

There are still good reasons to keep using progressive enhancement, but it may be time to accept that JavaScript is an essential technology on today’s web, and stop trying to make everything work in its absence.

Or maybe not

I’m completely torn on this issue. I wrote this post as a way of putting one side of the argument across in the hope of generating some kind of discussion. I don’t know what the solution is right now, but I know it’s worth talking about. So let me know what you think!

Appendix: Tools for this brave new world

Backbone.js is a JavaScript MVC framework, perfectly suited for developing applications that live on the client. You’ll want to use some kind of templating solution as well – I use jQuery.tmpl (I’ve written a presentation on the topic if you’re interested in learning more) but there are lots of others as well.

Sammy.js (suggested by justin TNT) looks like another good client-side framework, definitely influenced by Sinatra.

If anybody has any suggestions for other suitable libraries/frameworks I’ll gladly add them in here.


Edit: I’ve now written a follow-up to this post: The end of progressive enhancement revisited.

Magic in software development

April 24th, 2011 8 comments

First off, so we’re on the same page, we need to be talking in the same terms. So, what is magic? I’d define it as anything below your current level of abstraction that you don’t understand.

SICP cover

I first started thinking about this topic because of this angst I felt about all this magic I was relying on. There’s so many things that I rely on every day and I have no idea how they work. As far as I’m concerned, they’re magic. As long as I think about these things as magic, they seem unknowable. I think that’s where the angst comes from. What I need to remind myself is that everything I’ve learned about software seemed like magic before I understood it, and there’s nothing that’s unknowable, just things that aren’t known yet.

This post is my attempt to come to an understanding of what this ‘magic’ is, whether it’s a problem, and how best to deal with it.

The ‘Dave’ technique

At University, I learned how to record music. We recorded onto tape, so we had to learn how a tape machine worked. To learn how a tape machine worked, we had understand hysteresis and remanence and modulation and various other things that I’ve mostly forgotten now. We also had to understand how radio worked, so we learned about amplitude and frequency modulation, and how stereo FM was built on top of mono FM with an AM difference signal on top so as to be backwards compatible, and various other technical issues. Then came digital audio with ADCs, DACs, sampling, quantisation, dithering, etc. Each of these three topics was one or two modules, i.e. a semester’s-worth of lectures, so we went pretty deep.

Did that knowledge make me a better recording engineer? Probably not. But it was a helpful basis for the latter things I learned, which did help make me a better recording engineer.

The ‘Tim’ technique

Another lecturer taught us computer audio. This started off with an introduction to computers, starting at the level of AND, OR and NOT gates, then on to NAND and NOR (and how the simpler to understand AND, OR and NOT gates can be implemented in terms of NAND and NOR), then on to the ALU, then the CPU, then adding memory and disk storage and network and MIDI connections, creating a full computer. This was all presented in two or three lectures, so inevitably certain steps had to be left out (e.g. “we put together a load of these gates and we get the ALU”).

The benefit of this approach was that we were given a vague conceptual understanding of the basis on which computer audio is built, without having to spend hours and hours learning the full intricacies.

Bottom-up

If we’re intent on uncovering all the magic, how should we learn – bottom-up or top-down? Bottom-up, right from the bottom, implies first learning enough physics to understand electronics, which you’d need to understand digital circuit design, which you’d need to understand CPU design, which you need to understand assembly language, which you need to understand C, then enough C to understand Ruby (say). You’ll also need to understand how networks work right from the bottom of the TCP/IP stack and beyond; how storage works right from the level of hysteresis in magnetic particles on a hard disk; how RAM works from the level of individual capacitors; etc.

This reminds me of the famous Carl Sagan quote:

If you wish to make an apple pie from scratch, you must first invent the universe.

Top-down

Top-down is essentially the process followed by Structure and Interpretation of Computer Programs, at a micro and macro level. At the micro level, we are taught to solve problems using “wishful thinking”: first represent the problem using large custom building blocks, then build the pieces those blocks are made out of using language primitives. And at the macro level, the book teaches the workings of Scheme through a series of increasingly complex models.

Bottom-up vs top-down

The problem with the bottom-up approach is it would take decades of studying before you were let near an interpreted language. This approach may have worked in the fifties when there wasn’t as much to learn. But these days you won’t get anything useful done.

The problem with the top-down approach is that the levels we work at these days are so large, you can’t just learn all of your current level and then start to look at the lower levels, because you’d never finish learning all the details of the current level.

Leaky abstractions

Joel Spolsky has something to say on this matter in his piece The Law of Leaky Abstractions. This basically says that all abstractions are leaky, meaning that we cannot rely on them to shield us from the levels below our current level. So, to work at the current level we must learn the lower levels. It’s OK to work at the level of the abstraction, and it will save us time working at that level, but when it breaks down we still need to understand why. As Joel says:

the abstractions save us time working, but they don’t save us time learning.

Magic’s alright, yeah

Some of us don’t like the idea of magic. It feels like cheating. But it’s useful as a placeholder, something to learn about later; not now. It’s useful to delve down to lower levels, understanding how the abstractions we work with every day are implemented, but at some point the law of diminishing returns kicks in. (In some fields, we can’t even assume we’ll ever learn how the magic works. As @sermoa said, “we don’t understand exactly how quantum physics works but we accept that it does and make good use of it”.)

What we need to get to, eventually, is an understanding of the whole stack, but at greater and greater granularities as we get further away from our main level. So, if you’re a Ruby on Rails programmer, you need to know most of Ruby (but not all of it). It’s helpful to know a bit of C, and a bit of how the operating system responds to certain commands. Below that level, it’s good to have a vague understanding of how a computer works. And it’s useful to know that electrical signals can only transmit at a finite speed, and that there’s a physical limit to the number of transistors in a processor. As time goes on you can begin to flesh out these lower levels, remembring that the higher levels are generally going to be more beneficial to your day-to-day work.

Before you’ve delved deeper, those lower levels are magic. And that’s fine.


Thanks to @marcgravell, @ataraxia_status, @sermoa, @bytespider, @ghickman, @SamirTalwar and @josmasflores for their input on this post.

Also, thanks to Dave Fisher and Tim Brookes for being the sources of the ‘Dave’ and ‘Tim’ techniques.

Categories: Programming Tags: ,

Closures explained with JavaScript

April 22nd, 2011 31 comments

Last year I wrote A brief introduction to closures which was meant to help people understand exactly what a closure is and how it works. I’m going to attempt that again, but from a different angle. I think with these kinds of concept, you just need to read as many alternative explanations as you can in order to get a well rounded view.

First-class functions

As I explained in Why JavaScript is AWESOME, one of the most powerful parts of JavaScript is its first-class functions. So, what does “first-class” mean in a programming language? To quote wikipedia, something is first-class if it:

  • can be stored in variables and data structures
  • can be passed as a parameter to a subroutine
  • can be returned as the result of a subroutine
  • can be constructed at runtime
  • has intrinsic identity (independent of any given name)

So, functions in JavaScript are just like objects. In fact, in JavaScript functions are objects.

The ability to nest functions gives us closures. Which is what I’m going to talk about next…

Nested functions

Here’s a little toy example of nested functions:

The important thing to note here is that there is only one f defined. Each time f is called, a new function g is created, local to that execution of f. When that function g is returned, we can assign its value to a globally defined variable. So, we call f and assign the result to g5, then we call f again and assign the result to g1. g1 and g5 are two different functions. They happen to share the same code, but they were executed in different environments, with different free variables. (As an aside, we don’t need to use a function definition to define g and then return it. Instead, we can use a function expression which allows us to create a function without naming it. These are called ‘anonymous functions’ or lambdas. Here’s a version of the above with g converted to an anonymous function.)

Free variables and scope

A variable is free in any particular scope if it is defined within an enclosing scope. To make that more concrete, in the scope of g, the variable x is free, because it is defined within the scope of f. Any global variables are free within the scopes of f and g.

What do I mean by scope? A scope is an area of code where a variable may be defined, without the enclosing scope knowing about it. JavaScript has ‘function scope’, so each function has its own scope. So, any variable defined within f is invisible outside of f. Scopes can be nested, so in the above example, g has its own scope which is contained within the scope of f, which is contained within the global scope. Whenever a variable is accessed, the JavaScript interpreter first looks within the current scope. If the variable is not found to be defined within the current scope, the interpreter checks the enclosing scope, then the enclosing scope of the enclosing scope, all the way up to the global scope. If the variable is not found, a ReferenceError is thrown.

Closures are functions that retain a reference to their free variables

And this is the meat of the matter. Let’s look at a simplified version of the above example first:

It’s no surprise that when you call f with the argument 5, when g is called it has access to that argument. What’s a bit more surprising is that if you return g from the argument, the returned function still has access to the argument 5 (as shown in the original example). The bit that can be a bit mind-blowing (and I think generally the reason that people have such trouble understanding closures) is that the returned g actually is remembering the variable x that was defined when f was called. That might not make much sense, so here’s another example:

When person is called, the argument name is bound to the value passed. So, the first time person is called, name is bound to ‘Dave’, and the second time, it’s bound to ‘Mary’. person defines two internal functions, get and set. The first time these functions are defined, they have a free variable name which is bound to ‘Dave’. These two functions are returned in an array, which is unpacked on the outside to get two functions getDave and setDave. (If you want to return more than one value from a function, you can either return an object or an array. Using an array here is more verbose, but I didn’t want to confuse the issue by including objects as well. Here’s a version of the above using an object instead, if that makes more sense to you.)

And this is the magic bit. getDave and setDave both remember the same variable name, which was bound to ‘Dave’ originally. When setDave is called, that variable is set to ‘Bob’. Now when getDave is called, it returns ‘Bob’ (Dave never liked the name ‘Dave’ anyway). So getDave and setDave are two functions that remember the same variable. This is what I mean when I say “Closures are functions that retain a reference to their free variables”. getDave and setDave both remember the free variable name. Even though person has now returned, the variable name lives on, because it is referenced by getDave and setDave.

The variable name was bound to ‘Dave’ when person was called the first time. When person is called a second time, a new version of name comes into existence, as well as new versions of get and set. So the functions getMary and setMary are completely different to the functions getDave and setDave. They execute identical code, but in two different environments, with different free variables.

Summary

In summary, a closure is a function called in one context that remembers variables defined in another context – the context in which it was defined. Multiple closures defined within the same context remember that same context, so changes they make are shared between them. Every time a function is called that creates closures, a new, shared context is created for the new closures.

The best way to learn is to play, so I’d recommend editing the fiddles above (by clicking on the + at the top right) and having a fiddle. Enjoy!


Addendum: Lasse Reichstein had some useful comments on this post on the JSMentors mailing list – read them here. The important thing to realise is that a closure actually remembers its environment rather than its free variables, so if you define a new variable in the environment of the closure after the closure’s definition, it will be accessible inside the closure.

Seven steps towards becoming a professional developer

April 9th, 2011 11 comments

When you first start learning to program, the best thing you can do is just write code, and lots of it. Coding is a muscle, and it needs to be exercised.

That’s just a start though. If you enjoy it, and it’s something you want to do professionally, then there are some essential practices that you need to learn, and make part of your routine.

This list is split into two parts. The first items are directly related to coding. They all take some time to setup and master, but once you’ve got to that stage you’ll get nothing but good out of them. Think of them as an investment in your future.

The second bit is about the world beyond coding, and they mostly involve interacting with real people. If that’s not a problem, then great, but if it is, you’re gonna have to get over it, because dealing with people is a surprisingly big part of this job.

Coding

  1. You need to be testing your code. You won’t get everything perfect the first time, so at some point you’ll need to refactor. And if you refactor without tests, “you’re not refactoring; you’re just changing shit”.

  2. Your code needs to be under version control. I use git, but there are plenty of other options. Code that’s not under some form of version control may as well not exist. And using something with easy branching like git makes it much easier to experiment with big changes to your code that might not work out.

  3. Learn to touch-type. It’ll take a bit of effort, but it’s worth it. If you’re serious about being a professional, then you’re going to be spending a lot of your time on a keyboard. Learn to use it properly. Steve Yegge wrote a great blog post on touch-typing. If you want to learn, I don’t know of anything better than GNU Typist.

Everything else

  1. Start a blog. Write about code you’ve written, problems you’ve solved, things you’re learning about. It’s not for anyone else, just you. The act of putting your thoughts into words forces your brain to look at issues from a different angle. There may come a point that people start finding your content and making use of it, so it’s worth making it public.

  2. Join Twitter. Find local developers and the big names in whatever languages you’re currently learning. Twitter’s like a global virtual watercooler. People talk a lot of shit, but some of the most important conversations happen there as well, as well as job offers, so if you’re not involved you could miss out.

  3. Use and participate in Stack Overflow. You can get a lot of useful guidance if you learn how to ask questions sensibly, and you can learn loads by answering questions as well. Also, once you’ve got a good amount of answered questions, Stack Overflow becomes a great showcase of your knowledge and writing.

  4. Find a local user group. Google for the name of your town or city, the name of a language you’re interested in, and ‘user group’ (in Ruby that would be "#{city_name} #{language} user group"). There’ll be something out there.

That all looks too much like hard work to me

If you’re just starting out, these lists may seem a bit daunting. You don’t need to do everything at once, but if you’re taking this seriously then at some point you should be doing them all.

Categories: Miscellaneous Tags:

id_barcode – a barcode maker for Adobe InDesign

March 4th, 2011 6 comments

id_barcode is multiple things. It’s an example of writing structured, modular code for an InDesign plugin. It’s a demonstration of test-driven development for producing InDesign plugins. It also makes barcodes.

InDesign barcode maker

First off, this is a barcode maker for InDesign. It’s fairly simple – just put in your ISBN, and choose the font for the ISBN and for the numbers under the bars.

The script file is available to download here. If you don’t know how to install an InDesign script, see these instructions on InDesign Secrets.

Here it is:

id_barcode dialog

and this is what the barcode looks like:

Pretty wild.

Writing modular InDesign plugins

I split id_barcode across four files:

Really barcode_main.js should be split up into two files, one for the GUI and one for rendering the barcode.

Each part of the code does one thing. barcode_library doesn’t know anything about InDesign, all it does is produce a data structure giving the relative bar widths. The GUI code doesn’t contain any logic, it just lets the user enter the relevant details. Then the rendering code takes the barcode from the GUI, uses the library to get the bar widths, and draws the barcode with these widths and the fonts selected by the user.

Because I wanted to make the plugin easy to use, I wrote a shell script to concatenate the relevant JavaScript files and remove the #import lines. This was pretty straightforward.

Test-driven development

I’ve been reading Test-Driven JavaScript Development – an excellent book. I used some of the basics from that to construct a very basic testing framework, contained in barcode_test.js. Also in that file are a number of tests used to ensure that barcode_library.js was doing the right thing. Every time I changed something, or refactored, rather than checking to see if the barcodes still came out correctly I could just run the tests and get instant feedback. I haven’t written tests for the rendering code, but that would be a good next step.

Next steps

So far this plugin is very basic. You can’t customise anything apart from the fonts, so if you standardly use a different barcode design you’d need to do manual work to each barcode. The error handling’s a bit rubbish as well. If I make it more customisable, I think it would be worth test-driving the layout changes. The most important thing is to write a test that ensures the bars are all the correct width. I’d then have to start delving further into InDesign’s object model, which is a bit of a mess. We’ll see.

If anybody’s interested in contributing you can fork me on github!

Categories: Programming Tags: ,

A short word on nomenclature

February 26th, 2011 17 comments

These symbols have accepted names (in the context of programming):

  • [ ] These are opening and closing brackets. They are square, so are sometimes called square brackets.
  • { } These are opening and closing braces. You could call them curly braces, but there’s no need.
  • ( ) Theses are parentheses. One is a parenthesis. If those are a bit of a mouth- or keyboardful, call them parens. Don’t call them brackets or I’ll destroy your keyboard.
  • ~ This is a tilde.
  • | This is a pipe.

These symbols have multiple accepted names.

  • # I call this a hash because I’m British. Americans call it a pound sign. Pedants call it an octothorpe.
  • * Asterisk, star, splat. Definitely not an “asterix”.
  • ! Exclamation mark/point, bang.
  • < > Less-than and greater-than, but may also be called angle brackets.
  • . Full-stop, period, dot.
  • ? Question mark, query.
Categories: Programming Tags:

Why you should learn brainfuck (or: learn you a brainfuck for great good!)

February 20th, 2011 23 comments

Before I begin, a little disclaimer: there’s no way to write about brainfuck without offending someone. Some people will be upset about the swearing, others about the censorship. I’ve come up with a partial solution: Decensor this article (decensor plugin available on github) If you’re reading this in a feed reader then sorry for the swearing.

What is brainfuck?

Brainfuck is close to being the simplest programming language possible, with only 8 instructions:

> < + - , . [ ]

These instructions move an internal data pointer, increment and decrement the value at the data pointer, input and output data, and provide simple looping.

As an aside: brainfuck was written with the intent of having a language with the smallest-possible compiler. Many compilers for brainfuck are smaller than 200 bytes! See Wikipedia for more details.

Why would I want to learn such a stupid language?

It’s been suggested that you should learn at least 6 programming languages to be a good programmer. The more programming languages you know, the more perspectives you’ll have on problems. Brainfuck is such a simple language to learn, there’s almost no reason not to learn it. If you read the whole of this blog post now, and try out all the code samples in the interpreter, in half-an-hour to an hour you’ll have a new programming language under your belt. Tell me that’s a waste of time :)

Another good reason to learn brainfuck is to understand how basic a Turing-complete programming language can be. A common argument when programmers compare languages is “well they’re all Turing-complete”, meaning that anything you can do in one language you can do in another. Once you’ve learnt brainfuck, you’ll understand just how difficult it can be to use a Turing-complete language, and how that argument holds no water.

Ok, you’ve convinced me. Teach me!

Good. I knew you were one of the smart ones.

Brainfuck doesn’t have variable names – just one long array of cells, each of which contains a number:

address: 0  1  2  3  4  5  6  ...
value:  [0][0][0][0][0][0][0][...]

Each cell is initialized to zero. A data pointer starts off pointing to the first cell:

pointer: v
address: 0  1  2  3  4  5  6  ...
value:  [0][0][0][0][0][0][0][...]

Moving the data pointer and modifying the data

A brainfuck program is a stream of the 8 commands (other characters are allowed in this stream but are ignored). Here’s a basic brainfuck program:

>>>>++

The greater-than command moves the data pointer one cell to the right. The plus command increases the value of the cell under the data pointer by 1. Here’s what the data looks like after we run the above program:

pointer:             v
address: 0  1  2  3  4  5  6  ...
value:  [0][0][0][0][2][0][0][...]

The data pointer moved four cells to the right, and the value of that cell was increased by 2. Pretty simple.

You can probably guess what the less-than and minus commands do:

program: >>>>++<<++-

pointer:       v
address: 0  1  2  3  4  5  6  ...
value:  [0][0][1][0][2][0][0][...]

So, move four cells right, increment twice, move two cells left, increment twice and decrement once.

Well done, you’ve already learnt half of the syntax of brainfuck.

I/O

A programming language isn’t much use without the ability to input and output data. Brainfuck has two commands for I/O – , (comma) and . (full-stop/period). The comma inputs a character from the input into the current cell, and the period outputs the character in the current cell. The ASCII code of the character is used on input and output. You may not be used to interchanging characters and integers, so it’s worth having a look at this chart to see how they map.

Here’s a simple program that inputs a character, increments it once, then outputs it:

program: ,+.
input:   a

pointer:  v     
address:  0  1  2  3  4  5  6  ...
value:  [98][0][0][0][0][0][0][...]

The program takes a character from the input, in this case the letter ‘a’, and puts it into the first data cell. It then increments it, and outputs the value of the cell. A lowercase ‘a’ has the ASCII code 97, and ‘b’ has the ASCII code 98.

Try it out. Go to my brainfuck interpreter, put the string ,+. into the “Code” box and the letter ‘a’ into the “Input” box (or use this link), then press “Run”. Try it with different input, then try using more pluses or minuses.

Every time you use the comma command you remove a character from the input stream:

program: ,>,>,.<.<.
input:   abc

pointer:  v     
address:  0   1   2  3  4  5  6  ...
value:  [97][98][99][0][0][0][0][...]

This program puts the first character of input in the first cell, the second character in the second cell and the third character in the third cell (although it looks like an evil dragon smiley). Try it out to see what happens.

Looping

Ok, that’s three-quarters of the language covered now. All that’s left is looping. This is a bit trickier, but fairly simple once you’ve got the hang of it.

[ and ] (left and right brackets) are used for looping. Anything in between pairs of brackets will loop (you can have nested loops – the pairs match like parentheses (i.e. this is the equivalent of an inner loop) and this is the outer loop (did you see what I did there? (inner inner!))). Of course there’s no point looping forever, so brainfuck needs a way of knowing when to stop. It does this by checking the value of the current data cell to see if it is zero. If it is, execution will skip to after the end of the loop.

To illustrate this I’m going to need to show the position of the current instruction. This is called the instruction pointer, or i_ptr in the figure below.

To start with, the ‘z’ is read into the first cell:

i_ptr:   v
program: ,[.-]
input:   z

pointer:  v     
address:  0   1  2  3  4  5  6  ...
value:  [122][0][0][0][0][0][0][...]

The instruction pointer moves to the next instruction in the program. It checks to see if the value in the current data cell is zero. It isn’t, so the loop is entered.

i_ptr:    v
program: ,[.-]
input:   z

pointer:  v     
address:  0   1  2  3  4  5  6  ...
value:  [122][0][0][0][0][0][0][...]

The value in the current data cell is output (‘z’), then decremented:

i_ptr:      v
program: ,[.-]
input:   z

pointer:  v     
address:  0   1  2  3  4  5  6  ...
value:  [121][0][0][0][0][0][0][...]

The instruction pointer reaches the end of the loop and jumps back to the beginning:

i_ptr:    v
program: ,[.-]
input:   z

pointer:  v     
address:  0   1  2  3  4  5  6  ...
value:  [121][0][0][0][0][0][0][...]

The current cell is still not zero, so the loop is entered again, the new value is output (‘y’), and decremented again. This keeps on going until the value of the first data cell is zero. At this point, the instruction pointer jumps to after the end of the loop, which also happens to be the end of the program:

i_ptr:        v
program: ,[.-]
input:   z

pointer: v     
address: 0  1  2  3  4  5  6  ...
value:  [0][0][0][0][0][0][0][...]

Try it out to see what happens.

Refresher

And that’s the entire language. Here’s a quick recap:

  • > Move the data pointer one cell to the right
  • < Move the data pointer one cell to the left
  • + Increment the value of the cell at the data pointer
  • - Decrement the value of the cell at the data pointer
  • , Take a character from the input and place its value into the current data cell
  • . Output the value of the current data cell as a character
  • [ If the current data cell is zero, skip to after the closing bracket, otherwise continue
  • ] Skip back to the matching opening bracket (a common optimization is to skip over this instruction if the current cell is zero, rather than going back to the opening bracket and checking)

The end

If you’ve read the whole blog post and tried out all the code, you now know brainfuck. Well done, that’s another item for your CV! If you try making anything relatively complex you’ll realise the true value of the abstractions we build software on. All languages have different levels of abstraction, and you should be working at the highest level you can afford. It’s worth dropping down to the level of brainfuck occasionally so you can really appreciate the abstractions you’re given.

Further reading

Categories: Programming Tags: ,

Zen and the art of statefulness

February 12th, 2011 38 comments

The venerable master Qc Na was walking with his student, Anton. Hoping to prompt the master into a discussion, Anton said “Master, I have heard that objects are a very good thing – is this true?” Qc Na looked pityingly at his student and replied, “Foolish pupil – objects are merely a poor man’s closures.”

Chastised, Anton took his leave from his master and returned to his cell, intent on studying closures. He carefully read the entire “Lambda: The Ultimate…” series of papers and its cousins, and implemented a small Scheme interpreter with a closure-based object system. He learned much, and looked forward to informing his master of his progress.

On his next walk with Qc Na, Anton attempted to impress his master by saying “Master, I have diligently studied the matter, and now understand that objects are truly a poor man’s closures.” Qc Na responded by hitting Anton with his stick, saying “When will you learn? Closures are a poor man’s object.” At that moment, Anton became enlightened.

Anton van Straaten

When I first read the above koan some time ago, I didn’t really understand it. I had a very basic idea of closures, but at the time they were just a syntactic oddity to me – something you could do a few cool things with, but not particularly useful. Since then I’ve worked through quite a bit of Structure and Interpretation of Computer Programs and delved into functional programming in JavaScript, which has given me a much deeper understanding of closures. On the other side of the divide, I’ve been doing a lot of Ruby programming, which has helped me grok objects a lot better. I now feel like I can begin to comprehend what the quote was getting at.

My moment of enlightenment came today while reading Test-Driven JavaScript Development and looking at the code for a JavaScript strftime function (abbreviated here for brevity):

Date.prototype.strftime = (function () {
  function strftime(format) {
    var date = this;

    return (format + "").replace(/%([a-zA-Z])/g, function (m, f) {
      //format date based on Date.formats
    });
  }

  // Internal helper
  function zeroPad(num) {
    return (+num < 10 ? "0" : "") + num;
  }

  Date.formats = {
    // Formatting methods
    d: function (date) {
      return zeroPad(date.getDate());
    },
    //...
    //various other format methods
    //...

    // Format shorthands
    F: "%Y-%m-%d",
    D: "%m/%d/%y"
  };

  return strftime;
}());

The above code uses an IIFE (Immediately-invoked function expression) to produce a function with additional data (if Date.formats was instead declared as a local variable, this would be a better example). If it doesn’t make sense, I thoroughly recommend Ben Alman’s post on IFFEs for an overview of the technique. The code executes, and returns a function. The important thing is that one function is used to define another function and its context.

In Ruby, when you define a class, the class definition is executed as Ruby code, unlike in Java, for example, where a class definition is just a syntactic construct read at compile-time, but not executed in the way other Java code is. A Ruby class definition is read at run-time, and builds up a new class as it is interpreted.

In a lot of ways, Ruby class definitions and JavaScript function-defining functions are equivalent. I’ll give you a little example to illustrate:

zen.js

var Cat = function () {
  var age = 1;

  function catYears() {
    return age * 7;
  }

  function birthday() {
    age++;
  }

  return {
    catYears: catYears,
    birthday: birthday
  }
};

var powpow = Cat();
var shorty = Cat();

//yo shorty, it's your birthday:
shorty.birthday();

alert(powpow.catYears()); // => 7
alert(shorty.catYears()); // => 14

zen.rb

class Cat
  def initialize
    @age = 1
  end

  def cat_years
    @age * 7
  end

  def birthday
    @age += 1
  end
end

powpow = Cat.new
shorty = Cat.new

#yo shorty, it's your birthday:
shorty.birthday

puts powpow.cat_years # => 7
puts shorty.cat_years # => 14

Strictly speaking, in zen.js I’m returning an object, but the private data and methods of that object are saved in a closure. So, zen.js stores its state in a closure, while zen.rb stores its state in an object. Every time Cat() is called in zen.js, a new closure is created with a unique age. Every time Cat.new is called in zen.rb, a new object is created with a unique @age. (These two examples aren’t strictly equivalent – each cat in zen.js gets a new copy of the functions, whereas in zen.rb they share the same methods. It’s possible to make the JavaScript version function more like a Ruby class definition, but it takes a bit more code.)

Of course, there’s a lot you can do with JavaScript closures that you can’t do with Ruby objects. And there’s a lot you can do with Ruby objects that can’t be emulated using JavaScript closures. Any time you decide that one is better than the other, just imagine Qc Na hitting you with his stick.

Further reading

Really really simple Ruby metaprogramming

February 5th, 2011 12 comments

Metaprogramming in Ruby has the reputation of being something only the true zen Ruby masters can even hope to understand. They say the only true way to learn metaprogramming is to train with Dave Thomas in his mountain retreat for five years – the Ruby equivalent of this XKCD:

But it’s not that bad. In fact, I’m going to go so far to say that it’s possible to learn to metaprogram in Ruby without even meeting Dave Thomas. Yeah, I know, it doesn’t sound possible. But just you wait…

What is metaprogramming?

Metaprogramming is what makes Ruby awesome. It’s writing code to write code. It’s dynamic code generation. It’s the move from imperative to declarative. It’s a rejection of Java’s endless repetition of boilerplate code. It’s the living embodiment of DRY. Here’s an example:

class Monkey
  def name
    @name
  end

  def name= n
    @name = n
  end
end

I can see you there, in the middle of the classroom, with one arm straining up, the other one holding it up because it’s so damn hard to hold up. OK, Jimmy, what is it? Oh, you don’t need to write all that code in Ruby? You can just use attr_accessor?

Jimmy’s right. That code snippet above could be written like so:

class Monkey
  attr_accessor :name
end

So, attr_accessor‘s magic right? Well, actually, it’s not. Just like in Scooby-Doo where what looked like magic to start off with turned out to be an elaborate hoax, attr_accessor is just a class method of Module. It defines the name and name= methods on Monkey, as if you’d manually defined them.

And that’s all metaprogramming is. Just a way of adding arbitrary code to your application without having to write (or copy-paste) that code.

An (extended) example

The source code for this example is available in my StringifyStuff plugin on github, which in turn was adapted from Ryan Bates’ Railscast Making a Plugin. Incidentally, if you feel you can improve the plugin, fork me on github!

I’m assuming basic knowledge of Ruby, and some familiarity with Rails would be helpful as well, but not essential.

Have a quick peek at the github repo now – the important files to look at for now are init.rb and lib/stringify_time.

First, stringify_time. This file defines a module called StringifyTime, which defines a method called stringify_time:

module StringifyTime
  def stringify_time *names
    #crazy stuff going on in here
  end
end

The other two files, stringify_stuff and stringify_money are similar.

Now, init.rb:

class ActiveRecord::Base
  extend StringifyStuff
  extend StringifyTime
  extend StringifyMoney
end

This extends ActiveRecord::Base with the three modules listed. This now means that any class that inherits from ActiveRecord::Base (e.g. your models) now has the methods defined in each of those modules as class methods.

The StringifyStuff plugin is used like so:

class Project < ActiveRecord::Base
  stringify_stuff
  stringify_time :start_date, :end_date
end

stringify_time is passed a list of symbols representing the relevant model attributes. It will provide useful accessors for those attributes. Let’s have a look at a simplified version of stringify_time:

module StringifyTime
  def stringify_time *names
    names.each do |name|

      define_method "#{name}_string" do
        read_attribute(name) &&
          read_attribute(name).strftime("%e %b %Y").strip()
      end

    end
  end
end

stringify_time is passed a list of symbols. It iterates over these symbols, and for each one calls define_method. define_method is the first clever Ruby metaprogramming trick we’re going to look at. It takes a method name and a block representing the method arguments and body, and magically adds an instance method with that name, arguments and body to the class in which it was called. In this case, it was called from Project, so this will gain a new method with the name "#{name}_string". In this example, we passed in :start_date and :end_date, so two methods will be added to Project: start_date_string and end_date_string.

So far so good. Now though, it gets a little bit hairy, so hold onto your horses (or equivalent equid).

In a normal model instance method, you’d access an attribute using a method of the same name. So if you wanted to get the start_date converted to a string, you’d write:

def my_method
  start_date.to_s
end

The problem with doing this in define_method is that we don’t have start_date as an identifier – it’s held as a symbol. There are two ways to accomplish the above if start_date was passed in as a symbol:

def my_method attr_sym
  read_attribute(attr_sym).to_s #This is a Rails method
end

or:

def my_method attr_sym
  send(attr_sym).to_s #Ruby built-in method
end

For simplicity, I’m using write_attribute, but send is useful to know about too.

So, back to define_method:

      define_method "#{name}_string" do
        read_attribute(name) &&
          read_attribute(name).strftime("%e %b %Y").strip()
      end

So, each time round the loop, name will be one of the symbols passed into stringify_time. Let’s go with start_date to see what happens. define_method will define a new instance method on the Project class called start_date_string. This method will check to see if there is a non-nil attribute called start_date, and if so, call strftime on it. This formatted date will be the return value of the method.

Wrap up

That’s more than enough for one day (I’ve been told off for writing kilo-word posts before). I’ll explain the workings of the rest of the plugin in a future post. If you want to learn more, I highly recommend David Flanagan’s The Ruby Programming Language.

Metaprogramming is writing code that gives you more code. It’s a bit like the Sorcerer’s Apprentice; in fact, I think that should be recommended watching for this subject.

Be the Sorcerer. Don’t be Mickey.

Categories: Programming Tags: , ,

Ruby vs JavaScript: functions, Procs, blocks and lambdas

January 15th, 2011 103 comments

In my post Why JavaScript is AWESOME I compared Ruby to JavaScript like this:

A language like Ruby is a toolbox with some really neat little tools that do their job really nicely. JavaScript is a leather sheath with a really really sharp knife inside.

What I was partly getting at was how the two languages handle the passing around of code. Both have their own way of working with anonymous functions. They’ve taken very different approaches, so if you’re moving from one language to the other it can be confusing. I’d like to try to explain the differences. Let’s look at JavaScript first.

JavaScript functions

JavaScript has two ways of defining functions – function expressions (FE) and function declarations (FD). It can be a confusing distinction because they look the same. The difference is that a function declaration always has the word function as the first thing on the line.

//Function declaration:
function doSomething() {
    alert("Look ma, I did something!");
}

//Function expression:
var somethingElse = function () {
    alert("This is a different function");
};

The FE can optionally contain a name after the function keyword so it can call itself. I’m not going to go into depth on FD vs FE, but if you’re interested in learning more read Ben Alman’s piece on Immediately-Invoked Function Expressions, which has a good description and some useful links.

You can think of an FE as a function literal, just like {a: 'cat', b: 'dog'} is an object literal. So they don’t just have to be assigned to variables – they can be passed as function arguments, returned from functions, and stored in data structures (objects and arrays). In the end though, there’s only one function type – doSomething and somethingElse are the same type of object.

Ruby

Ruby is at the opposite end of the scale to JavaScript. Instead of having just the one function type, it has multiple types: blocks, Procs, and lambdas. (There are also methods and method objects but that’s a different story.)

Blocks

A block in Ruby is a chunk of code. It can take arguments via its funky pipe syntax, and its return value is whatever the last line evaluates to. Blocks aren’t first-class citizens in Ruby like functions are in JavaScript – they can only exist in one place – as the last argument to a method call. They’re very much a special syntactic construct baked right into the language. All methods can received blocks whether they include a parameter for one or not – it will be received anyway, and you can interact with it without it having a name. It can be called with the yield keyword, and you can check to see if a block was supplied with block_given?. Robert Sosinski gives a good example of using blocks from the point of view of the caller and that of the receiver in his post Understanding Ruby Blocks, Procs and Lambdas:

class Array
  def iterate!
    self.each_with_index do |n, i|
      self[i] = yield(n)
    end
  end
end

array = [1, 2, 3, 4]

array.iterate! do |n|
  n ** 2
end

Procs

If you need a block of code to act as a first-class citizen, you need to turn it into a Proc. That can be achieved with Proc.new. This takes a block and returns a Proc. Pretty cool eh? Procs are useful where you’d want to use a block but you’re not passing it as the last argument to a method. Rails provides a good example:

class Order < ActiveRecord::Base
  before_save :normalize_card_number,
    :if => Proc.new { |order| order.paid_with_card? }
end

The :if => syntax shows that we’re passing a hash to before_save. We can’t put a block as a hash value, so we need to make a Proc instead. Other callbacks can just take a raw block:

class User < ActiveRecord::Base
  validates_presence_of :login, :email

  before_create do 
    |user| user.name = user.login.capitalize if user.name.blank?
  end
end

For more discussion on Procs and blocks have a look at Eli Bendersky’s post: Understanding Ruby blocks, Procs and methods.

Lambdas

A lambda in Ruby is probably the closest thing to a function expression in JavaScript. A lambda is created similarly to Proc.new:

lambda { |x| x ** 3 }

Both lambda and Proc.new take a block and return an object. However, there are important differences in how they deal with their arguments and how they deal with return and break. To talk about that though I’ll need to go back to basics to discuss the reasons for using these self-contained chunks of code.

The reasons for using anonymous functions

There are a lot of good reasons to support anonymous functions in a programming language. In Ruby and JavaScript, the uses generally fall under two rough categories: iteration and deferred execution.

Iteration with anonymous functions

Iteration has a lot of uses – you can sort a collection, map one collection to another, reduce a collection down into a single value, etc. It basically comes down to going through a collection and doing something with each item, somehow. I’m going to show a very basic example of iteration – going through a list of numbers and printing each one.

In Ruby, there are two common ways to work through an array or other enumerable object, doing something to each item. The first is to use the for/in loop, which works like this:

arr = [1, 2, 3, 4]
for item in arr
  puts item
end

The second is to use the each iterator and a block:

arr = [1, 2, 3, 4]
arr.each do |item|
  puts item
end

Using iterators and blocks is much more common, so you’ll usually see it done like this.

JavaScript has a for/in loop as well, but it doesn’t really work very well on arrays. The standard way to iterate over an array is with the humble for loop:

var arr = [1, 2, 3, 4];
for (var i = 0, len = arr.length; i < len; i++) {
    console.log(arr[i]);
}

Many libraries offer support for something like Ruby’s each iterator – here is the jQuery version:

var arr = [1, 2, 3, 4];
$.each(arr, function(index, value) { 
    console.log(value); 
});

There’s an important difference here between the way Ruby and JavaScript handle the each iterator. Blocks in Ruby are specially designed for this kind of application. That means that they’ve been designed to work more like a looping language construct than an anonymous function. What does this mean practically? I’ll illustrate with an example:

arr = [1, 2, 3, 4]
for item in arr
  break if item > 2 #for non Rubyists, this is just like a compact if statement
  puts item
end

arr.each do |item|
  break if item > 2
  puts item
end

These two code snippets do the same thing – if item is greater than 2 iteration stops. On one level, this makes perfect sense – the each iterator takes a block of code, and if you want to stop iterating you break, as you would in any other language’s native foreach loop. On another level though, this doesn’t make any sense – break is used for loops, not anonymous functions. If you want to leave a function, you use return, not break.

Interestingly, if you use return inside a Ruby block, you don’t just return from the block, but the containing method. Here’s an example:

def find_first_positive_number(arr)
  arr.each do |x|
    if x >= 0
      return x
    end
  end
end

numbers = [-4, -2, 3, 7]
first = find_first_positive_number(numbers)

When arr.each gets to 3, find_first_positive_number will return 3. This demonstrates that in blocks in Ruby, the break and return keywords act as if the block was just a looping construct.

jQuery.each, on the other hand, is working with normal functions (there’s nothing else to work with in JavaScript). If you return inside the function passed to each, you’ll just return from that function, and each will move onto the next value. It’s therefore the equivalent of continue in a loop. To break out of the each entirely, you must return false.

So, in Ruby, break, continue and return work in the same way whether you’re using the for/in looping construct or using an iterator with a block. With a jQuery iterator, return is the equivalent of continue, return false is the equivalent of break, and there’s no equivalent of return without putting extra logic outside the loop. It’s easy to see why Ruby blocks were made to work this way.

Deferred execution

This is where a function is passed to another function for later use (this is often called a callback). For example, in jQuery.getJSON:

$.getJSON('ajax/test.json', function(data) {
    console.log(data);
});

The anonymous function isn’t executed until the JSON data comes back from the server. Similarly, with Rails:

class User < ActiveRecord::Base
  validates_presence_of :login, :email

  before_create do |user|
    user.name = user.login.capitalize
  end
end

The block won’t be executed immediately – it’ll be saved away, ready to be executed as soon as a new User is created. In this case, break no longer makes sense – there’s no loop to break out of. return generally also doesn’t make sense in this case, as it may try to return to a function that has already returned. For more details on this see my question on Stack Overflow. This is where lambdas come to the rescue. A lambda in Ruby works much more like a Ruby method or a JavaScript function: return just leaves the lambda; it doesn’t try to exit the containing method.

So why use a block as a callback? Probably the best reason is that Ruby makes blocks so easy, so it’s simpler just to pass a block to the before_create method than to create a lambda and pass that (the before_create internals will be simpler as well).

Pros and cons

As I said at the beginning, Ruby and JavaScript have taken two extremely different approaches to functions. There’s a whole different discussion about methods across the two languages as well (JavaScript uses the same function type for methods, whereas Ruby has yet another type) but I’m not going to go there. Ruby has something different for every situation, and each one is optimised for its primary use-case. This can lead to confusion but can also make code easier to read. On the other hand, in JavaScript, a function is a function is a function. Once you know how they work, it’s easy.

There’s no point arguing about which of these two approaches is better – there’s no answer to that. Some people will prefer one over the other, but I like them both. I love the way blocks in Ruby make certain kinds of functional-like programming just roll off the fingers, but I also love the way JavaScript doesn’t complicate things, and makes proper functional programming much more possible.

If you made it this far, good on ya. I didn’t mean to write such a mammoth post, but sometimes these things happen. Cheers!