Talks ∋ My first ruby

I gave this talk at the June 2010 meeting of the London Ruby User Group. Accordingly some of the things in it might be out of date, particularly references to versions of software.

The transcript below comes from the presenter notes so represents a perfect version, which is unlikely to be exactly what I said on the night.

A video of this talk is availble from Skills Matter:

a screenshot of the video for the talk from the skills matter website

A photo of a drawing of a ruby in red crayon with the crayon on top of the drawing; text: My first ruby by Murray Steele (aged 31 ½)

I’m going to talk to you about the first ruby script I ever wrote. It’s called MY first ruby, but I promise it won’t all be about me. There’ll be a little bit of personal history, just so you can get an idea of what sort of programmer I was at the time, but most of the talk will focus on the script itself.

My 2010 passport photo and the logos of my employer (Unboxed Consulting) and preferred programming language (ruby) from 2010; text: Me in 2010, employed by Unboxed Consulting, codes mostly in Ruby, passport photo

First the personal history which I promise won’t take up much time. This is me now. But I wrote this script in 2003. not 2010. So we have to go back in time. So step into my time portal and come with me to the late summer of 2003…

My 2003 passport photo and the logos of my employer (Applied Psychology Research) and preferred programming languages (Java and wxPython) from 2003; text: Me in 2003, employed by Applied Psychology Research, codes mostly in Java and wxPython

Cast your mind back. Things were different then. Summers were longer, the sun was brighter, I had more hair, and ruby 1.8.0 (the grandfather of the version you all know and love) had only just been released that August.

I was in the dying stages of my first job out of university. I was interviewed to work primarily as a Java programmer; but I ended up doing a load of C++, a bit of VB and some Python GUI development. It was a great company, but the management decided to move the company offices to Cambridge, which I felt meant I was living the wrong way around; you commute into London, not out of it. I’d never done any Ruby programming, but I did have a taste for dynamic languages having done a lot of (not very OO) Python programming (we used tuples a lot). I still considered myself a Java programmer, not least because I’d just accepted a job at a Java-based SMS gateway company.

A photo of my friend James Adam, with a speech bubble containing his first description of ruby to me; text: I FORSAKE JAVA! I RENOUNCE C++! ALL HAIL the Mighty RUBY!! Riddle: 2.upto(10) { |num| print "ruby is #{num} times better than java or c++" }  Work it out…if you dare.

I heard of Ruby because my friend, James Adam (pictured), had been doing a PhD instead of working and while he flirted with Java and Python for a couple of early instalments of his thesis code, he finally settled in about 2001 on Ruby for the majority of his code, and wouldn’t stop going on about it to anyone that would listen.

The prime vehicle for James’ evangelism of Ruby was to send messages a mailing list that our university classmates had set up to keep in touch after we graduated.

A screenshot of the egroups interface in Netscape navigator

This mailing list was originally running on egroups. Which was fine. It hosted files and polls and had a web interface. It was pretty sweet for the year 2000.

A screenshot of the Yahoo! Groups interface in Internet Explorer

Then they got bought by Yahoo! and it became Yahoo! Groups! Basically the same service, but it had a red and blue logo instead of a purple one, and yellow was banished from the UI.

However, and here’s where we finally get to the ruby script, Yahoo! Groups! started dropping emails intermittently, or taking several days for emails to show up.

As you can imagine, this sort of delay in the inane ramblings of a group of 20-somethings debating the merits of their first jobs during the dot-com era was TOO. MUCH. TO. BEAR!

A screenshot of the Yahoo! Groups interace in Internet Explorer with the word Cancelled stamped on it; text: cancelled

So despite thousands of available choices, and even a single click install of mailman on our shared hosting1, we decided to write our own replacement.

James somehow convinced us that we should write it in ruby even though he was the only one who knew any ruby. James knocked up a simple test script as a proof of concept and we decided to go ahead.

A photo of me dressed up fancy with a speech bubble proclaiming my readiness to program in ruby; text: I am all dressed up ready to code me some Ruby. YEAH!!!

And so, this talk is called “My First Ruby”, and while it’s true that it’s my first ruby. It’s not like I wrote it alone. After James wrote the prototype we both had a free weekend on the 13th September, 2003 and knocked up the initial version together.

A photo of me dressed up fancy, expanded to reveal James standing near by with a speech bubble explaining that heʼll help me; text: You donʼt even know Ruby, Iʼll make sure you do it right.

My first ruby script was written under the guidance of someone who had been using it for a couple of years. This probably isn’t that different to many of you though; any code you’ve written has hopefully had the benefit of other people at your job working on the same project, if you’re lucky you’ve even been pair programming with them. Even if it’s code you’ve written at home, chances are you’ve put it up on github and have the chance of thousands of rubyists looking at it. Don’t be scared of letting other people look at your code.

A photo of a hand holding open a book, the pages show (Go) code; text: https://www.flickr.com/photos/ajstarks/4196202909/

Speaking of reading code, Chris Lowis, LRUG’s resident podcaster, recently wrote a great blog post about open source rails projects and what you can learn from reading the source of them. I think it goes both ways; if you read other people’s code you learn a lot, and if you let other people read your code not only do they learn from you, but so will you when they critique it. They'll suggest patches for edge-cases you didn’t cover or even a neat re-factoring. It’s also a nice confidence boost when you read some code for say, gemcutter, and you notice something you think could be improved.

Anyway, aside over.

The diff of my first ever ruby commit, itʼs a small change; diff: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-11-my-first-ruby-diff
Gist for code in slide

So. Here’s the first code I committed to the project.

This probably isn’t the actual first bit of ruby I ever wrote. As I already mentioned, James had hacked up a prototype and the first commit to our source control was timed at around lunchtime on the Saturday. It’s 7 years ago so while I can’t remember exactly what we did that day, and I do recall spending a lot of the morning hunting around for a spare ethernet cable. I’m pretty sure we did some hacking on the code before we decided that CVS2 might be a good idea.

Anyway, I think there’s plenty in here that’s worth talking about. Why did I add this “empty constructor”. The commit message says it’s to allow YAML to make the object good. I’m not sure I know what that means, and on the face of it, it looks like I’m just stamping some error checking on here. I suspect however I was just experimenting with all the fun new things that Ruby can do.

The Java logo with a speech bubble and the Ruby logo with a speech bubble.  The contents of the speech bubbles contain the same object constructor code in each language for comparison; Java code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-12-a-constructor-java, Ruby code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-12-b-initializer-rb
Gists for Java code and Ruby code in slide

To compare what this felt like to me in 2003 it’d help to compare the final code to how I’d do the same thing in Java.

(Well, I think so anyway; my Java is rusty). This was pretty amazing! So much less code! First, there’s the fact that by allowing default values in method signatures I can get rid of that entire 2nd constructor. Then there’s using the if statement as a statement modifier, by placing it at the end. I don’t know why, but I’m a massive fan of this format, and I think it’s one of the reasons that I get ruby. It just reads so neatly. Finally, there’s a lack of extraneous syntax.

But, you didn’t come here to listen to a talk about why ruby syntax is better than java. And to be fair to Java in 2003, the syntax is nothing like the mess it is now with generics and annotations.

I think having shown my first “committed” ruby, it’s time to talk about the system as a whole rather than go through it and pick holes in every commit of mine.

The BBFC logo for a 15 certificate filem; text: 15 for frequent strong language

Now I have to warn those of a sensitive nature, for reasons best left unexplored we decided to call our new mailing list software after a favourite insult from our university days. And I’m not going to be able to avoid saying it or showing it on screen, so I have to warn you;

text: Fucknut

We called it:

Fucknut

Fucknut is a mailing list with an attached web front-end for viewing the archived messages and attachments and managing your user account. Basically, it’s a less accomplished version of mailman.

A hand-drawn architecture diagram of the fucknut system; Emails go into Procmail which reads from procmail.rc that contains a regexp to pipe the email to the fucknut.rb script; this script uses the YAML::Syck, Rmail, and Net::SMTP libraries, and outputs to an Archive database, and a set of users

The main component of fucknut is the part that processes mail, and this is also the oldest part of it as it’s based on James’ inital prototype.

It starts with a .procmailrc file. For those that don’t know, procmail is a UNIX tool that you can get to run against every mail that is delivered to your shell account and the .procmailrc file controls it. You can think of it like a rails routes.rb file for mail (except it doesn’t use ruby or have a nice dsl).

You define a regexp to match against some part of the incoming mail and if it matches you can decide what to do, for example forward the mail to /dev/null, or invoke a script on it (it passes the mail in via STDIN). You also decide if you want to stop processing or continue to see if it matches other rules.

For fucknut we have a rule that matches against the TO (or CC) address, and if it matches the list address we ask procmail to invoke a list handler script for that list.

These handler scripts are slim wrappers that set up the environment for the mailing list processor and then pass it the mail as a ruby object. We use Rmail for this (not the Tmail or Mail gems which you may be more famlilar with).

This mail part of fucknut also uses YAML::Syck (which is now the default YAML parser in 1.8.x so you’re just using YAML now3) to deal with some configuration stuff and Net::Smtp to send out email.

We’ll cover this in detail later, but having received an email it stores it in an archive db and then sends it on to the other users on the list.

A snippet of code that shows the handleMessage method; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-16-handlemessage-rb
Gist for code in slide

This is the main method from the mail processing script and it describes the route that the mail takes through the system.

The first thing it does it make sure that the sender of the mail is one of the users. We’re all digital natives so a user is allowed to have several email addresses attached to their account and can post from any of them.

If it’s not a valid user the mail is discarded. If it is a valid user we continue processing it.

The next thing we do is set the from address to the user’s preferred posting address. I might send email from my work account, but I don’t want people using that account to mail me (and this was important at the time because my work email address changed about 4 times as the company underwent furious rebranding every few months at the whim of our VCs) so we tell fucknut to make it seem like all my mail comes from my personal account.

We then massage the subject of the mail to add our list identifier and keep “re:”s down to a minimum.

Then we do various things to the headers, mostly required of us by the 12 hundred RFCs that there are about mailing lists4.

Then we process any attachments to save them to a separate data store. And, because we wrote this in 2003 when many people were still on dialup, remove any attachments over a certain limit.

Then we archive the message to our database (for which read dump the raw mail to disk).

Finally, we go through the complete user list and send the mail out to all the users. Including the sender.

And that’s it. That’s Fucknut at a glance. Now I’ve described the system I’ll go over some of the code that I think is particularly terrible.

A screenshot of an email thread titled “regarding ‘R E :’”, there are 23 messages;

Originally we were just going to add the list name in square brackets to the start of the mail, but then we realised we had to do something to prevent various mail clients messing up the re: re: re: re: stuff. After a tortuous requirements gathering thread we decided to settle on [list name] <original subject> and Re: [list name] >original subject without any re:<. As you can might be able to see, this took 3 days and 23 messages to argue about and decide to do the thing we were going to do anyway. A further argument, if you ever needed it, for not starting a bikeshed discussion if you can possibly get away with it.

There was nothing else about this app that involved this level of debate.

A snipped of code that shows the processSubject method; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-18-processsubject-rb
Gist for code in slide

This code is pretty bad. I’m massaging a string, I should be using regexp here. I don’t necessarily agree with using regexp for everything but doing all this string manipulation here would be much better done with regexp. It wasn’t until a few months into doing ruby professionally (2005-ish I think) that Jon Lim (who I was working with at the time) asked my why I kept using .slice and [] all the time instead of .gsub with a simple regexp. I think this was a hangover from my Java days where regular expressions were percieved as slow and crappy, and strings were immutable so you did everything with StringBuffers.

A snippet of code that shows a refactored version of the processSubject method; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-19-processsubject-refactored-rb
Gist for code in slide

I’m pretty sure, that even despite using regexps, this is easier to read and understand what’s going on.

A snippet of code showing a class, ListConfiguration, that extends from Hash with a comment suggesting this is for convenience; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-20-listconfiguration-rb; text: this is a loosely extended Hash with some little things to make itʼs use more convenient
Gist for code in slide

The next thing we do is add list headers. We store those as part of the system config and have this object ListConfiguration, “an extended Hash with some little things to make it’s use more convenient”.

As a comment at the top. You know what, I’ve never ever used a hash in ruby and thought, “Gee, I wish this was more convenient”, I can’t say I even really think that the new 1.9 hash syntax is that much better. It also shouldn’t extend Hash, it should contain a Hash and delegate the bits of the Hash API that I want and then provide it’s own methods where I want more convience.

A snippet of code showing one of the convenience methods from ListConfiguration; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-21-listheaders-rb; text: this method is a bit overkill, but by caching the actual list header object instead of having to get it from the main config hash each time we should save some overhead. the ListHeaders part is the most frequently accessed, because it contains a lot of varied list info.
Gist for code in slide

This is one of those methods. Clearly there’s some premature optimisation going on here. Maybe I misunderstood YAML backed Hashes and though it would always be hitting the YAML file. Even if I did, don’t optimize until you have to.

Given all that, you know what would be more convenient than having to write that method in the first place…

A snippet of code that is more convenient than the previous method; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-22-listheaders-rb
Gist for code in slide

…this.

We mostly assigned instances of ListConfiguration to a @config variable when we use it, so just treat it like a hash.

Or, if I wanted to save 1 char typing whenever I accessed the list headers…

Another snippet of code that is more convenient than the original method; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-23-listheaders-rb
Gist for code in slide

… we could define a listHeaders method and use that on the @config instance. But, really, the first thing is better, I’ve genuinely no idea what was going on here.

The listHeaders snippet of code again, without the descriptive comment; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-24-listheaders-rb
Gist for code in slide

The final weird thing about this ListConfiguration object is, if we look back at that listHeaders method you’ll notice we don’t use symbols or strings as the keys into the hash. We use constants which are defined at the top of the Config module. For example…

A snippet of code explaining that the constants used in listHeaders are just strings; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-25-listheaders-rb; text: these just resolve to unique strings, used as keys within the config hash
Gist for code in slide

…like these.

Why? I don’t know. I just doesn’t make any sense, until you remember my Java routes where these sorts of “magic” strings would be defined as public static Strings because you’d only want to create that object once (woe betide the Java programmer in the early 2000s who went around creating more objects than they strictly needed to). Thing is, Ruby has symbols which save one char on typing a string and are more idiomatic. I can only assume I didn’t know about symbols when I wrote this. I probably wanted some comfort that I wasn’t spelling a config key wrong and causing a nil error so shied away from using strings. Using constants means I’d get a runtime error saying there’s a missing constant instead of a nil error somewhere down the line.

A snippet of the code for processing attachments; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-26-attachment-processing-fragment-rb
Gist for code in slide

The next bit of code to talk about is this, it’s a chunk of the attachment processing method.

We get here if the message is multipart, and this fragment is run on each part.

We extract and store all the attachments, but we also remove them entirely if they are over a certain size

This code is actually not too shabby. It’s quite long because there are loads of edge-cases. Over the years, this is the part of the code that’s seen the most changes. Turns out multipart mime messages are hard and you can nest things and it gets weird. Whatever your naïve approach is, it’s going to crumble as people send richer and richer messages from more and more esoteric mail clients. In fact I had to fix a bug in it only a month ago, someone’s mail client started sending nested multipart messages with multipart/alternative.

What you’re looking at is what the code used to look like. Some of you may already have noticed the bug.

If the part of the mail we’re dealing with didn’t have a filename (such as multipart/alternative which is effectively nesting another set of parts), then our code logs the error but then blindly continued on assuming that it has a filename it can do something with. It never came to bite us until someone sent a multipart/alternative message.

This is something I hope would have been fixed by TDD. With TDD I’d probably have mocked the log or file.size calls and built the function up slowly. But, I may not have caught it with TDD because I might never have thought to try out a nested multipart file.

The other thing it shows is that the real world will always conspire to break your code. If I had tests though I probably would have been able to feed the mail that broke it into the test suite and find out what tests suddenly failed.

The handleMessage code snippet again; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-16-handlemessage-rb
Gist for code in slide

I’ve covered a few of the little refactorings I’d do to individual methods, but I think the whole thing could do with an overhaul. It’s pretty much one class and it does everything. I think something like the following…

A hand-drawn diagram showing an alternative architecture for processing messages using a pipeline; Email goes into the pipeline of processors: subject -> sender -> archive (which outputs a .msg file) -> attachments (which outputs a series of files) -> users (which sends emails)

…might work. It’s really a pipeline and, like Rack, it might make sense to have a chain of smaller classes linked together, they all take in a mail message and do something to or with it and then pass it on. That way everything can be tested properly in isolation and it reduces the coupling between things.

A screenshot of Safari showing the Ruby Application Archive search page showing the results of a search for “www”

That’s the first part of fucknut covered, so now we’re onto the 2nd part, the web front-end.

2003 was a dark time for web development in ruby. There were lots of little libraries, but nothing as comprehensive as Rails, certainly not the rails we have now, but not even the rails we got in 2005. The thing that may surprise you is that we didn’t even have gems back in 2003. The first release of gems was in March 2004. To find ruby stuff we scoured something called the Ruby Application Archive which was a website similar to what rubyforge is now5, where you could list ruby projects and categorise them. Except it did no hosting, you just pointed the links to where the data was.

On top of this, someone wrote a program called raa-install, which would go and find projects on RAA, download them and install them. At this point most libraries, if they had any installation, used the ruby setup.rb incantations, and raa-install would run those for you too. It didn’t do dependency graph information, but that’s because this info wasn’t on RAA. The thing that’s not clear to me looking back, is why gems and rubyforge came along when there was already this in place. I’ve not looked into it nor looked at the code. There’s probably an interesting story there.

So, as an aside, I know there’s a bit of hate for gems right now6, mostly about issues over dependencies and keeping applications and system stuff independent. I wouldn’t worry about it though, rubygems is not the first ruby code distribution and management system that’s existed, so maybe it won’t be the last. If it’s served it’s purpose perhaps it is time to move on; be that bundler, rip or something else.

A screenshot of Safari showing the Ruby Application Archive entry for narf version 0.3.4

Anyway, after a couple of false starts we settled on something called narf, which appeared to be the most high-level thing at the time. Now, remember this is me talking about narf as it was in September 2003. It was version 0.3.4 then, and it got up to 0.7.3 before development appeared to stop in 2005, and some of the docs imply it was headed in a direction that would abstract things further.

A code fragment showing the handler.rb script; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-31-handler-rb
Gist for code in slide

So, the first thing that seems weird to me, is that narf isn’t just a library that you include into your scripts. It also comes with an executable. As it turns out this executable is there to redirect any exceptions and errors from the ruby process back into the CGI environment. But this fact is buried as an aside in the docs. It’s weird.

A code fragment showing the Apache configuration for running the handler.rb script; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-32-apache-conf
Gist for code in slide

There’s no nice abstraction of routes, but then that’s not surprising, it’s not a higher-level MVC framework. It’s a web framework. It abstracts the first level of CGI interaction, it doesn’t build on top of it to give you what rails or sinatra gives you.

So, you run it as a CGI script, although you could have used mod_ruby (no, not passenger, the mod_ruby that existed ages ago that no-body used). And if you’ve installed ruby_fastcgi7 you can run it as fastcgi. So, it’s very bare bones, if you want fancy urls you have to write them yourselves using RewriteRule directives. This is ours.

We clearly got bored, as there are a few more URLs in the webhead other than looking at the archives, but we clearly couldn’t be bothered with making them pretty, most likely because we’d have to write them in this non-expressive regex format.

That said, I’m pretty sure early versions of rails asked you to do the same thing. I could be wrong though. Already it should be clear we are working at a lower level of abstraction. We’re close to the metal here.

A code fragment showing some of the narf API use in handler.rb; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-33-cgi-handler-rb
Gist for code in slide

Having asked apache to invoke your script, you require the narf libraries and this gives you a Web object. This object is what you interact with to communicate with the webserver. This is a fragment of our main CGI script.

Apart from showing off my naïve ruby stylings (4 space indent! collect instead of map!) this explains a lot of the narf api:

  • Web[] to get params that were sent with the request
  • Web.print_template to invoke some template processing on a file providing a list of variables
  • Web.flush to send everything back to the webserver
A fragment of the view templates that narf uses; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-34-error-page-html
Gist for code in slide

This is what that template looks like.

As you can see here, there are 2 ways of rendering data in these templates.

The first is that, moustache style, it’ll evalutate and render the results of any expressions within {} braces. The key/values in the hash you provide to print_template are available as $vars for evaluation. Much like the :locals hash when rendering a rails partial.

The other way of interacting is to use these <narf:> prefixed tags. Think of them like rails helpers, except instead of looking like code, they look like HTML. The web community swings back and forth on this sort of thing every so often, should the code in our views look like code (front-enders keep your hands off!) or should it look like markup (front-enders get stuck in!). I can think of only one templating engine for rails (radius which is used by radiant) that does it this way though.

Some of the narf tags, like <narf:foreach>, would emit things and you could use what was emitted inside the curly braces. As far as I can tell though, the braces are just for evaluating simple expressions, no logic. If you wanted logic you have to use narftags.

Anyway, that’s a whistlestop tour of narf as it was. To be honest it’s clearly early days and some of the things littered in the documentation suggest it was pointed in the right direction (it came with a testing framework and the docs suggested building the app test first using that framework).

A fragment of the LoginHandler; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-35-loginhandler-rb
Gist for code in slide

showLogin is effectively the render action for the "login" command. If the user isn’t logged in we want to show them the login page, we’ll call .showLogin on the LoginHandler.

So, what’s going on here? Again I’ve defined strings as constants when they really didn’t need to be, it’s all internal. I’ve also, for no reason abstracted the call to print_template out into calling it and the args I’d want to pass to it, I think it was just excitement at using the splat operator. Nowhere in the code do I ever call loginTemplateArgs except in this method, and I can’t think where I’d want to given that the showLogin method is simply a pass through to print_template. The only way this might make sense were if it was like this…

A fragment of the LoginHandler code showing a potential refactor to the showLogin method; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-36-showlogin-rb
Gist for code in slide

Maybe somewhere else (and I use this form for other handlers and their “actions”) I might want to render a template that shows a fragment of the login ui, and so grabbing the params that it needs from the LoginHandler might make sense.

But I don’t. This just makes it more complex, and it should really be…

A fragment of the LoginHandler code showing a further refactoring of the showLogin method; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-37-showlogin-refactored-rb
Gist for code in slide

It’s just simpler.

A hand-drawn diagram of the 3 “controllers” in the system: UserUpdater, LoginHandler, ArchiveDisplayer;

You’ll have noticed that I talk about LoginHandler. Narf doesn’t give you any controller framework, so we came up with our own. There are 3 handlers:

  1. LoginHandler - this deals with user sessions
  2. UserUpdater - this deals with letting the user manipulate their details from the user database
  3. ArchiveDisplayer - this deals with showing archived messages

They all have the same constructor: takes the Web object and the path to the fucknut root. They all have a .do method which looks at the cmd param of the Web object and acts accordingly. For example if the cmd is 'showmsg' in the ArchiveDisplayer, it finds the requested message and displays it. If the cmd is 'update' in the UserUpdater it fetches the current user’s details and updates them based on the POSTed params.

That’s where consistency ends though. LoginHandler returns a Session object (our own wrapper to a couple of methods on Web) and has other methods for rendering templates like showLogin above. For the other 2 calling .do is effectively an action endpoint and that handler will deal with everything from then on.

It’s clear that LoginHandler and the other 2 aren’t really the same sort of thing, and yet I’ve made them look the same. The fact that I was using them differently meant I should have realised that they were different things, or that I was doing something else wrong. A LoginHandler could easily have acted like the other 2 (where .do is a render endpoint) if I’d had some other object that deals with is the user logged in or not. Frameworks give you rules and consistency, when you go it alone without much thought you end up with messes like this.

Also, looking at the code for the main handler script and these other cmd handlers I’m amazed at how much plumbing has gone into my code to determine what to do based on the params, as opposed to actually doing it. With a higher level abstraction (like rails or sinatra routing) I can get on with saying: “this url means this code gets run”.

A fragment of the code from the UserUpdater showing a lack of abstraction; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-39-userupdater-fragment-rb
Gist for code in slide

Let’s look at some code in one of those handlers: UserUpdater.

This is some of the code that is run when the 'update' command is sent to the handler.

This code is directly inside the UserUpdater#do method. It’s not even refactored out into it’s own method! My mind reels at this nowadays, but clearly not in 2003. We’ve got:

  1. view code - I’m building HTML fragments that later I send to a template.
  2. model code -
    1. data conversion - converting params from strings into objects
    2. data validation - checking that the data isn’t nil or invalid

Clearly there’s the shock that I’ve had to hand code all this, then there’s the shock that it’s all there mixed up in one method and finally that although we have a User class it didn’t even cross our minds to keep this logic inside there. There’s nothing inside that User class that deals with validation. If I didn’t check that the sendTo mail address wasn’t valid here, it would be saved by default by the User when I asked it to later in the method.

The URL template that the ArchiveDisplayer has to process: /archives/<year>/<month>/<message_id>.msg

Finally, I want to show some code from the ArchiveDisplayer. The code in here isn’t actually so bad. Maybe it’s because I wrote this in January 2004 in a week I had off between jobs. I was clearly more learned, or maybe it’s because one you get past the web stuff, what the code is doing is fairly straightforward. It just has to go through our disk based archive structure:

archive/<year>/<momnth>/message_id.msg

and display the messages. It has some of the flaws already discussed in that everything happens in one class even when it probably could be decomposed more; it has a mix of view code and model code. But actually, looking back at it, although the framework is unfamiliar, some of the methods look not too far from what I’d write as view helpers today.

A fragment of the code used to process query params in the ArchiveDisplayer; code: https://gist.github.com/h-lame/1f032a1f8181fe220d6f1c2c4d98f64e#file-slide-41-param-parsing-rb
Gist for code in slide

Apart from this one.

That doesn’t look too bad, until you look at the rest of the method (this is just a fragment). That last if statement (which gets the navigation link or placeholder for taking you from the page you are on to the page for the previous year) is repeated almost in it’s entirety to get the navigation links for the previous month, the next month and the next year.

I remember during my final year in university doing a prolog exercise and getting a good mark for it because somehow I’d managed to bend my mind into making it do a reasonable job at playing noughts and crosses (for a partial board) without using all the memory on the planet. However one of the negative remarks was for a section of code where I’d repeated some lines without abstracting them into another method call. “Surely we put this sort of copy and paste behind us in CS1001?”

Apparently almost 4 years later I was still doing it.

A photo of a smiley face drawn in crayon with the crayons lying on top of the drawing; text: The end

So that’s my first ruby. And before I go, I just want to explain why I gave this talk. Mostly it’s because I hope that after having me come up here and show you this code from 7 years ago, and how bad it is more people will want to get up and show off their code in future meetings. Either, as I’ve intended, by showing that everyone writes bad code and you needn’t be worried about it. Or, unintentionally, because you’re worried that if you don’t I’ll turn this into a series of lectures where I talk you through every piece of ruby code I’ve ever written.

Thanks.