Sumo! A Generic Microformats Parser For JavaScript

BarCamp London 2 is coming up next weekend and having been hard at work on all my paying projects I’ve hardly had any time to work on other stuff. This meant that I didn’t really have anything to show and tell which is a bit crap. I thought this would be a good excuse to work on something that I’ve been threatening to do for a while: a generic microformats parser for JavaScript. Yes, there are a few things about but they all tend to be tied to a single format and I wanted to build a framework that would allow any microformat to be defined using a JSON-like definition so it could be used as new formats emerge.

So, this past week, I’ve pulled a few 3am nights and got something up and running which I’m quite pleased with. I’ve decided, for no apparent reason, to christen it Sumo and it lives in my subversion repos and at the moment I’ve written definitions for hCard, hCalendar and hReview.

It’s still in pretty early stages and I probably am still missing some details of microformat parsing (damn, it’s bloody complex) but it’s in a state that’s playable with so go ahead and look at the test, fire up Firebug and have a bash. To read all microformats of a certain type on a page you do:

var hcards = HCard.discover();
hcards[0].fn; //=> 'Dan Webb'
hcards[0].n.familyName; //=> 'Webb'

It returns an array of HCard instances nicely parsing out all the properties and doing all the insanely fiddly shit like implied n properties and class value parsing for you. Also, check the definition files…It’s my best attempt at implementing a DSL in JavaScript. Unfortunately, I’ve just found this post ....it seems someone’s had a similar idea…damn. Oh well, it was a learning experience. I’ll be interested to see what the Operator crew come up with.

The reason I really wanted to put a solid JS microformats parser out is that I really think that microformats and unobtrusive scripting are a match made in heaven that hasn’t been exploited that much yet. On the @media 2007 site I wrote a little script that pulls hCards, extracts the geo info(albeit in a none standard way because we didn’t want to use the abbr pattern) and places them on a Google Map. I believe I’ve seen Jeremy Keith doing a similar thing on the dConstruct site last year. It’s definitely a great way to go.

Microformats standardise structures of nodes with class names to represent certain information. Unobtrusive scripting (more often than not) hooks into structures of nodes with class names to add behaviour…it’s like they were made for each other. It’s a good basis for making easily deployable and portable unobtrusive ‘behaviours’ and I can see loads of applications for this.

This was what I was planning to show at BarCamp but in the process of writing Sumo I realised I’d started writing alot of JavaScript in a metaprogramming kind of style and have recently picked up loads of useful techniques in this vein so now I’ve changed my mind. My BarCamp session is going to be Metaprogramming JavaScript. I’ll write up the whole shebang into a proper article soon too.

So yeah, if your BarCamping, I’ll see y’all there.

10 Comments (Closed)

Yoink!

Matthew PennellMatthew Pennell at 09.02.07 / 09AM

Wha?

DanDan at 09.02.07 / 09AM

Lovely!

AndyAndy at 09.02.07 / 12PM

Do you plan to add XFN next/soon?

Devon YoungDevon Young at 09.02.07 / 15PM

Devon: Yeah, I will add it soon. XFN and RelTag both are different to the other microformats in that they don’t use classes so it’s going to require more work on the core parser. The other reason its not there already is that I couldn’t immediately see a use case for it in DOM Scripting. What behaviour could you attach to XFN?

DanDan at 09.02.07 / 16PM

Your link to http://www.kaply.com/weblog/2007/01/31/parsing-microformats/ is busted.

That said, please do write this up. Fun stuff.

Jeremy DunckJeremy Dunck at 09.02.07 / 16PM

Argh, bloody textile! It shall be fixed.

DanDan at 09.02.07 / 16PM

Are we free to use this in our commercial projects?

I’ve been looking for something like this for a while and I can see some great possibilities!

I thought I should ask first of course!

Steven HambletonSteven Hambleton at 14.02.07 / 22PM

Steve: It certainly will be available for whatever use you like. I normally license code using the MIT license. I’ve not put a license on this yet mail because it’s not finished yet. I’d definitely wait it out until the test suite is complete but feel free to have a play around and get back to me with any problems.

DanDan at 15.02.07 / 09AM

Thanks for very interesting article. Can I translate your article into polish and publish at my webblog? I will back here and check your answer. Keep up the good work. Greetings

PozycjonowaniePozycjonowanie at 16.02.07 / 14PM

About This Article