two small tech. challenges down...about a million more to go!

Two quick ‘technical’ challenges implemented today:

First, I have now implemented a poor mans cache system for the reviews.com homepage…here are a few quick details:

1. I updated IIS so that it first renders index.html (before index.cfm) if it exists…so basically as long as there is an index.html file, the system will use that as the home page.

2. I then built a very simple Ruby program that screen scrapes http://www.reviews.com/index.cfm (which is the real-time, dynamically built version of the reviews site that had been the main home page for the past 5+ years)…the program than saves the scraped version out to index.html (creating or replacing any pre-existing index.html page in the web site root directory).

3. I scheduled this 'rebuild’ of the index.html page to occur once every 10 minutes (but we can do any schedule we want for this 'rebuild’).

With this system in place, the normal user should always see the index.html page as the initial www.reviews.com page (and so it should load pretty darn fast)…with the 10 min. schedule, that version of the page should also be pretty up to date at any given time (regardless of how often/when new reviews are published)…and since the real index.cfm page still exists and is still in place, all pre-existing links (and things like the 'next’ and 'prev’ buttons on the home page can continue to work without interruption).

Pretty cool for a quick fix to long and ongoing problem.

The second challenge is a bit more involved…and so it’s a bit more exciting for me to report on…well not really all that exciting because it’s not 100% released just yet but it’s getting there…anyway, now that I have the feed store bit and the XML merge bits plugged into the back end of fubnub, I finally began to work on the 'group by’ and 'sort by’ features.

The interesting thing about this part of the challenge is that it’s one of those little problems that is actually very easy for a human to perform but is quite complex to handle within a computer language (or at least within the system fubnub is currently operating within).

To be fair, only part of the problem is related to the technical challenges of computer programming…the other part is related to poor initial design on my part (because I’m now taking the system in quite a different direction than was my original intent/plan).

Anyway, the approach I’m taking at the moment (which I’m not 100% sure I’ll stick with, but I haven’t exhausted the path 100% just yet to know for sure) involves adding in a 'filter’ option after parsing a given feed, but before applying the user-defined template.

So the filter simply states what tags from your feeds you want to group and sort by.

On the back end, I still parse your XMl feed into an array (of Hash values)…I then use the filter to determine what you want to group by, and loop through array building out a new master hash - and yes, for those of you keeping track that means I now have a hash of arrays of hashes…are you starting to see the complexity of this approach?

Once I have this 'grouped’ hash, I can then roll it back into an array and sort that array based on the counts of repeating elements within a Hash…it’s pretty hard to explain (especially since it was pretty hard for me to get a handle on building in the first place!)…so maybe a quick example is the way to show you what I’m trying to do.

Let’s assume you have something like this as your raw feed:

<track><artist>Billy Joel</artist><song>The Entertainer</song></track>
<track><artist>The Beatles</artist><song>Rocky Racoon</song></track>
<track><artist>Billy Joel</artist><song>Piano Man</song></track>

What all this Hash to Array to Hash and back again junk eventually gives you something like this:

<track><artist>Billy Joel</artist><song>The Entertainer</song></track>
<track><artist>Billy Joel</artist><song>Piano Man</song></track>
<track><artist>The Beatles</artist><song>Rocky Racoon</song></track>

Now I know given a small data set like this, it doesn’t seem all that impressive (or even useful considering the work involved)…but expanding this over a few hundred (or more) nodes shows that it’s actually a pretty cool first step in grouping and sorting.

The next step in the process is to update the templating part of the code to now be aware of the filters as well…and use the 'group’ by tags so that the template knows just how to display these new 'groupings’…until then, the group and sort by features are working but on the front end rendering it’s not 100% obvious…

Anyway, it’s another one of those little behind the scenes challenges that I (as a developer) am proud to have worked through/solved…but that you (and the rest of the world) will never notice (assuming I solved it properly). Sometimes that’s just how the cookies crumble for us developers I guess!

This post has received 43 loves.


ARCHIVE OF POSTS



This is the personal blog of Kevin Marshall (a.k.a Falicon) where he often digs into side projects he's working on for digdownlabs.com and other random thoughts he's got on his mind.

Kevin has a day job as CTO of Veritonic and is spending nights & weekends hacking on Share Game Tape. You can also check out some of his open source code on GitHub or connect with him on Twitter @falicon or via email at kevin at falicon.com.

If you have comments, thoughts, or want to respond to something you see here I would encourage you to respond via a post on your own blog (and then let me know about the link via one of the routes mentioned above).