ECDP open data hack


Lamplight hacks – the ECDP data dashboard (2011)

Here’s a little hack showing data from
ECDP
. They’re a charity in Essex – from their website:

We are an organisation run by and for disabled people. Established in 1995 our origins are firmly rooted in the belief that the voice of disabled people, both as individuals and collectively, is vital if the lives of disabled people are to be enhanced.
We currently provide a wide range of support, information, advice and guidance services. We also play a key role in influencing the agenda of tomorrow to effect real social change to enhance the everyday lives of disabled people.

and they publish monthly performance data as an open data initiative.

The data are from their website, and the background photos are from their photo stream on flickr. This is only intended to show the ways that charities could easily (and pretty much freely) publish data. The data are ‘hosted’ as a Google docs spreadsheet; all the processing and data retrieval happens on this static web page, not on a server. It’s also intended to show how it’s fairly easy to bring in data from different sources, and how data might be standardised to enable benchmarking etc.

This is in response to a session at the NCVO annual conference on open data. If I were better at multi-tasking I might have managed this during the session, but I’m not!

So here’s their performance data for the last 7 months for their four top-level key performance indicators, against target. Blue lines are actual data, and orange lines are targets. Note there’s a bit of inconsistency about whether targets are culmulative or per month, but let’s not worry about that.

KPI-BENE-COUNT-MONTH

KPI-VOLU-COUNT-MONTH

KPI-PART-COUNT-MONTH

KPI-MEMB-COUNT-MONTH

Notes

This is a very simple display of their KPI data. It’s all client-side javascript using YAHOO Query Language (YQL) and Yahoo User Interface (YUI).

What’s happening: in overview…

  1. We use Yahoo Query Language (YQL) to get the ECDP performance data from a csv file on the web.
  2. We take this data and make four charts (using Yahoo User Interface) and display them
  3. We look up the KPI codes from the first csv file (something like “KPI-BENE-COUNT-MONTH” – it doesn’t really matter, I just made it up for now to make the point). These are held in another .csv file on the web, so we use YQL again to fetch these, depending on what KPI’s were in our data file.
  4. When these come back, we replace the codes with the useful descriptions
  5. We also look up ECDP’s flickr photostream, again using YQL, and when that arrives we put the images in as background images on the charts. Just for fun
  6. So we’re combining three separate data sources (a bit artifically, admittedly).

That’s why there are some flickers as we retrieve data and put it in the right place. All the data is being loaded live when we look at the page, but from the page. If I were doing this for real I might be a bit more graceful in when and how I’d want to show the elements, but for the sake of a demo it helps to see things appear as they are ready, because that’s part of the point.

Here’s how it works it a bit more detail…

  1. As part of the commitment to open data, ECDP publish some excel spreadsheets you can download (here)

    But a set of Excel spreadsheets are hard to do anything with automatically.

  2. I copied some of the top KPI data into a Google Docs spreadsheet and published it: you can see it here

    This csv file has month/year down the first column, and then four more columns with the KPI data, and then their targets for each indicator.

  3. So we’ve got 7 months of data in one csv file online.
  4. Now we can query this csv data using the yql module in YUI

    YQL lets you write SQL-like requests from the internet:

    select * from csv where url =  
             "https://spreadsheets.google.com/a/lamplight3.co.uk/pub?hl=en&hl=en&key=0Akeai-XJWldxdEMyXzJYVEg0QklOVDF1eWY5dk9BTXc&single=true&gid=0&output=csv"
             and columns = "category,kp1,kp2,kp3,kp4";
            

    The response back is a json object containing the data from our
    spreadsheet.

  5. And then it’s easy to make a chart:

    var chart = new Y.Chart({
        dataProvider: jsonresponse, 
        render: "#results"
    });
    

  6. In fact, that would put all 4 KPIs on one chart: I’ve done a bit more here to get four different charts with targets, but the principle is the same. I just do a bit more iteration. And there’s extra stuff with photos etc.

Why is this interesting/a good idea? It’s only a hack, inspired after hearing ECDP talk about this at the NCVO conference. But here’s why I think this is a neat way to do things:

  • This is all client-side. I’ve not had to do anything with a server except save a spreadsheet in Google docs. This web page is a static html page with a bit of javascript.
  • Which means I don’t need to host data, or, more importantly, have all the data I need on one domain. Using YQL like this means it’s really quick and easy to start combining data sources.
  • For example, you could pull in another organisations’ performance data. Or lots. Or mash it up with all sorts of other things.
  • To get the data, pictures and build the charts is 120 lines of code, including comments (view source if you want to have a look). At the most basic, all I need to write to get a chart of this data up is:
          
    YUI().use('node', 'yql', 'charts', function (Y) {
        
        var yqlStatement = 'select * from csv where url = ' + 
            '"https://spreadsheets.google.com/a/lamplight3.co.uk/pub?' +
            'hl=en&hl=en&key=0Akeai-XJWldxdEMyXzJYVEg0QklOVDF1eWY5dk9BTXc&single=true&gid=0&output=csv" '+
            ' and columns = "category,kp1,kp2,kp3,kp4" and category != "Date";'
    
        Y.YQL(yqlStatement, function (r) {    
            var chart = new Y.Chart({
                dataProvider: r.query.results.row,
                render: "#results"
            });
        });
    });
          

    That’s it: Most of it is the url I copied from Google docs. That’s cool. It’s taken longer to write up than write the code!

  • The only ‘but’ about all of this is the original data format: you can’t really do much with a series of Excel worksheets. But if lots of similar organisations all started publishing this kind of data, in the same kind of format, you can start doing some interesting things, really easily. Like generate benchmarking data by combining different data sets.

To do this properly it’d be worth creating a YQL table definition, which isn’t hard and makes the actual YQL statement much shorter (perhaps something like "select * from ukcharity.ecdp.kpis") and would I think be faster. It would also be better to use proper dates in the original spreadsheet, so you could query by date etc. And even better, to have a real, shared repository of KPI codes and descriptions, so different organisations could use the same indicators, knowing that they were counting the same things.

What we’re thinking about all this at Lamplight

Lamplight already lets you produce these kinds of stats pretty easily. And we’ve developed a publishing module that lets you access data within Lamplight from externals sources: at the moment the idea is that, for example, infrastructure organisations could publish a member’s directory straight from Lamplight – only one source of data, for internal and external use.

But what we’re starting to think about is adding a some ‘KPI’ functionality that’d generate these overall stats automatically, and enabling these to be published too via an API call, with or without a password (up to the customer). It would then be straightforward to create a YQL table definition that pulled the data from Lamplight
and you could use it in the same way as here. Or you could use the API directly.

So a customer wanting to join this open data movement would really only need to choose their KPI’s and set a ‘publish them’ config setting, and the data would be available on the web in this way.