Warm Fusion: Mapping the HESA Data Futures Schema…

Warm Fusion: Mapping the HESA Data Futures Schema…

Some combinations just work; Bacon and Eggs,  Fred and Ginger,  Tom and Jerry,  Sheringham and Shearer and – of course – Angus and Malcolm Young of AC/DC. Okay the last one might not be for everyone, but segues into the idea of this article.

For universities developing solutions for in-year reporting to the Office for Students, there is a raft of material available. Most of which is provided by HESA as part of the ‘Data Futures’ Programme*. The most important of which is the Data Dictionary (or schema)which defines all the entities and attributes mandated for each reference point.  I’ve seen a variety of visualisations attempting to show the impact of that schema on current activities.

It’s hard to do frankly. There are still pockets of stubbornness trotting out the well-worn anecdote that ‘we do HESA really well today, so why do we need so much change?’ Other than not having all the data you need, the current derivations not being fit for purpose and a lack of six months to clean the data, almost nothing! But that’s for another post.

The second artefact that’s generated excitement, and some head scratching, in the sector is the UCISA HE capability model. Released in Spring 2018, this was the culmination of a years work to create a ‘map of what a university does’ at an abstract level. We’ve been using this for almost every engagement – mostly to map the impact of Data Futures on the wider university. It’s a very powerful visual tool which resonates with many audiences.

All of which got me thinking. To understand the detailed impact of Data Futures, we need to assess where data is collected and where it is used. The latter sets the quality criteria, while the former should clean the data to that quality standard on ingestion to the university. Okay the world doesn’t work like that, but with a visualisation we can begin to focus on where expended effort in process, training, systems and data management will have the most reward.

So in a lightbulb moment, I realised a fusion of the current schema and the capability model would do just that.  That was a great solution but just the start of my problems. As I needed a way to at least partially automate it. I also know that expensive architecture and process tools are no prevalent across the sector. So we had to go with lowest denominator and that means Excel. Again. Not ideal, but let’s call it a proof of concept and move on!

Creating a copy of the model in Excel wasn’t difficult just time consuming.  A late night in a hotel later I had a workable template, a set of definitions, a list of entities and a blank expression. My idea was to select an entity – say Disability – and ‘light’ up the model with where it was captured and collected. Except I didn’t know how to do that. Thankfully the big brains of a University’s Strategic Planning and Analysis group were very kind about my lack of skills, while whipping up a VLOOKUP model in all the time it took me to say ‘How the heck did you manage that?’

So we started completing our capability assessment for each entity and it threw up some really interesting stuff. For example how many capabilities Disability would light up in terms of support for that type of learner, while something like Socia-Occupational Classification was basically collected and exhausted with no further processing. I took this as a validation of our approach, and it gave me a warm feeling about what University’s rightly care about.

Problem was we had data issues all over the place. My authorative source of capabilities was being copied and pasted into multiple cells with all the data quality issues that brings. All a bit cobblers’ shoes. So a second mighty VLOOKUP was forged to lock the input to that authoritative source.

Which is where we are now. This version has space for 95% of the Schema. We’ve included a mapping of the ’Student’ entity as an example. There’s space to complete the rest, with some basic instructions on how to do that in the workbook.  All I would say is this is released free of license and with zero warranty. If you mess about with the formulae and it breaks, then that’s on you!

I’ll pass this on to the UCISA EA Community of Practice to see if they want to also publish on the appropriate part of the UCISA site.  Finally I’m pondering converting this to a proper web based application that could be used for multiple scenarios. That is – however – just going to give me a whole load of new problems I don’t have time to fix!

The tool can be found in our community section

*In the community section, there is a Data Capability plan template for Data Futures. I will update it with more current material over the next few months.

By |2018-12-13T18:46:41+00:00December 12th, 2018|What I've done|

Leave A Comment