The answer – as many as you need!
Staff at iData, the company that provides Data Cookbook, recently published an analysis of the number of Functional Areas that each customer maintains. There are currently about 120 customers.
The median count of functional areas is 8, and the average is 11. The distribution is far from normal – one standard deviation (~9) in either direction covers nearly 90% of clients in the sample. The inter-quartile range runs from 6 to 13.
But almost 20% have more than 15, and at least half a dozen clients have more than 30 functional areas in their Cookbook instance.
For the record, at Rochester we maintain 12 Areas. When we started with Data Cookbook in 2013, we quickly set up about 7 Areas. In our first 6 months we added the remainder and have stayed steady at 12 every since. In the context of the iData analysis, we are about average.
The key thing to keep in mind is that a Steward (responsible for the definitions in their area of data expertise) is associated with every Functional Area, and the Steward comes first. If a new Steward comes forward, with expertise in an area of data not represented, a Functional Area would be created for him or her.
Highlights from Day 2 of the 2014 HEDW conference.
At breakfast I met someone from Ithaca College who asked me if my school was considering a hosted cloud platform for our data warehouse. We are not. She is. The small BI unit at Ithaca is eager to lighten their burden of routine maintenance tasks. Later that day I watched this person repeatedly pose her question to anyone around her during meals and breaks. I might try her systematic networking method some day when faced with a decision to which there is no clear answer.
I attended “Effective practices for data governance & lessons learned” presented by University of Notre Dame Information Services staff. Governance, a key theme of their presentation, was a trending theme at the conference. In this presentation I saw the most engaging illustration of why the same data question can have 2 answers.
Every data request has these three components – time, element, and context. Time and element are usually explicitly expressed by the data requester. Context – not so much. “How many faculty do we have in current fiscal year” answers time and element. Context is missing, and might be “…who’s primary role is faculty” (n=53), which eliminates the people with a primary role = staff but who teach one class (n=5). Thus, there could be 58 faculty or 53, depending on whether the context of primary role was important. This presentation motivates me to encourage data users at Rochester to make their assumed context explicit – in conversation, metadata, and email.
We had a rainy walk to dinner at Pod – a great pan-Asian restaurant with an all-white George Jetson decor. Within our group were colleagues from George Washington and Stanford. Both schools had signed with Collibra in the past 60 days – the first two US schools to sign. It will be interesting to see how they utilize Collibra.