“People make the world go round.” — The Stylistics
Metadata – Start with people

People make the world go round
Before we talk about people I’d like to apologize for the delay in writing this new post.
Some upcoming projects and the responses to my last post, which was the first of a series about peculiar hierarchies and other pitfalls, kept me quite busy. Plus, this post about people is quite special and there are many ways to approach the subject. I have created and discarded different versions so far and I’m hoping that this actual version can get my points across.
Depending on how we define “metadata” one can argue whether the information about the people involved in a data warehouse project should really belong to metadata.
The standard literature defines metadata as “data about data”, but I’d like to extend that definition. In my definition metadata is all the data about and “around” the data that eventually and potentially appears in an end-user report or analysis.
As I have stressed in the introductory post about metadata, they should be as comprehensive as possible, because they ease and lead the path through the overall project. In a special way, this is the case with information about the people involved. So it became a habit to me to strive for a very clear picture of all the participants of the project. They can be participating actively or passively. Some of them might never explicitely get involved, but nevertheless could have a significant influence on the outcome of the project.
What should you store?
As I suggested in the first post of this blog, a tool for storing and maintaining all metadata is very useful. I usually use MS Access for metadata maintenance.
In principal, all metadata should be stored on the highest possible detail level, but in case of data about people it could be sensible to think twice. This will become more clear when we look at the data in detail. Anyway, the policy of storage of personal data should be in line with your own and your company’s or client’s philosophy.
The standard set
Many data warehousing textbooks address the subject of people. This includes very important issues like
- stakeholder analysis
- stakeholders and stakeholder groups
- target groups
- sponsors and supporters
- roles and responsibilities
I don’t want to go into detail about these issues. Google is your friend! 😀 Among many others, you might find this link quite useful. Treatises on stakeholders, sponsors, and staffing can also be found in some of Kimball’s books:
- The Data Warehouse Toolkit
- The Microsoft Data Warehouse Toolkit
- The Data Warehouse Lifecycle Toolkit (this project plan task list can be quite valuable)
Just by the way, although some years old, these books are still required reading for data warehouse people.
There’s more
Besides the above mentioned standard set there is more important information about people.
In my experience, this information can be quite vital and neglecting it can seriously jeopardize the success of the entire project.
The main aspect of a data warehouse is integration. The DWH brings together data from many departments, subsidiaries, countries, external data services, etc.. In doing so, a newly created DWH usually replaces a number of legacy systems, insular solutions, individual spreadsheets, or even other data warehouse solutions.
Bringing together data also means bringing together people. The interests, goals, and strategies of those people as a whole can be rather heterogeneous.
Pay attention to hierarchies
Among the people, several different coexisting, overlapping, and often competing hierarchies exist. Writing down and drawing these entire hierarchies can be very rewarding. It shows how big the team actually is and how many people are involved.
Overlapping and competing hierarchies can contain some hidden explosives. Imagine members from two different departments have to cooperate, but their department leaders have other plans or priorities. Even though the respective department heads are not directly involved in the data warehouse project, they can have a big influence on it. It gets even worse when the two leaders are somehow competing against each other.
Other countries, other customs
Integrating data is one thing. Integrating reporting is another. Often, the introduction of a new BI system on an enterprise level is coupled with a standardization of reporting. After this is done, some people might start feeling deprived. End users might miss some of their most crucial and often very special reports. This is often the case for specific countries or subsidiaries. Suddenly, the most badly affected people turn from being supporters to being boycotters. In my opinion it’s crucial to consider those people’s requirements and to ensure that they can continue doing their business.
Legacy systems
The new DWH and BI system usually replaces a larger number of legacy systems. These can range from significant mainframe-hosted solutions to single and very specific and sophisticated Excel sheets. Those systems have been specified, developed, used, and administered by people, who now are probably part of the new DWH project team. These team members are frequently against the change. It helps to identify those issues and to regard the respective team members as a special group. Try to turn their fierce criticism into constructive criticism. They usually invest extra efforts into finding out the flaws of the new solution. This can be an invaluable contribution to the quality and acceptance of the future system.
Soft data
The issues above are very important for the success of the project, but rather than the standard set of people metadata they are more like what’s called “soft” data.
Standard data are quite straight forward. They include name, contact data, association to one or multiple groups, position, fees, etc..
Soft data can be decribed by classifications or quantitative measures. Classifications can be discrete values like high, medium, low. Qualitative measures can be continuous ranges like 0 to 100%.
Here are some examples for soft data:
- level of sponsorship (sponsor – neutral – boycotter)
- availibility (0-100%)
- commitment (0-10)
- lead by (all respective people)
- involved in legacy solution (all concerning legacy solutions)
- knowledge carrier (high – medium – low for each concering subject)
You can surely come up with many more of soft data. Please do not hesitate to start or participate in discussions about those issues.
Recent Comments