A Poor Man’s Data Dictionary

A Data Dictionary assists in planning, controlling, and evaluating the collection, storage, and use of data. An Excel tool can be downloaded and a brief explanation and set of definitions are provided.

Austrian Parliament, Art lamenting the cold, 2023-01-21

Technology is wonderful and part of that wonderness is the ability for mortals to create some pretty amazing things via tools such as Microsoft Excel, SharePoint, Access, etc.

The problem with end users is love building but are not so fond of documenting. As a result, a summer student builds a complex spreadsheet that becomes critical to your operations, GREAT! And then she leaves and know one is sure what all the tabs are for or even what data goes in which column, not so great. Thus enter the Data Dictionary as well as a little bit of documentation.

>>> Download the Microsoft Excel Data Dictionary <<<

Data WHAT-nary

According to the “IBM Dictionary of Computing”, a Data Dictionary is a

n. A centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format. It assists management, database administrators, system analysts, and application programmers in planning, controlling, and evaluating the collection, storage, and use of data. [1]

Okay but what does that definition have to do with a spreadsheet?  It is the last sentence that is the important one: “planning, controlling, and evaluating the collection, storage, and use of data“.   

Spreadsheets, SharePoint lists, Microsoft Access databases and a host of other user tool cost money to create/maintain and are often central to the functioning of an organization.  When they become corrupted, lost or in need of duplication, they cost even more.  In other words, if it was worth the money to create the tool, and it is generating value, shouldn’t your organization at least know how the tool works?

What a Data-Dictionary Looks Like

The following tool described below is developed in Microsoft Excel.  There is no reason it could not be created in Google Docs or my favourite a SharePoint list. The ‘container’ * is important and it protects value but does not create any value.  What is inside the spreadsheet is what is important. [* see my Don’t Confuse the Container with Its Contents blog for more on this].

The following is are the minimum fields for a dictionary.  And yes, these fields have all been documented in a data-dictionary you can download below.

Minimum Data Set for a Dictionary

The Excel file has a few more including a calculated field and a few different reference columns.  You can delete these more advance columns or start populating and then go back and read my definition for the columns and say ‘OOOOHHH, that is why he created that column’ – Go Data Dictionary, You GO GIRL!

ColumnDefinitionNotes
NameWhat is being Defined, Noun only.Nouns only, separate the definition from the name of the asset
AssetThe overall asset being defined typically at a file or SP-List level.  This field allows for filtering of a specific Asset if using one dictionary to define a number of files or ‘assets’.Circular drop down list.  You will need to define the asset first, leave the cell and then select itself from the drop down.
Sub-AssetA further categorization of what is being defined.A tab within an XLS if there are a number to be defined.
ElementSelect what is being defined from a standard list of selections.Drop down list in the flag tab which can be amended or added to.
DefinitionComments and description of what the term means.Set a maximum size to as to encourage succinct responses.
ElementSourceAs applicable, where does the element come from? This field is used when an element is used from a source other than its own.Additional details that help the user understand the source of the entry.  Default = BLANK
Current?Currently used?Used to suppress no longer used entries.
Comment: Narrative not otherwise included other fields; for example, used when a de-activating an definition. This is the place to be chatty
Fields to have within a Data Dictionary.

Populating and Enforcement

At this point you might be saying something like, GROOANNNN, another thing to do!  ‘I barely have time to do things poorly, where am I going to find the time to do things well‘.  Take heart, there is a pay back on your time:

  1. Reduce Re-Do: This is likely the biggest payback when someone can’t figure out an Excel document, etc., and ends up re-creating it… of course later on, ANOTHER future person can not figure it out and recreates it.
  2. More Re-Leverage: Because you can do things like filter and search this support re-use and consistency in building your assets.
  3. Better Builds: Here is a pro tip, write the documentation first.  For example, why do we need a new spreadsheet, what will it do (ideally in a Desk Reference, see my article: “Beyond the Big Honkin’ Binder” for how to use things like Wikies for documentation). When it comes time to do the build, document things like the spreadsheet columns or VBA code modules and THEN build the thing.  It is amazing what happens when you have to think through the business problem and compare the results to this definition to improve your builds.
  4. Auditors and New Employees: Finally, people like auditors or new employees will come along and ask questions like, what is that column for or where does this information come from?  With a Data Dictionary you can not only answer the question but answer it consistently.

Beyond a Spreadsheet

A spreadsheet is only as good as the discipline to use it consistently. Creating the dictionary is only the first challenge, keeping it up to date is MORE important. 

Still, the payback on your investments of time can be considerable but in effort but in cash as well. For example, for consultants a dictionary can help you create client products faster beyond simply copying a previously used spreadsheet and creating a new one.  If you are leaving the file with the client, think of the perceived ‘value add’ including the dictionary will have in your final report.

Happy documenting and remember that Documentation is a Waste of Time. If you end up using the tool let me know particularly if you have any improvements on its use.

References and Notes

  1. IBM Dictionary of Computing.
  2. March 17, 2023: updates from the blog originally posted February, 2019 but based on work done over the past 2-5 years.

Leave a comment