Skip to content

Data Dictionary: A How to and Best Practices

Metadata

  • Author: Carl Anderson
  • Full Title: Data Dictionary: A How to and Best Practices
  • Category: #Type/Highlight/Article
  • URL: https://medium.com/p/a09a685dcd61

Highlights

  • Importantly, don’t ask “what is the current definition?” but “how should this be defined?” If the current implementation is not their ideal definition, this is the perfect chance for the business team to set out their ideal state. For instance, this is a chance to simplify if you have inherited an overly complex definition. Once that ideal definition is captured then there is additional pressure on the data team, tech team, or other parts of the business to deliver on that metric as defined.
  • This is a key step: root out any terms where its definition differs among teams.
  • Publish the data dictionary as a single page document where it is accessible to the whole company — thus, not just in a BI tool. These definitions should be widely understood and adopted, not just by execs, analysts, and decision makers, but by all staff. Therefore, visibility is crucial. If the company uses a wiki heavily, them publish there. It should should be where people expect.
  • For instance, at Warby Parker, our data dictionary was generated from a Jenkins job. If the repository was modified, it regenerated our documentation (a dedicated internal website or “data book” for all data documentation).
  • Do not let different systems get out of sync; hence, why auto-generation of documentation is valuable.