Documentation and Metadata¶

Having data available is of no use if it cannot be understood. For example, a table of numbers is useless if there are no headings that describe what the columns/rows contain. Therefore you should ensure that open datasets include consistent core metadata and that the data is fully described. This requires that all documentation accompanying data is written in clear, plain language, and that data users have sufficient information to understand the source, strengths, weaknesses, and analytical limitations of the data so that they can make informed decisions when using it.

The level of documentation and metadata will vary according to the project, and the range of people the data needs to be understood by.
It is best practice to use recognised community metadata standards to make it easier for datasets to be combined. For example, for brain data, the Brain Imaging Data Structure is the standard to use. Other metadata standards(reporting requirements, terminologies, and models/schemas) are searchable in FAIRsharing.
Variables should be defined and explained using data dictionaries.
Data should be stored in logical and hierarchical folder structures, with a README file used to describe the structure.

The README file is helpful for others and will also help you find your data in the future [FK18]. See the README template from Cornell for an example.

The Turing Way

Documentation and Metadata¶