Tidy Data
- Good project organisation encompasses the naming, arrangement,
backing up, and documenting of files.
- Time invested into project organisation is paid back multiple times
over during a project.
- Avoid using multiple tables within one spreadsheet.
- Avoid spreading data across multiple tabs.
- Record zeros as zeros.
- Use an appropriate null value to record missing data.
- Don’t use formatting to convey information or to make your
spreadsheet look pretty.
- Place comments in a separate column.
- Record units in column headers.
- Include only one piece of information in a cell.
- Avoid spaces, numbers and special characters in column headers.
- Avoid special characters in your data.
- Record metadata in a separate plain text README file.
- Treating dates as multiple pieces of data rather than one makes them
easier to handle.
- Always copy your original spreadsheet file and work with a copy so
you don’t affect the raw data.
- Use data validation to prevent accidentally entering invalid
data.
- Use sorting to check for invalid data.
- Use conditional formatting (cautiously) to check for invalid
data.
- Data stored in common spreadsheet formats will often not be read
correctly into data analysis software, introducing errors into your
data.
- Exporting data from spreadsheets to formats like CSV or TSV puts it
in a format that can be used consistently by most programs.