When working with your environmental data, you may find things occasionally get a little out of hand. Data is altered in mysterious ways, computers crash, and people come and go from a project, taking critical knowledge with them. Here are the top 5 best practices in Environmental Data Management to keep the wheels turning smoothly. By no means is this meant to be a comprehensive list – simply a summary of useful tips gathered by working on projects over the years. Most of my experience comes from using Hydro GeoAnalyst and SQL Server but I think these could be considered in most situations.
ACCESS
Does everyone really need access to this data? Do some people only need to read the data while others need complete access to add and remove data? And who should have the permission to change the database structure? I will never forget the client who came to me because he had a summer student entering data for him and he was unable to query any data by the date field. After reviewing the student’s work, I found she had changed the date field to a string field as she was having difficulties importing the data. It turned out there were problems with the way Excel was formatting the date and instead of resolving the problem in the data set, she simply changed the date field to be a string data type!
Lesson Learned – choose wisely when providing permission to your database.
BACKUPS
This cannot be stressed enough – it is essential that you backup your database on a regular basis. Also consider doing them before doing system or software updates, and after you have done large data imports or data updates. Consider the changes that you are going to make or have just made and whether you will ever need to revert back – or how upsetting it would be to lose work that has been done.
Don’t forget to test that your backups are indeed working! Can you restore your database from your backup?
STANDARDIZATION
Coming up with a standardized list of values for any of the fields in your database can increase the usability of your data. This is especially important if you have people entering data that are less familiar with the domain. If data is being manually entered into the database, this can help to avoid typos as well by using dropdown pick lists in the program to ensure the data is being entered consistently.
NAMING CONVENTIONS
This can become more important as time goes on or as a project grows in size. Having established naming conventions for things like location names, names of saved queries, etc. can help in the long term maintenance of your project.
Imagine if someone new comes onto the project or if the project needs to be handed over to someone else – good naming conventions can help with knowledge transfer. No one is going to remember what the purpose of query_312 was.
DATA QUALITY
If you are not sure the data you are entering into the database is quality data, do you really want to enter it in the first place? Enforcing certain data fields to be required can help to ensure data quality – as can the standardization and naming conventions mentioned above. Having a simple tool to make entering validated data into your database quick and easy is the best idea. I really like the HGA Quick Checker for this purpose – a simple Excel plug-in that allows you to validate the data within Excel (where most people are comfortable) before entering it in the database. You can validate required fields, proper formatting, valid values from a list and more!
For more information about how Waterloo Hydrogeologic can help you get the most out of your data, please contact us sales@waterloohydrogeologic.com