Common, Persistent and Annoying Challenges: The WQFS Pilot Sensor Project and Other Data Quality Lessons from Agriculture

 

Of the four key “V” attributes of “big data” – volume, velocity, variability and veracity – variability and veracity pose unique challenges in agricultural research as illustrated by the Purdue Plant Sciences Initiative’s recent Soil Moisture Sensor Pilot Project conducted at the Water Quality Field Station. Many of the scientific domains contributing to agricultural research warrant the designation of “mature” as their origins and foundational methods date back hundreds of years. As such, key areas of study such as soil fertility and water management have developed strong cultures of practice for experimentation that predate much of the automation that now enables the agricultural researcher’s ability to collect “big data.” Thus, implementing a strategy for using of big data to address complex problems in agriculture must address a host of data quality issues that encompass not only the common foibles of machine generated data but the profound lack of standardization in data streams that can be attributed to a dated and non-curricular approach to data competencies within agricultural sciences. This presentation will use the Sensor Pilot Project as a case study to examine critical quality aspects of agricultural research data streams from the common challenges to sensor performance in the real world to the workflow practices that must be implemented to ensure provenance.