International Journal of Reliable Information and Assurance
Volume 5, No. 2, 2017, pp 19-24 | ||
Abstract |
Data Quality Assurance for the Simulation Data Analysis in the EDISON-SDR
|
The EDISON platform is designed to make it easy for students to use computational science SW and high performance computing infrastructure on a web-based environment and to learn computational science-based research methodology. Recently, the convergence of computation, data, and artificial intelligence has emerged as an important factor that enables new discovery. To meet this trend we are developing the EDISON-SDR repository, which extends the existing EDISON platform so that students can learn data-based research methodology. Key requirements for efficient computational simulation data analysis are to provide a high level of quality control over the data and extraction of sufficient and consistent metadata from heterogeneous raw computational simulation data files. In this paper, we discuss the quality control method of the EDISON-SDR repository such as file verification, metadata verification, data workflow, and continuous reprocessing. This paper contributes to the data quality management model in the data repository considering data diversity and heterogeneity.