A big data framework to validate thermodynamic data for chemical species

30 Oct 2017

The advent of large sets of chemical and thermodynamic data has enabled the rapid investigation of increasingly complex systems. The challenge, however, is how to validate such large databases. We propose an automated framework to solve this problem by identifying which data are consistent and recommending what future experiments or calculations are required. The framework is applied to validate data for the standard enthalpy of formation for 920 gas-phase species containing carbon, oxygen and hydrogen retrieved from the NIST Chemistry WebBook. The concept of error-cancelling balanced reactions is used to calculate a distribution of possible values for the standard enthalpy of formation of each species. The method automates the identification and exclusion of inconsistent data. We find that this enables the rapid convergence of the calculations towards chemical accuracy. The method can exploit knowledge of the structural similarities between species and the consistency of the data to identify which species introduce the most error and recommend what future experiments and calculations should be considered.