AARI Quality Control Methods
Documentation provided by V. Radionov
The following describes general methods of quality control used at AARI.
Stage I: Data that were not already in digital form were digitized from logbooks, bulletins, or charts. Data quality control was performed for three or six hourly data from North Pole drifting stations and Ice Patrol ships, for once-daily DARMS data values, and for monthly means at coastal and island stations as follows:
Each observation or monthly mean (monthly means were calculated at each station by an observer) was evaluated based on the likelihood and consistency of individual parameter values. This excluded most large errors.
For individual meteorological parameters, where the distribution is close to normal, statistical estimates of the mean and extremes can be used for testing. These parameters generally include pressure, air temperature, relative humidity, and surface temperature.
Grubbs' criterion [Grubbs, 1950] was used to detect individual extrema. If a point exceeded a threshold based on the mean, one may assume the hypothesis of over-estimation. Points that exceeded plus or minus 2.5 standard deviations from the monthly mean were marked.
The modified criteria of Tietjen and Moore [1972] was sometimes used for testing outliers.
Values exceeding the thresholds were noted as questionable. As an additional quality control, a parameter may have been temporarily changed by interpolation of the tested parameter with observational data of this parameter across two adjacent intervals. Discrepancies between tested and interpolated parameter values were estimated as extreme deviations (Kolmogoroff deviation) and each evaluated by an expert, who made the ultimate decision. Additionally, all questionable observations were tested by an expert specialist from AARI who made the ultimate decision about the rejection of questionable values.
Stage II: Testing during Stage I will exclude crude errors. In Stage II, systematic errors connected with instrument function, improper operation, or with incorrect data processing are considered. These are errors that would not necessarily be routinely noticed on a day-to-day basis.
Within a 40-year period, there may appear changes in the climatic homogeneity of time series. These may result from changes in the meteorological station location, or in the station surroundings, and from natural climatic changes. The most common analysis methods of climatic homogeneity, the difference and ratio methods [Drozdov, 1989] are used in this test. This procedure makes it possible to identify shifts in parameter value. Other cases of climatic heterogeneity of temporal sets will not be identified during these threshold tests because it is impossible to distinguish the causes of heterogeneity without carrying out a sophisticated analysis. The data record of the 65 Russian coastal stations was tested by this method.
As in all preceding steps, an expert made the final decisions regarding quality control, including the advisability of testing observations at adjacent stations. Observations that passed all stages of the testing are included in the Atlas data base.