Methodology Physics authors titles recent submissions

8 take advantage of the developments at NCEP and the Joint Center for Satellite Data Assimilation JCSDA. Details can be found at the website https:gmao.gsfc.nasa.govforecasts. It has a spatial resolution of 0.25° latitude × 0.3125° longitude. We extracted relative humidity RH, , air temperature at a 2 m height TMP, K, wind speed at 10 m above ground WS, ms, surface pressure PS, kPa, and planetary boundary layer height PBL, m from the GEOS 5-FP dataset. Each variable was regridded to 0.01° to be consist with satellite observations. 2.4. Data integration Firstly, a 0.01-degree grid was created for the data integration, model establishment, and mapping. For each 0.01-degree grid, ground-level PM 2.5 measurements from multiple stations are averaged. Meanwhile, we resampled the meteorological data with a spatial resolution of 0.25°× 0.3125° to 0.01° to match with satellite observations. All the data were re-projected to the same coordinate system. Finally, we extracted satellite observations, meteorological parameters on locations where PM 2.5 measurement are available.

3. Methodology

The Ref-PM modeling was developed to estimate ground-level PM 2.5 from satellite reflectance rather than satellite-derived AOD. Generally, the structure of Ref-PM modeling is shown as Eq. 1, containing dependent variable PM 2.5 concentration and two types of predictors satellite observations and meteorological parameters. Details of Ref-PM modeling are illustrated in Fig. 2.   2.5 1, 3, 7, , , , , , , , , , PM f B B B SensorA SensorZ SolarA SolarZ RH WS TMP PBL PS NDVI  1 where 1, 3, 7 B B B top-of-atmosphere TOA reflectance and observation geometry sensor 9 azimuth, sensor zenith, solar azimuth, and solar zenith are utilized, inspired by the dark target DT algorithm Kaufman et al., 1997a; Kaufman et al., 1997b. Figure 2. The schematics of Ref-PM modeling for the estimation of PM 2.5. To model the specific relationship between PM 2.5 and predictors, a conventional statistical model multiple linear regression, MLR Gupta and Christopher 2009b, neural networks back-propagation neural network, BPNN; generalized regression neural network, GRNN Gupta and Christopher 2009a; Li et al., 2017b, and a deep learning model deep belief network, DBN Li et al., 2017a are adopted. On this basis, the spatiotemporal autocorrelation of PM 2.5 is incorporated into the Ref-PM modeling. The spatial and temporal terms in conjunction with the above variables can greatly improve the model performance. The details can be seen in somewhere else Li et al., 2017a, and the deep learning model which considers the spatiotemporal autocorrelation of PM 2.5 is denoted as Geoi-DBN. The objective of this study is to develop a new solution Ref-PM modeling to remote sensing of PM 2.5 concentration. We will thus compare the performance of Ref-PM modeling with that of AOD-PM modeling, based on the above models. 10 A 10-fold cross-validation CV technique Rodriguez et al., 2010 was used to test the potential of model overfitting and predictive power. Previous studies usually used sample-based CV Li et al., 2017b; Ma et al., 2014 or site-based CV Lee et al., 2011; Xie et al., 2015. For sample-based CV, all samples in the model dataset are randomly and equally divided into ten subsets. One subset is used as testing samples and the rest subsets are used to fit the model for each round of validation. For site-based CV, one monitoring station is used for validation and the remaining stations for model fitting in each round. In this study, we chose sample-based 10-fold CV, which is more widely used Ma et al., 2016. We adopted the statistical indicators of the coefficient of determination R 2 , the root-mean-square error RMSE, 3 g m  , the mean prediction error MPE, 3 g m  , and the relative prediction error RPE, defined as RMSE divided by the mean ground-level PM 2.5 to evaluate the model performance.

4. Results and discussion