Skip to content

Bias_Correction_demo

sixtohg edited this page Sep 6, 2014 · 1 revision

Demo

Aim of the demo

The aim of this example is to obtain a bias corrected time series of the NCEP/NCAR Reanalisys 1 in a particular location for the period 2001-2010 building upon the observed and simulated time series for the period 1991-2000, using the different bias correction techniques included in downscaleR.

All the required data for the first part of the practice are included in the built-in datasets in the downscaleR library, although note that NCEP/NCAR Reanalisys 1 data can be also accessed via the ECOMS User Data Gateway using the loadECOMS interface of the ecomsUDG.Raccess package.

The example will be done on the Navacerrada weather station, close to Madrid - Spain, and for the NCEP/NCAR Reanalysis 1, but this demo could be applied for several locations or gridded datasets, and for several members of a seasonal forecasting model.

Key inputs

  • Target variable: Mean surface daily temperature (tmean)
  • Observations: GSN station data in Navacerrada (Madrid - Spain)
  • Globel Circulation Model: Nearest point to Navacerrada from the NCEP/NCAR Reanalysis 1.
  • Target season: Boreal winter (DJF)
  • Calibration (training) period: 1991-2000
  • Test period: 2001-2010
  • Bias Correction methods: All the methods applicable for mean temperature (unbiasing, qq-map, ISI-MIP, etc.)

Demo

First of all, we must to install or load the downscaleR R-package:

# If we have not installed the "devtools" library we should do it:
install.packages("devtools")
library(devtools)
# Installing the downscaleR R-package
devtools::install_github(c("SantanderMetGroup/downscaleR.java@stable","SantanderMetGroup/downscaleR@stable"))
library(downscaleR)

In this example we will use the observations and simulations included by default in the downscaleR R-package: the GCOS Surface Network - GSN over the Iberian Peninsula and a subset created for this R-package of the NCEP/NCAR Reanalysis 1.

# Where is the dataset GSN located?
gsn.data.dir <- file.path(find.package("downscaleR"), "datasets/observations/GSN_Iberia")
# What variables includes this dataset?
# stationInfo provides a quick overview of the dataset:
stationInfo(gsn.data.dir)

Fig1:gsnStations

We use only one station of the dataset. We could choose it by name, location or identifier, but we should use the station Id as input in the loadStationData function. In our case we have chosen the Navacerrada station (stationID = "SP000008215") located near Madrid.

obs <- loadStationData(dataset = gsn.data.dir, file.format = "ascii", var="tmean", stationID = "SP000008215", season=c(12,1,2), years = 1991:2000)

To apply the bias correction methods we should have the same variable in the observational and the model datasets. Then, if we want to calibrate the mean temperature (tmean) we look for the corresponding variable in the model, 2T in our case. The downscaleR vocabulary and the inventory of the NCEP/NCAR reanalysis could help us to identify the proper variable.

data(vocabulary)
print(vocabulary) 
ncep.data.dir <- file.path(find.package("downscaleR"), "datasets/reanalysis/Iberia_NCEP/Iberia_NCEP.ncml")
ncep.inv <- dataInventory(ncep.data.dir)
str(ncep.inv)
# We look for the nearest point of the model to the station chosen:
col <- which((ncep.inv$T$Dimensions$lon$Values-obs$xyCoords[1])^2 == min((ncep.inv$T$Dimensions$lon$Values-obs$xyCoords[1])^2, na.rm = TRUE)) 
row <- which((ncep.inv$T$Dimensions$lat$Values-obs$xyCoords[2])^2 == min((ncep.inv$T$Dimensions$lat$Values-obs$xyCoords[2])^2, na.rm = TRUE)) 
# We load the data in the control period to adjust the method:
prd <- loadGridData(dataset = ncep.data.dir, var = "tas", lonLim = c(ncep.inv$T$Dimensions$lon$Values[col]-2.5,ncep.inv$T$Dimensions$lon$Values[col]+2.5), latLim= c(ncep.inv$T$Dimensions$lat$Values[row]-2.5,ncep.inv$T$Dimensions$lat$Values[row]+2.5), season=c(12,1,2), years = 1991:2000)
# We should interpolate the observations to the grid of model: we use the method "nearest" and the getGrid function to ensure spatial consistency:
obs <- interpGridData(obs, new.grid = getGrid(prd), method = "nearest")

Once both the observations and model data are loaded and defined in the same locations, we can apply the bias correction methodologies available in downscaleR. To analyze properly the effect of the different methods we apply them to adjust the same prd object used as control simulation.

prd1 <- biasCorrection (obs, prd, prd, method = "unbiasing")# bias
prd2 <- biasCorrection (obs, prd, prd, method = "qqmap")# qq-mapping
prd3 <- isimip(obs, prd, prd)# isimip

We could compare the observation (black line) with the uncorrected (red line) and adjusted (blue line) series using several graphics.

row1 <- 2
col1 <- 2
# Time Series
par(mfrow = c(1,3))
plot(obs$Data[,row1,col1], type = "l", col = "black")
lines(prd$Data[,row1,col1],  col = "red")
lines(prd1$Data[,row1,col1],  col = "blue")
plot(obs$Data[,row1,col1], type = "l", col = "black")
lines(prd$Data[,row1,col1],  col = "red")
lines(prd2$Data[,row1,col1],  col = "blue")
plot(obs$Data[,row1,col1], type = "l", col = "black")
lines(prd$Data[,row1,col1],  col = "red")
lines(prd3$Data[,row1,col1],  col = "blue")

Fig2:timeSeriesTrain

Using the quantile-quantile plots we could see clearer the effect of the different bias correction techniques.

# Note that the observations could have missing data
indNaN <-which(!is.na(obs$Data[,row1,col1]))
#  scatter-plots
par(mfrow = c(1,3))
qqplot(sort(obs$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), sort(prd$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), plot.it = TRUE, xlab = "Observations", ylab = "Prediction", col = "red")
points(sort(obs$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), sort(prd1$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), type = "p", col = "blue", asp = 1)
qqplot(sort(obs$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), sort(prd$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), plot.it = TRUE, xlab = "Observations", ylab = "Prediction", col = "red")
points(sort(obs$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), sort(prd2$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), type = "p", col = "blue", asp = 1)
qqplot(sort(obs$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), sort(prd$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), plot.it = TRUE, xlab = "Observations", ylab = "Prediction", col = "red")
points(sort(obs$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), sort(prd3$Data[indNaN,row1,col1], decreasing = FALSE, na.last = NA), type = "p", col = "blue", asp = 1)

Fig3:qqPlotTrain

Finally, once we know how the bias correction method work, we could apply them to unknown conditions,

# We apply in unknown conditions:
sim <- loadGridData(dataset = ncep.data.dir, var = "tas", lonLim = c(ncep.inv$T$Dimensions$lon$Values[col]-2.5,ncep.inv$T$Dimensions$lon$Values[col]+2.5), latLim= c(ncep.inv$T$Dimensions$lat$Values[row]-2.5,ncep.inv$T$Dimensions$lat$Values[row]+2.5), season=c(12,1,2), years = 2001:2010)
sim1 <- biasCorrection (obs, prd, sim, method = "unbiasing")# bias
sim2 <- biasCorrection (obs, prd, sim, method = "qqmap")# qq-mapping
sim3 <- isimip(obs, prd, sim)# isimip
# We show the resulting as time Series
par(mfrow = c(1,3))
plot(sim$Data[,row1,col1], type = "l", col = "black")
points(sim1$Data[,row1,col1],  col = "blue")
plot(sim$Data[,row1,col1], type = "l", col = "black")
points(sim2$Data[,row1,col1],  col = "blue")
plot(sim$Data[,row1,col1], type = "l", col = "black")
points(sim3$Data[,row1,col1],  col = "blue")

Fig4:timeSeriesTest

# or scatter-plots
par(mfrow = c(1,3))
plot(sort(sim$Data[,row1,col1], decreasing = FALSE, na.last = NA), col = "red", type = "p")
points(sort(sim1$Data[,row1,col1], decreasing = FALSE, na.last = NA), col = "blue")
plot(sort(sim$Data[,row1,col1], decreasing = FALSE, na.last = NA), col = "red", type = "p")
points(sort(sim2$Data[,row1,col1], decreasing = FALSE, na.last = NA), col = "blue")
plot(sort(sim$Data[,row1,col1], decreasing = FALSE, na.last = NA), col = "red", type = "p")
points(sort(sim3$Data[,row1,col1], decreasing = FALSE, na.last = NA), col = "blue")

Fig5:qqPlotTest

Clone this wiki locally