Skip to content
Snippets Groups Projects
Commit 5011b448 authored by Frisinghelli Daniel's avatar Frisinghelli Daniel
Browse files

Added ROCSS for different thresholds.

parent 4c15d325
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:f2bfabd5-c8d6-46aa-b340-c169cc03bee7 tags: %% Cell type:markdown id:f2bfabd5-c8d6-46aa-b340-c169cc03bee7 tags:
# Precipitation: Sensitivity to wet day threshold # Precipitation: Sensitivity to wet day threshold
%% Cell type:markdown id:9f024696-3cb3-4637-ab30-c57df733aeed tags: %% Cell type:markdown id:9f024696-3cb3-4637-ab30-c57df733aeed tags:
We used **1981-1991 as training** period and **1991-2010 as reference** period. The results shown in this notebook are based on the model predictions on the reference period. We used **1981-1991 as training** period and **1991-2010 as reference** period. The results shown in this notebook are based on the model predictions on the reference period.
%% Cell type:markdown id:17ca3513-5757-470c-9d3c-7494b5bade56 tags: %% Cell type:markdown id:17ca3513-5757-470c-9d3c-7494b5bade56 tags:
**Predictors on pressure levels (500, 850)**: **Predictors on pressure levels (500, 850)**:
- Geopotential (z) - Geopotential (z)
- Temperature (t) - Temperature (t)
- Zonal wind (u) - Zonal wind (u)
- Meridional wind (v) - Meridional wind (v)
- Specific humidity (q) - Specific humidity (q)
**Predictors on surface**: **Predictors on surface**:
- Mean sea level pressure (msl) - Mean sea level pressure (msl)
**Auxiliary predictors**: **Auxiliary predictors**:
- Elevation from Copernicus EU-DEM v1.1 (dem) - Elevation from Copernicus EU-DEM v1.1 (dem)
- Day of the year (doy) - Day of the year (doy)
%% Cell type:markdown id:a7e732f3-a219-4317-aa42-638cae86b4af tags: %% Cell type:markdown id:a7e732f3-a219-4317-aa42-638cae86b4af tags:
Define the predictand and the model to evaluate: Define the predictand and the model to evaluate:
%% Cell type:code id:d75e3627-13c4-46b8-b749-d582e4b802ac tags: %% Cell type:code id:d75e3627-13c4-46b8-b749-d582e4b802ac tags:
``` python ``` python
# define the model parameters # define the model parameters
PREDICTAND = 'pr' PREDICTAND = 'pr'
MODEL = 'USegNet' MODEL = 'USegNet'
PPREDICTORS = 'ztuvq' PPREDICTORS = 'ztuvq'
PLEVELS = ['500', '850'] PLEVELS = ['500', '850']
SPREDICTORS = 'mslp' SPREDICTORS = 'mslp'
DEM = 'dem' DEM = 'dem'
DEM_FEATURES = '' DEM_FEATURES = ''
DOY = '' DOY = ''
WET_DAY_THRESHOLD = [0, 0.5, 1, 2, 3, 5] WET_DAY_THRESHOLD = [0, 0.5, 1, 2, 3, 5]
LOSS = 'BernoulliGammaLoss' LOSS = 'BernoulliGammaLoss'
``` ```
%% Cell type:markdown id:cda0cf8b-adce-4513-af69-31df6eb5542f tags: %% Cell type:markdown id:cda0cf8b-adce-4513-af69-31df6eb5542f tags:
### Imports ### Imports
%% Cell type:code id:01ed9fbc-49c2-4028-a3f3-692a014dcf89 tags: %% Cell type:code id:01ed9fbc-49c2-4028-a3f3-692a014dcf89 tags:
``` python ``` python
# builtins # builtins
import datetime import datetime
import warnings import warnings
import calendar import calendar
# externals # externals
import xarray as xr import xarray as xr
import numpy as np import numpy as np
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.inset_locator import inset_axes from mpl_toolkits.axes_grid1.inset_locator import inset_axes
import scipy.stats as stats import scipy.stats as stats
from IPython.display import Image from IPython.display import Image
from sklearn.metrics import r2_score, roc_curve, auc, classification_report from sklearn.metrics import r2_score, roc_curve, auc, classification_report
# locals # locals
from climax.main.io import ERA5_PATH, OBS_PATH, TARGET_PATH, DEM_PATH from climax.main.io import ERA5_PATH, OBS_PATH, TARGET_PATH, DEM_PATH
from pysegcnn.core.utils import search_files from pysegcnn.core.utils import search_files
from pysegcnn.core.graphics import plot_classification_report from pysegcnn.core.graphics import plot_classification_report
``` ```
%% Cell type:code id:073dfc6a-8b33-4aca-ab71-c06ef92bbb80 tags: %% Cell type:code id:073dfc6a-8b33-4aca-ab71-c06ef92bbb80 tags:
``` python ``` python
# mapping from predictands to variable names # mapping from predictands to variable names
NAMES = {'tasmin': 'minimum temperature', 'tasmax': 'maximum temperature', 'pr': 'precipitation'} NAMES = {'tasmin': 'minimum temperature', 'tasmax': 'maximum temperature', 'pr': 'precipitation'}
``` ```
%% Cell type:markdown id:9410ed09-50a9-4d19-9e90-8551ab701509 tags: %% Cell type:markdown id:9410ed09-50a9-4d19-9e90-8551ab701509 tags:
### Load datasets ### Load datasets
%% Cell type:code id:030c221d-e0ad-4e67-b742-db34de549983 tags: %% Cell type:code id:030c221d-e0ad-4e67-b742-db34de549983 tags:
``` python ``` python
# construct file pattern to match # construct file pattern to match
PATTERN = '_'.join([MODEL, PREDICTAND, PPREDICTORS, *PLEVELS]) PATTERN = '_'.join([MODEL, PREDICTAND, PPREDICTORS, *PLEVELS])
PATTERN = '_'.join([PATTERN, SPREDICTORS]) if SPREDICTORS else PATTERN PATTERN = '_'.join([PATTERN, SPREDICTORS]) if SPREDICTORS else PATTERN
PATTERN = '_'.join([PATTERN, DEM]) if DEM else PATTERN PATTERN = '_'.join([PATTERN, DEM]) if DEM else PATTERN
PATTERN = '_'.join([PATTERN, DEM_FEATURES]) if DEM_FEATURES else PATTERN PATTERN = '_'.join([PATTERN, DEM_FEATURES]) if DEM_FEATURES else PATTERN
PATTERN = '_'.join([PATTERN, DOY]) if DOY else PATTERN PATTERN = '_'.join([PATTERN, DOY]) if DOY else PATTERN
``` ```
%% Cell type:code id:7106fed2-9f63-453b-a9af-b5d490efbbb4 tags: %% Cell type:code id:7106fed2-9f63-453b-a9af-b5d490efbbb4 tags:
``` python ``` python
# file pattern for different wet day thresholds # file pattern for different wet day thresholds
PATTERNS = ['_'.join([PATTERN, '{}mm_{}'.format(str(t).replace('.', ''), LOSS)]) for t in WET_DAY_THRESHOLD] PATTERNS = ['_'.join([PATTERN, '{}mm_{}'.format(str(t).replace('.', ''), LOSS)]) for t in WET_DAY_THRESHOLD]
PATTERNS PATTERNS
``` ```
%% Cell type:code id:0069c5d8-ee28-49a3-84f0-8af6b3bcb05d tags: %% Cell type:code id:0069c5d8-ee28-49a3-84f0-8af6b3bcb05d tags:
``` python ``` python
# load model predictions for different thresholds # load model predictions for different thresholds
y_pred = {k: xr.open_dataset(search_files(TARGET_PATH.joinpath('pr'), '.'.join([v, 'nc$'])).pop()) for k, v in zip(WET_DAY_THRESHOLD, PATTERNS)} y_pred = {k: xr.open_dataset(search_files(TARGET_PATH.joinpath('pr'), '.'.join([v, 'nc$'])).pop()) for k, v in zip(WET_DAY_THRESHOLD, PATTERNS)}
``` ```
%% Cell type:code id:316a35e3-49e6-4d02-b865-25710a621daa tags: %% Cell type:code id:316a35e3-49e6-4d02-b865-25710a621daa tags:
``` python ``` python
# load observations # load observations
y_true = xr.open_dataset(search_files(OBS_PATH.joinpath('pr'), 'OBS_pr(.*).nc$').pop()) y_true = xr.open_dataset(search_files(OBS_PATH.joinpath('pr'), 'OBS_pr(.*).nc$').pop())
y_true = y_true.sel(time=y_pred[0].time) y_true = y_true.sel(time=y_pred[0].time)
``` ```
%% Cell type:code id:58177982-7f86-4c9e-9980-563cf5d48852 tags: %% Cell type:code id:58177982-7f86-4c9e-9980-563cf5d48852 tags:
``` python ``` python
# align datasets and mask missing values # align datasets and mask missing values
for k, v in y_pred.items(): for k, v in y_pred.items():
_, y = xr.align(y_true, v, join='override') _, y = xr.align(y_true, v, join='override')
y = y.where(~np.isnan(y_true.precipitation), other=np.nan) y = y.where(~np.isnan(y_true.precipitation), other=np.nan)
y_pred[k] = y y_pred[k] = y
``` ```
%% Cell type:markdown id:5a72117c-2e1c-462a-968e-91db18b4aaaf tags: %% Cell type:markdown id:5a72117c-2e1c-462a-968e-91db18b4aaaf tags:
## Model sensitivity ## Model sensitivity
%% Cell type:markdown id:e505a78a-e04a-4cb3-bcdf-c68ed7c402a2 tags: %% Cell type:markdown id:e505a78a-e04a-4cb3-bcdf-c68ed7c402a2 tags:
### Coefficient of determination $R^2$ ### Coefficient of determination $R^2$
%% Cell type:code id:9758d58b-82e9-47b2-9cf7-1c93d5ed04a7 tags: %% Cell type:code id:9758d58b-82e9-47b2-9cf7-1c93d5ed04a7 tags:
``` python ``` python
# calculate true monthly mean precipitation (mm / month) # calculate true monthly mean precipitation (mm / month)
y_true_values = y_true[NAMES['pr']].resample(time='1M').sum(skipna=False).groupby('time.month').mean(dim='time').values y_true_values = y_true[NAMES['pr']].resample(time='1M').sum(skipna=False).groupby('time.month').mean(dim='time').values
``` ```
%% Cell type:code id:57ca80c6-2c02-44d1-87eb-35c8a78ec71f tags: %% Cell type:code id:57ca80c6-2c02-44d1-87eb-35c8a78ec71f tags:
``` python ``` python
# calculate predicted monthly mean precipitation (mm / month) # calculate predicted monthly mean precipitation (mm / month)
for k, v in y_pred.items(): for k, v in y_pred.items():
y_pred_values = v[NAMES['pr']].resample(time='1M').sum(skipna=False).groupby('time.month').mean(dim='time').values y_pred_values = v[NAMES['pr']].resample(time='1M').sum(skipna=False).groupby('time.month').mean(dim='time').values
# apply mask of valid pixels # apply mask of valid pixels
mask = (~np.isnan(y_pred_values) & ~np.isnan(y_true_values)) mask = (~np.isnan(y_pred_values) & ~np.isnan(y_true_values))
y_pred_values = y_pred_values[mask] y_pred_values = y_pred_values[mask]
y_true_masked = y_true_values[mask] y_true_masked = y_true_values[mask]
# calculate coefficient of determination # calculate coefficient of determination
r2 = r2_score(y_true_masked, y_pred_values) r2 = r2_score(y_true_masked, y_pred_values)
print('({:.1f} mm), r2: {:.2f}'.format(k, r2)) print('({:.1f} mm), r2: {:.2f}'.format(k, r2))
``` ```
%% Cell type:markdown id:788a97c1-b913-4782-aca0-318d4da1180d tags: %% Cell type:markdown id:788a97c1-b913-4782-aca0-318d4da1180d tags:
### Bias, MAE, and RMSE: Mean ### Bias, MAE, and RMSE: Mean
%% Cell type:markdown id:bd8971e9-1c25-4635-9ff4-8c8dea822066 tags: %% Cell type:markdown id:bd8971e9-1c25-4635-9ff4-8c8dea822066 tags:
Calculate yearly average bias over entire reference period: Calculate yearly average bias over entire reference period:
%% Cell type:code id:02c99a93-530c-4570-a7ed-4cbf1940222f tags: %% Cell type:code id:02c99a93-530c-4570-a7ed-4cbf1940222f tags:
``` python ``` python
# yearly average precipitation # yearly average precipitation
y_true_yearly_avg = y_true.groupby('time.year').mean(dim='time') y_true_yearly_avg = y_true.groupby('time.year').mean(dim='time')
``` ```
%% Cell type:code id:bc85224d-68d8-4791-8eec-ab39e16898b7 tags: %% Cell type:code id:bc85224d-68d8-4791-8eec-ab39e16898b7 tags:
``` python ``` python
# yearly average bias over reference period # yearly average bias over reference period
for k, v in y_pred.items(): for k, v in y_pred.items():
y_pred_yearly_avg = v[NAMES['pr']].groupby('time.year').mean(dim='time') y_pred_yearly_avg = v[NAMES['pr']].groupby('time.year').mean(dim='time')
bias_yearly_avg = ((y_pred_yearly_avg - y_true_yearly_avg) / y_true_yearly_avg) * 100 bias_yearly_avg = ((y_pred_yearly_avg - y_true_yearly_avg) / y_true_yearly_avg) * 100
mae_yearly_avg = np.abs(y_pred_yearly_avg - y_true_yearly_avg) mae_yearly_avg = np.abs(y_pred_yearly_avg - y_true_yearly_avg)
rmse_yearly_avg = ((y_pred_yearly_avg - y_true_yearly_avg) ** 2).mean() rmse_yearly_avg = ((y_pred_yearly_avg - y_true_yearly_avg) ** 2).mean()
print('({:.1f} mm), relative bias = {:.2f}%'.format(k, bias_yearly_avg.mean().to_array().item())) print('({:.1f} mm), relative bias = {:.2f}%'.format(k, bias_yearly_avg.mean().to_array().item()))
print('({:.1f} mm), MAE = {:.2f} mm'.format(k, mae_yearly_avg.mean().to_array().item())) print('({:.1f} mm), MAE = {:.2f} mm'.format(k, mae_yearly_avg.mean().to_array().item()))
print('({:.1f} mm), RMSE = {:.2f} mm'.format(k, rmse_yearly_avg.mean().to_array().item())) print('({:.1f} mm), RMSE = {:.2f} mm'.format(k, rmse_yearly_avg.mean().to_array().item()))
``` ```
%% Cell type:markdown id:9a8377ea-1634-48d4-ae6c-84ea07e53a15 tags: %% Cell type:markdown id:9a8377ea-1634-48d4-ae6c-84ea07e53a15 tags:
### Bias, MAE, RMSE: Extremes ### Bias, MAE, RMSE: Extremes
%% Cell type:code id:0a5d2707-70f5-4b63-8427-2e8ba52922d3 tags: %% Cell type:code id:0a5d2707-70f5-4b63-8427-2e8ba52922d3 tags:
``` python ``` python
# extreme quantile of interest # extreme quantile of interest
quantile = 0.98 quantile = 0.98
``` ```
%% Cell type:code id:5d072ecf-c9f6-48de-9407-66a67b18e3b1 tags: %% Cell type:code id:5d072ecf-c9f6-48de-9407-66a67b18e3b1 tags:
``` python ``` python
# calculate observed extreme quantile for each year # calculate observed extreme quantile for each year
with warnings.catch_warnings(): with warnings.catch_warnings():
warnings.simplefilter('ignore', category=RuntimeWarning) warnings.simplefilter('ignore', category=RuntimeWarning)
y_true_ex = y_true.groupby('time.year').quantile(quantile, dim='time') y_true_ex = y_true.groupby('time.year').quantile(quantile, dim='time')
``` ```
%% Cell type:code id:d08ee7b4-4546-4244-b1ba-6fdc31667b3f tags: %% Cell type:code id:d08ee7b4-4546-4244-b1ba-6fdc31667b3f tags:
``` python ``` python
# calculate predicted extreme quantile for each year # calculate predicted extreme quantile for each year
for k, v in y_pred.items(): for k, v in y_pred.items():
with warnings.catch_warnings(): with warnings.catch_warnings():
warnings.simplefilter('ignore', category=RuntimeWarning) warnings.simplefilter('ignore', category=RuntimeWarning)
y_ex = v.groupby('time.year').quantile(quantile, dim='time') y_ex = v.groupby('time.year').quantile(quantile, dim='time')
# calculate bias in extreme quantile for each year # calculate bias in extreme quantile for each year
bias_ex = ((y_ex - y_true_ex) / y_true_ex) * 100 bias_ex = ((y_ex - y_true_ex) / y_true_ex) * 100
# mean absolute error in extreme quantile # mean absolute error in extreme quantile
mae_ex = np.abs(y_ex - y_true_ex).mean() mae_ex = np.abs(y_ex - y_true_ex).mean()
# root mean squared error in extreme quantile # root mean squared error in extreme quantile
rmse_ex = ((y_ex - y_true_ex) ** 2).mean() rmse_ex = ((y_ex - y_true_ex) ** 2).mean()
print('({:.1f} mm), relative bias of P{:.0f} = {:.2f}%'.format(k, quantile * 100, bias_ex.mean().to_array().item())) print('({:.1f} mm), relative bias of P{:.0f} = {:.2f}%'.format(k, quantile * 100, bias_ex.mean().to_array().item()))
print('({:.1f} mm), MAE of P{:.0f} = {:.2f} mm'.format(k, quantile * 100, mae_ex.mean().to_array().item())) print('({:.1f} mm), MAE of P{:.0f} = {:.2f} mm'.format(k, quantile * 100, mae_ex.mean().to_array().item()))
print('({:.1f} mm), RMSE of P{:.0f} = {:.2f} mm'.format(k, quantile * 100, rmse_ex.mean().to_array().item())) print('({:.1f} mm), RMSE of P{:.0f} = {:.2f} mm'.format(k, quantile * 100, rmse_ex.mean().to_array().item()))
``` ```
%% Cell type:code id:26b130a6-5caa-490f-93ce-72f4a5f53412 tags: %% Cell type:markdown id:f6dfd17e-6eb4-465f-a624-bd5c8f79c99b tags:
### ROC: Receiver operating characteristics
%% Cell type:code id:92733418-8840-46c2-bf33-adf93b78c646 tags:
``` python ``` python
# true and predicted probability of precipitation
for k, v in y_pred.items():
p_true = (y_true.precipitation > k).values.flatten()
p_pred = v.prob.values.flatten()
# apply mask of valid pixels
mask = (~np.isnan(p_true) & ~np.isnan(p_pred))
p_pred = p_pred[mask]
p_true = p_true[mask].astype(float)
# calculate ROC: false positive rate vs. true positive rate
fpr, tpr, _ = roc_curve(p_true, p_pred)
area = auc(fpr, tpr) # area under ROC curve
rocss = 2 * area - 1 # ROC skill score (cf. https://journals.ametsoc.org/view/journals/clim/16/24/1520-0442_2003_016_4145_otrsop_2.0.co_2.xml)
print('({:.1f} mm), ROCSS = {:.2f}'.format(k, rocss))
``` ```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment