Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
C
Climax
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
earth_observation_public
Climax
Commits
5011b448
Commit
5011b448
authored
3 years ago
by
Frisinghelli Daniel
Browse files
Options
Downloads
Patches
Plain Diff
Added ROCSS for different thresholds.
parent
4c15d325
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
Notebooks/eval_precipitation_sensitivity.ipynb
+26
-2
26 additions, 2 deletions
Notebooks/eval_precipitation_sensitivity.ipynb
with
26 additions
and
2 deletions
Notebooks/eval_precipitation_sensitivity.ipynb
+
26
−
2
View file @
5011b448
...
...
@@ -334,13 +334,37 @@
" print('({:.1f} mm), RMSE of P{:.0f} = {:.2f} mm'.format(k, quantile * 100, rmse_ex.mean().to_array().item()))"
]
},
{
"cell_type": "markdown",
"id": "f6dfd17e-6eb4-465f-a624-bd5c8f79c99b",
"metadata": {},
"source": [
"### ROC: Receiver operating characteristics"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "
26b130a6-5caa-490f-93ce-72f4a5f53412
",
"id": "
92733418-8840-46c2-bf33-adf93b78c646
",
"metadata": {},
"outputs": [],
"source": []
"source": [
"# true and predicted probability of precipitation\n",
"for k, v in y_pred.items():\n",
" p_true = (y_true.precipitation > k).values.flatten()\n",
" p_pred = v.prob.values.flatten()\n",
" \n",
" # apply mask of valid pixels\n",
" mask = (~np.isnan(p_true) & ~np.isnan(p_pred))\n",
" p_pred = p_pred[mask]\n",
" p_true = p_true[mask].astype(float)\n",
" \n",
" # calculate ROC: false positive rate vs. true positive rate\n",
" fpr, tpr, _ = roc_curve(p_true, p_pred)\n",
" area = auc(fpr, tpr) # area under ROC curve\n",
" rocss = 2 * area - 1 # ROC skill score (cf. https://journals.ametsoc.org/view/journals/clim/16/24/1520-0442_2003_016_4145_otrsop_2.0.co_2.xml)\n",
" print('({:.1f} mm), ROCSS = {:.2f}'.format(k, rocss))"
]
}
],
"metadata": {
...
...
%% Cell type:markdown id:f2bfabd5-c8d6-46aa-b340-c169cc03bee7 tags:
# Precipitation: Sensitivity to wet day threshold
%% Cell type:markdown id:9f024696-3cb3-4637-ab30-c57df733aeed tags:
We used
**1981-1991 as training**
period and
**1991-2010 as reference**
period. The results shown in this notebook are based on the model predictions on the reference period.
%% Cell type:markdown id:17ca3513-5757-470c-9d3c-7494b5bade56 tags:
**Predictors on pressure levels (500, 850)**
:
-
Geopotential (z)
-
Temperature (t)
-
Zonal wind (u)
-
Meridional wind (v)
-
Specific humidity (q)
**Predictors on surface**
:
-
Mean sea level pressure (msl)
**Auxiliary predictors**
:
-
Elevation from Copernicus EU-DEM v1.1 (dem)
-
Day of the year (doy)
%% Cell type:markdown id:a7e732f3-a219-4317-aa42-638cae86b4af tags:
Define the predictand and the model to evaluate:
%% Cell type:code id:d75e3627-13c4-46b8-b749-d582e4b802ac tags:
```
python
# define the model parameters
PREDICTAND
=
'
pr
'
MODEL
=
'
USegNet
'
PPREDICTORS
=
'
ztuvq
'
PLEVELS
=
[
'
500
'
,
'
850
'
]
SPREDICTORS
=
'
mslp
'
DEM
=
'
dem
'
DEM_FEATURES
=
''
DOY
=
''
WET_DAY_THRESHOLD
=
[
0
,
0.5
,
1
,
2
,
3
,
5
]
LOSS
=
'
BernoulliGammaLoss
'
```
%% Cell type:markdown id:cda0cf8b-adce-4513-af69-31df6eb5542f tags:
### Imports
%% Cell type:code id:01ed9fbc-49c2-4028-a3f3-692a014dcf89 tags:
```
python
# builtins
import
datetime
import
warnings
import
calendar
# externals
import
xarray
as
xr
import
numpy
as
np
import
matplotlib.pyplot
as
plt
from
mpl_toolkits.axes_grid1.inset_locator
import
inset_axes
import
scipy.stats
as
stats
from
IPython.display
import
Image
from
sklearn.metrics
import
r2_score
,
roc_curve
,
auc
,
classification_report
# locals
from
climax.main.io
import
ERA5_PATH
,
OBS_PATH
,
TARGET_PATH
,
DEM_PATH
from
pysegcnn.core.utils
import
search_files
from
pysegcnn.core.graphics
import
plot_classification_report
```
%% Cell type:code id:073dfc6a-8b33-4aca-ab71-c06ef92bbb80 tags:
```
python
# mapping from predictands to variable names
NAMES
=
{
'
tasmin
'
:
'
minimum temperature
'
,
'
tasmax
'
:
'
maximum temperature
'
,
'
pr
'
:
'
precipitation
'
}
```
%% Cell type:markdown id:9410ed09-50a9-4d19-9e90-8551ab701509 tags:
### Load datasets
%% Cell type:code id:030c221d-e0ad-4e67-b742-db34de549983 tags:
```
python
# construct file pattern to match
PATTERN
=
'
_
'
.
join
([
MODEL
,
PREDICTAND
,
PPREDICTORS
,
*
PLEVELS
])
PATTERN
=
'
_
'
.
join
([
PATTERN
,
SPREDICTORS
])
if
SPREDICTORS
else
PATTERN
PATTERN
=
'
_
'
.
join
([
PATTERN
,
DEM
])
if
DEM
else
PATTERN
PATTERN
=
'
_
'
.
join
([
PATTERN
,
DEM_FEATURES
])
if
DEM_FEATURES
else
PATTERN
PATTERN
=
'
_
'
.
join
([
PATTERN
,
DOY
])
if
DOY
else
PATTERN
```
%% Cell type:code id:7106fed2-9f63-453b-a9af-b5d490efbbb4 tags:
```
python
# file pattern for different wet day thresholds
PATTERNS
=
[
'
_
'
.
join
([
PATTERN
,
'
{}mm_{}
'
.
format
(
str
(
t
).
replace
(
'
.
'
,
''
),
LOSS
)])
for
t
in
WET_DAY_THRESHOLD
]
PATTERNS
```
%% Cell type:code id:0069c5d8-ee28-49a3-84f0-8af6b3bcb05d tags:
```
python
# load model predictions for different thresholds
y_pred
=
{
k
:
xr
.
open_dataset
(
search_files
(
TARGET_PATH
.
joinpath
(
'
pr
'
),
'
.
'
.
join
([
v
,
'
nc$
'
])).
pop
())
for
k
,
v
in
zip
(
WET_DAY_THRESHOLD
,
PATTERNS
)}
```
%% Cell type:code id:316a35e3-49e6-4d02-b865-25710a621daa tags:
```
python
# load observations
y_true
=
xr
.
open_dataset
(
search_files
(
OBS_PATH
.
joinpath
(
'
pr
'
),
'
OBS_pr(.*).nc$
'
).
pop
())
y_true
=
y_true
.
sel
(
time
=
y_pred
[
0
].
time
)
```
%% Cell type:code id:58177982-7f86-4c9e-9980-563cf5d48852 tags:
```
python
# align datasets and mask missing values
for
k
,
v
in
y_pred
.
items
():
_
,
y
=
xr
.
align
(
y_true
,
v
,
join
=
'
override
'
)
y
=
y
.
where
(
~
np
.
isnan
(
y_true
.
precipitation
),
other
=
np
.
nan
)
y_pred
[
k
]
=
y
```
%% Cell type:markdown id:5a72117c-2e1c-462a-968e-91db18b4aaaf tags:
## Model sensitivity
%% Cell type:markdown id:e505a78a-e04a-4cb3-bcdf-c68ed7c402a2 tags:
### Coefficient of determination $R^2$
%% Cell type:code id:9758d58b-82e9-47b2-9cf7-1c93d5ed04a7 tags:
```
python
# calculate true monthly mean precipitation (mm / month)
y_true_values
=
y_true
[
NAMES
[
'
pr
'
]].
resample
(
time
=
'
1M
'
).
sum
(
skipna
=
False
).
groupby
(
'
time.month
'
).
mean
(
dim
=
'
time
'
).
values
```
%% Cell type:code id:57ca80c6-2c02-44d1-87eb-35c8a78ec71f tags:
```
python
# calculate predicted monthly mean precipitation (mm / month)
for
k
,
v
in
y_pred
.
items
():
y_pred_values
=
v
[
NAMES
[
'
pr
'
]].
resample
(
time
=
'
1M
'
).
sum
(
skipna
=
False
).
groupby
(
'
time.month
'
).
mean
(
dim
=
'
time
'
).
values
# apply mask of valid pixels
mask
=
(
~
np
.
isnan
(
y_pred_values
)
&
~
np
.
isnan
(
y_true_values
))
y_pred_values
=
y_pred_values
[
mask
]
y_true_masked
=
y_true_values
[
mask
]
# calculate coefficient of determination
r2
=
r2_score
(
y_true_masked
,
y_pred_values
)
print
(
'
({:.1f} mm), r2: {:.2f}
'
.
format
(
k
,
r2
))
```
%% Cell type:markdown id:788a97c1-b913-4782-aca0-318d4da1180d tags:
### Bias, MAE, and RMSE: Mean
%% Cell type:markdown id:bd8971e9-1c25-4635-9ff4-8c8dea822066 tags:
Calculate yearly average bias over entire reference period:
%% Cell type:code id:02c99a93-530c-4570-a7ed-4cbf1940222f tags:
```
python
# yearly average precipitation
y_true_yearly_avg
=
y_true
.
groupby
(
'
time.year
'
).
mean
(
dim
=
'
time
'
)
```
%% Cell type:code id:bc85224d-68d8-4791-8eec-ab39e16898b7 tags:
```
python
# yearly average bias over reference period
for
k
,
v
in
y_pred
.
items
():
y_pred_yearly_avg
=
v
[
NAMES
[
'
pr
'
]].
groupby
(
'
time.year
'
).
mean
(
dim
=
'
time
'
)
bias_yearly_avg
=
((
y_pred_yearly_avg
-
y_true_yearly_avg
)
/
y_true_yearly_avg
)
*
100
mae_yearly_avg
=
np
.
abs
(
y_pred_yearly_avg
-
y_true_yearly_avg
)
rmse_yearly_avg
=
((
y_pred_yearly_avg
-
y_true_yearly_avg
)
**
2
).
mean
()
print
(
'
({:.1f} mm), relative bias = {:.2f}%
'
.
format
(
k
,
bias_yearly_avg
.
mean
().
to_array
().
item
()))
print
(
'
({:.1f} mm), MAE = {:.2f} mm
'
.
format
(
k
,
mae_yearly_avg
.
mean
().
to_array
().
item
()))
print
(
'
({:.1f} mm), RMSE = {:.2f} mm
'
.
format
(
k
,
rmse_yearly_avg
.
mean
().
to_array
().
item
()))
```
%% Cell type:markdown id:9a8377ea-1634-48d4-ae6c-84ea07e53a15 tags:
### Bias, MAE, RMSE: Extremes
%% Cell type:code id:0a5d2707-70f5-4b63-8427-2e8ba52922d3 tags:
```
python
# extreme quantile of interest
quantile
=
0.98
```
%% Cell type:code id:5d072ecf-c9f6-48de-9407-66a67b18e3b1 tags:
```
python
# calculate observed extreme quantile for each year
with
warnings
.
catch_warnings
():
warnings
.
simplefilter
(
'
ignore
'
,
category
=
RuntimeWarning
)
y_true_ex
=
y_true
.
groupby
(
'
time.year
'
).
quantile
(
quantile
,
dim
=
'
time
'
)
```
%% Cell type:code id:d08ee7b4-4546-4244-b1ba-6fdc31667b3f tags:
```
python
# calculate predicted extreme quantile for each year
for
k
,
v
in
y_pred
.
items
():
with
warnings
.
catch_warnings
():
warnings
.
simplefilter
(
'
ignore
'
,
category
=
RuntimeWarning
)
y_ex
=
v
.
groupby
(
'
time.year
'
).
quantile
(
quantile
,
dim
=
'
time
'
)
# calculate bias in extreme quantile for each year
bias_ex
=
((
y_ex
-
y_true_ex
)
/
y_true_ex
)
*
100
# mean absolute error in extreme quantile
mae_ex
=
np
.
abs
(
y_ex
-
y_true_ex
).
mean
()
# root mean squared error in extreme quantile
rmse_ex
=
((
y_ex
-
y_true_ex
)
**
2
).
mean
()
print
(
'
({:.1f} mm), relative bias of P{:.0f} = {:.2f}%
'
.
format
(
k
,
quantile
*
100
,
bias_ex
.
mean
().
to_array
().
item
()))
print
(
'
({:.1f} mm), MAE of P{:.0f} = {:.2f} mm
'
.
format
(
k
,
quantile
*
100
,
mae_ex
.
mean
().
to_array
().
item
()))
print
(
'
({:.1f} mm), RMSE of P{:.0f} = {:.2f} mm
'
.
format
(
k
,
quantile
*
100
,
rmse_ex
.
mean
().
to_array
().
item
()))
```
%% Cell type:code id:26b130a6-5caa-490f-93ce-72f4a5f53412 tags:
%% Cell type:markdown id:f6dfd17e-6eb4-465f-a624-bd5c8f79c99b tags:
### ROC: Receiver operating characteristics
%% Cell type:code id:92733418-8840-46c2-bf33-adf93b78c646 tags:
```
python
# true and predicted probability of precipitation
for
k
,
v
in
y_pred
.
items
():
p_true
=
(
y_true
.
precipitation
>
k
).
values
.
flatten
()
p_pred
=
v
.
prob
.
values
.
flatten
()
# apply mask of valid pixels
mask
=
(
~
np
.
isnan
(
p_true
)
&
~
np
.
isnan
(
p_pred
))
p_pred
=
p_pred
[
mask
]
p_true
=
p_true
[
mask
].
astype
(
float
)
# calculate ROC: false positive rate vs. true positive rate
fpr
,
tpr
,
_
=
roc_curve
(
p_true
,
p_pred
)
area
=
auc
(
fpr
,
tpr
)
# area under ROC curve
rocss
=
2
*
area
-
1
# ROC skill score (cf. https://journals.ametsoc.org/view/journals/clim/16/24/1520-0442_2003_016_4145_otrsop_2.0.co_2.xml)
print
(
'
({:.1f} mm), ROCSS = {:.2f}
'
.
format
(
k
,
rocss
))
```
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment