Note
An extensive walk through DownClim¶
Although all the described methods and functions describe below are fully functional, this is not the recommended way to use the library, as you don't fully take advantage of the workflow and abstractions provided. Take a look at other notebooks from the `examples` section to see how to use `DownClim` in a more efficient way.
However, maybe at some point you'll find yourself your own way to use `DownClim` !
Definition of the workflow¶
In this intentiionaly simplified example, we will use the DownClim library to:
download
CHELSAdata for the baseline period 1980-1981download
CHIRPSandGSHTDdata for the evaluation period 2005-2006download
CORDEXandCMIP6data for the baseline, evaluation and projection periods
Authentication¶
First, we need to authenticate to Earth Engine to retrieve data from GSHTD, CHIRPS and CMIP6.
Although we also need to authenticate to ESGF for CORDEX data, login information can be provided in a separate file.
[ ]:
from __future__ import annotations
import ee
ee.Authenticate()
ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com", project="downclim")
To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions:
The authorization workflow will generate a code, which you should paste in the box below.
Successfully saved authorization token.
Areas of interest¶
We first need to define the areas of interest. This will define the boundaries for downloading the data and the area for which we will be predicting the downscaling.
There are multiple ways to define the areas of interest (cf. api link to get_aoi).
[ ]:
from downclim.aoi import get_aoi
aoi1 = get_aoi("Vanuatu")
aoi2 = get_aoi((10, 10, 20, 20, "box"))
Download CHELSA data¶
[3]:
from downclim.dataset.chelsa2 import get_chelsa2
get_chelsa2(
aoi=[aoi1, aoi2],
variable=["pr", "tas", "tasmin", "tasmax"],
period=(1980, 1981),
keep_tmp_dir=True,
)
Downloading CHELSA data...
Getting year "1980" for variables "pr" and areas of interest : "['Vanuatu', 'box']"
Getting year "1981" for variables "pr" and areas of interest : "['Vanuatu', 'box']"
Concatenating data for area of interest : Vanuatu
Concatenating data for area of interest : Vanuatu
saving file ./results/tmp/chelsa/chelsa_Vanuatu_pr_1981.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_Vanuatu_pr_1980.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_box_pr_1981.nc
saving file ./results/tmp/chelsa/chelsa_box_pr_1980.nc
Getting year "1980" for variables "tas" and areas of interest : "['Vanuatu', 'box']"Getting year "1981" for variables "tas" and areas of interest : "['Vanuatu', 'box']"
Concatenating data for area of interest : Vanuatu
Concatenating data for area of interest : Vanuatu
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tas_1981.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tas_1980.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_box_tas_1981.nc
saving file ./results/tmp/chelsa/chelsa_box_tas_1980.nc
Getting year "1981" for variables "tasmin" and areas of interest : "['Vanuatu', 'box']"Getting year "1980" for variables "tasmin" and areas of interest : "['Vanuatu', 'box']"
Concatenating data for area of interest : Vanuatu
Concatenating data for area of interest : Vanuatu
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tasmin_1981.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tasmin_1980.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_box_tasmin_1980.nc
saving file ./results/tmp/chelsa/chelsa_box_tasmin_1981.nc
Getting year "1980" for variables "tasmax" and areas of interest : "['Vanuatu', 'box']"
Getting year "1981" for variables "tasmax" and areas of interest : "['Vanuatu', 'box']"
Concatenating data for area of interest : Vanuatu
Concatenating data for area of interest : Vanuatu
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tasmax_1981.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tasmax_1980.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_box_tasmax_1980.nc
saving file ./results/tmp/chelsa/chelsa_box_tasmax_1981.nc
Merging files by aoi...
Merging files for area Vanuatu...
Merging files for area box...
Downalod CHIRPS and GSHTD data¶
[9]:
from downclim.dataset.chirps import get_chirps
from downclim.dataset.gshtd import get_gshtd
get_chirps(
aoi=[aoi1, aoi2],
period=(2006, 2007),
project = "downclim", # project name for Earth Engine
)
get_gshtd(
aoi=[aoi1, aoi2],
variable=["tas", "tasmin", "tasmax"],
period=(2006, 2007),
)
Already connected to Earth Engine with project 'downclim'.
Downloading CHIRPS data...
Getting CHIRPS data for period : "(2006, 2007)" and area of interest : "Vanuatu"
Getting CHIRPS data for period : "(2006, 2007)" and area of interest : "box"
Already connected to Earth Engine with project 'downclim'.
Downloading GSHTD data...
Getting GSHTD data for period : "(2006, 2007)" and variable : "tas" on area of interest : "Vanuatu"
Getting GSHTD data for period : "(2006, 2007)" and variable : "tasmin" on area of interest : "Vanuatu"
Getting GSHTD data for period : "(2006, 2007)" and variable : "tasmax" on area of interest : "Vanuatu"
Getting GSHTD data for period : "(2006, 2007)" and variable : "tas" on area of interest : "box"
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
Getting GSHTD data for period : "(2006, 2007)" and variable : "tasmin" on area of interest : "box"
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
Getting GSHTD data for period : "(2006, 2007)" and variable : "tasmax" on area of interest : "box"
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
Download climate simulations¶
Download CORDEX data¶
[ ]:
from downclim.dataset.cordex import (
CORDEXContext,
get_cordex_from_list,
get_download_scripts,
list_available_cordex_simulations,
)
# Define the research context for CORDEX data
cordex_context = CORDEXContext(
domain=["AUS-22", "AFR-44"],
experiment=["historical", "rcp26", "rcp85"],
frequency="mon",
variable=["pr", "tas"],
)
# Use the previously defined context to list available simulations
# ! This step requires ESGF credentials
cordex_simulations = list_available_cordex_simulations(
cordex_context, esgf_credential="../../config/esgf_credential.yaml"
)
# Retrieve download scripts for the available simulations
cordex_simulations = get_download_scripts(cordex_simulations, esgf_credential="config/esgf_credential.yaml")
# Save the list of simulations to a CSV file. This can be useful if you want to perform hand-selection.
cordex_simulations.to_csv("results/cordex/cordex_simulations.csv")
get_cordex_from_list(
aoi=[aoi1, aoi2],
cordex_simulations=cordex_simulations,
historical_period=(1980, 1981),
evaluation_period=(2006, 2007),
projection_period=(2071, 2072),
output_dir="./results/cordex",
tmp_dir = "./results/tmp/cordex",
keep_tmp_dir = True,
esgf_credential="config/esgf_credential.yaml"
)
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[7], line 19
8 cordex_context = CORDEXContext(
9 domain=["AUS-22", "AFR-44"],
10 experiment=["historical", "rcp26", "rcp85"],
11 frequency="mon",
12 variable=["pr", "tas"],
13 )
15 cordex_simulations = list_available_cordex_simulations(
16 cordex_context, esgf_credential="../../config/esgf_credential.yaml"
17 )
---> 19 cordex_simulations = get_download_scripts(cordex_simulations, esgf_credential="config/esgf_credential.yaml")
20 cordex_simulations.to_csv("results/cordex/cordex_simulations.csv")
File ~/Documents/Modeles/Codes/DownClim/src/downclim/dataset/cordex.py:279, in get_download_scripts(simulations, esgf_credential)
268 """Get the esgf download scripts for the simulations described in the DataFrame.
269
270 Args:
(...)
275 pd.DataFrame: same DataFrame as the input with the download scripts added.
276 """
278 # connect
--> 279 connector = connect_to_esgf(esgf_credential, server=DataProduct.CORDEX.url)
281 facets = ", ".join(simulations_columns)
282 cordex_scripts = []
File ~/Documents/Modeles/Codes/DownClim/src/downclim/dataset/connectors.py:79, in connect_to_esgf(esgf_credential, server)
65 def connect_to_esgf(esgf_credential: str, server: str) -> pyesgf.SearchConnection:
66 """
67 Connector to ESGF server.
68
(...)
77
78 """
---> 79 with Path(esgf_credential).open(encoding="utf-8") as stream:
80 creds = yaml.safe_load(stream)
81 lm = LogonManager()
File ~/miniconda3/envs/downclim/lib/python3.13/pathlib/_local.py:537, in Path.open(self, mode, buffering, encoding, errors, newline)
535 if "b" not in mode:
536 encoding = io.text_encoding(encoding)
--> 537 return io.open(self, mode, buffering, encoding, errors, newline)
FileNotFoundError: [Errno 2] No such file or directory: 'config/esgf_credential.yaml'
Download CMIP6 data¶
[ ]:
from downclim.dataset.cmip6 import (
CMIP6Context,
get_cmip6,
list_available_cmip6_simulations,
)
cmip6_context = CMIP6Context(
project=["ScenarioMIP", "CMIP"],
institute=["NOAA-GFDL", "CMCC"],
experiment=["ssp126", "historical"],
ensemble="r1i1p1f1",
frequency="mon",
variable=["tas", "pr"],
grid_label="gn",
)
cmip6_simulations = list_available_cmip6_simulations(cmip6_context)
cmip6_simulations.to_csv("results/cmip6/cmip6_simulations.csv")
get_cmip6(
aoi=[aoi1, aoi2],
cmip6_simulations=cmip6_simulations,
historical_period=(1980, 1981),
evaluation_period=(2006, 2007),
projection_period=(2071, 2072),
output_dir="./results/cmip6",
)
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
Cell In[10], line 17
7 cmip6_context = CMIP6Context(
8 project=["ScenarioMIP", "CMIP"],
9 institute=["NOAA-GFDL", "CMCC"],
(...)
14 grid_label="gn",
15 )
16 cmip6_simulations = list_available_cmip6_simulations(cmip6_context)
---> 17 cmip6_simulations.to_csv("results/cmip6/cmip6_simulations.csv")
19 get_cmip6_from_list(
20 aoi=aois,
21 cmip6_simulations=cmip6_simulations,
(...)
25 output_dir="./results/cmip6",
26 )
File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/util/_decorators.py:333, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
327 if len(args) > num_allow_args:
328 warnings.warn(
329 msg.format(arguments=_format_argument_list(allow_args)),
330 FutureWarning,
331 stacklevel=find_stack_level(),
332 )
--> 333 return func(*args, **kwargs)
File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/core/generic.py:3967, in NDFrame.to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, lineterminator, chunksize, date_format, doublequote, escapechar, decimal, errors, storage_options)
3956 df = self if isinstance(self, ABCDataFrame) else self.to_frame()
3958 formatter = DataFrameFormatter(
3959 frame=df,
3960 header=header,
(...)
3964 decimal=decimal,
3965 )
-> 3967 return DataFrameRenderer(formatter).to_csv(
3968 path_or_buf,
3969 lineterminator=lineterminator,
3970 sep=sep,
3971 encoding=encoding,
3972 errors=errors,
3973 compression=compression,
3974 quoting=quoting,
3975 columns=columns,
3976 index_label=index_label,
3977 mode=mode,
3978 chunksize=chunksize,
3979 quotechar=quotechar,
3980 date_format=date_format,
3981 doublequote=doublequote,
3982 escapechar=escapechar,
3983 storage_options=storage_options,
3984 )
File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/io/formats/format.py:1014, in DataFrameRenderer.to_csv(self, path_or_buf, encoding, sep, columns, index_label, mode, compression, quoting, quotechar, lineterminator, chunksize, date_format, doublequote, escapechar, errors, storage_options)
993 created_buffer = False
995 csv_formatter = CSVFormatter(
996 path_or_buf=path_or_buf,
997 lineterminator=lineterminator,
(...)
1012 formatter=self.fmt,
1013 )
-> 1014 csv_formatter.save()
1016 if created_buffer:
1017 assert isinstance(path_or_buf, StringIO)
File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/io/formats/csvs.py:251, in CSVFormatter.save(self)
247 """
248 Create the writer & save.
249 """
250 # apply compression and byte/text conversion
--> 251 with get_handle(
252 self.filepath_or_buffer,
253 self.mode,
254 encoding=self.encoding,
255 errors=self.errors,
256 compression=self.compression,
257 storage_options=self.storage_options,
258 ) as handles:
259 # Note: self.encoding is irrelevant here
260 self.writer = csvlib.writer(
261 handles.handle,
262 lineterminator=self.lineterminator,
(...)
267 quotechar=self.quotechar,
268 )
270 self._save()
File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/io/common.py:749, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
747 # Only for write methods
748 if "r" not in mode and is_path:
--> 749 check_parent_directory(str(handle))
751 if compression:
752 if compression != "zstd":
753 # compression libraries do not like an explicit text-mode
File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/io/common.py:616, in check_parent_directory(path)
614 parent = Path(path).parent
615 if not parent.is_dir():
--> 616 raise OSError(rf"Cannot save file into a non-existent directory: '{parent}'")
OSError: Cannot save file into a non-existent directory: 'results/cmip6'