Note

This page was generated from examples/all_functions.ipynb.
Interactive online version: Binder badge

An extensive walk through DownClim

Although all the described methods and functions describe below are fully functional, this is not the recommended way to use the library, as you don't fully take advantage of the workflow and abstractions provided. Take a look at other notebooks from the `examples` section to see how to use `DownClim` in a more efficient way.

However, maybe at some point you'll find yourself your own way to use `DownClim` !

Definition of the workflow

In this intentiionaly simplified example, we will use the DownClim library to:

  • download CHELSA data for the baseline period 1980-1981

  • download CHIRPS and GSHTD data for the evaluation period 2005-2006

  • download CORDEX and CMIP6 data for the baseline, evaluation and projection periods

Authentication

First, we need to authenticate to Earth Engine to retrieve data from GSHTD, CHIRPS and CMIP6.

Although we also need to authenticate to ESGF for CORDEX data, login information can be provided in a separate file.

[ ]:
from __future__ import annotations

import ee

ee.Authenticate()
ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com", project="downclim")

Successfully saved authorization token.

Areas of interest

We first need to define the areas of interest. This will define the boundaries for downloading the data and the area for which we will be predicting the downscaling.

There are multiple ways to define the areas of interest (cf. api link to get_aoi).

[ ]:
from downclim.aoi import get_aoi

aoi1 = get_aoi("Vanuatu")
aoi2 = get_aoi((10, 10, 20, 20, "box"))

Download CHELSA data

[3]:
from downclim.dataset.chelsa2 import get_chelsa2

get_chelsa2(
    aoi=[aoi1, aoi2],
    variable=["pr", "tas", "tasmin", "tasmax"],
    period=(1980, 1981),
    keep_tmp_dir=True,
)
Downloading CHELSA data...
Getting year "1980" for variables "pr" and areas of interest : "['Vanuatu', 'box']"
Getting year "1981" for variables "pr" and areas of interest : "['Vanuatu', 'box']"
Concatenating data for area of interest : Vanuatu
Concatenating data for area of interest : Vanuatu
saving file ./results/tmp/chelsa/chelsa_Vanuatu_pr_1981.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_Vanuatu_pr_1980.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_box_pr_1981.nc
saving file ./results/tmp/chelsa/chelsa_box_pr_1980.nc
Getting year "1980" for variables "tas" and areas of interest : "['Vanuatu', 'box']"Getting year "1981" for variables "tas" and areas of interest : "['Vanuatu', 'box']"

Concatenating data for area of interest : Vanuatu
Concatenating data for area of interest : Vanuatu
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tas_1981.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tas_1980.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_box_tas_1981.nc
saving file ./results/tmp/chelsa/chelsa_box_tas_1980.nc
Getting year "1981" for variables "tasmin" and areas of interest : "['Vanuatu', 'box']"Getting year "1980" for variables "tasmin" and areas of interest : "['Vanuatu', 'box']"

Concatenating data for area of interest : Vanuatu
Concatenating data for area of interest : Vanuatu
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tasmin_1981.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tasmin_1980.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_box_tasmin_1980.nc
saving file ./results/tmp/chelsa/chelsa_box_tasmin_1981.nc
Getting year "1980" for variables "tasmax" and areas of interest : "['Vanuatu', 'box']"
Getting year "1981" for variables "tasmax" and areas of interest : "['Vanuatu', 'box']"
Concatenating data for area of interest : Vanuatu
Concatenating data for area of interest : Vanuatu
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tasmax_1981.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_Vanuatu_tasmax_1980.nc
Concatenating data for area of interest : box
saving file ./results/tmp/chelsa/chelsa_box_tasmax_1980.nc
saving file ./results/tmp/chelsa/chelsa_box_tasmax_1981.nc
Merging files by aoi...
Merging files for area Vanuatu...
Merging files for area box...

Downalod CHIRPS and GSHTD data

[9]:
from downclim.dataset.chirps import get_chirps
from downclim.dataset.gshtd import get_gshtd

get_chirps(
    aoi=[aoi1, aoi2],
    period=(2006, 2007),
    project = "downclim", # project name for Earth Engine
)

get_gshtd(
    aoi=[aoi1, aoi2],
    variable=["tas", "tasmin", "tasmax"],
    period=(2006, 2007),
)

Already connected to Earth Engine with project 'downclim'.
Downloading CHIRPS data...
Getting CHIRPS data for period : "(2006, 2007)" and area of interest : "Vanuatu"
Getting CHIRPS data for period : "(2006, 2007)" and area of interest : "box"
Already connected to Earth Engine with project 'downclim'.
Downloading GSHTD data...
Getting GSHTD data for period : "(2006, 2007)" and variable : "tas" on area of interest : "Vanuatu"
Getting GSHTD data for period : "(2006, 2007)" and variable : "tasmin" on area of interest : "Vanuatu"
Getting GSHTD data for period : "(2006, 2007)" and variable : "tasmax" on area of interest : "Vanuatu"
Getting GSHTD data for period : "(2006, 2007)" and variable : "tas" on area of interest : "box"
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
Getting GSHTD data for period : "(2006, 2007)" and variable : "tasmin" on area of interest : "box"
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
Getting GSHTD data for period : "(2006, 2007)" and variable : "tasmax" on area of interest : "box"
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: earthengine-highvolume.googleapis.com. Connection pool size: 10

Download climate simulations

Download CORDEX data

[ ]:
from downclim.dataset.cordex import (
    CORDEXContext,
    get_cordex_from_list,
    get_download_scripts,
    list_available_cordex_simulations,
)

# Define the research context for CORDEX data
cordex_context = CORDEXContext(
    domain=["AUS-22", "AFR-44"],
    experiment=["historical", "rcp26", "rcp85"],
    frequency="mon",
    variable=["pr", "tas"],
)

# Use the previously defined context to list available simulations
# ! This step requires ESGF credentials
cordex_simulations = list_available_cordex_simulations(
    cordex_context, esgf_credential="../../config/esgf_credential.yaml"
)

# Retrieve download scripts for the available simulations
cordex_simulations = get_download_scripts(cordex_simulations, esgf_credential="config/esgf_credential.yaml")

# Save the list of simulations to a CSV file. This can be useful if you want to perform hand-selection.
cordex_simulations.to_csv("results/cordex/cordex_simulations.csv")

get_cordex_from_list(
    aoi=[aoi1, aoi2],
    cordex_simulations=cordex_simulations,
    historical_period=(1980, 1981),
    evaluation_period=(2006, 2007),
    projection_period=(2071, 2072),
    output_dir="./results/cordex",
    tmp_dir = "./results/tmp/cordex",
    keep_tmp_dir = True,
    esgf_credential="config/esgf_credential.yaml"
)

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[7], line 19
      8 cordex_context = CORDEXContext(
      9     domain=["AUS-22", "AFR-44"],
     10     experiment=["historical", "rcp26", "rcp85"],
     11     frequency="mon",
     12     variable=["pr", "tas"],
     13 )
     15 cordex_simulations = list_available_cordex_simulations(
     16     cordex_context, esgf_credential="../../config/esgf_credential.yaml"
     17 )
---> 19 cordex_simulations = get_download_scripts(cordex_simulations, esgf_credential="config/esgf_credential.yaml")
     20 cordex_simulations.to_csv("results/cordex/cordex_simulations.csv")

File ~/Documents/Modeles/Codes/DownClim/src/downclim/dataset/cordex.py:279, in get_download_scripts(simulations, esgf_credential)
    268 """Get the esgf download scripts for the simulations described in the DataFrame.
    269
    270 Args:
   (...)
    275    pd.DataFrame: same DataFrame as the input with the download scripts added.
    276 """
    278 # connect
--> 279 connector = connect_to_esgf(esgf_credential, server=DataProduct.CORDEX.url)
    281 facets = ", ".join(simulations_columns)
    282 cordex_scripts = []

File ~/Documents/Modeles/Codes/DownClim/src/downclim/dataset/connectors.py:79, in connect_to_esgf(esgf_credential, server)
     65 def connect_to_esgf(esgf_credential: str, server: str) -> pyesgf.SearchConnection:
     66     """
     67     Connector to ESGF server.
     68
   (...)
     77
     78     """
---> 79     with Path(esgf_credential).open(encoding="utf-8") as stream:
     80         creds = yaml.safe_load(stream)
     81     lm = LogonManager()

File ~/miniconda3/envs/downclim/lib/python3.13/pathlib/_local.py:537, in Path.open(self, mode, buffering, encoding, errors, newline)
    535 if "b" not in mode:
    536     encoding = io.text_encoding(encoding)
--> 537 return io.open(self, mode, buffering, encoding, errors, newline)

FileNotFoundError: [Errno 2] No such file or directory: 'config/esgf_credential.yaml'

Download CMIP6 data

[ ]:
from downclim.dataset.cmip6 import (
                                    CMIP6Context,
                                    get_cmip6,
                                    list_available_cmip6_simulations,
)

cmip6_context = CMIP6Context(
    project=["ScenarioMIP", "CMIP"],
    institute=["NOAA-GFDL", "CMCC"],
    experiment=["ssp126", "historical"],
    ensemble="r1i1p1f1",
    frequency="mon",
    variable=["tas", "pr"],
    grid_label="gn",
)
cmip6_simulations = list_available_cmip6_simulations(cmip6_context)
cmip6_simulations.to_csv("results/cmip6/cmip6_simulations.csv")

get_cmip6(
    aoi=[aoi1, aoi2],
    cmip6_simulations=cmip6_simulations,
    historical_period=(1980, 1981),
    evaluation_period=(2006, 2007),
    projection_period=(2071, 2072),
    output_dir="./results/cmip6",
)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[10], line 17
      7 cmip6_context = CMIP6Context(
      8     project=["ScenarioMIP", "CMIP"],
      9     institute=["NOAA-GFDL", "CMCC"],
   (...)
     14     grid_label="gn",
     15 )
     16 cmip6_simulations = list_available_cmip6_simulations(cmip6_context)
---> 17 cmip6_simulations.to_csv("results/cmip6/cmip6_simulations.csv")
     19 get_cmip6_from_list(
     20     aoi=aois,
     21     cmip6_simulations=cmip6_simulations,
   (...)
     25     output_dir="./results/cmip6",
     26 )

File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/util/_decorators.py:333, in deprecate_nonkeyword_arguments.<locals>.decorate.<locals>.wrapper(*args, **kwargs)
    327 if len(args) > num_allow_args:
    328     warnings.warn(
    329         msg.format(arguments=_format_argument_list(allow_args)),
    330         FutureWarning,
    331         stacklevel=find_stack_level(),
    332     )
--> 333 return func(*args, **kwargs)

File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/core/generic.py:3967, in NDFrame.to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, lineterminator, chunksize, date_format, doublequote, escapechar, decimal, errors, storage_options)
   3956 df = self if isinstance(self, ABCDataFrame) else self.to_frame()
   3958 formatter = DataFrameFormatter(
   3959     frame=df,
   3960     header=header,
   (...)
   3964     decimal=decimal,
   3965 )
-> 3967 return DataFrameRenderer(formatter).to_csv(
   3968     path_or_buf,
   3969     lineterminator=lineterminator,
   3970     sep=sep,
   3971     encoding=encoding,
   3972     errors=errors,
   3973     compression=compression,
   3974     quoting=quoting,
   3975     columns=columns,
   3976     index_label=index_label,
   3977     mode=mode,
   3978     chunksize=chunksize,
   3979     quotechar=quotechar,
   3980     date_format=date_format,
   3981     doublequote=doublequote,
   3982     escapechar=escapechar,
   3983     storage_options=storage_options,
   3984 )

File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/io/formats/format.py:1014, in DataFrameRenderer.to_csv(self, path_or_buf, encoding, sep, columns, index_label, mode, compression, quoting, quotechar, lineterminator, chunksize, date_format, doublequote, escapechar, errors, storage_options)
    993     created_buffer = False
    995 csv_formatter = CSVFormatter(
    996     path_or_buf=path_or_buf,
    997     lineterminator=lineterminator,
   (...)
   1012     formatter=self.fmt,
   1013 )
-> 1014 csv_formatter.save()
   1016 if created_buffer:
   1017     assert isinstance(path_or_buf, StringIO)

File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/io/formats/csvs.py:251, in CSVFormatter.save(self)
    247 """
    248 Create the writer & save.
    249 """
    250 # apply compression and byte/text conversion
--> 251 with get_handle(
    252     self.filepath_or_buffer,
    253     self.mode,
    254     encoding=self.encoding,
    255     errors=self.errors,
    256     compression=self.compression,
    257     storage_options=self.storage_options,
    258 ) as handles:
    259     # Note: self.encoding is irrelevant here
    260     self.writer = csvlib.writer(
    261         handles.handle,
    262         lineterminator=self.lineterminator,
   (...)
    267         quotechar=self.quotechar,
    268     )
    270     self._save()

File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/io/common.py:749, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
    747 # Only for write methods
    748 if "r" not in mode and is_path:
--> 749     check_parent_directory(str(handle))
    751 if compression:
    752     if compression != "zstd":
    753         # compression libraries do not like an explicit text-mode

File ~/miniconda3/envs/downclim/lib/python3.13/site-packages/pandas/io/common.py:616, in check_parent_directory(path)
    614 parent = Path(path).parent
    615 if not parent.is_dir():
--> 616     raise OSError(rf"Cannot save file into a non-existent directory: '{parent}'")

OSError: Cannot save file into a non-existent directory: 'results/cmip6'

Evaluation

Downscaling