Step-by-Step guide to building preprocessed directories from scratch

Step-by-Step guide to building preprocessed directories from scratch#

This guide is for anyone who wants to learn how to start from scratch with building preprocessed directories, the same way OGGM does for its users. You might even use your own data. This tutorial assumes you are already familiar with OGGM basics. We won’t go into every little detail here, but you’ll find links for more information if you’re interested.

We’ve structured the guide into five main sections, each dedicated to a different level of preprocessing. At the beginning of each section, we’ll outline the tasks to be performed, the data we’ll use, and provide links to related tutorials. Additionally, at the end of each section, we’ll share a corresponding prepro_base_url. This URL allows you to start directly at that level with everything pre-setup, bypassing the need to complete the earlier steps. Plus, in the tutorial storing glacier directories for later use, we show you how to save your work. This way, you don’t have to redo everything from the beginning every time (many steps only need to be done once).

Tags: advanced, glacier-directory, workflow

Tip: There’s a lot to learn here. If you’re curious about a specific function and want to know more, just add a question mark (?) right after it, and you’ll see more details.

# Example: Getting help on the Python function 'print'
print?

Set-up#

First, let’s get everything ready to go. Here’s how we’ll do it:

  1. Import Functions: We’ll start by importing the functions we need. Depending on which preprocessed level you’re working with, you might not need all of them.

  2. Initialize OGGM: Next, we’ll set up OGGM and choose where we want to save our work (defining the working directory).

  3. Choose a Glacier: Lastly, we’ll pick one glacier to focus on as our example.

Remember, these steps are important no matter which level you’re starting from!

from oggm import cfg, utils, workflow, tasks, DEFAULT_BASE_URL

import geopandas as gpd
import numpy as np
import os
# we always need to initialzie and define a working directory
cfg.initialize(logging_level='WARNING')
cfg.PATHS['working_dir'] = utils.gettempdir(dirname='OGGM-full_prepro_elevation_bands', reset=True)
2025-04-16 10:31:03: oggm.cfg: Reading default parameters from the OGGM `params.cfg` configuration file.
2025-04-16 10:31:03: oggm.cfg: Multiprocessing switched OFF according to the parameter file.
2025-04-16 10:31:03: oggm.cfg: Multiprocessing: using all available processors (N=4)
# Our example glacier
rgi_ids = ['RGI60-11.00897']  # Hintereisferner
rgi_region = '11'  # this must fit to example glacier(s), if starting from level 0
# This section is only for future developments of the tutorial (e.g. updateing for new OGGM releases)
# Test if prepro_base_url valid for both flowline_type_to_use, see level 2.
# In total four complete executions of the notebook:
# (load_from_prepro_base_url=False/True and flowline_type_to_use = 'elevation_band'/'centerline')
load_from_prepro_base_url = False

Level 0#

Tasks:

  • Define the rgi_id for your glacier directory gdir.

  • Define the map projection of the glacier directory

  • Add an outline of the glacier.

  • Optionally add intersects to other outlines.

Data used:

  • Glacier outline

  • Optionally intersects

Related Tutorials:

CAUTION: When using your own glacier outlines, it's important to note that OGGM relies on the defined RGI_ID to fetch calibration data from global datasets, which are tailored to the RGI outlines. If your glacier's outline significantly deviates from its RGI counterpart, this could introduce errors, potentially large or small, into your model's results. Ideally, you should provide your own calibration data for custom outlines or, at the very least, be mindful of the discrepancies this might cause.
# load all RGI outlines for our region and extract the example glaciers
rgidf = gpd.read_file(utils.get_rgi_region_file(rgi_region, version='62'))
rgidf = rgidf[np.isin(rgidf.RGIId, rgi_ids)]

# We also take care of intersects for this RGI version
cfg.set_intersects_db(utils.get_rgi_intersects_region_file(rgi_region, version='62'))

# set the used projection used for gdir, options 'tmerc' or 'utm'
cfg.PARAMS['map_proj'] = cfg.PARAMS['map_proj']  # default is 'tmerc'

gdirs = workflow.init_glacier_directories(rgidf, reset=True, force=True)
2025-04-16 10:31:09: oggm.workflow: Execute entity tasks [GlacierDirectory] on 1 glaciers
# Instruction for beginning with existing OGGM's preprocessed directories
if load_from_prepro_base_url:
    # to start from level 0 you can do
    prepro_base_url_L0 = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/L1-L2_files/elev_bands/'
    gdirs = workflow.init_glacier_directories(rgi_ids,
                                              from_prepro_level=0,
                                              prepro_base_url=prepro_base_url_L0,
                                              prepro_border=80,  # could be 10, 80, 160 or 240
                                              reset=True,
                                              force=True,
                                             )

Level 1#

Tasks:

  • Define the border around the outline.

  • Define the local grid resolution, which will also set the resolution for the flowlines.

  • Add the digital elevation model DEM.

  • Set up a local grid for each gdir.

Data used:

  • DEM file

Related Tutorials:

  • dem_sources: Create local topography maps from different DEM sources with OGGM

  • rgitopo_rgi6: RGI-TOPO for RGI v6.0

Please note that registration may be required to access some of the DEM sources. For more information, refer to the dem_sources tutorial.
# define the border, we keep the default here
cfg.PARAMS['border'] = cfg.PARAMS['border']

# set the method for determining the local grid resolution
cfg.PARAMS['grid_dx_method'] = cfg.PARAMS['grid_dx_method']  # The default method is 'square', which determines the grid spacing (dx) based on the glacier's outline area.
cfg.PARAMS['fixed_dx'] = cfg.PARAMS['fixed_dx']  # This allows setting a specific resolution in meters. It's applicable only when grid_dx_method is set to 'fixed'.

# set the DEM source to use
source = 'COPDEM90'  # we use COPDEM here

# this task adds the DEM and defines the local grid
workflow.execute_entity_task(tasks.define_glacier_region, 
                             gdirs,
                             source=source);
2025-04-16 10:31:09: oggm.workflow: Execute entity tasks [define_glacier_region] on 1 glaciers
2025-04-16 10:33:23: oggm.core.gis: InvalidDEMError occurred during task define_glacier_region on RGI60-11.00897: Source: COPDEM90 no topography file available for extent lat:[np.float64(46.74980052226855), np.float64(46.855130806917515)], lon:[np.float64(10.672580052073911), np.float64(10.855576771995684)]!
---------------------------------------------------------------------------
InvalidDEMError                           Traceback (most recent call last)
Cell In[8], line 12
      9 source = 'COPDEM90'  # we use COPDEM here
     11 # this task adds the DEM and defines the local grid
---> 12 workflow.execute_entity_task(tasks.define_glacier_region, 
     13                              gdirs,
     14                              source=source);

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:192, in execute_entity_task(task, gdirs, **kwargs)
    188     if ng > 3:
    189         log.workflow('WARNING: you are trying to run an entity task on '
    190                      '%d glaciers with multiprocessing turned off. OGGM '
    191                      'will run faster with multiprocessing turned on.', ng)
--> 192     out = [pc(gdir) for gdir in gdirs]
    194 return out

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:192, in <listcomp>(.0)
    188     if ng > 3:
    189         log.workflow('WARNING: you are trying to run an entity task on '
    190                      '%d glaciers with multiprocessing turned off. OGGM '
    191                      'will run faster with multiprocessing turned on.', ng)
--> 192     out = [pc(gdir) for gdir in gdirs]
    194 return out

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:109, in _pickle_copier.__call__(self, arg)
    107 for func in self.call_func:
    108     func, kwargs = func
--> 109     res = self._call_internal(func, arg, kwargs)
    110 return res

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:103, in _pickle_copier._call_internal(self, call_func, gdir, kwargs)
    100     gdir, gdir_kwargs = gdir
    101     kwargs.update(gdir_kwargs)
--> 103 return call_func(gdir, **kwargs)

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_workflow.py:496, in entity_task.__call__.<locals>._entity_task(gdir, reset, print_log, return_value, continue_on_error, add_to_log_file, **kwargs)
    494     signal.alarm(cfg.PARAMS['task_timeout'])
    495 ex_t = time.time()
--> 496 out = task_func(gdir, **kwargs)
    497 ex_t = time.time() - ex_t
    498 if cfg.PARAMS['task_timeout'] > 0:

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/gis.py:550, in define_glacier_region(gdir, entity, source)
    546 # Back to lon, lat for DEM download/preparation
    547 tmp_grid = salem.Grid(proj=utm_proj, nxny=(nx, ny), x0y0=(ulx, uly),
    548                       dxdy=(dx, -dx), pixel_ref='corner')
--> 550 dem_list, dem_source = get_dem_for_grid(grid=tmp_grid,
    551                                         fpath=gdir.get_filepath('dem'),
    552                                         source=source, gdir=gdir)
    554 # Glacier grid
    555 x0y0 = (ulx+dx/2, uly-dx/2)  # To pixel center coordinates

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/gis.py:482, in get_dem_for_grid(grid, fpath, source, gdir)
    471 grid_prop = {
    472     'utm_proj': grid.proj,
    473     'dx': grid.dx,
   (...)    477     'ny': grid.ny
    478 }
    480 source = check_dem_source(source, extent_ll, rgi_id=rgi_id)
--> 482 dem_list, dem_source = get_topo_file((minlon, maxlon), (minlat, maxlat),
    483                                      gdir=gdir,
    484                                      dx_meter=grid_prop['dx'],
    485                                      source=source)
    487 if rgi_id is not None:
    488     log.debug('(%s) DEM source: %s', rgi_id, dem_source)

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_downloads.py:2499, in get_topo_file(lon_ex, lat_ex, gdir, dx_meter, zoom, source)
   2497     return files, source
   2498 else:
-> 2499     raise InvalidDEMError('Source: {2} no topography file available for '
   2500                           'extent lat:{0}, lon:{1}!'.
   2501                           format(lat_ex, lon_ex, source))

InvalidDEMError: Source: COPDEM90 no topography file available for extent lat:[np.float64(46.74980052226855), np.float64(46.855130806917515)], lon:[np.float64(10.672580052073911), np.float64(10.855576771995684)]!
# Instruction for beginning with existing OGGM's preprocessed directories
if load_from_prepro_base_url:
    # to start from level 1 you can do
    prepro_base_url_L1 = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/L1-L2_files/elev_bands/'
    gdirs = workflow.init_glacier_directories(rgi_ids,
                                              from_prepro_level=1,
                                              prepro_base_url=prepro_base_url_L1,
                                              prepro_border=80,  # could be 10, 80, 160 or 240
                                              reset=True,
                                              force=True,
                                             )

Level 2#

Tasks:

  • Choose the type of flowline to use.

  • Create the flowlines surface structure, including surface height and width.

  • Create the downstream flowline, which starts from the glacier’s terminus and extends downstream.

  • Optionally you can bring in extra data from the OGGM-shop and bin it to the elevation band flowline.

Data used:

  • Outline

  • DEM

  • Optional: additional datasets

Related Tutorials:

Starting from this point, it's important to choose the prepro_base_url based on the type of flowline you're working with (see end of this chapter).
flowline_type_to_use = 'elevation_band'  # you can also select 'centerline' here

if flowline_type_to_use == 'elevation_band':
    elevation_band_task_list = [
        tasks.simple_glacier_masks,
        tasks.elevation_band_flowline,
        tasks.fixed_dx_elevation_band_flowline,
        tasks.compute_downstream_line,
        tasks.compute_downstream_bedshape,
    ]

    for task in elevation_band_task_list:
        workflow.execute_entity_task(task, gdirs);

elif flowline_type_to_use == 'centerline':
    # for centerline we can use parabola downstream line
    cfg.PARAMS['downstream_line_shape'] = 'parabola'

    centerline_task_list = [
        tasks.glacier_masks,
        tasks.compute_centerlines,
        tasks.initialize_flowlines,
        tasks.catchment_area,
        tasks.catchment_intersections,
        tasks.catchment_width_geom,
        tasks.catchment_width_correction,
        tasks.compute_downstream_line,
        tasks.compute_downstream_bedshape,
    ]

    for task in centerline_task_list:
        workflow.execute_entity_task(task, gdirs);
    
else:
    raise ValueError(f"Unknown flowline type '{flowline_type_to_use}'! Select 'elevation_band' or 'centerline'!")
2025-04-16 10:33:24: oggm.workflow: Execute entity tasks [simple_glacier_masks] on 1 glaciers
2025-04-16 10:33:24: oggm.core.gis: RasterioIOError occurred during task process_dem on RGI60-11.00897: /tmp/OGGM/OGGM-full_prepro_elevation_bands/per_glacier/RGI60-11/RGI60-11.00/RGI60-11.00897/dem.tif: No such file or directory
2025-04-16 10:33:24: oggm.core.gis: RasterioIOError occurred during task simple_glacier_masks on RGI60-11.00897: /tmp/OGGM/OGGM-full_prepro_elevation_bands/per_glacier/RGI60-11/RGI60-11.00/RGI60-11.00897/dem.tif: No such file or directory
---------------------------------------------------------------------------
CPLE_OpenFailedError                      Traceback (most recent call last)
File rasterio/_base.pyx:310, in rasterio._base.DatasetBase.__init__()

File rasterio/_base.pyx:221, in rasterio._base.open_dataset()

File rasterio/_err.pyx:359, in rasterio._err.exc_wrap_pointer()

CPLE_OpenFailedError: /tmp/OGGM/OGGM-full_prepro_elevation_bands/per_glacier/RGI60-11/RGI60-11.00/RGI60-11.00897/dem.tif: No such file or directory

During handling of the above exception, another exception occurred:

RasterioIOError                           Traceback (most recent call last)
Cell In[10], line 13
      4     elevation_band_task_list = [
      5         tasks.simple_glacier_masks,
      6         tasks.elevation_band_flowline,
   (...)      9         tasks.compute_downstream_bedshape,
     10     ]
     12     for task in elevation_band_task_list:
---> 13         workflow.execute_entity_task(task, gdirs);
     15 elif flowline_type_to_use == 'centerline':
     16     # for centerline we can use parabola downstream line
     17     cfg.PARAMS['downstream_line_shape'] = 'parabola'

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:192, in execute_entity_task(task, gdirs, **kwargs)
    188     if ng > 3:
    189         log.workflow('WARNING: you are trying to run an entity task on '
    190                      '%d glaciers with multiprocessing turned off. OGGM '
    191                      'will run faster with multiprocessing turned on.', ng)
--> 192     out = [pc(gdir) for gdir in gdirs]
    194 return out

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:192, in <listcomp>(.0)
    188     if ng > 3:
    189         log.workflow('WARNING: you are trying to run an entity task on '
    190                      '%d glaciers with multiprocessing turned off. OGGM '
    191                      'will run faster with multiprocessing turned on.', ng)
--> 192     out = [pc(gdir) for gdir in gdirs]
    194 return out

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:109, in _pickle_copier.__call__(self, arg)
    107 for func in self.call_func:
    108     func, kwargs = func
--> 109     res = self._call_internal(func, arg, kwargs)
    110 return res

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:103, in _pickle_copier._call_internal(self, call_func, gdir, kwargs)
    100     gdir, gdir_kwargs = gdir
    101     kwargs.update(gdir_kwargs)
--> 103 return call_func(gdir, **kwargs)

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_workflow.py:496, in entity_task.__call__.<locals>._entity_task(gdir, reset, print_log, return_value, continue_on_error, add_to_log_file, **kwargs)
    494     signal.alarm(cfg.PARAMS['task_timeout'])
    495 ex_t = time.time()
--> 496 out = task_func(gdir, **kwargs)
    497 ex_t = time.time() - ex_t
    498 if cfg.PARAMS['task_timeout'] > 0:

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/gis.py:1032, in simple_glacier_masks(gdir)
   1027     raise GeometryError('{} is a nominal glacier.'.format(gdir.rgi_id))
   1029 if not os.path.exists(gdir.get_filepath('gridded_data')):
   1030     # In a possible future, we might actually want to raise a
   1031     # deprecation warning here
-> 1032     process_dem(gdir)
   1034 # Geometries
   1035 geometry = gdir.read_shapefile('outlines').geometry[0]

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_workflow.py:496, in entity_task.__call__.<locals>._entity_task(gdir, reset, print_log, return_value, continue_on_error, add_to_log_file, **kwargs)
    494     signal.alarm(cfg.PARAMS['task_timeout'])
    495 ex_t = time.time()
--> 496 out = task_func(gdir, **kwargs)
    497 ex_t = time.time() - ex_t
    498 if cfg.PARAMS['task_timeout'] > 0:

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/gis.py:764, in process_dem(gdir, grid, fpath, output_filename)
    747 """Reads the DEM from the tiff, attempts to fill voids and apply smooth.
    748 
    749 The data is then written to `gridded_data.nc`.
   (...)    760     The filename of the nc file to add the DEM to. Defaults to gridded_data
    761 """
    762 if gdir is not None:
    763     # open srtm tif-file:
--> 764     dem = read_geotiff_dem(gdir)
    765     # Grid
    766     dem_grid = gdir.grid

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/gis.py:654, in read_geotiff_dem(gdir, fpath)
    650     else:
    651         raise InvalidParamsError('If you do not provide a gdir you must'
    652                                  f'define a fpath! Given fpath={fpath}.')
--> 654 with rasterio.open(dem_path, 'r', driver='GTiff') as ds:
    655     topo = ds.read(1).astype(rasterio.float32)
    656     topo[topo <= -999.] = np.nan

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/rasterio/env.py:463, in ensure_env_with_credentials.<locals>.wrapper(*args, **kwds)
    460     session = DummySession()
    462 with env_ctor(session=session):
--> 463     return f(*args, **kwds)

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/rasterio/__init__.py:356, in open(fp, mode, driver, width, height, count, crs, transform, dtype, nodata, sharing, opener, **kwargs)
    353     path = _parse_path(raw_dataset_path)
    355 if mode == "r":
--> 356     dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)
    357 elif mode == "r+":
    358     dataset = get_writer_for_path(path, driver=driver)(
    359         path, mode, driver=driver, sharing=sharing, **kwargs
    360     )

File rasterio/_base.pyx:312, in rasterio._base.DatasetBase.__init__()

RasterioIOError: /tmp/OGGM/OGGM-full_prepro_elevation_bands/per_glacier/RGI60-11/RGI60-11.00/RGI60-11.00897/dem.tif: No such file or directory
# Instruction for beginning with existing OGGM's preprocessed directories
if load_from_prepro_base_url:
    # to start from level 2 we need to distinguish between the flowline types
    if flowline_type_to_use == 'elevation_band':
        prepro_base_url_L2 = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/L1-L2_files/2023.2/elev_bands_w_data/'
    elif flowline_type_to_use == 'centerline':
        prepro_base_url_L2 = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/L1-L2_files/centerlines/'
    else:
        raise ValueError(f"Unknown flowline type '{flowline_type_to_use}'! Select 'elevation_band' or 'centerline'!")

    gdirs = workflow.init_glacier_directories(rgi_ids,
                                              from_prepro_level=2,
                                              prepro_base_url=prepro_base_url_L2,
                                              prepro_border=80,  # could be 10, 80, 160 or 240
                                              reset=True,
                                              force=True,
                                             )

Level 3#

Tasks:

  • Add baseline climate data to gdir.

  • Calibrate the mass balance model statically (without considering glacier dynamics) using geodetic observations. This involves the calibration of melt_f, prcp_fac and temp_bias.

  • Conduct an inversion for the glacier’s bed topography. Including the calibration of glen_a and fs by matching to the total volume estimate.

  • Create the dynamic flowline for dynamic simulation runs.

Data used:

Related Tutorials:

For the inversion, we diverge from the standard preprocessed directories when focusing on individual glaciers instead of entire regions. This is because calibrate_inversion_from_consensus adjusts glacier volume based on total regional consensus estimates, not per glacier. Since volume estimates are model-based and not directly observed, they're less reliable for individual calibration. However, for our example, we'll calibrate using the consensus estimate for a single glacier, differing from the preprocessed approach.
# define the climate data to use, we keep the default
cfg.PARAMS['baseline_climate'] = cfg.PARAMS['baseline_climate']

# add climate data to gdir
workflow.execute_entity_task(tasks.process_climate_data, gdirs);

# the default mb calibration
workflow.execute_entity_task(tasks.mb_calibration_from_geodetic_mb,
                             gdirs,
                             informed_threestep=True,  # only available for 'GSWP3_W5E5'
                            );

# glacier bed inversion
workflow.execute_entity_task(tasks.apparent_mb_from_any_mb, gdirs);
workflow.calibrate_inversion_from_consensus(
    gdirs,
    apply_fs_on_mismatch=True,
    error_on_mismatch=True,  # if you running many glaciers some might not work
    filter_inversion_output=True,  # this partly filters the overdeepening due to
    # the equilibrium assumption for retreating glaciers (see. Figure 5 of Maussion et al. 2019)
    volume_m3_reference=None,  # here you could provide your own total volume estimate in m3
);

# finally create the dynamic flowlines
workflow.execute_entity_task(tasks.init_present_time_glacier, gdirs);
2025-04-16 10:33:24: oggm.workflow: Execute entity tasks [process_climate_data] on 1 glaciers
2025-04-16 10:33:24: oggm.workflow: Execute entity tasks [mb_calibration_from_geodetic_mb] on 1 glaciers
2025-04-16 10:33:25: oggm.core.massbalance: FileNotFoundError occurred during task mb_calibration_from_scalar_mb on RGI60-11.00897: [Errno 2] No such file or directory: '/tmp/OGGM/OGGM-full_prepro_elevation_bands/per_glacier/RGI60-11/RGI60-11.00/RGI60-11.00897/inversion_flowlines.pkl'
2025-04-16 10:33:25: oggm.core.massbalance: FileNotFoundError occurred during task mb_calibration_from_geodetic_mb on RGI60-11.00897: [Errno 2] No such file or directory: '/tmp/OGGM/OGGM-full_prepro_elevation_bands/per_glacier/RGI60-11/RGI60-11.00/RGI60-11.00897/inversion_flowlines.pkl'
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[12], line 8
      5 workflow.execute_entity_task(tasks.process_climate_data, gdirs);
      7 # the default mb calibration
----> 8 workflow.execute_entity_task(tasks.mb_calibration_from_geodetic_mb,
      9                              gdirs,
     10                              informed_threestep=True,  # only available for 'GSWP3_W5E5'
     11                             );
     13 # glacier bed inversion
     14 workflow.execute_entity_task(tasks.apparent_mb_from_any_mb, gdirs);

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:192, in execute_entity_task(task, gdirs, **kwargs)
    188     if ng > 3:
    189         log.workflow('WARNING: you are trying to run an entity task on '
    190                      '%d glaciers with multiprocessing turned off. OGGM '
    191                      'will run faster with multiprocessing turned on.', ng)
--> 192     out = [pc(gdir) for gdir in gdirs]
    194 return out

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:192, in <listcomp>(.0)
    188     if ng > 3:
    189         log.workflow('WARNING: you are trying to run an entity task on '
    190                      '%d glaciers with multiprocessing turned off. OGGM '
    191                      'will run faster with multiprocessing turned on.', ng)
--> 192     out = [pc(gdir) for gdir in gdirs]
    194 return out

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:109, in _pickle_copier.__call__(self, arg)
    107 for func in self.call_func:
    108     func, kwargs = func
--> 109     res = self._call_internal(func, arg, kwargs)
    110 return res

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:103, in _pickle_copier._call_internal(self, call_func, gdir, kwargs)
    100     gdir, gdir_kwargs = gdir
    101     kwargs.update(gdir_kwargs)
--> 103 return call_func(gdir, **kwargs)

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_workflow.py:496, in entity_task.__call__.<locals>._entity_task(gdir, reset, print_log, return_value, continue_on_error, add_to_log_file, **kwargs)
    494     signal.alarm(cfg.PARAMS['task_timeout'])
    495 ex_t = time.time()
--> 496 out = task_func(gdir, **kwargs)
    497 ex_t = time.time() - ex_t
    498 if cfg.PARAMS['task_timeout'] > 0:

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/massbalance.py:1666, in mb_calibration_from_geodetic_mb(gdir, ref_period, write_to_gdir, overwrite_gdir, use_regional_avg, override_missing, use_2d_mb, informed_threestep, calibrate_param1, calibrate_param2, calibrate_param3, mb_model_class, filesuffix)
   1663     prcp_fac_min = clip_scalar(prcp_fac * 0.8, mi, ma)
   1664     prcp_fac_max = clip_scalar(prcp_fac * 1.2, mi, ma)
-> 1666     return mb_calibration_from_scalar_mb(gdir,
   1667                                          ref_mb=ref_mb,
   1668                                          ref_mb_err=ref_mb_err,
   1669                                          ref_period=ref_period,
   1670                                          write_to_gdir=write_to_gdir,
   1671                                          overwrite_gdir=overwrite_gdir,
   1672                                          use_2d_mb=use_2d_mb,
   1673                                          calibrate_param1='prcp_fac',
   1674                                          calibrate_param2='melt_f',
   1675                                          calibrate_param3='temp_bias',
   1676                                          prcp_fac=prcp_fac,
   1677                                          prcp_fac_min=prcp_fac_min,
   1678                                          prcp_fac_max=prcp_fac_max,
   1679                                          temp_bias=temp_bias,
   1680                                          mb_model_class=mb_model_class,
   1681                                          filesuffix=filesuffix,
   1682                                          )
   1684 else:
   1685     return mb_calibration_from_scalar_mb(gdir,
   1686                                          ref_mb=ref_mb,
   1687                                          ref_mb_err=ref_mb_err,
   (...)   1697                                          filesuffix=filesuffix,
   1698                                          )

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_workflow.py:496, in entity_task.__call__.<locals>._entity_task(gdir, reset, print_log, return_value, continue_on_error, add_to_log_file, **kwargs)
    494     signal.alarm(cfg.PARAMS['task_timeout'])
    495 ex_t = time.time()
--> 496 out = task_func(gdir, **kwargs)
    497 ex_t = time.time() - ex_t
    498 if cfg.PARAMS['task_timeout'] > 0:

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/massbalance.py:1846, in mb_calibration_from_scalar_mb(gdir, ref_mb, ref_mb_err, ref_period, ref_mb_years, write_to_gdir, overwrite_gdir, use_2d_mb, calibrate_param1, calibrate_param2, calibrate_param3, melt_f, melt_f_min, melt_f_max, prcp_fac, prcp_fac_min, prcp_fac_max, temp_bias, temp_bias_min, temp_bias_max, mb_model_class, filesuffix)
   1842     raise InvalidParamsError('Cannot set `ref_mb_years` and `ref_period` '
   1843                              'at the same time.')
   1845 if not use_2d_mb:
-> 1846     fls = gdir.read_pickle('inversion_flowlines')
   1847 else:
   1848     # if the 2D data is used, the flowline is not needed.
   1849     fls = None

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_workflow.py:3183, in GlacierDirectory.read_pickle(self, filename, use_compression, filesuffix)
   3181 _open = gzip.open if use_comp else open
   3182 fp = self.get_filepath(filename, filesuffix=filesuffix)
-> 3183 with _open(fp, 'rb') as f:
   3184     try:
   3185         out = pickle.load(f)

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/gzip.py:58, in open(filename, mode, compresslevel, encoding, errors, newline)
     56 gz_mode = mode.replace("t", "")
     57 if isinstance(filename, (str, bytes, os.PathLike)):
---> 58     binary_file = GzipFile(filename, gz_mode, compresslevel)
     59 elif hasattr(filename, "read") or hasattr(filename, "write"):
     60     binary_file = GzipFile(None, gz_mode, compresslevel, filename)

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/gzip.py:174, in GzipFile.__init__(self, filename, mode, compresslevel, fileobj, mtime)
    172     mode += 'b'
    173 if fileobj is None:
--> 174     fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
    175 if filename is None:
    176     filename = getattr(fileobj, 'name', '')

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/OGGM/OGGM-full_prepro_elevation_bands/per_glacier/RGI60-11/RGI60-11.00/RGI60-11.00897/inversion_flowlines.pkl'

Guidance on utilizing various baseline climates:

Currently, OGGM supports a variety of baseline climates, including ‘CRU’, ‘HISTALP’, ‘W5E5’, ‘GSWP3_W5E5’ (the default), ‘ERA5’, ‘ERA5L’, ‘CERA’, ‘ERA5dr’, and ‘ERA5L-HMA’. Although switching between these datasets is straightforward, calibrating the mass balance model according to each dataset is more complex. For instance, you’ll need to choose a default precipitation factor that suits both your selected climate dataset and your specific region. Additionally, you must determine the best method to calibrate the mass balance parameters. For a comprehensive guide on the available options, explanations, and how to incorporate your own geodetic observations, please refer to the tutorial massbalance_calibration.

Here’s an example of using the ERA5 dataset:

# define the baseline climate and add it
cfg.PARAMS['baseline_climate'] = 'ERA5'
workflow.execute_entity_task(tasks.process_climate_data, gdirs);

# define the default precipitation factor
cfg.PARAMS['prcp_fac'] = 1.6  # Note: This is not a universial value!
cfg.PARAMS['use_winter_prcp_fac'] = False  # This option is only available for 'GSWP3_W5E5'
cfg.PARAMS['use_temp_bias_from_file'] = False  # This option is only available for 'GSWP3_W5E5'

# an example of static calibration for mass balance, more options are available in the tutorial
workflow.execute_entity_task(tasks.mb_calibration_from_geodetic_mb,
                             gdirs,
                             calibrate_param1='melt_f',
                             calibrate_param2='prcp_fac',
                             calibrate_param3='temp_bias')

You can also utilize your own climate data. However, you will need to either convert your data into a specific format (for an example, see OGGM/oggm-sample-data ->test-files/histalp_merged_hef.nc) or create your own tasks.process_climate_data function. Here’s how you might do this:

cfg.PARAMS['baseline_climate'] = 'CUSTOM'
cfg.PATHS['climate_file'] = path_to_the_climate_file

workflow.execute_entity_task(tasks.process_climate_data, gdirs);

# proceed with defining the default precipitation factor and mass balance calibration as shown above
# Instruction for beginning with existing OGGM's preprocessed directories
if load_from_prepro_base_url:
    # to start from level 3 you can do
    if flowline_type_to_use == 'elevation_band':
        prepro_base_url_L3 = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/L3-L5_files/2023.3/elev_bands/W5E5/'
    elif flowline_type_to_use == 'centerline':
        prepro_base_url_L3 = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/L3-L5_files/2023.3/centerlines/W5E5/'
    else:
        raise ValueError(f"Unknown flowline type '{flowline_type_to_use}'! Select 'elevation_band' or 'centerline'!")

    gdirs = workflow.init_glacier_directories(rgi_ids,
                                              from_prepro_level=3,
                                              prepro_base_url=prepro_base_url_L3,
                                              prepro_border=80,  # could be 80 or 160
                                              reset=True,
                                              force=True,
                                             )

Level 4#

Tasks:

  • Initialize the current state of the glacier without a dynamic spinup. This method, default until version 1.6., is mainly for comparison purposes and can often be skipped.

  • Initialize the current glacier state with a dynamic spinup. This process includes a dynamic calibration of the mass balance. It’s important to note that this option isn’t available for centerlines in the current OGGM preprocessed directories, meaning it hasn’t been tested or analyzed.

Data used:

Related Tutorials:

# set the ice dynamic solver depending on the flowline-type
if flowline_type_to_use == 'elevation_band':
    cfg.PARAMS['evolution_model'] = 'SemiImplicit'
elif flowline_type_to_use == 'centerline':
    cfg.PARAMS['evolution_model'] = 'FluxBased'
else:
    raise ValueError(f"Unknown flowline type '{flowline_type_to_use}'! Select 'elevation_band' or 'centerline'!")

# get the start and end year of the selected baseline
y0 = gdirs[0].get_climate_info()['baseline_yr_0']
ye = gdirs[0].get_climate_info()['baseline_yr_1'] + 1  # run really to the end until 1.1.

# 'static' initialisation
workflow.execute_entity_task(tasks.run_from_climate_data, gdirs,
                             min_ys=y0, ye=ye,
                             fixed_geometry_spinup_yr=None,  # here you could add a static spinup if you want
                             output_filesuffix='_historical')

# 'dynamic' initialisation, including dynamic mb calibration
dynamic_spinup_start_year = 1979
minimise_for = 'area'  # other option would be 'volume'
workflow.execute_entity_task(
    tasks.run_dynamic_melt_f_calibration, gdirs,
    err_dmdtda_scaling_factor=0.2,  # by default we reduce the mass balance error for accounting for
    # corrleated uncertainties on a regional scale
    ys=dynamic_spinup_start_year, ye=ye,
    kwargs_run_function={'minimise_for': minimise_for},
    ignore_errors=True,
    kwargs_fallback_function={'minimise_for': minimise_for},
    output_filesuffix='_spinup_historical',
);
2025-04-16 10:33:25: oggm.workflow: Execute entity tasks [run_from_climate_data] on 1 glaciers
2025-04-16 10:33:25: oggm.core.flowline: InvalidWorkflowError occurred during task run_from_climate_data_historical on RGI60-11.00897: Need a valid `model_flowlines` file. If you explicitly want to use `inversion_flowlines`, set use_inversion_flowlines=True.
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/massbalance.py:1220, in MultipleFlowlineMassBalance.__init__(self, gdir, fls, mb_model_class, use_inversion_flowlines, input_filesuffix, **kwargs)
   1219 try:
-> 1220     fls = gdir.read_pickle('model_flowlines')
   1221 except FileNotFoundError:

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_workflow.py:3183, in GlacierDirectory.read_pickle(self, filename, use_compression, filesuffix)
   3182 fp = self.get_filepath(filename, filesuffix=filesuffix)
-> 3183 with _open(fp, 'rb') as f:
   3184     try:

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/gzip.py:58, in open(filename, mode, compresslevel, encoding, errors, newline)
     57 if isinstance(filename, (str, bytes, os.PathLike)):
---> 58     binary_file = GzipFile(filename, gz_mode, compresslevel)
     59 elif hasattr(filename, "read") or hasattr(filename, "write"):

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/gzip.py:174, in GzipFile.__init__(self, filename, mode, compresslevel, fileobj, mtime)
    173 if fileobj is None:
--> 174     fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
    175 if filename is None:

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/OGGM/OGGM-full_prepro_elevation_bands/per_glacier/RGI60-11/RGI60-11.00/RGI60-11.00897/model_flowlines.pkl'

During handling of the above exception, another exception occurred:

InvalidWorkflowError                      Traceback (most recent call last)
Cell In[14], line 14
     11 ye = gdirs[0].get_climate_info()['baseline_yr_1'] + 1  # run really to the end until 1.1.
     13 # 'static' initialisation
---> 14 workflow.execute_entity_task(tasks.run_from_climate_data, gdirs,
     15                              min_ys=y0, ye=ye,
     16                              fixed_geometry_spinup_yr=None,  # here you could add a static spinup if you want
     17                              output_filesuffix='_historical')
     19 # 'dynamic' initialisation, including dynamic mb calibration
     20 dynamic_spinup_start_year = 1979

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:192, in execute_entity_task(task, gdirs, **kwargs)
    188     if ng > 3:
    189         log.workflow('WARNING: you are trying to run an entity task on '
    190                      '%d glaciers with multiprocessing turned off. OGGM '
    191                      'will run faster with multiprocessing turned on.', ng)
--> 192     out = [pc(gdir) for gdir in gdirs]
    194 return out

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:192, in <listcomp>(.0)
    188     if ng > 3:
    189         log.workflow('WARNING: you are trying to run an entity task on '
    190                      '%d glaciers with multiprocessing turned off. OGGM '
    191                      'will run faster with multiprocessing turned on.', ng)
--> 192     out = [pc(gdir) for gdir in gdirs]
    194 return out

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:109, in _pickle_copier.__call__(self, arg)
    107 for func in self.call_func:
    108     func, kwargs = func
--> 109     res = self._call_internal(func, arg, kwargs)
    110 return res

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/workflow.py:103, in _pickle_copier._call_internal(self, call_func, gdir, kwargs)
    100     gdir, gdir_kwargs = gdir
    101     kwargs.update(gdir_kwargs)
--> 103 return call_func(gdir, **kwargs)

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/utils/_workflow.py:496, in entity_task.__call__.<locals>._entity_task(gdir, reset, print_log, return_value, continue_on_error, add_to_log_file, **kwargs)
    494     signal.alarm(cfg.PARAMS['task_timeout'])
    495 ex_t = time.time()
--> 496 out = task_func(gdir, **kwargs)
    497 ex_t = time.time() - ex_t
    498 if cfg.PARAMS['task_timeout'] > 0:

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/flowline.py:3730, in run_from_climate_data(gdir, ys, ye, min_ys, max_ys, fixed_geometry_spinup_yr, store_monthly_step, store_model_geometry, store_fl_diagnostics, climate_filename, mb_model, mb_model_class, climate_input_filesuffix, output_filesuffix, init_model_filesuffix, init_model_yr, init_model_fls, zero_initial_glacier, bias, temperature_bias, precipitation_factor, **kwargs)
   3727     ys = ys if ys < max_ys else max_ys
   3729 if mb_model is None:
-> 3730     mb_model = MultipleFlowlineMassBalance(gdir,
   3731                                            mb_model_class=mb_model_class,
   3732                                            filename=climate_filename,
   3733                                            bias=bias,
   3734                                            input_filesuffix=climate_input_filesuffix)
   3736 if temperature_bias is not None:
   3737     mb_model.temp_bias += temperature_bias

File /usr/local/pyenv/versions/3.11.11/lib/python3.11/site-packages/oggm/core/massbalance.py:1222, in MultipleFlowlineMassBalance.__init__(self, gdir, fls, mb_model_class, use_inversion_flowlines, input_filesuffix, **kwargs)
   1220         fls = gdir.read_pickle('model_flowlines')
   1221     except FileNotFoundError:
-> 1222         raise InvalidWorkflowError('Need a valid `model_flowlines` '
   1223                                    'file. If you explicitly want to '
   1224                                    'use `inversion_flowlines`, set '
   1225                                    'use_inversion_flowlines=True.')
   1227 self.fls = fls
   1229 # Initialise the mb models

InvalidWorkflowError: Need a valid `model_flowlines` file. If you explicitly want to use `inversion_flowlines`, set use_inversion_flowlines=True.
# Instruction for beginning with existing OGGM's preprocessed directories
if load_from_prepro_base_url:
    # to start from level 4 you can do
    if flowline_type_to_use == 'elevation_band':
        prepro_base_url_L4 = DEFAULT_BASE_URL
    elif flowline_type_to_use == 'centerline':
        prepro_base_url_L4 = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/L3-L5_files/2023.3/centerlines/W5E5/'
    else:
        raise ValueError(f"Unknown flowline type '{flowline_type_to_use}'! Select 'elevation_band' or 'centerline'!")
    gdirs = workflow.init_glacier_directories(rgi_ids,
                                              from_prepro_level=4,
                                              prepro_base_url=prepro_base_url_L4,
                                              prepro_border=80,  # could be 80 or 160
                                              reset=True,
                                              force=True,
                                             )

Level 5#

Tasks:

  • Retain only the data necessary for future projection runs to conserve disk space. At this stage, it’s not possible to revisit the preprocessing steps from earlier levels, but all required information for conducting future projection runs is preserved.

Data used:

  • No additional data is needed for this level.

Related Tutorials:

mini_base_dir = os.path.join(cfg.PATHS['working_dir'],
                             'mini_per_glacier')
mini_gdirs = workflow.execute_entity_task(tasks.copy_to_basedir, gdirs,
                                          base_dir=mini_base_dir,
                                          setup='run/spinup')
2025-04-16 10:33:26: oggm.workflow: Execute entity tasks [copy_to_basedir] on 1 glaciers
When you're ready to access your work later, you should first remove the 'per_glacier' folder from your working directory. Then, rename the 'mini_per_glacier' folder to 'per_glacier'. Remember, proceed with these steps only if you've completed setting up your glacier directories (gdirs) and are sure you won't need to make further changes!
# Instruction for beginning with existing OGGM's preprocessed directories
if load_from_prepro_base_url:
    # to start from level 5 you can do
    if flowline_type_to_use == 'elevation_band':
        prepro_base_url_L5 = DEFAULT_BASE_URL
    elif flowline_type_to_use == 'centerline':
        prepro_base_url_L5 = 'https://cluster.klima.uni-bremen.de/~oggm/gdirs/oggm_v1.6/L3-L5_files/2023.3/centerlines/W5E5/'
    else:
        raise ValueError(f"Unknown flowline type '{flowline_type_to_use}'! Select 'elevation_band' or 'centerline'!")
    gdirs = workflow.init_glacier_directories(rgi_ids,
                                              from_prepro_level=5,
                                              prepro_base_url=prepro_base_url_L5,
                                              prepro_border=80,  # could be 80 or 160
                                              reset=True,
                                              force=True,
                                             )

And that’s it! We’ve successfully recreated all the preprocessed levels offered by OGGM. Remember, if you prefer, you can bypass all previous steps and jump straight into your future projections from Level 5. Happy modeling!

What’s next?#