Does cdo remapcon2 remap per-area emissions or per-gridcell emissions?

emiprep will use CDO for regridding purposes. So we need to understand how the CDO interpolation operators work.

In case of the TNO-MACC3 inventory, we need to convert from mass/time/gridcell to mass/time/area. So it is important to know the order in which we have to do things: first cdo remap and then convert from per-gridcell to per-area, or vice-versa? Let’s find out!

First, we need some imports:

[1]:
import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join('..', '..')))

from cdo import Cdo

import numpy as np
import xarray as xr

from emiprep.regrid.cdo import (
    _metgrid_to_cdo_grid_info_extraction,
    metgrid_to_cdo_griddes)

Within this notebook, we need to explicitly set the cdo executable to the correct one. This should not be necessary when working in a proper shell:

[2]:
# os.environ['CDO'] = '/home2/hilboll/.local/easybuild/software/Anaconda3/4.2.0/envs/python3_rolling/bin/cdo'

We need to define some temporary file names:

[3]:
FN_SOURCE_ONES = 'data_srcgrid.nc'
FN_GRIDDES_SOURCE = 'griddes_source.txt'
FN_GRIDDES_TARGET = 'griddes_target.txt'

As source data, we take a constant field of 1.0:

[4]:
grid_source = xr.DataArray(np.ones((672, 720)),
                           dims=['grid_ysize', 'grid_xsize'],
                           name='data').to_dataset()
grid_source.to_netcdf(FN_SOURCE_ONES, format='NETCDF4_CLASSIC', encoding={'data': {'zlib': True}})
grid_source
[4]:
<xarray.Dataset>
Dimensions:  (grid_xsize: 720, grid_ysize: 672)
Dimensions without coordinates: grid_xsize, grid_ysize
Data variables:
    data     (grid_ysize, grid_xsize) float64 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ...

CDO is a bit peculiar regarding file metadata; for testing purposes, it’s easiest to manually force the grid of the source (dummy) data to be a regular lat-lon grid:

[5]:
%%file griddes_source.txt
gridtype = lonlat
xsize    =  720
ysize    =  672
xfirst   =  -30
xinc     =    0.125
yfirst   =   30
yinc     =    0.0625
Writing griddes_source.txt

We load our target grid from a testdata file:

[6]:
TESTDATA_PATH = os.path.join('..', '..', 'tests', 'testdata')
fn_metgrid = os.path.join(TESTDATA_PATH, 'metgrid_coords.nc')
grid_target = _metgrid_to_cdo_grid_info_extraction(fn_metgrid)
grid_target
[6]:
<xarray.Dataset>
Dimensions:          (grid_corners: 4, grid_size: 44100, grid_xsize: 210, grid_ysize: 210)
Coordinates:
  * grid_size        (grid_size) int64 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
Dimensions without coordinates: grid_corners, grid_xsize, grid_ysize
Data variables:
    grid_center_lat  (grid_ysize, grid_xsize) float32 36.3869 36.4172 ...
    grid_center_lon  (grid_ysize, grid_xsize) float32 -12.1618 -12.0042 ...
    grid_corner_lat  (grid_ysize, grid_xsize, grid_corners) float32 36.3083 ...
    grid_corner_lon  (grid_ysize, grid_xsize, grid_corners) float32 -12.2217 ...
    grid_imask       (grid_size) int64 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ...
    dummydata        (grid_ysize, grid_xsize) float32 0.0 0.0 0.0 0.0 0.0 ...

Now, we create a CDO grid description file for the target grid, using emiprep functionality:

[7]:
_ = metgrid_to_cdo_griddes(fn_metgrid, FN_GRIDDES_TARGET)

Now, we can use the CDO interpolation operators for regridding. Here, we make use of operator chains to speed up calculations. Also, we directly access the data as numpy array instead of creating an output file and then reading that:

[8]:
cdo = Cdo()
[9]:
data_regridded = cdo.remapcon(
    FN_GRIDDES_TARGET, options='-f nc', returnArray='data',
    input='-setgrid,{} {}'.format(FN_GRIDDES_SOURCE, FN_SOURCE_ONES))
[10]:
data_regridded2 = cdo.remapcon2(
    FN_GRIDDES_TARGET, options='-f nc', returnArray='data',
    input='-setgrid,{} {}'.format(FN_GRIDDES_SOURCE, FN_SOURCE_ONES))

We can see that the arrays indeed have the correct shape:

[11]:
data_regridded.shape, data_regridded2.shape
[11]:
((210, 210), (210, 210))

Also, we can see that the regridded data are still all equal to 1.0:

[12]:
np.allclose(data_regridded, 1), np.allclose(data_regridded2, 1)
[12]:
(True, True)

This means that the CDO remap operators apparently regrid per-area. For emiprep, this means that we have to first convert from per-gridcell to per-area, and then regrid using cdo remapcon2.

Finally, we remove our temporary files:

[13]:
for fn in FN_SOURCE_ONES, FN_GRIDDES_SOURCE, FN_GRIDDES_TARGET:
    os.remove(fn)