Reference¶

thalassa ¶

Thalassa is a library for visualizing unstructured mesh data with a focus on large scale sea level data

crop ¶

crop(ds: xr.Dataset, bbox: shapely.Polygon) -> xr.Dataset

Crop the dataset using the provided bbox.

Examples:

import thalassa
import shapely

ds = thalassa.open_dataset("some_netcdf.nc")
bbox = shapely.box(0, 0, 1, 1)
ds = thalassa.crop(ds, bbox)

PARAMETER	DESCRIPTION
`ds`	The dataset we want to crop. TYPE: `xarray.Dataset`
`bbox`	A Shapely polygon whose boundary will be used to crop `ds`. TYPE: `shapely.Polygon`

Source code in thalassa/utils.py

def crop(
    ds: xr.Dataset,
    bbox: shapely.Polygon,
) -> xr.Dataset:
    """
    Crop the dataset using the provided `bbox`.

    Examples:
        ``` python
        import thalassa
        import shapely

        ds = thalassa.open_dataset("some_netcdf.nc")
        bbox = shapely.box(0, 0, 1, 1)
        ds = thalassa.crop(ds, bbox)
        ```

    Parameters:
        ds: The dataset we want to crop.
        bbox: A Shapely polygon whose boundary will be used to crop `ds`.
    """
    bbox = resolve_bbox(bbox)
    indices_of_nodes_in_bbox = np.where(
        True
        & (ds.lat >= bbox.bounds[1])
        & (ds.lat <= bbox.bounds[3])
        & (ds.lon >= bbox.bounds[0])
        & (ds.lon <= bbox.bounds[2]),
    )[0]
    indices_of_triface_nodes_in_bbox = np.where(
        ds.triface_nodes.isin(indices_of_nodes_in_bbox).all(axis=1),
    )[0]
    ds = ds.isel(node=indices_of_nodes_in_bbox, triface=indices_of_triface_nodes_in_bbox)
    remapped_nodes = np.arange(len(indices_of_nodes_in_bbox))
    remapped_triface_nodes = np.c_[
        npi.remap(ds.triface_nodes[:, 0], indices_of_nodes_in_bbox, remapped_nodes),
        npi.remap(ds.triface_nodes[:, 1], indices_of_nodes_in_bbox, remapped_nodes),
        npi.remap(ds.triface_nodes[:, 2], indices_of_nodes_in_bbox, remapped_nodes),
    ]
    ds["triface_nodes"] = (("triface", "three"), remapped_triface_nodes)
    return ds

normalize ¶

normalize(ds: xr.Dataset) -> xr.Dataset

Normalize the dataset i.e. convert it to the "Thalassa Schema".

Examples:

import thalassa
import xarray as xr

ds = xr.open_dataset("some_netcdf.nc")
ds = thalassa.normalize(ds)
print(ds)

PARAMETER	DESCRIPTION
`ds`	The dataset we want to convert. TYPE: `xarray.Dataset`

Source code in thalassa/normalization.py

def normalize(ds: xr.Dataset) -> xr.Dataset:
    """
    Normalize the `dataset` i.e. convert it to the "Thalassa Schema".

    Examples:
        ``` python
        import thalassa
        import xarray as xr

        ds = xr.open_dataset("some_netcdf.nc")
        ds = thalassa.normalize(ds)
        print(ds)
        ```

    Parameters:
        ds: The dataset we want to convert.

    """
    logger.debug("Dataset normalization: Started")
    fmt = infer_format(ds)
    normalizer_func = NORMALIZE_DISPATCHER[fmt]
    normalized_ds = normalizer_func(ds)
    # Handle quad elements
    # Splitting quad elements to triangles, means that the number of faces increases
    # There are two options:
    # 1. We insert new faces and we keep on using `face_nodes`
    # 2. We define a new variable and a new dimension which specifically address triangular elements
    # I'd rather avoid altering the values of the provided netcdf file therefore we go for option #2,
    # i.e. we create the `triface_nodes` variable.
    if "triface_nodes" not in ds.data_vars:
        if "max_no_vertices" in normalized_ds.dims and len(normalized_ds.max_no_vertices) == 4:
            triface_nodes = utils.split_quads(normalized_ds.face_nodes.values)
        else:
            triface_nodes = normalized_ds.face_nodes.values
        normalized_ds["triface_nodes"] = (("triface", "three"), triface_nodes)
    logger.debug("Dataset normalization: Finished")
    return normalized_ds

open_dataset ¶

open_dataset(path: str | os.PathLike[str], normalize: bool = True, **kwargs: dict[str, typing.Any]) -> xr.Dataset

Open the file specified at path using xarray and return an xarray.Dataset.

If normalize is True then convert the dataset to the "Thalassa schema", too. Additional kwargs are passed on to xarray.open_dataset().

Note

This function is just a wrapper around xarray.open_dataset(). The reason we need it is because the netcdfs files created by ADCIRC are not compatible with xarray, at least not when using the defaults. This function automatically detects the problematic variables (e.g. neta and nvel) and drops them.

Examples:

import thalassa

ds = thalassa.open_dataset("some_netcdf.nc")
print(ds)

PARAMETER	DESCRIPTION
`path`	The path to the dataset file (netCDF, zarr, grib) TYPE: `str \| os.PathLike[str]`
`normalize`	Boolean flag indicating whether the dataset should be converted/normalized to the "Thalassa schema". Normalization is currently only supported for `SCHISM` and `ADCIRC` netcdf files. TYPE: `bool` DEFAULT: `True`
`kwargs`	The `kwargs` are being passed through to `xarray.open_dataset`. TYPE: `dict[str, typing.Any]` DEFAULT: `{}`

Source code in thalassa/api.py

def open_dataset(
    path: str | os.PathLike[str],
    normalize: bool = True,
    **kwargs: dict[str, typing.Any],
) -> xr.Dataset:
    """
    Open the file specified at ``path`` using ``xarray`` and return an ``xarray.Dataset``.

    If `normalize` is `True` then convert the dataset to the "Thalassa schema", too.
    Additional `kwargs` are passed on to `xarray.open_dataset()`.

    !!! note

        This function is just a wrapper around `xarray.open_dataset()`. The reason we need
        it is because the netcdfs files created by ADCIRC are not compatible with `xarray`,
        at least not when using the defaults. This function automatically detects the
        problematic variables (e.g. `neta` and `nvel`) and drops them.

    Examples:
        ``` python
        import thalassa

        ds = thalassa.open_dataset("some_netcdf.nc")
        print(ds)
        ```

    Parameters:
        path: The path to the dataset file (netCDF, zarr, grib)
        normalize: Boolean flag indicating whether the dataset should be converted/normalized to the "Thalassa schema".
            Normalization is currently only supported for ``SCHISM`` and ``ADCIRC`` netcdf files.
        kwargs: The ``kwargs`` are being passed through to ``xarray.open_dataset``.

    """
    default_kwargs: dict[str, typing.Any] = dict(
        mask_and_scale=True,
        cache=False,
        drop_variables=ADCIRC_VARIABLES_TO_BE_DROPPED,
    )
    with warnings.catch_warnings(record=True):
        ds = xr.open_dataset(path, **(default_kwargs | kwargs))  # type: ignore[arg-type]
    if normalize:
        ds = normalization.normalize(ds)
    return ds

plot ¶

plot(ds: xr.Dataset, variable: str, bbox: shapely.Polygon | None = None, title: str = '', cmap: str = 'plasma', colorbar: bool = True, clabel: str = '', clim_min: float | None = None, clim_max: float | None = None, x_range: tuple[float, float] | None = None, y_range: tuple[float, float] | None = None, show_mesh: bool = False) -> gv.DynamicMap

Return the plot of the specified variable.

Examples:

import thalassa

ds = thalassa.open_dataset("some_netcdf.nc")
thalassa.plot(ds, variable="zeta_max")

When we plot time dependent variables we need to filter the data in such a way that time is no longer a dimension. For example to plot the map of the first timestamp of variable zeta:

import thalassa

ds = thalassa.open_dataset("some_netcdf.nc")
thalassa.plot(ds.isel(time=0), variable="zeta")

Often, it is quite useful to limit the range of the colorbar:

import thalassa

ds = thalassa.open_dataset("some_netcdf.nc")
thalassa.plot(ds, variable="zeta", clim_min=1, clim_max=3, clabel="meter")

If we want to on-the-fly crop the dataset we can pass bbox, too:

import shapely
import thalassa

bbox = shapely.box(0, 0, 1, 1)
ds = thalassa.open_dataset("some_netcdf.nc")
thalassa.plot(ds, variable="depth", bbox=bbox)

PARAMETER	DESCRIPTION
`ds`	The dataset which will get visualized. It must adhere to the "thalassa schema". TYPE: `xarray.Dataset`
`variable`	The dataset's variable which we want to visualize. TYPE: `str`
`bbox`	A Shapely polygon which will be used to (on-the-fly) crop the `dataset`. TYPE: `shapely.Polygon \| None` DEFAULT: `None`
`title`	The title of the plot. Defaults to `variable`. TYPE: `str` DEFAULT: `''`
`cmap`	The colormap to use. TYPE: `str` DEFAULT: `'plasma'`
`colorbar`	Boolean flag indicating whether the plot should have an integrated colorbar. TYPE: `bool` DEFAULT: `True`
`clabel`	A caption for the colorbar. Useful for indicating e.g. units TYPE: `str` DEFAULT: `''`
`clim_min`	The lower limit for the colorbar. TYPE: `float \| None` DEFAULT: `None`
`clim_max`	The upper limit for the colorbar. TYPE: `float \| None` DEFAULT: `None`
`x_range`	A tuple indicating the minimum and maximum longitude to be displayed. TYPE: `tuple[float, float] \| None` DEFAULT: `None`
`y_range`	A tuple indicating the minimum and maximum latitude to be displayed. TYPE: `tuple[float, float] \| None` DEFAULT: `None`
`show_mesh`	A boolean flag indicating whether the mesh should be overlayed on the rendered variable. Enabling this makes rendering slower. TYPE: `bool` DEFAULT: `False`

Source code in thalassa/plotting.py

def plot(
    ds: xr.Dataset,
    variable: str,
    bbox: shapely.Polygon | None = None,
    title: str = "",
    cmap: str = "plasma",
    colorbar: bool = True,
    clabel: str = "",
    clim_min: float | None = None,
    clim_max: float | None = None,
    x_range: tuple[float, float] | None = None,
    y_range: tuple[float, float] | None = None,
    show_mesh: bool = False,
) -> gv.DynamicMap:
    """
    Return the plot of the specified `variable`.

    Examples:
        ``` python
        import thalassa

        ds = thalassa.open_dataset("some_netcdf.nc")
        thalassa.plot(ds, variable="zeta_max")
        ```

        When we plot time dependent variables we need to filter the data
        in such a way that `time` is no longer a dimension.
        For example to plot the map of the first timestamp of variable `zeta`:

        ``` python
        import thalassa

        ds = thalassa.open_dataset("some_netcdf.nc")
        thalassa.plot(ds.isel(time=0), variable="zeta")
        ```

        Often, it is quite useful to limit the range of the colorbar:

        ``` python
        import thalassa

        ds = thalassa.open_dataset("some_netcdf.nc")
        thalassa.plot(ds, variable="zeta", clim_min=1, clim_max=3, clabel="meter")
        ```

        If we want to on-the-fly crop the dataset we can pass `bbox`, too:

        ``` python
        import shapely
        import thalassa

        bbox = shapely.box(0, 0, 1, 1)
        ds = thalassa.open_dataset("some_netcdf.nc")
        thalassa.plot(ds, variable="depth", bbox=bbox)
        ```

    Parameters:
        ds: The dataset which will get visualized. It must adhere to the "thalassa schema".
        variable: The dataset's variable which we want to visualize.
        bbox: A Shapely polygon which will be used to (on-the-fly) crop the `dataset`.
        title: The title of the plot. Defaults to `variable`.
        cmap: The colormap to use.
        colorbar: Boolean flag indicating whether the plot should have an integrated colorbar.
        clabel: A caption for the colorbar. Useful for indicating e.g. units
        clim_min: The lower limit for the colorbar.
        clim_max: The upper limit for the colorbar.
        x_range: A tuple indicating the minimum and maximum longitude to be displayed.
        y_range: A tuple indicating the minimum and maximum latitude to be displayed.
        show_mesh: A boolean flag indicating whether the mesh should be overlayed on the rendered variable.
            Enabling this makes rendering slower.

    """
    ds = normalization.normalize(ds)
    _sanity_check(ds=ds, variable=variable)
    if bbox:
        ds = utils.crop(ds, bbox)
    trimesh = api.create_trimesh(ds_or_trimesh=ds, variable=variable)
    raster = api.get_raster(
        ds_or_trimesh=trimesh,
        variable=variable,
        x_range=x_range,
        y_range=y_range,
        cmap=cmap,
        colorbar=colorbar,
        clim_min=clim_min,
        clim_max=clim_max,
        title=title,
        clabel=clabel,
    )
    tiles = api.get_tiles()
    components = [tiles, raster]
    if show_mesh:
        mesh = api.get_wireframe(ds)
        components.append(mesh)
    overlay = hv.Overlay(components)
    dmap = overlay.collate()
    # Keep a reference of the raster DynamicMap, in order to be able to retrieve it from plot_ts
    dmap._raster = raster
    return dmap

plot_mesh ¶

plot_mesh(ds: xr.Dataset, bbox: shapely.Polygon | None = None, title: str = 'Mesh') -> gv.DynamicMap

Plot the mesh of the dataset

Examples:

import thalassa

ds = thalassa.open_dataset("some_netcdf.nc")
thalassa.plot_mesh(ds)

If we want to on-the-fly crop the dataset we can pass bbox, too:

import shapely
import thalassa

bbox = shapely.box(0, 0, 1, 1)
ds = thalassa.open_dataset("some_netcdf.nc")
thalassa.plot_mesh(ds, bbox=bbox)

PARAMETER	DESCRIPTION
`ds`	The dataset whose mesh we want to visualize. It must adhere to the "thalassa schema" TYPE: `xarray.Dataset`
`bbox`	A Shapely polygon which will be used to (on-the-fly) crop the `dataset`. TYPE: `shapely.Polygon \| None` DEFAULT: `None`
`title`	The title of the plot. TYPE: `str` DEFAULT: `'Mesh'`

Source code in thalassa/plotting.py

def plot_mesh(
    ds: xr.Dataset,
    bbox: shapely.Polygon | None = None,
    title: str = "Mesh",
) -> gv.DynamicMap:
    """
    Plot the mesh of the dataset

    Examples:
        ``` python
        import thalassa

        ds = thalassa.open_dataset("some_netcdf.nc")
        thalassa.plot_mesh(ds)
        ```

        If we want to on-the-fly crop the dataset we can pass `bbox`, too:

        ``` python
        import shapely
        import thalassa

        bbox = shapely.box(0, 0, 1, 1)
        ds = thalassa.open_dataset("some_netcdf.nc")
        thalassa.plot_mesh(ds, bbox=bbox)
        ```

    Parameters:
        ds: The dataset whose mesh we want to visualize. It must adhere to the "thalassa schema"
        bbox: A Shapely polygon which will be used to (on-the-fly) crop the `dataset`.
        title: The title of the plot.

    """
    ds = normalization.normalize(ds)
    if bbox:
        ds = utils.crop(ds, bbox)
    tiles = api.get_tiles()
    mesh = api.get_wireframe(ds)
    overlay = hv.Overlay((tiles, mesh)).opts(title=title).collate()
    return overlay

plot_ts ¶

plot_ts(ds: xr.Dataset, variable: str, source_plot: gv.DynamicMap) -> gv.DynamicMap

Return a plot with the full timeseries of a specific node.

The node that will be visualized is selected by clicking on source_plot.

Examples:

import thalassa

ds = thalassa.open_dataset("some_netcdf.nc")
main_plot = thalassa.plot(ds, variable="zeta_max")
ts_plot = thalassa.plot_ts(ds, variable="zeta", source_plot=main_plot)

(main_plot.opts(width=600) + ts_plot.opts(width=600)).cols(1)

PARAMETER	DESCRIPTION
`ds`	The dataset which will get visualized. It must adhere to the "thalassa schema" TYPE: `xarray.Dataset`
`variable`	The dataset's variable which we want to visualize. TYPE: `str`
`source_plot`	The plot instance which be used to select the coordinates of the node. Normally, you get this instance by calling `plot()`. TYPE: `geoviews.DynamicMap`

Source code in thalassa/plotting.py

def plot_ts(
    ds: xr.Dataset,
    variable: str,
    source_plot: gv.DynamicMap,
) -> gv.DynamicMap:
    """
    Return a plot with the full timeseries of a specific node.

    The node that will be visualized is selected by clicking on `source_plot`.

    Examples:
        ``` python
        import thalassa

        ds = thalassa.open_dataset("some_netcdf.nc")
        main_plot = thalassa.plot(ds, variable="zeta_max")
        ts_plot = thalassa.plot_ts(ds, variable="zeta", source_plot=main_plot)

        (main_plot.opts(width=600) + ts_plot.opts(width=600)).cols(1)
        ```

    Parameters:
        ds: The dataset which will get visualized. It must adhere to the "thalassa schema"
        variable: The dataset's variable which we want to visualize.
        source_plot: The plot instance which be used to select the coordinates of the node.
            Normally, you get this instance by calling `plot()`.
    """
    ds = normalization.normalize(ds)
    ts = api.get_tap_timeseries(ds, variable, source_plot._raster)
    return ts