Contributors: Maddie Shankle
Description: An example notebook of how to download ECCO output to the group workspace on JASMIN and access it from within your own notebooks. Based on The ecco_access Python “package”: accessing ECCO output on PO.DAAC section of the ECCO-v4-Python-Tutorial; see there for more info.
Introduction¶
ECCOv4 release 4 output is available from the Physical Oceanography Distributed Active Archive Center (PO.DAAC). The ecco_access package has been written specifically to simplify data access and includes useful functions like:
ecco_podaac_to_xrdataset(): takes as input a text query or ECCO dataset identifier, plus start and end dates, and returns an xarray Dataset. Used in this notebook.ecco_podaac_access(): takes the same input, but returns the URLs/paths or local files where the data is located.
ecco_access is not yet available on conda or pip, so this notebook sets it up so that it can be imported and used like any other Python package. Here we will use it to download ECCO output to the group workspace (/gws/nopw/j04/co2clim/datasets/ECCOv4r4/).
Initial setup¶
Set up Earthdata login credentials¶
Follow the instructions in the Setting up Earthdata login credentials section of the ECCO-v4-Python-Tutorial.
Git-clone ECCO-v4-Python-Tutorial repository¶
Git-clone the entire “ECCO-v4-Python-Tutorial” repository off GitHub which contains, among other things, the ecco_access package as a subdirectory. In your download notebook, you will add this subdirectory as a path so that you can access the relevant functions (see below).
⚠️ A bug was detected relating to the function arguments, outlined in this issue and fixed in this pull request. If the linked PR has not yet been merged, clone Maddie’s fork of the ECCO-v4-Python-Tutorial.
Downloading ECCO Output¶
Specify the path to the ECCO-v4-Python-Tutorial directory and import ecco_access
# Add path to directory containing 'ecco_access'
import sys
sys.path.append('/gws/nopw/j04/co2clim/USERNAME/ECCO-v4-Python-Tutorial')
# Add ecco_access "package", and one other needed for this notebook
import ecco_access as ea
from os.path import join
Set output path to /Group Workspace/datasets/ECCOv4r4/
# Specify where on gws to download to.
# The download commands below will download all relevant .nc files and place them
# in a sub-directory (named ShortName, see below) within my_download_dir/ECCO_V4r4_PODAAC/.
my_download_dir = '/gws/nopw/j04/co2clim/datasets/ECCOv4r4/'
ECCO output is organized into dataset identifiers or codes called ShortNames. Find the ShortName for your variable of interest here. All output (.nc files, one per month) will be downloaded to /my_download_dir/ECCO_V4r4_PODAAC/, within a subfolder named ShortName.
# download data and open xarray dataset
ShortName = 'ECCO_L4_OCEAN_VEL_05DEG_MONTHLY_V4R4'
ds = ea.ecco_podaac_to_xrdataset(ShortName,\
StartDate='1992-01',EndDate='2017-12',\
mode='download',\
download_root_dir=join(my_download_dir,'ECCO_V4r4_PODAAC'))
ds
🤔 The occasional error
OSError: [Errno -51] NetCDF: Unknown file format: xxx.ncseems to disappear when the cell is run a second time.
Accessing ECCO Output¶
Once downloaded, output can be loaded into any notebook with xarray’s open-multiple-files command. Just point to the subdirectory of interest.
import xarray as xr
ds = xr.open_mfdataset('gws/nopw/j04/co2clim/datasets/ECCOv4r4/ECCO_V4r4_PODAAC/ECCO_L4_OCEAN_VEL_05DEG_MONTHLY_V4R4/*.nc')