Sample Data#

pop-tools provides some sample data through the pop_tools.datasets module.

Where are the sample data files?#

The sample data files are downloaded automatically by pooch the first time you load them.

import xarray as xr

import pop_tools

To find which data files are available via pop_tools, you can run the following:

pop_tools.DATASETS.registry_files
['tend_zint_100m_Fe.nc',
 'iron_tracer.nc',
 'daily_surface_potential_temperature.nc',
 'monthly_dissolved_oxygen.nc',
 'cesm_pop_monthly.T62_g17.nc',
 'lateral_fill_np_array_filled_ref.npz',
 'lateral_fill_np_array_tripole_filled_ref.npz',
 'lateral_fill_np_array_tripole_filled_ref.20200818.npz',
 'lateral_fill_np_array_filled_SOR_ref.20200820.npz',
 'lateral_fill_np_array_tripole_filled_SOR_ref.20200820.npz',
 'POP_gx3v7.nc',
 'g.e20.G.TL319_t13.control.001_hfreq.nc',
 'g.e20.G.TL319_t13.control.001_hfreq-coarsen.nc',
 'Pac_POP0.1_JRA_IAF_1993-12-6-test.nc',
 'Pac_grid_pbc_1301x305x62.tx01_62l.2013-07-13.nc',
 'comp-grid.tx9.1v3.20170718.zarr.zip']

Once you know which file you are interested in, you can pass the name to the pop_tools.DATASETS.fetch() function. This function will download the file if it does not exist already on your local system. After the file has been downloaded, the fetch function returns the path:

filepath = pop_tools.DATASETS.fetch('cesm_pop_monthly.T62_g17.nc')
print(filepath)
/home/docs/.pop_tools/data/cesm_pop_monthly.T62_g17.nc

Now, we can pass the file path to the appropriate I/O package for loading the content of the file:

ds = xr.open_dataset(filepath)
ds
<xarray.Dataset>
Dimensions:       (time: 1, z_t: 60, nlat: 384, nlon: 320, lat_aux_grid: 395,
                   d2: 2)
Coordinates:
    TLAT          (nlat, nlon) float64 ...
    TLONG         (nlat, nlon) float64 ...
    ULAT          (nlat, nlon) float64 ...
    ULONG         (nlat, nlon) float64 ...
  * lat_aux_grid  (lat_aux_grid) float32 -79.49 -78.95 -78.42 ... 89.47 90.0
  * time          (time) object 0173-01-01 00:00:00
  * z_t           (z_t) float32 500.0 1.5e+03 2.5e+03 ... 5.125e+05 5.375e+05
Dimensions without coordinates: nlat, nlon, d2
Data variables:
    SALT          (time, z_t, nlat, nlon) float32 ...
    TEMP          (time, z_t, nlat, nlon) float32 ...
    UVEL          (time, z_t, nlat, nlon) float32 ...
    VVEL          (time, z_t, nlat, nlon) float32 ...
    time_bound    (time, d2) object ...
Attributes:
    title:             g.e21.G1850ECOIAF.T62_g17.004
    history:           Sun May 26 14:13:02 2019: ncks -4 -L 9 cesm_pop_monthl...
    Conventions:       CF-1.0; http://www.cgd.ucar.edu/cms/eaton/netcdf/CF-cu...
    time_period_freq:  month_1
    model_doi_url:     https://doi.org/10.5065/D67H1H0V
    contents:          Diagnostic and Prognostic Variables
    source:            CCSM POP2, the CCSM Ocean Component
    revision:          $Id: tavg.F90 90507 2019-01-18 20:54:19Z altuntas@ucar...
    calendar:          All years have exactly  365 days.
    start_time:        This dataset was created on 2019-05-26 at 11:20:07.5
    cell_methods:      cell_methods = time: mean ==> the variable values are ...
    NCO:               netCDF Operators version 4.7.4 (http://nco.sf.net)
%load_ext watermark
%watermark -d -iv -m -g -h
Compiler    : GCC 11.3.0
OS          : Linux
Release     : 5.15.0-1004-aws
Machine     : x86_64
Processor   : x86_64
CPU cores   : 2
Architecture: 64bit

Hostname: build-21213814-project-451810-pop-tools

Git hash: d3c80c0576ae4838c0e04a0157734eb0c977e613

xarray   : 2023.6.0
pop_tools: 2023.3.0.post2+dirty