Back to Tutorials and Data Recipes
Authors: Guoqing Wang (NASA, GSFC), Ian Carroll (NASA, UMBC), Eli Holmes (NOAA)
The following notebooks are prerequisites for this tutorial.
An Earthdata Login account is required to access data from the NASA Earthdata system, including NASA ocean color data.
This tutorial demonstrates accessing and analyzing NASA ocean color data from the NASA Ocean Biology Distributed Active Archive Center (OBDAAC) archives. Currently, there are several ways to find and access ocean color data:
In this tutorial, we will focus on using earthaccess
Python module to access ocean color data through NASA’s Common Metadata
Repository (CMR), a metadata system that catalogs Earth Science data and
associated metadata records.
The Level-3 datasets of Aqua/MODIS include multiple types of temporally and spatially aggregated data. We will look at 8-day averaged and monthly averaged data at 4km resolution. We will plot chlorophyll-a (chlor_a) and remote-sensing reflectance at 412 nm (Rrs_412) data.
At the end of this notebok you will know: * How to find OB.DAAC ocean
color data * How to download files using earthaccess
* How
to create a plot using xarray
We begin by importing all of the packages used in this notebook. If you have created an environment following the guidance provided with this tutorial, then the imports will be successful.
from matplotlib import pyplot as plt
import cartopy
import earthaccess
import numpy as np
import xarray as xr
auth = earthaccess.login(persist=True)
In this example, the image to be used is MODIS AQUA L3 8-day averaged
4km chlorophyll image for Sep 13-20, 2016 and the January 2020 monthly
average for Rrs_412. First we need to search for that data. These data
are hosted by the OB.DAAC. The earthaccess.search_datasets
function queries the CMR for collections. To do this search we need to
know something about the data information, particularly that we are
looking for L3m
or Level-3 mapped collections and MODIS
AQUA.
results = earthaccess.search_datasets(
keyword="L3m ocean color modis aqua chlorophyll",
instrument = "MODIS",
)
Datasets found: 3
set((i.summary()["short-name"] for i in results))
{'MODISA_L3m_CHL', 'MODISA_L3m_CHL_NRT', 'MODIST_L3m_CHL'}
You will want to go on to https://search.earthdata.nasa.gov/ and
enter the short names to read about each data collection. We want to use
the MODISA_L3m_CHL
data collection for our first plot. We
can get the files (granules) in that collection with
earthaccess.search_data()
.
tspan = ("2016-09-20", "2016-09-20")
results = earthaccess.search_data(
short_name="MODISA_L3m_CHL",
temporal=tspan,
)
Granules found: 122
Clearly, that’s too many granules for a single day! The OB.DAAC
publishes multiple variants of a dataset under the same short name, and
the only way to distinguish them is by the product or granule name. The
CMR search allows a granule_name
parameter with wildcards
for this kind of filter. The strings we need to see in the granule name
are “.8D” and “.9km” (the “.” is a separator used in granule names).
results = earthaccess.search_data(
short_name="MODISA_L3m_CHL",
granule_name="*.8D*.9km*",
temporal=tspan,
)
Granules found: 1
results[0]
We need to check if the data are cloud-hosted. If they are and we are, we can load into memory directly without downloading. If they are not cloud-hosted, we need to download the data file.
results[0].cloud_hosted
False
The data are not cloud-hosted so we download with
earthaccess.download()
.
paths = earthaccess.download(results, "data")
Getting 1 granules, approx download size: 0.01 GB
File AQUA_MODIS.20160913_20160920.L3m.8D.CHL.chlor_a.9km.nc already downloaded
dataset = xr.open_dataset(paths[0])
dataset
<xarray.Dataset> Size: 37MB Dimensions: (lat: 2160, lon: 4320, rgb: 3, eightbitcolor: 256) Coordinates: * lat (lat) float32 9kB 89.96 89.88 89.79 89.71 ... -89.79 -89.88 -89.96 * lon (lon) float32 17kB -180.0 -179.9 -179.8 ... 179.8 179.9 180.0 Dimensions without coordinates: rgb, eightbitcolor Data variables: chlor_a (lat, lon) float32 37MB ... palette (rgb, eightbitcolor) uint8 768B ... Attributes: (12/62) product_name: AQUA_MODIS.20160913_20160920.L3m.8D.CHL... instrument: MODIS title: MODISA Level-3 Standard Mapped Image project: Ocean Biology Processing Group (NASA/GS... platform: Aqua source: satellite observations from MODIS-Aqua ... ... cdm_data_type: grid keywords: Earth Science > Oceans > Ocean Chemistr... keywords_vocabulary: NASA Global Change Master Directory (GC... data_bins: 3848961 data_minimum: 0.003795496 data_maximum: 85.78689
array = np.log10(dataset["chlor_a"])
array.attrs.update(
{
"units": f'log10({dataset["chlor_a"].attrs["units"]})',
}
)
crs_proj = cartopy.crs.Robinson()
crs_data = cartopy.crs.PlateCarree()
fig = plt.figure(figsize=(10, 5))
ax = fig.add_subplot(projection=crs_proj)
array.plot(x="lon", y="lat", cmap="jet", ax=ax, robust=True, transform=crs_data)
ax.coastlines()
ax.set_title(dataset.attrs["product_name"])
plt.show()
Repeat these steps to map the monthly Rrs_412 dataset, a temporal average of cloud-free pixels, aggregated to 9km spatial resolution, for October 2020.
tspan = ("2020-10-01", "2020-10-01")
results = earthaccess.search_data(
short_name="MODISA_L3m_RRS",
granule_name="*.MO*.Rrs_412*.9km*",
temporal=tspan,
)
Granules found: 1
paths = earthaccess.download(results, "data")
Getting 1 granules, approx download size: 0.01 GB
File AQUA_MODIS.20201001_20201031.L3m.MO.RRS.Rrs_412.9km.nc already downloaded
dataset = xr.open_dataset(paths[0])
fig = plt.figure(figsize=(10, 5))
ax = plt.axes(projection=crs_proj)
dataset["Rrs_412"].plot(
x="lon", y="lat", cmap="jet", robust=True, ax=ax, transform=crs_data
)
ax.coastlines()
ax.set_title(dataset.attrs["product_name"])
plt.show()