library(rhdf5)
library(gdalraster)
library(stars)
library(terra)
library(rnaturalearth)
library(rnaturalearthdata)
4 HDF5 and NetCDF Files from MODIS Ocean Color Data
In these examples, we will use data from: Earth data. This website provides direct access to ocean color data products from various satellite missions, including MODIS. You can find Level-3 binned and mapped data here.
4.1 L3BIN (Level-3 Binned Data) - HDF5
Level-3 binned (L3b) files are HDF5 files containing data that has been statistically processed and binned into spatial grids. The data is organized in layers, where each layer represents a different variable. These layers are stored in the /level-3_binned_data
group within the HDF5 file. The BinList
layer is particularly important, as it contains the bin information, such as the bin’s coordinates and the bin’s weights, which are crucial for proper interpretation of the data.
In R, these files can be read using the rhdf5
package from bioconductor. If you haven’t already, you’ll need to install the rhdf5
package. The following code shows how to install the package and then reads the chlor_a
layer (chlorophyll-a concentration) from the file A2016160.L3b_DAY_CHL.nc
:
if (!require("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
::install()
BiocManager
::install("rhdf5") BiocManager
The h5ls()
function will list all the layers included in the HDF5 file. This is useful for exploring the file’s structure and identifying the available datasets.
<- here::here("data", "AQUA_MODIS.20160608.L3b.DAY.CHL.nc")
file h5ls(file)
Datatype: binDataType
Datatype: binIndexType
Datatype: binListType
group | name | otype | dclass | dim | |
---|---|---|---|---|---|
0 | / | level-3_binned_data | H5I_GROUP | ||
1 | /level-3_binned_data | BinIndex | H5I_DATASET | COMPOUND | 4320 |
2 | /level-3_binned_data | BinList | H5I_DATASET | COMPOUND | 2417047 |
3 | /level-3_binned_data | binDataDim | H5I_DATASET | FLOAT | 0 |
4 | /level-3_binned_data | binIndexDim | H5I_DATASET | FLOAT | 0 |
5 | /level-3_binned_data | binListDim | H5I_DATASET | FLOAT | 0 |
6 | /level-3_binned_data | chlor_a | H5I_DATASET | COMPOUND | 2417047 |
7 | / | processing_control | H5I_GROUP | ||
8 | /processing_control | input_parameters | H5I_GROUP |
Finally, use h5read()
to open a specific layer:
<- h5read(file, "/level-3_binned_data/chlor_a")
df head(df)
sum | sum_squared |
---|---|
0.1869691 | 0.03495743 |
0.2829849 | 0.05712159 |
0.4636284 | 0.09679225 |
0.6681932 | 0.13557214 |
0.5648066 | 0.12085202 |
0.4852219 | 0.09646683 |
The resulting data frame df
will contain the binned chlorophyll-a data. The columns in this data frame represent:
sum
: The sum of the chlorophyll-a values of all pixels within the bin.sum_squared
: The sum of the squared chlorophyll-a values within the bin. This can be used to calculate the variance or standard deviation of the data within the bin.
It is crucial to understand that the observed values in the sum
column need to be weighted to obtain the actual average chlorophyll-a concentration for each bin. The weights, representing the relative contribution of each pixel to the bin, are stored in the BinList
layer.
To apply the weights, read the BinList
layer and divide the sum
column by the weights
column:
<- h5read(file, "/level-3_binned_data/BinList")
bins
"sum"]] <- df[["sum"]] / bins[["weights"]]
df[[
head(df)
sum | sum_squared |
---|---|
0.1869691 | 0.03495743 |
0.2001005 | 0.05712159 |
0.2073409 | 0.09679225 |
0.2014678 | 0.13557214 |
0.2134768 | 0.12085202 |
0.1980910 | 0.09646683 |
The df[["sum"]]
column now contains the weighted average chlorophyll-a concentration for each bin. You can now perform further analysis or visualization with this data.
4.2 L3MAP (Level-3 Mapped Data) - NetCDF
Level-3 mapped (L3m) data represents geophysical variables projected onto a regular grid. These files are typically easier to visualize than L3b data because they are already in a gridded format.
<- here::here("data", "AQUA_MODIS.20160608.L3m.DAY.CHL.chlor_a.4km.nc")
file
<- rast(file)
r
r
class : SpatRaster
dimensions : 4320, 8640, 1 (nrow, ncol, nlyr)
resolution : 0.04166667, 0.04166667 (x, y)
extent : -180, 180, -90.00001, 90 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs
source : AQUA_MODIS.20160608.L3m.DAY.CHL.chlor_a.4km.nc:chlor_a
varname : chlor_a (Chlorophyll Concentration, OCI Algorithm)
name : chlor_a
unit : mg m^-3
This shows the raster’s metadata, such as the number of layers, the number of rows and columns, the resolution, and the extent.
<- ne_countries(scale = "medium", returnclass = "sf")
wm
plot(wm[["geometry"]], col = "lightgray", lwd = 0.5)
plot(r, add = TRUE, col = terrain.colors(10L))
4.3 L3BIN (Level-3 Binned Data) - hdf4
<- here::here("data", "MYD08_D3.A2003181.061.2018007232726.hdf")
filename
<- describe(filename, sds = TRUE, meta = FALSE, parse = FALSE)
sdss head(sdss)
id | name | var | desc | nrow | ncol | nlyr |
---|---|---|---|---|---|---|
1 | HDF4_EOS:EOS_GRID:“/media/LaCie16TB/work/projects/documentation/read_rs_product/data/MYD08_D3.A2003181.061.2018007232726.hdf”:mod08:Solar_Zenith_Mean | Solar_Zenith_Mean | [180x360] Solar_Zenith_Mean mod08 (16-bit integer) | 180 | 360 | 1 |
2 | HDF4_EOS:EOS_GRID:“/media/LaCie16TB/work/projects/documentation/read_rs_product/data/MYD08_D3.A2003181.061.2018007232726.hdf”:mod08:Solar_Zenith_Standard_Deviation | Solar_Zenith_Standard_Deviation | [180x360] Solar_Zenith_Standard_Deviation mod08 (16-bit integer) | 180 | 360 | 1 |
3 | HDF4_EOS:EOS_GRID:“/media/LaCie16TB/work/projects/documentation/read_rs_product/data/MYD08_D3.A2003181.061.2018007232726.hdf”:mod08:Solar_Zenith_Minimum | Solar_Zenith_Minimum | [180x360] Solar_Zenith_Minimum mod08 (16-bit integer) | 180 | 360 | 1 |
4 | HDF4_EOS:EOS_GRID:“/media/LaCie16TB/work/projects/documentation/read_rs_product/data/MYD08_D3.A2003181.061.2018007232726.hdf”:mod08:Solar_Zenith_Maximum | Solar_Zenith_Maximum | [180x360] Solar_Zenith_Maximum mod08 (16-bit integer) | 180 | 360 | 1 |
5 | HDF4_EOS:EOS_GRID:“/media/LaCie16TB/work/projects/documentation/read_rs_product/data/MYD08_D3.A2003181.061.2018007232726.hdf”:mod08:Solar_Zenith_Pixel_Counts | Solar_Zenith_Pixel_Counts | [180x360] Solar_Zenith_Pixel_Counts mod08 (16-bit integer) | 180 | 360 | 1 |
6 | HDF4_EOS:EOS_GRID:“/media/LaCie16TB/work/projects/documentation/read_rs_product/data/MYD08_D3.A2003181.061.2018007232726.hdf”:mod08:Solar_Azimuth_Mean | Solar_Azimuth_Mean | [180x360] Solar_Azimuth_Mean mod08 (16-bit integer) | 180 | 360 | 1 |
<- rast(filename, 142L)
r
plot(r)