The Tangled Bank
  • Home
  • Vignettes
    • Dates From netCDF Files: Two Approaches
    • Downloading and Prep: ERDDAP
    • Retrieving Chlorophyll-a Data from ERDDAP Servers
    • Downloading and Prep: Python + CDO
    • Detecting Events in Gridded Data
    • Regridding gridded data
    • Displaying MHWs and MCSs as Horizon Plots
    • heatwaveR Issues
    • Plotting the Whale Sightings and Chlorophyll-a Concentrations
    • Spatial Localisation, Subsetting, and Aggregation of the Chlorophyll-a Data
    • Wavelet Analysis of Diatom Time Series
    • Extracting gridded data within a buffer
  • High-Performace Computing
    • Using Lengau
    • PBSPro user and job management
    • Using tmux
  • Tangled Bank Blog
  • About
  • Blogroll
    • R-bloggers
    • Source Code
    • Report a Bug

On this page

  • VIIRS chlorophyll-a data
  • ERA5 reanalysis wind data
  • Regridding
    • Option 1
    • Option 2
    • Option 3
  • Edit this page
  • Report an issue

Regridding gridded data

Aligning and matching spatial grids

This vignette demonstrates basic ideas behind regridding.
Author

AJ Smit

Published

June 12, 2023

library(tidyverse) # A staple of modern data processing in R
library(tidync) # For easily dealing with NetCDF data
library(data.table)
library(rerddap) # For easily downloading subsets of data
library(lubridate)
library(reticulate)
library(doParallel) # For parallel processing

First, I define the Benguela region and time extent of interest:

lats <- c(-37.5, -20)
lons <- c(15, 20)
time <- c("2021-01-01", "2021-12-31")

Then I find some data.

VIIRS chlorophyll-a data

which_chl <- ed_search(query = "Chlorophyll-a", which = "griddap")

I select the VIIRS chl-a data. These data start in 2012 and it has a spatial resolution of ~4km lat/lon. The data, “VIIRSN, Suomi-NPP, Level-3 SMI, NASA, Global, 4km, Chlorophyll a, OCI Algorithm, R2018, 2012-present, Daily,” were retrieved from here.

browse("erdVH2018chla1day")

And now I download it for the region and time period specified earlier:

chl <- griddap(datasetx = "erdVH2018chla1day", 
               url = "https://upwell.pfeg.noaa.gov/erddap/", 
               time = c(time[1], time[2]),
               latitude = lats,
               longitude = lons,
               fields = "all")$data %>% 
  mutate(time = as.Date(stringr::str_remove(time, "T00:00:00Z"))) |> 
  as.data.table()

ERA5 reanalysis wind data

I use the “ERA5 hourly data on single levels from 1940 to present” data, which come in at an hourly resolution for the whole world, starting in 1940. The spatial resolution is an unimpressive 0.25° × 0.25° lat/lon, far coarser than the chl-a data.

The ERA5 reanalysis wind data (u and v components) were downloaded from Copernicus using a python script, which can be generated on the website using the “Show API request” option after selecting the variables and spatio-temporal ranges of interest:

import cdsapi

c = cdsapi.Client()

c.retrieve(
    'reanalysis-era5-single-levels',
    {
        'product_type': 'reanalysis',
        'variable': [
            '10m_u_component_of_wind', '10m_v_component_of_wind',
        ],
        'year': '2021',
        'month': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
        ],
        'day': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
            '13', '14', '15',
            '16', '17', '18',
            '19', '20', '21',
            '22', '23', '24',
            '25', '26', '27',
            '28', '29', '30',
            '31',
        ],
        'time': [
            '00:00', '01:00', '02:00',
            '03:00', '04:00', '05:00',
            '06:00', '07:00', '08:00',
            '09:00', '10:00', '11:00',
            '12:00', '13:00', '14:00',
            '15:00', '16:00', '17:00',
            '18:00', '19:00', '20:00',
            '21:00', '22:00', '23:00',
        ],
        'area': [
            -20, 15, -37.5,
            20,
        ],
        'format': 'netcdf',
    },
    'download.nc')

The end product of the python download is a 52.3 Mb netCDF file, which I will now load and process to produce the daily temperature values to match the daily resolution of the VIIRS chl-a data:

# time units: hours since 1900-01-01 00:00:00.0
origin <- as.POSIXct("1900-01-01 00:00:00", tz = "UTC")

ncFile <- "/Volumes/OceanData/ERA5/ERA5_2021_Benguela.nc"
# ncFile <- "~/Downloads/ERA5_2021_Benguela.nc"

era5 <- tidync(ncFile) |>
hyper_tibble() %>%
mutate(time = floor_date(time * 3600 + origin, "day")) |>
reframe(u10 = mean(u10),
v10 = mean(v10),
.by = c(time, longitude, latitude)) |> 
as.data.table()

Note that I coerce the data to a date.table object since the regridding step (Option 1) uses

Regridding

Let’s check out their respective spatial resolutions:

head(unique(chl$longitude))
[1] 14.97917 15.02084 15.06251 15.10417 15.14584 15.18751
sort(unique(era5$longitude))
 [1] 15.00 15.25 15.50 15.75 16.00 16.25 16.50 16.75 17.00 17.25 17.50 17.75
[13] 18.00 18.25 18.50 18.75 19.00 19.25 19.50 19.75 20.00
head(unique(chl$latitude))
[1] -19.97917 -20.02084 -20.06250 -20.10417 -20.14584 -20.18750
head(sort(unique(era5$latitude)))
[1] -37.50 -37.25 -37.00 -36.75 -36.50 -36.25

There is a huge difference. It is seldom a good idea to go from a low resolution like 25 km to a higher resolution like 4 km, but if you insist, you can do it with bi-linear interpolation. Here we will degrade the 4 km product to match the 25 km resolution of the wind data.

Option 1

Define a new output grid. This will be the coarsest resolution one from ERA5:

lon.out <- unique(era5$longitude)
lat.out <- sort(unique(era5$latitude))

Using the metR package and its Interpolate() function, interpolate the data to the new coarser resolution grid. I show two approaches: i) a method shown in the function’s help file that uses data.table, and ii) a dplyr (tidyverse) method using the new reframe() function. What reframe() does when used within a dplyr data pipe is reveal the column names, which can then be given to the function of interest in the usual way; it then returns a dataframe or tibble of arbitrary length. reframe() also accommodates the grouping structure within the function itself through the use of the .by = argument (i.e. no need for an a priori group_by()), making the syntax not dissimilar to that of data.table’s. A few years ago the data.table approach would have been faster, but it seems the new versions of dplyr have undergone some significant speed improvements. See the results of the system.time() function:

library(metR)

# using the data.table method
system.time(
  interp_chl <- chl[, Interpolate(chla ~ longitude + latitude, lon.out, lat.out), by = time]
)
   user  system elapsed 
  5.955   0.823   5.662 
head(interp_chl)
         time longitude latitude  chla
       <Date>     <num>    <num> <num>
1: 2021-01-01     15.00    -37.5    NA
2: 2021-01-01     15.25    -37.5    NA
3: 2021-01-01     15.50    -37.5    NA
4: 2021-01-01     15.75    -37.5    NA
5: 2021-01-01     16.00    -37.5    NA
6: 2021-01-01     16.25    -37.5    NA
# using dplyr
system.time(
  interp_chl <- chl |> 
    reframe(Interpolate(chla ~ longitude + latitude,
                        x.out = lon.out, y.out = lat.out), .by = time) |> 
    as_tibble()
) 
   user  system elapsed 
  6.008   0.892   6.086 
interp_chl
# A tibble: 544,215 × 4
   time       longitude latitude  chla
   <date>         <dbl>    <dbl> <dbl>
 1 2021-01-01      15      -37.5    NA
 2 2021-01-01      15.2    -37.5    NA
 3 2021-01-01      15.5    -37.5    NA
 4 2021-01-01      15.8    -37.5    NA
 5 2021-01-01      16      -37.5    NA
 6 2021-01-01      16.2    -37.5    NA
 7 2021-01-01      16.5    -37.5    NA
 8 2021-01-01      16.8    -37.5    NA
 9 2021-01-01      17      -37.5    NA
10 2021-01-01      17.2    -37.5    NA
# ℹ 544,205 more rows

Note that daily chl-a data are very gappy and hence there are many NAs in the interpolated dataset.

Let’s verify that the output grids are now the same:

head(unique(interp_chl$longitude))
[1] 15.00 15.25 15.50 15.75 16.00 16.25
head(unique(era5$longitude))
[1] 15.00 15.25 15.50 15.75 16.00 16.25
head(unique(interp_chl$latitude))
[1] -37.50 -37.25 -37.00 -36.75 -36.50 -36.25
head(sort(unique(era5$latitude)))
[1] -37.50 -37.25 -37.00 -36.75 -36.50 -36.25

These approaches seem to work… ‘work’ as in they don’t fail. I have not tested the output data to see if the results are believable; for example, how does the Interpolate() function handle missing values? I am unsure.

Option 2

Another commonly used function for interpolation lives in the akima package. The advantage of this package is that it can do spline interpolation in addition to linear interpolation (Interpolate() only does linear interpolation). The disadvantage is that is really does not like NAs. This is how it would work:

# Assuming 'chl' is your data.frame and 'time', 'longitude', 'latitude' and 'chla' are your columns
library(akima)

chl |> 
  reframe(interp_chl = interp(longitude, latitude, chla,
                              xo = lon.out, yo = lat.out))

Option 3

The next regridding option is done entirely within a spatial data framework. Traditionally we used the raster package, but this has been phased out in favour of sf (Simple Features for R), stars (Spatiotemporal Arrays: Raster and Vector Data Cubes), and terra (Spatial Data Analysis). Here I shall use sf and stars. Refer to Spatial Data Science for information about these spatial methods; specifically, see Chapter 7, Introduction to sf and stars.

R spatial packages are experiencing a rapid evolution and the learning curve might be steep. I think, however, that it’s well worth one’s time as a host of spatial mapping options become available, bringing R closer in functionality to GIS. I am still learning all the various features myself and I am exploring options for integrating the spatial functionality into our marine heatwave workflows.

Make stars objects from the gridded data. Let’s start with the chl-a data first:

library(stars)
library(sf)

# EPSG:4326
# WGS 84 -- WGS84 - World Geodetic System 1984, used in GPS
chl_st <- chl |>
  st_as_stars(dims = c("longitude", "latitude", "time"),
              raster = "chlorophyll") |>
  sf::st_set_crs(4326) |> 
  st_warp(crs = st_crs(4326))

Here are a few interrogation methods to see info about the data’s spatial extent:

print(chl_st)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
            Min.   1st Qu.    Median      Mean  3rd Qu.     Max.  NA's
chla  0.09295794 0.1467107 0.2234708 0.5681996 0.375212 7.863433 99166
dimension(s):
     from  to     offset      delta refsys x/y
x       1 121    14.9583  0.0416985 WGS 84 [x]
y       1 422   -19.9583 -0.0416985 WGS 84 [y]
time    1 365 2021-01-01     1 days   Date    
dim(chl_st)
   x    y time 
 121  422  365 
st_dimensions(chl_st)
     from  to     offset      delta refsys x/y
x       1 121    14.9583  0.0416985 WGS 84 [x]
y       1 422   -19.9583 -0.0416985 WGS 84 [y]
time    1 365 2021-01-01     1 days   Date    
st_bbox(chl_st)
     xmin      ymin      xmax      ymax 
 14.95834 -37.55509  20.00385 -19.95834 

We can also plot the stars data directly; here I plot the data on the 34th day in the time series:

# visualise a time step (day 34):
plot(chl_st[, , , 34])

Next we also need to get the ERA5 data into a stars format, and we print out some spatial info and make a basic map:

era5_st <- era5 |>
  st_as_stars(dims = c("longitude", "latitude", "time"),
              raster = c("u10", "v10")) |>
  sf::st_set_crs(4326) |> 
  st_warp(crs = st_crs(4326))
print(era5_st)
stars object with 3 dimensions and 2 attributes
attribute(s):
          Min.   1st Qu.     Median        Mean  3rd Qu.    Max. NA's
u10  -17.47988 -2.111889 -0.1112329 0.009983922 1.863335 17.6071 7665
v10  -11.88497 -1.194172  0.7711142 1.171964005 3.200816 15.3441 7665
dimension(s):
     from  to         offset     delta  refsys x/y
x       1  21         14.875  0.250205  WGS 84 [x]
y       1  72        -19.875 -0.250205  WGS 84 [y]
time    1 365 2021-01-01 UTC    1 days POSIXct    
st_bbox(era5_st)
     xmin      ymin      xmax      ymax 
 14.87500 -37.88974  20.12930 -19.87500 
plot(era5_st["u10", , , 34])

All of this was to bring us to a point where we can do the actual regridding. This is done with the same st_warp() function. Previously it was used to make the data conform to a specific coordinate reference system (CRS) but here I use it to perform the regridding:

chl_st_regrid <- st_warp(src = chl_st, dest = era5_st)

Let us see if it worked as advertised:

print(chl_st_regrid)
stars object with 3 dimensions and 1 attribute
attribute(s):
            Min.   1st Qu.    Median      Mean   3rd Qu.     Max.   NA's
chla  0.04547131 0.1944372 0.3054729 0.8114727 0.6934659 93.51793 488906
dimension(s):
     from  to     offset     delta refsys x/y
x       1  21     14.875  0.250205 WGS 84 [x]
y       1  72    -19.875 -0.250205 WGS 84 [y]
time    1 365 2021-01-01    1 days   Date    
st_bbox(chl_st_regrid)
     xmin      ymin      xmax      ymax 
 14.87500 -37.88974  20.12930 -19.87500 
plot(chl_st_regrid[, , , 34])

Convert the stars object back to a tibble if necessary:

chl_st_regrid_df <- as_tibble(chl_st_regrid, xy = TRUE)
chl_st_regrid_df
# A tibble: 551,880 × 4
       x     y time        chla
   <dbl> <dbl> <date>     <dbl>
 1  15.0 -20.0 2021-01-01    NA
 2  15.3 -20.0 2021-01-01    NA
 3  15.5 -20.0 2021-01-01    NA
 4  15.8 -20.0 2021-01-01    NA
 5  16.0 -20.0 2021-01-01    NA
 6  16.3 -20.0 2021-01-01    NA
 7  16.5 -20.0 2021-01-01    NA
 8  16.8 -20.0 2021-01-01    NA
 9  17.0 -20.0 2021-01-01    NA
10  17.3 -20.0 2021-01-01    NA
# ℹ 551,870 more rows

That’s it, folks!

If you have any suggestions about how to do regridding or make the spatial functions more user-friendly in a marine heatwave analysis workflow, please let me know.

Reuse

CC BY-NC-SA 4.0

Citation

BibTeX citation:
@online{smit,_a._j.,
  author = {Smit, A. J., and Smit, AJ},
  title = {Regridding Gridded Data},
  date = {},
  url = {http://tangledbank.netlify.app/vignettes/regridding.html},
  langid = {en}
}
For attribution, please cite this work as:
Smit, A. J., Smit A Regridding gridded data. http://tangledbank.netlify.app/vignettes/regridding.html.
Source Code
---
author: "AJ Smit"
title: "Regridding gridded data"
subtitle: "Aligning and matching spatial grids"
date: "`r Sys.Date()`"
description: "This vignette demonstrates basic ideas behind regridding."
bibliography: ../references.bib
csl: ../marine-biology.csl
format:
  html:
    code-fold: false
    toc-title: "On this page"
    standalone: true
    toc-location: right
    page-layout: full
---

```{r}
library(tidyverse) # A staple of modern data processing in R
library(tidync) # For easily dealing with NetCDF data
library(data.table)
library(rerddap) # For easily downloading subsets of data
library(lubridate)
library(reticulate)
library(doParallel) # For parallel processing
```

<!-- First, find some wind-related data: -->

<!-- ```{r} -->
<!-- which_ws <- ed_search(query = "All Metop ASCAT", which = "griddap") -->
<!-- ``` -->

<!-- I select the dataset "Wind Stress, All Metop ASCAT, 0.25°, Global, Near Real Time, 2013-present (3 -->
<!-- Day), Lon+/-180" with ERDDAP with the dataset ID `erdQAstress1day_LonPM180` accessible [here](https://upwell.pfeg.noaa.gov/erddap/info/erdQAstress1day_LonPM180/index.html): -->

<!-- ```{r} -->
<!-- #| eval: false -->
<!-- browse('erdQAstress1day_LonPM180') -->
<!-- ``` -->

<!-- Now I put together a function to download the data in long format: -->

<!-- ```{r} -->
<!-- # this function downloads and prepares data based on user provided start and end dates -->
<!-- # run once only, then save the downloaded data! -->
<!-- metop <- griddap(datasetx = "erdQAstress1day_LonPM180",  -->
<!--                   url = "https://upwell.pfeg.noaa.gov/erddap/",  -->
<!--                   time = c(time[1], time[2]), -->
<!--                   latitude = lats, -->
<!--                   longitude = lons, -->
<!--                   fields = "all")$data %>%  -->
<!--   mutate(time = as.Date(stringr::str_remove(time, "T00:00:00Z"))) -->
<!-- ``` -->

First, I define the Benguela region and time extent of interest:

```{r}
lats <- c(-37.5, -20)
lons <- c(15, 20)
time <- c("2021-01-01", "2021-12-31")
```

Then I find some data.

## VIIRS chlorophyll-*a* data

```{r}
which_chl <- ed_search(query = "Chlorophyll-a", which = "griddap")
```

I select the VIIRS chl-*a* data. These data start in 2012 and it has a spatial resolution of ~4km lat/lon. The data, "VIIRSN, Suomi-NPP, Level-3 SMI, NASA, Global, 4km, Chlorophyll a, OCI Algorithm, R2018, 2012-present, Daily," were retrieved from [here](https://upwell.pfeg.noaa.gov/erddap/info/erdVH2018chla1day/index.html).

```{r}
#| eval: false
browse("erdVH2018chla1day")
```

And now I download it for the region and time period specified earlier:

```{r}
#| eval: false
chl <- griddap(datasetx = "erdVH2018chla1day", 
               url = "https://upwell.pfeg.noaa.gov/erddap/", 
               time = c(time[1], time[2]),
               latitude = lats,
               longitude = lons,
               fields = "all")$data %>% 
  mutate(time = as.Date(stringr::str_remove(time, "T00:00:00Z"))) |> 
  as.data.table()
```

```{r}
#| eval: false
#| echo: false
save(chl, file = "/Volumes/OceanData/VIIRS/VIIRS_chl.RData")
```

```{r}
#| echo: false
load("/Volumes/OceanData/VIIRS/VIIRS_chl.RData")
```

## ERA5 reanalysis wind data

I use the "ERA5 hourly data on single levels from 1940 to present" data, which come in at an hourly resolution for the whole world, starting in 1940. The spatial resolution is an unimpressive 0.25° × 0.25° lat/lon, far coarser than the chl-*a* data.

The ERA5 reanalysis wind data (*u* and *v* components) were downloaded from [Copernicus](https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview) using a python script, which can be generated on the website using the "Show API request" option after selecting the variables and spatio-temporal ranges of interest:

```{python}
#| eval: false
import cdsapi

c = cdsapi.Client()

c.retrieve(
    'reanalysis-era5-single-levels',
    {
        'product_type': 'reanalysis',
        'variable': [
            '10m_u_component_of_wind', '10m_v_component_of_wind',
        ],
        'year': '2021',
        'month': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
        ],
        'day': [
            '01', '02', '03',
            '04', '05', '06',
            '07', '08', '09',
            '10', '11', '12',
            '13', '14', '15',
            '16', '17', '18',
            '19', '20', '21',
            '22', '23', '24',
            '25', '26', '27',
            '28', '29', '30',
            '31',
        ],
        'time': [
            '00:00', '01:00', '02:00',
            '03:00', '04:00', '05:00',
            '06:00', '07:00', '08:00',
            '09:00', '10:00', '11:00',
            '12:00', '13:00', '14:00',
            '15:00', '16:00', '17:00',
            '18:00', '19:00', '20:00',
            '21:00', '22:00', '23:00',
        ],
        'area': [
            -20, 15, -37.5,
            20,
        ],
        'format': 'netcdf',
    },
    'download.nc')
```

The end product of the python download is a 52.3 Mb netCDF file, which I will now load and process to produce the daily temperature values to match the daily resolution of the VIIRS chl-*a* data:

```{r}
# time units: hours since 1900-01-01 00:00:00.0
origin <- as.POSIXct("1900-01-01 00:00:00", tz = "UTC")

ncFile <- "/Volumes/OceanData/ERA5/ERA5_2021_Benguela.nc"
# ncFile <- "~/Downloads/ERA5_2021_Benguela.nc"

era5 <- tidync(ncFile) |>
hyper_tibble() %>%
mutate(time = floor_date(time * 3600 + origin, "day")) |>
reframe(u10 = mean(u10),
v10 = mean(v10),
.by = c(time, longitude, latitude)) |> 
as.data.table()
```

Note that I coerce the data to a **date.table** object since the regridding step (Option 1) uses 

## Regridding

Let's check out their respective spatial resolutions:

```{r}
head(unique(chl$longitude))
sort(unique(era5$longitude))

head(unique(chl$latitude))
head(sort(unique(era5$latitude)))
```

There is a huge difference. It is seldom a good idea to go from a low resolution like 25 km to a higher resolution like 4 km, but if you insist, you can do it with bi-linear interpolation. Here we will degrade the 4 km product to match the 25 km resolution of the wind data.

### Option 1

Define a new output grid. This will be the coarsest resolution one from ERA5:

```{r}
lon.out <- unique(era5$longitude)
lat.out <- sort(unique(era5$latitude))
```

Using the **metR** package and its `Interpolate()` function, interpolate the data to the new coarser resolution grid. I show two approaches: i) a method shown in the function's help file that uses **data.table**, and ii) a **dplyr** (tidyverse) method using the new `reframe()` function. What `reframe()` does when used within a **dplyr** data pipe is reveal the column names, which can then be given to the function of interest in the usual way; it then returns a dataframe or tibble of arbitrary length. `reframe()` also accommodates the grouping structure within the function itself through the use of the `.by = ` argument (i.e. no need for an *a priori* `group_by()`), making the syntax not dissimilar to that of **data.table**'s. A few years ago the **data.table** approach would have been faster, but it seems the new versions of **dplyr** have undergone some significant speed improvements. See the results of the `system.time()` function:

```{r}
library(metR)

# using the data.table method
system.time(
  interp_chl <- chl[, Interpolate(chla ~ longitude + latitude, lon.out, lat.out), by = time]
)

head(interp_chl)

# using dplyr
system.time(
  interp_chl <- chl |> 
    reframe(Interpolate(chla ~ longitude + latitude,
                        x.out = lon.out, y.out = lat.out), .by = time) |> 
    as_tibble()
) 

interp_chl
```

Note that daily chl-*a* data are very gappy and hence there are many `NA`s in the interpolated dataset.

Let's verify that the output grids are now the same:

```{r}
head(unique(interp_chl$longitude))
head(unique(era5$longitude))

head(unique(interp_chl$latitude))
head(sort(unique(era5$latitude)))
```

These approaches seem to work... 'work' as in they don't fail. I have not tested the output data to see if the results are believable; for example, how does the `Interpolate()` function handle missing values? I am unsure.

### Option 2

Another commonly used function for interpolation lives in the **akima** package. The advantage of this package is that it can do spline interpolation in addition to linear interpolation (`Interpolate()` only does linear interpolation). The disadvantage is that is really does not like `NA`s. This is how it would work:

```{r}
#| eval: false
# Assuming 'chl' is your data.frame and 'time', 'longitude', 'latitude' and 'chla' are your columns
library(akima)

chl |> 
  reframe(interp_chl = interp(longitude, latitude, chla,
                              xo = lon.out, yo = lat.out))
```

### Option 3

The next regridding option is done entirely within a spatial data framework. Traditionally we used the **raster** package, but this has been phased out in favour of [**sf**](https://r-spatial.github.io/sf/) (Simple Features for R), [**stars**](https://r-spatial.github.io/stars/) (Spatiotemporal Arrays: Raster and Vector Data Cubes), and [**terra**](https://rspatial.org/) (Spatial Data Analysis). Here I shall use **sf** and **stars**. Refer to [Spatial Data Science](https://r-spatial.org/book/) for information about these spatial methods; specifically, see [Chapter 7, Introduction to **sf** and **stars**](https://r-spatial.org/book/07-Introsf.html).

R spatial packages are experiencing a rapid evolution and the learning curve might be steep. I think, however, that it's well worth one's time as a host of spatial mapping options become available, bringing R closer in functionality to GIS. I am still learning all the various features myself and I am exploring options for integrating the spatial functionality into our marine heatwave workflows.

Make **stars** objects from the gridded data. Let's start with the chl-*a* data first:

```{r}
library(stars)
library(sf)

# EPSG:4326
# WGS 84 -- WGS84 - World Geodetic System 1984, used in GPS
chl_st <- chl |>
  st_as_stars(dims = c("longitude", "latitude", "time"),
              raster = "chlorophyll") |>
  sf::st_set_crs(4326) |> 
  st_warp(crs = st_crs(4326))
```

Here are a few interrogation methods to see info about the data's spatial extent:

```{r}
print(chl_st)
dim(chl_st)
st_dimensions(chl_st)
st_bbox(chl_st)
```

We can also plot the **stars** data directly; here I plot the data on the 34th day in the time series:

```{r}
# visualise a time step (day 34):
plot(chl_st[, , , 34])
```

Next we also need to get the ERA5 data into a **stars** format, and we print out some spatial info and make a basic map:

```{r}
era5_st <- era5 |>
  st_as_stars(dims = c("longitude", "latitude", "time"),
              raster = c("u10", "v10")) |>
  sf::st_set_crs(4326) |> 
  st_warp(crs = st_crs(4326))
print(era5_st)
st_bbox(era5_st)

plot(era5_st["u10", , , 34])
```

All of this was to bring us to a point where we can do the actual regridding. This is done with the same `st_warp()` function. Previously it was used to make the data conform to a specific coordinate reference system (CRS) but here I use it to perform the regridding:

```{r}
chl_st_regrid <- st_warp(src = chl_st, dest = era5_st)
```

Let us see if it worked as advertised:

```{r}
print(chl_st_regrid)
st_bbox(chl_st_regrid)

plot(chl_st_regrid[, , , 34])
```

Convert the stars object back to a tibble if necessary:

```{r}
chl_st_regrid_df <- as_tibble(chl_st_regrid, xy = TRUE)
chl_st_regrid_df
```

That's it, folks!

If you have any suggestions about how to do regridding or make the spatial functions more user-friendly in a marine heatwave analysis workflow, please let me know.

© 2025, AJ Smit

  • Edit this page
  • Report an issue
Cookie Preferences

Made with and Quarto View the source at GitHub