sgsR - Structurally Guided Sampling

Tristan Goodbody, Nicholas Coops, Martin Queinnec, Joanne White, Piotr Tompalski, Andrew Hudak, David Auty, Ruben Valbuena, Antoine LeBoeuf, Ian Sinclair, Grant McCartney, Jean-Francois Prieur, Murray Woods

University of British Columbia

2022-09-02

Take-away message

  • sgsR - R-package for structurally guided sampling for enhanced forest inventories.

sgsR stands for structurally guided sampling implemented in R

  • Stratification and sampling functions to guide primarily model-based sampling approaches
  • Focus on management-level inventories
  • Documentation and vignettes online
  • Funded by the Canadian Wood Fibre Centre, Canadian Forest Service

Overview

  • Brief inventory and sampling overview

  • Discuss using auxiliary variables within sampling frameworks

  • Structurally guided sampling using Airborne Laser Scanning

  • sgsR overview

  • Programmatic examples of the package

🌲🌳 Forest inventories

🌲🌳 Forest inventories

Purpose: Obtain knowledge about the population (forest area) under investigation and provide estimates of specific target attributes

Needed information: Defined by the scope & scale of the inventory. Answered by questions like:

  • Who/what is the information for? (e.g. Reporting obligations, timber production)
  • How big of an area are we inventorying? (e.g. National level, operational level)
  • Answers dictate the sampling approaches to fulfill inventory obligations and objectives

🚩 Sampling

🚩 Sampling

Mensuration is a cornerstone of forest management
  • Sampling drives accurate forest attribute estimates (e.g. forest area, stem volume)

Sampling can be:

  • Labour intensive
  • Logistically challenging
  • Expensive
  • We want to balance these challenges with attribute estimate accuracy

🎲📊 Traditional sampling approaches

🎲 Random sampling

  • Randomized sampling where probabilities for each sample unit can be equivalent and known
#--- simple random sampling ---#
sample_srs(raster = mr, nSamp = 100, plot = TRUE)

🟩 Systematic sampling

  • Systematic sampling methods are also common, where sample units are selected based on a defined distance
#--- systematic sampling ---#
sample_systematic(raster = mr, cellsize = 1000, plot = TRUE)

🟩 Systematic sampling

  • Different tessellation shapes are common
#--- systematic sampling in hexagons ---#
sample_systematic(raster = mr, cellsize = 1000, square = FALSE, plot = TRUE)

🟩 Systematic random sampling

  • And combinations of systematic and simple random sampling also exists
#--- systematic random sampling ---#
sample_systematic(raster = mr, cellsize = 1000, 
                  square = FALSE, location = "random", plot = TRUE)

🎲📊 Traditional sampling approaches

  • We now understand a bit more about probability-based sampling:
  • Randomized
  • Sampling unit probabilities is equal, known, or can be known
  • Different methods exist (e.g. simple random, systematic)
  • Time-tested

  • Simple

  • Efficient

  • Broadly used


What can we do to improve sampling efficiency and attribute estimate accuracy

🛰👩🏻‍💻 Auxiliary data

👩🏻‍💻 Auxiliary data

  • Empirical distributions that are spatially explicit
  • Imagery (satellite, airborne, drone)
  • Feature-based inventories (species, management type)
  • ALS metrics (height, cover, variability)

✔️ Understand inventory attributes of interest

✔️ Associate auxiliary data correlated to those attributes

✔️ Sample across the full range of attribute variability

💠🔢 Stratification

💠🔢 Stratification

“Our results highlight that LiDAR data integrated with field data sampling designs can provide broad-scale assessments of vegetation structure and biomass, i.e., information crucial for carbon and biodiversity science.” (Hawbaker et al. 2009)

💠🔢 Stratification

“The ALS data also provides an excellent source of prior information that may be used in the design phase of the field survey to reduce the size of the field data set.”(Gobakken, Korhonen, and Næsset 2013)

💠🔢 Stratification - 1 metric

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mr$zq90, # p90
                           nStrata = 5) # 5 strata in p90

💠🔢 Stratification - 1 metric

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mr$zq90, # p90
                           nStrata = 5) # 5 strata in p90

💠🔢 Stratification - 1 metric

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mr$zq90, # p90
                           nStrata = 5) # 5 strata in p90

💠🔢 Stratification - 2 metrics

#--- perform dual metric stratification ---#
sraster <- strat_quantiles(mraster = mr[[c(1,3)]], # p90
                           nStrata = list(10,3))

💠🔢 Stratification - 2 metrics

#--- perform dual metric stratification ---#
sraster <- strat_quantiles(mraster = mr[[c(1,3)]], # p90 & zsd
                           nStrata = list(10,3))

🎯 Structurally guided sampling

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mr$zq90, # p90
                           nStrata = 10)

#--- structurally guided stratified sampling ---#
sample_strat(sraster = sraster, nSamp = 100,mindist = 200, plot = TRUE)

sgsR

sgsR purpose

sgsR is a toolbox to provide primarily model-based sampling approaches for management-level forest inventories that are:

  • Transparent
  • Repeatable
  • Tuneable
  • Spatially-explicit

Algorithm structure

  • sgsR was built using the terra, sf, & tidyverse packages

  • There are 4 primary function verbs that sgsR uses:

  • strat_* - apply stratification to metrics raster (mraster) and output a stratified raster (sraster)
  • sample_* - allocate samples using srasters produced from strat_* functions
  • calculate_*- calculate sample information or create useful intermediary sampling products
  • extract_* - extract pixels values from rasters to samples

Example 1 🌱 Stratified sampling constrained by access

Example 1 🌱 Stratified sampling constrained by access

  • Imagine you are a forester who needs some new plots

  • You love the idea of sgsR and want to use it

Example 1 🌱 Stratified sampling constrained by access

1️⃣ Read in some ALS metrics

#--- Stratification ---#
#--- Load ALS metrics from sgsR internal data ---#
r <- system.file("extdata", "mraster.tif", package = "sgsR")

#--- Read ALS metrics using the terra package ---#
mraster <- terra::rast(r)

Example 1 🌱 Stratified sampling constrained by access

1️⃣ Read in some ALS metrics

#--- Stratification ---#
#--- Load ALS metrics from sgsR internal data ---#
r <- system.file("extdata", "mraster.tif", package = "sgsR")

#--- Read ALS metrics using the terra package ---#
mraster <- terra::rast(r)

Example 1 🌱 Stratified sampling constrained by access

2️⃣ Read in a linear road access network

#--- Load access network from sgsR internal data ---#
a <- system.file("extdata", "access.shp", package = "sgsR")

#--- load the access vector using the sf package ---#
access <- sf::st_read(a)

Example 1 🌱 Stratified sampling constrained by access

3️⃣ Stratify p90 in to 4 strata based on quantiles

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mraster$zq90, # input ALS metric - p90
                           nStrata = 4) # desired number of strata (4)

Example 1 🌱 Stratified sampling constrained by access

4️⃣ Now lets use the sraster output

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mraster$zq90, # input ALS metric - p90
                           nStrata = 4) # desired number of strata (4)

#--- perform sampling ---#
samples <- sample_strat(sraster = sraster, 
                        nSamp = 100, 
                        allocation = "proportional", # equal, manual, optimal
                        access = access, 
                        buff_inner = 50, 
                        buff_outer = 400)

Example 1 🌱 Stratified sampling constrained by access

5️⃣ Request 100 samples

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mraster$zq90, # input ALS metric - p90
                           nStrata = 4) # desired number of strata (4)

#--- perform sampling ---#
samples <- sample_strat(sraster = sraster, 
                        nSamp = 100, 
                        allocation = "prop", # equal, manual, optimal
                        access = access, 
                        buff_inner = 50, 
                        buff_outer = 400)

Example 1 🌱 Stratified sampling constrained by access

6️⃣ Sample proportional to stratum size

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mraster$zq90, # input ALS metric - p90
                           nStrata = 4) # desired number of strata (4)

#--- perform sampling ---#
samples <- sample_strat(sraster = sraster, 
                        nSamp = 100,
                        allocation = "prop", # equal, manual, optimal
                        access = access, 
                        buff_inner = 50, 
                        buff_outer = 400)

Example 1 🌱 Stratified sampling constrained by access

7️⃣ Bring in the access road

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mraster$zq90, # input ALS metric - p90
                           nStrata = 4) # desired number of strata (4)

#--- perform sampling ---#
samples <- sample_strat(sraster = sraster, 
                        nSamp = 100,
                        allocation = "prop", # equal, manual, optimal
                        access = access, 
                        buff_inner = 50, 
                        buff_outer = 400)

Example 1 🌱 Stratified sampling constrained by access

8️⃣ Specify we don’t want samples within 50 m of access

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mraster$zq90, # input ALS metric - p90
                           nStrata = 4) # desired number of strata (4)

#--- perform sampling ---#
samples <- sample_strat(sraster = sraster, 
                        nSamp = 100, 
                        allocation = "prop", # equal, manual, optimal
                        access = access, 
                        buff_inner = 50, 
                        buff_outer = 400)

Example 1 🌱 Stratified sampling constrained by access

9️⃣ Or further than 400 m from access

#--- perform stratification ---#
sraster <- strat_quantiles(mraster = mraster$zq90, # input ALS metric - p90
                           nStrata = 4) # desired number of strata (4)

#--- perform sampling ---#
samples <- sample_strat(sraster = sraster, 
                        nSamp = 100, 
                        allocation = "prop", # equal, manual, optimal
                        access = access, 
                        buff_inner = 50, 
                        buff_outer = 400)

Example 1 🌱 Stratified sampling constrained by access

  • Mapped result (A) and plotted result (B)
  • Note buffered access in A. Points are samples in both A & B

Example 1 🌱 Comparing distributions

  • Cumulative frequency distributions
  • access constrained vs full extent for p90 (A) and zsd (B)

Example 2 🌿 Augmenting an existing sample

🌿 Augmenting an existing sample

“I have an existing sample network, can I use those same sample locations?”

“If I go and visit those same sample units, where should I locate new samples for structural representation?”

Example 2 🌿 Augmenting an existing sample

  • Lets create an existing sample of 50 plots using simple random sampling (sample_srs)

  • We are assuming these have been measured or used previously and can be revisited

set.seed(2022)
#--- simple random sampling ---#
existing <- sample_srs(raster = mr, nSamp = 50, plot = TRUE)

Example 2 🌿 Augmenting an existing sample

Adapted Hypercube Evaluation of a Legacy Sample (AHELS) (Malone, Minansy, and Brungard 2019)

sample_ahels() works by:

  • Determine representation of existing sample
  • Generate quantile and covariance matrix of ALS metrics
  • Determining number of additional samples that can / need to be added
  • Identify where new samples are needed to balance quantile density and sampling density
  • Iteratively locate samples

Example 2 🌿 Augmenting an existing sample

1️⃣ We have our existing sample

#--- simple random sampling ---#
existing <- sample_srs(raster = mr, nSamp = 50, plot = TRUE)

Example 2 🌿 Augmenting an existing sample

2️⃣ Now we can use the sample_ahels() algorithm with our ALS metrics

#--- simple random sampling ---#
existing <- sample_srs(raster = mr, nSamp = 50, plot = TRUE)

#--- augment sample network using sample_ahels ---#
#--- perform ahels sampling ---#
sample_ahels(mraster = mr,
             existing = existing,
             nSamp = 50)

Example 2 🌿 Augmenting an existing sample

3️⃣ Specify our existing sample

#--- simple random sampling ---#
existing <- sample_srs(raster = mr, nSamp = 50, plot = TRUE)

#--- augment sample network using sample_ahels ---#
#--- perform ahels sampling ---#
sample_ahels(mraster = mr,
             existing = existing,
             nSamp = 50)

Example 2 🌿 Augmenting an existing sample

4️⃣ And specify we want 50 new sample units (nSamp)

#--- simple random sampling ---#
existing <- sample_srs(raster = mr, nSamp = 50, plot = TRUE)

#--- augment sample network using sample_ahels ---#
#--- perform ahels sampling ---#
sample_ahels(mraster = mr,
             existing = existing,
             nSamp = 50)

Example 2 🌿 Augmenting an existing sample

  • Mapped result (A) and plotted result (B)
  • Note ratios (black/red) and additional added samples (e.g. n = 2)

sample_ahels() result

  • existing only (A) and addition of new samples (B)
  • We see that metric and sample density become quite even - structurally representative

Summary

  • Structurally guided sampling methods show promise for model-based sampling
  • The sgsR package provides many methods to implement SGS approaches
  • We presented a few examples of sgsR functionality
  • If interested - come and talk to me & use the link on the slides more info!

🤲🤝🙏 Thank you!

Special thanks to the Canadian Wood Fibre Centre for funding this research!

🤲🤝🙏 Thank you!

🐤 @GoodbodyT 🐤 @IRSS_UBC

Special thanks to my collaborators

  • Martin Queinnec
  • Joanne White
  • Andrew Hudak
  • Ruben Valbuena
  • Murray Woods
  • David Auty
  • Antoine Leboeuf
  • Ian Sinclair
  • Grant McCartney
  • Jean-Francois Prieur
  • Piotr Tompalski

🤲🤝🙏 Thank you!

🌲👀 See for yourself!

Additional resources

This presentation was made with Quarto and will be made available on Github following the presentation at ForestSAT

Gobakken, Terje, Lauri Korhonen, and Erik Næsset. 2013. “Laser-Assisted Selection of Field Plots for an Area-Based Forest Inventory.” Silva Fennica 47 (5). https://doi.org/10.14214/sf.943.
Hawbaker, Todd J., Nicholas S. Keuler, Adrian A. Lesak, Terje Gobakken, Kirk Contrucci, and Volker C. Radeloff. 2009. “Improved Estimates of Forest Vegetation Structure and Biomass with a LiDAR-Optimized Sampling Design.” Journal of Geophysical Research: Biogeosciences 114 (G2): n/a–. https://doi.org/10.1029/2008jg000870.
Malone, Brendan P., Budiman Minansy, and Colby Brungard. 2019. “Some Methods to Improve the Utility of Conditioned Latin Hypercube Sampling.” PeerJ 7 (February): e6451. https://doi.org/10.7717/peerj.6451.