Introduction

pavo is an R package developed with the goal of establishing a flexible and integrated workflow for working with spectral color data. It includes functions that take advantage of new data classes to work seamlessly from importing raw data to visualization and analysis.

Although pavo deals largely, in its examples, with spectral reflectance data from bird feathers, it is meant to be applicable to a range of taxa. It provides flexible ways to input spectral data from a variety of equipment manufacturers, process these data, extract variables, and produce publication-quality figures.

pavo was written with the following workflow in mind:

  1. Organize spectral data by inputting files and processing spectra (e.g., to remove noise, negative values, smooth curves, etc…).
  2. Analyze the resulting files, either using typical colorimetric variables (hue, saturation, brightness) or using visual models based on perceptual data from the taxon of interest.
  3. Visualize the output, with multiple options provided for exploratory analyses.

Below we will show the main functions in the package in an example workflow. The development version of pavo can be found on github.

Dataset Description

The data used in this example are available from github by clicking here. You can download and extract it to follow the vignette.

The data consist of reflectance spectra, obtained using Avantes equipment and software, from seven bird species: Northern Cardinal Cardinalis cardinalis, Wattled Jacana Jacana jacana, Baltimore Oriole Icterus galbula, Peach-fronted Parakeet Aratinga aurea, American Robin Turdus migratorius, and Sayaca Tanager Thraupis sayaca. Several individuals were measured (sample size varies by species), and 3 spectra were collected from each individual. However, the number of individuals measured per species is uneven and the data have additional peculiarities that should emphasize the flexibility pavo offers, as we’ll see below.

In addition, pavo includes two datasets that can be called with the data function. data(teal) and data(sicalis) will both be used in this vignette. See help for more information help(package = "pavo").

Organizing and Processing Spectral Data

Importing Data

The first thing we need to do is import the spectral data into R using the function getspec(). Since the spectra were obtained using Avantes software, we will need to specify that the files have the .ttt extension. Further, the data is organized in subdirectories for each species. getspec() does recursive sampling, and may include the names of the subdirectories in the spectra name if desired. A final issue with the data is that it was collected using a computer with international numbering input, which means it uses commas instead of periods as a decimal separator. We can specify that in the function call.

The files were downloaded and placed in a directory called /pavo/vignette_data. By default, getspec will search for files in the current folder, but a different one can be specified:

#specs <- getspec("~/pavo/vignette_data/", ext = "ttt", decimal = ",", subdir = TRUE, subdir.names = FALSE)
# 213  files found; importing spectra
# ============================================================
specs[1:10,1:4]
##     wl cardinal.0001 cardinal.0002 cardinal.0003
## 1  300        5.7453        8.0612        8.0723
## 2  301        6.0181        8.3926        8.8669
## 3  302        5.9820        8.8280        9.0680
## 4  303        6.2916        8.7621        8.7877
## 5  304        6.6277        8.6819        9.3450
## 6  305        6.3347        9.6016        9.4834
## 7  306        6.3189        9.5712        9.3533
## 8  307        6.7951        9.4650        9.9492
## 9  308        7.0758        9.4677        9.8587
## 10 309        7.2126       10.6172       10.5396
dim(specs) # the data set has 213 spectra, from 300 to 700 nm, plus a 'wl' column
## [1] 401 214

When pavo imports spectra, it creates an object of class rspec, which inherits attributes from the data.frame class:

is.rspec(specs)
## [1] TRUE

If you already have multiple spectra in a single data frame that you’d like to use with pavo functions, you can use the command as.rspec to convert it to an rspec object. The function will attempt to identify the wavelength variable or you can specify the column containing wavelengths with the whichwl argument. The default way that as.rspec handles reflectance data is to interpolate the data in 1-nm bins, as is commonly done for spectral analyses. However, this can be turned off by using: interp = FALSE. As an example, we will create some fake reflectance data, name the column containing wavelengths (in 0.5-nm bins) wavelength rather than wl (required for pavo functions to work) and also put the column containing wavelengths third rather than first.

# Create some fake reflectance data with wavelength column arbitrarily titled 
# and not first in the data frame:
fakedat <- data.frame(refl1 = rnorm(n = 801), 
                      refl2 = rnorm(n = 801), 
                      wavelength = seq(300, 700, by = .5))
head(fakedat)
##        refl1       refl2 wavelength
## 1  0.7655208 -1.45010776      300.0
## 2 -0.3232796 -2.70222640      300.5
## 3 -0.4397411  0.05708485      301.0
## 4  0.3787726  0.99941093      301.5
## 5 -0.4152657  0.37113311      302.0
## 6  0.4602692 -0.36202684      302.5
is.rspec(fakedat)
## [1] FALSE
fakedat.new <- as.rspec(fakedat)
## wavelengths found in column 3
is.rspec(fakedat.new)
## [1] TRUE
head(fakedat.new)
##    wl      refl1       refl2
## 1 300  0.7655208 -1.45010776
## 2 301 -0.4397411  0.05708485
## 3 302 -0.4152657  0.37113311
## 4 303  0.6565167  0.43887199
## 5 304 -0.1444356  0.21402262
## 6 305  0.1873939 -0.21319912

As can be seen, as.rspec renames the column containing wavelengths, sets it as the first column, interpolates the data in 1-nm bins and converts the data to an rspec object. Note that the same output is returned with specifying whichwl = 3:

head(as.rspec(fakedat, whichwl = 3))
##    wl      refl1       refl2
## 1 300  0.7655208 -1.45010776
## 2 301 -0.4397411  0.05708485
## 3 302 -0.4152657  0.37113311
## 4 303  0.6565167  0.43887199
## 5 304 -0.1444356  0.21402262
## 6 305  0.1873939 -0.21319912

Finally, the lim argument allows you to specify the range of wavelengths contained in the input dataset. This is useful either in the case that the dataset doesn’t contain this information (and hence you cannot specify the column with whichwl or automatically find the column with as.rspec). Additionally, it may be useful to focus on a subset of wavelength. In our example, the wavelengths ranged from 300 to 700 nm, however you could also specify a restricted range of wavelengths with lim:

fakedat.new2 <- as.rspec(fakedat, lim = c(300, 500))
## wavelengths found in column 3
plot(fakedat.new2[, 2] ~ fakedat.new2[, 1], type = 'l', xlab = 'wl')