Function to compute the (base 10) log ratios of the measurements relative to standard reference values. By default a reference is provided with the package.

LogRatios(
  data,
  ref = reference$Combi,
  identifiers = c("TAX", "EL"),
  refMeasuresName = "Measure",
  refValuesName = "Standard",
  thesaurusSet = zoologThesaurus,
  joinCategories = NULL,
  mergedMeasures = NULL
)

Arguments

data

A dataframe with the input measurements.

ref

A dataframe including the measurement values used as references. The default ref = reference$Combi provided as package zoolog data.

identifiers

A vector of column names in ref identifying a type of bone. By default identifiers = c("TAX", "EL").

refMeasuresName

The column name in ref identifying the type of bone measurement.

refValuesName

The column name in ref giving the measurement value.

thesaurusSet

A thesaurus allowing datasets with different nomenclatures to be merged. By default thesaurusSet = zoologThesaurus.

joinCategories

A list of named character vectors. Each vector is named by a category in the reference and includes a set of categories in the data for which to compute the log ratios with respect to that reference. When NULL (default) no grouping is considered.

mergedMeasures

A list of character vectors or a single character vector. Each vector identifies a set of measures that the data presents merged in the same column, named as any of them. This practice only makes sense if only one of the measures can appear in each bone element.

Value

A dataframe including the input dataframe and additional columns, one for each extracted log ratio for each relevant measurement in the reference. The name of the added columns are constructed by prefixing each measurement by the internal variable logPrefix.

If the input dataframe includes additional S3 classes (such as "tbl_df"), they are also passed to the output.

Details

Each log ratio is defined as the decimal logarithm of the ratio of the variable of interest to a corresponding reference value.

The identifiers are expected to determine corresponding columns in both data and reference. Each value in these columns identifies the type of bone. By default this is determined by a taxon and a bone element. For any case in the data, the log ratios are computed with respect to the reference values in the same bone type. If the reference does not include that bone type, the corresponding log ratios are set to NA.

For some applications it can be interesting to group some set of bone types into the same reference category to compute the log ratios. The parameter joinCategories allows this grouping. joinCategories must be a list of named vectors, each including the set of categories in the data which should be mapped to the reference category given by its name.

This can be applied to group different species into a single reference species. For instance sheep, capra, and doubtful cases between both (sheep/capra), can be grouped and matched to the same reference for sheep, by setting joinCategories = list(sheep = c("sheep", "capra", "oc")). Similarly, it can be applied to group different bone elements into a single reference (see the example below for undetermined phalanges).

Note that the joinCategories option does not remove the distinction between the different bone types in the data, just indicates that for any of them the log ratios must be computed from the same reference.

There are some measures that are restricted to a subset of bones. For instance, GLl is only relevant for the astragalus, while GL is not applicable to it. Thus, there cannot be any ambiguity between both measures since they can be identified by the bone element. This justifies that some users have simplified datasets where a single column records indistinctly GL or GLl. The optional parameter mergedMeasures facilitates the processing of this type of simplified dataset. For the alluded example, mergedMeasures = list(c("GL", "GLl")) automatically selects, for each bone element, the corresponding measure present in the reference.

Observe that if mergedMeasures is set to non mutually exclusive measures, the behaviour is unpredictable.

Examples

## Read an example dataset: dataFile <- system.file("extdata", "dataValenzuelaLamas2008.csv.gz", package="zoolog") dataExample <- utils::read.csv2(dataFile, na.strings = "", encoding = "UTF-8", stringsAsFactors = TRUE) ## For illustration purposes we keep now only a subset of cases to make ## the example run sufficiently fast. ## Avoid this step if you want to process the full example dataset. dataExample <- dataExample[145:1000, ] ## We can observe the first lines (excluding some columns for visibility): head(dataExample)[, -c(6:20,32:64)]
#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT GLc BFd #> 145 ALP 2639 8351 sudo 1fal 34.4 16.4 16.6 13.6 15.7 10.8 NA NA NA NA #> 146 ALP 1092 10283 sudo 1fal 35.8 15.9 16.2 12.2 14.0 10.8 NA NA NA NA #> 147 ALP 585 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 148 ALP 586 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 149 ALP 589 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 150 ALP 2049 7083 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> Dl #> 145 NA #> 146 NA #> 147 NA #> 148 NA #> 149 NA #> 150 NA
## Compute the log-ratios with respect to the default reference in the ## package zoolog: dataExampleWithLogs <- LogRatios(dataExample) ## The output data frame include new columns with the log-ratios of the ## present measurements, in both data and reference, with a "log" prefix: head(dataExampleWithLogs)[, -c(6:20,32:64)]
#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT GLc BFd #> 145 ALP 2639 8351 sudo 1fal 34.4 16.4 16.6 13.6 15.7 10.8 NA NA NA NA #> 146 ALP 1092 10283 sudo 1fal 35.8 15.9 16.2 12.2 14.0 10.8 NA NA NA NA #> 147 ALP 585 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 148 ALP 586 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 149 ALP 589 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 150 ALP 2049 7083 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> Dl logGL logBp logDp logSD logBd logDd logBT logGLc logBFd logDl logGB #> 145 NA NA NA NA NA NA NA NA NA NA NA NA #> 146 NA NA NA NA NA NA NA NA NA NA NA NA #> 147 NA NA NA NA NA NA NA NA NA NA NA NA #> 148 NA NA NA NA NA NA NA NA NA NA NA NA #> 149 NA NA NA NA NA NA NA NA NA NA NA NA #> 150 NA NA NA NA NA NA NA NA NA NA NA NA #> logSLC logGLP logBG logLG logDPA logBPC logLA logLAR logSH logSB logL logH #> 145 NA NA NA NA NA NA NA NA NA NA NA NA #> 146 NA NA NA NA NA NA NA NA NA NA NA NA #> 147 NA NA NA NA NA NA NA NA NA NA NA NA #> 148 NA NA NA NA NA NA NA NA NA NA NA NA #> 149 NA NA NA NA NA NA NA NA NA NA NA NA #> 150 NA NA NA NA NA NA NA NA NA NA NA NA
## Compute the log-ratios with respect to a different reference: dataExampleWithLogs2 <- LogRatios(dataExample, ref = reference$Basel) head(dataExampleWithLogs2)[, -c(6:20,32:64)]
#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT GLc BFd #> 145 ALP 2639 8351 sudo 1fal 34.4 16.4 16.6 13.6 15.7 10.8 NA NA NA NA #> 146 ALP 1092 10283 sudo 1fal 35.8 15.9 16.2 12.2 14.0 10.8 NA NA NA NA #> 147 ALP 585 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 148 ALP 586 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 149 ALP 589 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 150 ALP 2049 7083 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> Dl logGL logBp logSD logDD logBd logBT logBFd logDl logGB logSLC logGLP #> 145 NA NA NA NA NA NA NA NA NA NA NA NA #> 146 NA NA NA NA NA NA NA NA NA NA NA NA #> 147 NA NA NA NA NA NA NA NA NA NA NA NA #> 148 NA NA NA NA NA NA NA NA NA NA NA NA #> 149 NA NA NA NA NA NA NA NA NA NA NA NA #> 150 NA NA NA NA NA NA NA NA NA NA NA NA #> logBG logLG logDPA logBPC logLA logLAR logSH #> 145 NA NA NA NA NA NA NA #> 146 NA NA NA NA NA NA NA #> 147 NA NA NA NA NA NA NA #> 148 NA NA NA NA NA NA NA #> 149 NA NA NA NA NA NA NA #> 150 NA NA NA NA NA NA NA
## Define an altenative reference combining differently the references' ## database: refComb <- list(cattle = "Nieto", sheep = "Davis", Goat = "Clutton", pig = "Albarella", redDeer = "Basel") userReference <- AssembleReference(refComb) ## Compute the log-ratios with respect to this alternative reference: dataExampleWithLogs3 <- LogRatios(dataExample, ref = userReference) ## We can be interested in including the first and second phalanges without ## anterior-posterior identification ("phal 1" and "phal 2"), by computing ## their log ratios with respect to the reference of the corresponding ## anterior first phalanges ("phal 1 ant" and "phal 2 ant", respectively). ## For this we use the optional argument joinCategories: categoriesPhalAnt <- list('phal 1 ant' = c("phal 1 ant", "phal 1"), 'phal 2 ant' = c("phal 2 ant", "phal 2")) dataExampleWithLogs4 <- LogRatios(dataExample, joinCategories = categoriesPhalAnt) head(dataExampleWithLogs4)[, -c(6:20,32:64)]
#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT GLc BFd #> 145 ALP 2639 8351 sudo 1fal 34.4 16.4 16.6 13.6 15.7 10.8 NA NA NA NA #> 146 ALP 1092 10283 sudo 1fal 35.8 15.9 16.2 12.2 14.0 10.8 NA NA NA NA #> 147 ALP 585 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 148 ALP 586 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 149 ALP 589 10182 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> 150 ALP 2049 7083 sudo 1fal NA NA NA NA NA NA NA NA NA NA #> Dl logGL logBp logDp logSD logBd logDd logBT logGLc logBFd logDl logGB #> 145 NA NA NA NA NA NA NA NA NA NA NA NA #> 146 NA NA NA NA NA NA NA NA NA NA NA NA #> 147 NA NA NA NA NA NA NA NA NA NA NA NA #> 148 NA NA NA NA NA NA NA NA NA NA NA NA #> 149 NA NA NA NA NA NA NA NA NA NA NA NA #> 150 NA NA NA NA NA NA NA NA NA NA NA NA #> logSLC logGLP logBG logLG logDPA logBPC logLA logLAR logSH logSB logL logH #> 145 NA NA NA NA NA NA NA NA NA NA NA NA #> 146 NA NA NA NA NA NA NA NA NA NA NA NA #> 147 NA NA NA NA NA NA NA NA NA NA NA NA #> 148 NA NA NA NA NA NA NA NA NA NA NA NA #> 149 NA NA NA NA NA NA NA NA NA NA NA NA #> 150 NA NA NA NA NA NA NA NA NA NA NA NA