LogRatios.Rd
Function to compute the (base 10) log ratios of the measurements relative to standard reference values. The default reference and several alternative references are provided with the package. But the user can use their own references if desired.
LogRatios(
data,
ref = reference$Combi,
identifiers = c("Taxon", "Element"),
refMeasuresName = "Measure",
refValuesName = "Standard",
thesaurusSet = zoologThesaurus,
taxonomy = zoologTaxonomy,
joinCategories = NULL,
mergedMeasures = NULL,
useGenusIfUnambiguous = TRUE
)
A dataframe with the input measurements.
A dataframe including the measurement values used as references.
The default ref = reference$Combi
and other reference sets are
provided with the package zoolog.
A vector of column names in ref
identifying
a type of bone. By default identifiers = c("Taxon", "Element")
.
The column name in ref
identifying the type of
bone measurement.
The column name in ref
giving the measurement
value.
A thesaurus allowing datasets with different nomenclatures
to be merged. By default thesaurusSet = zoologThesaurus
.
A taxonomy allowing the automatic detection of data and
reference sharing the same genus (or higher taxonomic rank), although of
different species. By default taxonomy = zoologTaxonomy
.
A list of named character vectors. Each vector is named
by a category in the reference and includes a set of categories in the data
for which to compute the log ratios with respect to that reference.
When NULL
(default) no grouping is considered.
A list of character vectors or a single character vector. Each vector identifies a set of measures that the data presents merged in the same column, named as any of them. This practice only makes sense if only one of the measures can appear in each bone element.
Boolean. If TRUE
(default), data cases
are matched to reference sharing the same genus, instead of sharing the same
species.
A dataframe including the input dataframe and additional columns, one
for each extracted log ratio for each relevant measurement in the reference.
The name of the added columns are constructed by prefixing each measurement by
the internal variable logPrefix
.
If the input dataframe includes additional S3 classes (such as "tbl_df"), they are also passed to the output.
Each log ratio is defined as the decimal logarithm of the ratio of the variable of interest to a corresponding reference value.
The identifiers
are expected to determine corresponding
columns in both data and reference. Each value in these columns identifies
the type of bone. By default this is determined by a taxon and a bone
element. For any case in the data, the log ratios are computed with respect
to the reference values in the same bone type. If the reference does not
include that bone type, the corresponding log ratios are set to NA
.
The taxonomy allows the matching of data and reference by genus, instead
of by species. This is the default behaviour with
useGenusIfUnambiguous = TRUE
, unless there is some ambiguity:
reference including more than one species for the same genus. For instance,
reference$Combi
includes a reference for Sus scrofa.
If the data includes cases of Sus domesticus, their
log ratios will be computed with respect to the provided reference for
Sus scrofa.
However, a warning is given to inform the user of this assumption, and let
they know that this can be prevented by setting
useGenusIfUnambiguous = FALSE
.
For some applications it can be interesting to group some set of bone types
into the same reference category to compute the log ratios. The parameter
joinCategories
allows this grouping. joinCategories
must be a
list of named vectors, each including the set of categories in the data
which should be mapped to the reference category given by its name.
This can be applied to group different species into a single
reference species. For instance sheep, capra, and doubtful
cases between both (sheep/goat), can be grouped and matched to the
same reference for sheep, by setting
joinCategories = list(sheep = c("sheep", "goat", "oc"))
.
Indeed, the zoologTaxonomy can be used for that purpose using the function
SubtaxonomySet
as
joinCategories = list(sheep = SubtaxonomySet("Caprini"))
.
Similarly, joinCategories
can be applied to group
different bone elements into a single reference (see the example below for
undetermined phalanges).
Note that the joinCategories
option does not remove the distinction
between the different bone types in the data, just indicates that for any
of them the log ratios must be computed from the same reference.
Using the taxonomy, the presence of cases identified by higher taxonomic
ranks are also automatically detected. For instance, if some partially
identified cases have been recorded as "Ovis/Capra", this is recognized
to denote the tribe Caprini, which includes several possible species.
Then a warning is given informing the user of the detection of these cases
and of the option to use any of the corresponding species in the reference by
using the argument joinCategories
(unless this has been already done).
There are some measures that, for most usual taxa, are restricted to a subset
of bones. For instance, for Bos, Ovis, Capra, and Sus, the measure
GLl is only relevant for the astragalus, while GL is not
applicable to it.
Thus, there cannot be any ambiguity between both measures since they can
be identified by the bone element. This justifies that some users have
simplified datasets where a single column records indistinctly GL or
GLl. The optional parameter mergedMeasures
facilitates the
processing of this type of simplified dataset. For the alluded example,
mergedMeasures = list(c("GL", "GLl"))
automatically selects, for each
bone element, the corresponding measure present in the reference.
Observe that if mergedMeasures
is set to non mutually exclusive
measures, the behaviour is unpredictable.
## Read an example dataset:
dataFile <- system.file("extdata", "dataValenzuelaLamas2008.csv.gz",
package="zoolog")
dataExample <- utils::read.csv2(dataFile,
na.strings = "",
encoding = "UTF-8")
## For illustration purposes we keep now only a subset of cases to make
## the example run sufficiently fast.
## Avoid this step if you want to process the full example dataset.
dataExample <- dataExample[1:400, ]
## We can observe the first lines (excluding some columns for visibility):
head(dataExample)[, -c(6:20,32:64)]
#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT
#> 1 ALP 4918 10364 bota 1 fal 54.0 31.3 30.6 28.1 26.3 27.5 20.0 NA
#> 2 ALP 4919 10364 bota 1 fal 54.5 27.9 31.8 26.0 22.8 25.3 19.5 NA
#> 3 ALP 3453 10410 ovar 1fal ant 27.1 9.9 12.3 17.9 9.0 9.0 NA NA
#> 4 ALP 3455 10410 ovar 1fal ant 27.6 9.6 12.2 7.6 8.9 8.3 NA NA
#> 5 ALP 4245 7036 cahi hum NA 128.3 NA 12.9 NA 27.4 26.6 23.6
#> 6 ALP 4674 10227 cahi hum NA NA NA NA NA 26.0 25.7 22.3
#> GLc BFd Dl
#> 1 NA NA NA
#> 2 NA NA NA
#> 3 NA NA NA
#> 4 NA NA NA
#> 5 NA NA NA
#> 6 NA NA NA
## Compute the log-ratios with respect to the default reference in the
## package zoolog:
dataExampleWithLogs <- LogRatios(dataExample)
#> Warning: Reference for Sus scrofa used for cases of Sus domesticus.
#> Reference for Sus scrofa used for cases of Sus.
#> Set useGenusIfUnambiguous to FALSE if this behaviour is not desired.
#> Warning: Data includes some cases recorded as
#> * Caprini (which is a Tribe)
#> for which the reference for Ovis aries or Capra hircus could be used.
#> Set joinCategories as appropriate if you want to use any of them.
## The output data frame include new columns with the log-ratios of the
## present measurements, in both data and reference, with a "log" prefix:
head(dataExampleWithLogs)[, -c(6:20,32:64)]
#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT
#> 1 ALP 4918 10364 bota 1 fal 54.0 31.3 30.6 28.1 26.3 27.5 20.0 NA
#> 2 ALP 4919 10364 bota 1 fal 54.5 27.9 31.8 26.0 22.8 25.3 19.5 NA
#> 3 ALP 3453 10410 ovar 1fal ant 27.1 9.9 12.3 17.9 9.0 9.0 NA NA
#> 4 ALP 3455 10410 ovar 1fal ant 27.6 9.6 12.2 7.6 8.9 8.3 NA NA
#> 5 ALP 4245 7036 cahi hum NA 128.3 NA 12.9 NA 27.4 26.6 23.6
#> 6 ALP 4674 10227 cahi hum NA NA NA NA NA 26.0 25.7 22.3
#> GLc BFd Dl logGL logBp logDp logSD logBd logDd
#> 1 NA NA NA NA NA NA NA NA NA
#> 2 NA NA NA NA NA NA NA NA NA
#> 3 NA NA NA NA -0.07991177 -0.07265930 0.2629585 -0.08911977 NA
#> 4 NA NA NA NA -0.09327573 -0.07620458 -0.1090810 -0.12428419 NA
#> 5 NA NA NA NA 0.40167955 NA -0.2116296 -0.15130497 -0.06787875
#> 6 NA NA NA NA NA NA NA -0.17408218 -0.08282727
#> logBT logGLc logBFd logDl logGB logSLC logGLP logBG logLG logDPA logBPC logLA
#> 1 NA NA NA NA NA NA NA NA NA NA NA NA
#> 2 NA NA NA NA NA NA NA NA NA NA NA NA
#> 3 NA NA NA NA NA NA NA NA NA NA NA NA
#> 4 NA NA NA NA NA NA NA NA NA NA NA NA
#> 5 NA NA NA NA NA NA NA NA NA NA NA NA
#> 6 NA NA NA NA NA NA NA NA NA NA NA NA
#> logLAR logSH logSB logL logH
#> 1 NA NA NA NA NA
#> 2 NA NA NA NA NA
#> 3 NA NA NA NA NA
#> 4 NA NA NA NA NA
#> 5 NA NA NA NA NA
#> 6 NA NA NA NA NA
## Compute the log-ratios with respect to a different reference:
dataExampleWithLogs2 <- LogRatios(dataExample, ref = reference$Basel)
#> Warning: Reference for Ovis orientalis used for cases of Ovis aries.
#> Reference for Sus scrofa used for cases of Sus domesticus.
#> Reference for Sus scrofa used for cases of Sus.
#> Set useGenusIfUnambiguous to FALSE if this behaviour is not desired.
#> Warning: Data includes some cases recorded as
#> * Caprini (which is a Tribe)
#> for which the reference for Ovis orientalis or Capra hircus could be used.
#> Set joinCategories as appropriate if you want to use any of them.
head(dataExampleWithLogs2)[, -c(6:20,32:64)]
#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT
#> 1 ALP 4918 10364 bota 1 fal 54.0 31.3 30.6 28.1 26.3 27.5 20.0 NA
#> 2 ALP 4919 10364 bota 1 fal 54.5 27.9 31.8 26.0 22.8 25.3 19.5 NA
#> 3 ALP 3453 10410 ovar 1fal ant 27.1 9.9 12.3 17.9 9.0 9.0 NA NA
#> 4 ALP 3455 10410 ovar 1fal ant 27.6 9.6 12.2 7.6 8.9 8.3 NA NA
#> 5 ALP 4245 7036 cahi hum NA 128.3 NA 12.9 NA 27.4 26.6 23.6
#> 6 ALP 4674 10227 cahi hum NA NA NA NA NA 26.0 25.7 22.3
#> GLc BFd Dl logGL logBp logSD logDD logBd logBT logBFd logDl
#> 1 NA NA NA NA NA NA NA NA NA NA NA
#> 2 NA NA NA NA NA NA NA NA NA NA NA
#> 3 NA NA NA NA NA NA NA NA NA NA NA
#> 4 NA NA NA NA NA NA NA NA NA NA NA
#> 5 NA NA NA NA NA -0.2064284 NA -0.09308922 -0.1260874 NA NA
#> 6 NA NA NA NA NA NA NA -0.11586643 -0.1506945 NA NA
#> logGB logSLC logGLP logBG logLG logDPA logBPC logLA logLAR logSH
#> 1 NA NA NA NA NA NA NA NA NA NA
#> 2 NA NA NA NA NA NA NA NA NA NA
#> 3 NA NA NA NA NA NA NA NA NA NA
#> 4 NA NA NA NA NA NA NA NA NA NA
#> 5 NA NA NA NA NA NA NA NA NA NA
#> 6 NA NA NA NA NA NA NA NA NA NA
## Define an altenative reference combining differently the references'
## database:
refComb <- list(cattle = "Nieto", sheep = "Davis", Goat = "Clutton",
pig = "Albarella", redDeer = "Basel")
userReference <- AssembleReference(refComb)
## Compute the log-ratios with respect to this alternative reference:
dataExampleWithLogs3 <- LogRatios(dataExample, ref = userReference)
#> Warning: Reference for Sus domesticus used for cases of Sus.
#> Set useGenusIfUnambiguous to FALSE if this behaviour is not desired.
#> Warning: Data includes some cases recorded as
#> * Caprini (which is a Tribe)
#> for which the reference for Ovis aries or Capra hircus could be used.
#> Set joinCategories as appropriate if you want to use any of them.
## We can be interested in including the first and second phalanges without
## anterior-posterior identification ("phal 1" and "phal 2"), by computing
## their log ratios with respect to the reference of the corresponding
## anterior phalanges ("phal 1 ant" and "phal 2 ant", respectively).
## For this we use the optional argument joinCategories:
categoriesPhalAnt <- list('phal 1 ant' = c("phal 1 ant", "phal 1"),
'phal 2 ant' = c("phal 2 ant", "phal 2"))
dataExampleWithLogs4 <- LogRatios(dataExample,
joinCategories = categoriesPhalAnt)
#> Warning: Reference for Sus scrofa used for cases of Sus domesticus.
#> Reference for Sus scrofa used for cases of Sus.
#> Set useGenusIfUnambiguous to FALSE if this behaviour is not desired.
#> Warning: Data includes some cases recorded as
#> * Caprini (which is a Tribe)
#> for which the reference for Ovis aries or Capra hircus could be used.
#> Set joinCategories as appropriate if you want to use any of them.
head(dataExampleWithLogs4)[, -c(6:20,32:64)]
#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT
#> 1 ALP 4918 10364 bota 1 fal 54.0 31.3 30.6 28.1 26.3 27.5 20.0 NA
#> 2 ALP 4919 10364 bota 1 fal 54.5 27.9 31.8 26.0 22.8 25.3 19.5 NA
#> 3 ALP 3453 10410 ovar 1fal ant 27.1 9.9 12.3 17.9 9.0 9.0 NA NA
#> 4 ALP 3455 10410 ovar 1fal ant 27.6 9.6 12.2 7.6 8.9 8.3 NA NA
#> 5 ALP 4245 7036 cahi hum NA 128.3 NA 12.9 NA 27.4 26.6 23.6
#> 6 ALP 4674 10227 cahi hum NA NA NA NA NA 26.0 25.7 22.3
#> GLc BFd Dl logGL logBp logDp logSD logBd
#> 1 NA NA NA NA 0.048386306 0.008600172 0.08697848 0.02435935
#> 2 NA NA NA NA -0.001553828 0.025305865 0.05324551 -0.01185283
#> 3 NA NA NA NA -0.079911767 -0.072659295 0.26295847 -0.08911977
#> 4 NA NA NA NA -0.093275728 -0.076204576 -0.10908097 -0.12428419
#> 5 NA NA NA NA 0.401679554 NA -0.21162958 -0.15130497
#> 6 NA NA NA NA NA NA NA -0.17408218
#> logDd logBT logGLc logBFd logDl logGB logSLC logGLP logBG logLG logDPA
#> 1 -0.006466042 NA NA NA NA NA NA NA NA NA NA
#> 2 -0.017461427 NA NA NA NA NA NA NA NA NA NA
#> 3 NA NA NA NA NA NA NA NA NA NA NA
#> 4 NA NA NA NA NA NA NA NA NA NA NA
#> 5 -0.067878752 NA NA NA NA NA NA NA NA NA NA
#> 6 -0.082827266 NA NA NA NA NA NA NA NA NA NA
#> logBPC logLA logLAR logSH logSB logL logH
#> 1 NA NA NA NA NA NA NA
#> 2 NA NA NA NA NA NA NA
#> 3 NA NA NA NA NA NA NA
#> 4 NA NA NA NA NA NA NA
#> 5 NA NA NA NA NA NA NA
#> 6 NA NA NA NA NA NA NA