`CondenseLogs.Rd`

This function condenses the calculated log ratio values into a reduced number
of features by grouping log ratio values and selecting or calculating a
feature value. By default the selected groups each represents a single dimension,
i.e. `Length`

and `Width`

. Only one feature is extracted per group.
Currently, two methods are possible: priority (default) or average.

CondenseLogs( data, grouping = list(Length = c("GL", "GLl", "GLm", "HTC"), Width = c("BT", "Bd", "Bp", "SD", "Bfd", "Bfp")), method = "priority" )

data | A dataframe with the input measurements. |
---|---|

grouping | A list of named character vectors. The list includes a vector
per selected group. Each vector gives the group of measurements in order of
priority. By default the groups are |

method | Character string indicating which method to use for extracting
the condensed features. Currently accepted methods: |

A dataframe including the input dataframe and additional columns, one
for each extracted condensed feature, with the corresponding name given in
`grouping`

.

This operation is motivated by two circumstances. First, not all measurements are available for every bone specimen, which obstructs their direct comparison and statistical analysis. Second, several measurements can be strongly correlated (e.g. SD and Bd both represent bone width). Thus, considering them as independent would produce an over-representation of bone remains with more measurements per axis. Condensing each group of measurements into a single feature (e.g. one measure per axis) palliates both problems.

Observe that an important property of the log-ratios from a reference is that
it makes the different measures comparable. For instance, if a bone is
scaled with respect to the reference, so that it homogeneously doubles its
width, then all width related measures
(*BT*, *Bd*, *Bp*, *SD*, ...) will give the
same log-ratio (`log(2)`

). In contrast, the
absolute measures are not directly comparable.

The measurement names in the grouping list are given without the
`logPrefix`

. But the selection is made from the log-ratios.

The default method is `"priority"`

, which selects the first available
measure log-ratio in each group. The method `"average"`

extracts the
mean per group, ignoring the non-available measures.
We provide the following by-default group and prioritization:
For lengths, the order of priority is: GL, GLl, GLm, HTC.
For widths, the order of priority is: BT, Bd, Bp, SD, Bfd, Bfp.
This order maximises the robustness and reliability of the measurements,
as priority is given to the most abundant, more replicable, and less age
dependent measurements.

This method was first used in: Trentacoste, A., Nieto-Espinet, A., & Valenzuela-Lamas, S. (2018). Pre-Roman improvements to agricultural production: Evidence from livestock husbandry in late prehistoric Italy. PloS one, 13(12), e0208109.

Alternatively, a user-defined `method`

can be provided as a function
with a single argument (data.frame) assumed to have as columns the measure
log-ratios determined by the `grouping`

.

## Read an example dataset: dataFile <- system.file("extdata", "dataValenzuelaLamas2008.csv.gz", package="zoolog") dataExample <- utils::read.csv2(dataFile, na.strings = "", encoding = "UTF-8", stringsAsFactors = TRUE) ## Compute the log-ratios and select the cases with available log ratios: dataExampleWithLogs <- RemoveNACases(LogRatios(dataExample)) ## We can observe the first lines (excluding some columns for visibility): head(dataExampleWithLogs)[, -c(6:20,32:63)]#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT #> 1 ALP 3453 10410 ovar 1fal ant 27.1 9.9 12.3 17.9 9.0 9.0 NA NA #> 2 ALP 3455 10410 ovar 1fal ant 27.6 9.6 12.2 7.6 8.9 8.3 NA NA #> 3 ALP 4245 7036 cahi hum NA 128.3 NA 12.9 NA 27.4 26.6 23.6 #> 4 ALP 4674 10227 cahi hum NA NA NA NA NA 26.0 25.7 22.3 #> 5 ALP 4085 10253 cahi hum NA NA NA NA NA 27.9 27.3 23.2 #> 6 TFC 24 407 ceel mc 262.7 41.3 30.8 25.0 21.2 41.1 27.1 NA #> GLc BFd Dl HmandM3 logGL logBp logDp logSD logBd #> 1 NA NA NA NA -0.10786052 -0.07991177 -0.07265930 0.2629585 -0.08911977 #> 2 NA NA NA NA -0.09992073 -0.09327573 -0.07620458 -0.1090810 -0.12428419 #> 3 NA NA NA NA NA 0.40167955 NA -0.2116296 -0.15130497 #> 4 NA NA NA NA NA NA NA NA -0.17408218 #> 5 NA NA NA NA NA NA NA NA -0.14345133 #> 6 NA NA NA NA NA NA NA NA -0.03354115 #> logDd logBT logGLc logBFd logDl logGB logSLC logGLP logBG logLG logDPA #> 1 NA NA NA NA NA NA NA NA NA NA NA #> 2 NA NA NA NA NA NA NA NA NA NA NA #> 3 -0.06787875 NA NA NA NA NA NA NA NA NA NA #> 4 -0.08282727 NA NA NA NA NA NA NA NA NA NA #> 5 -0.05659774 NA NA NA NA NA NA NA NA NA NA #> 6 NA NA NA NA NA NA NA NA NA NA NA #> logBPC logLA logLAR logSH logSB logL logH #> 1 NA NA NA NA NA NA NA #> 2 NA NA NA NA NA NA NA #> 3 NA NA NA NA NA NA NA #> 4 NA NA NA NA NA NA NA #> 5 NA NA NA NA NA NA NA #> 6 NA NA NA NA NA NA NA## Extract the default condensed features with the default "priority" method: dataExampleWithSummary <- CondenseLogs(dataExampleWithLogs) head(dataExampleWithSummary)[, -c(6:20,32:63)]#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT #> 1 ALP 3453 10410 ovar 1fal ant 27.1 9.9 12.3 17.9 9.0 9.0 NA NA #> 2 ALP 3455 10410 ovar 1fal ant 27.6 9.6 12.2 7.6 8.9 8.3 NA NA #> 3 ALP 4245 7036 cahi hum NA 128.3 NA 12.9 NA 27.4 26.6 23.6 #> 4 ALP 4674 10227 cahi hum NA NA NA NA NA 26.0 25.7 22.3 #> 5 ALP 4085 10253 cahi hum NA NA NA NA NA 27.9 27.3 23.2 #> 6 TFC 24 407 ceel mc 262.7 41.3 30.8 25.0 21.2 41.1 27.1 NA #> GLc BFd Dl HmandM3 logGL logBp logDp logSD logBd #> 1 NA NA NA NA -0.10786052 -0.07991177 -0.07265930 0.2629585 -0.08911977 #> 2 NA NA NA NA -0.09992073 -0.09327573 -0.07620458 -0.1090810 -0.12428419 #> 3 NA NA NA NA NA 0.40167955 NA -0.2116296 -0.15130497 #> 4 NA NA NA NA NA NA NA NA -0.17408218 #> 5 NA NA NA NA NA NA NA NA -0.14345133 #> 6 NA NA NA NA NA NA NA NA -0.03354115 #> logDd logBT logGLc logBFd logDl logGB logSLC logGLP logBG logLG logDPA #> 1 NA NA NA NA NA NA NA NA NA NA NA #> 2 NA NA NA NA NA NA NA NA NA NA NA #> 3 -0.06787875 NA NA NA NA NA NA NA NA NA NA #> 4 -0.08282727 NA NA NA NA NA NA NA NA NA NA #> 5 -0.05659774 NA NA NA NA NA NA NA NA NA NA #> 6 NA NA NA NA NA NA NA NA NA NA NA #> logBPC logLA logLAR logSH logSB logL logH Length Width #> 1 NA NA NA NA NA NA NA -0.10786052 -0.08911977 #> 2 NA NA NA NA NA NA NA -0.09992073 -0.12428419 #> 3 NA NA NA NA NA NA NA NA -0.15130497 #> 4 NA NA NA NA NA NA NA NA -0.17408218 #> 5 NA NA NA NA NA NA NA NA -0.14345133 #> 6 NA NA NA NA NA NA NA NA -0.03354115## Extract only width with "average" method: dataExampleWithSummary2 <- CondenseLogs(dataExampleWithLogs, grouping = list(Width = c("BT", "Bd", "Bp", "SD")), method = "average") head(dataExampleWithSummary2)[, -c(6:20,32:63)]#> Site N.inv UE Especie Os GL Bp Dp SD DD Bd Dd BT #> 1 ALP 3453 10410 ovar 1fal ant 27.1 9.9 12.3 17.9 9.0 9.0 NA NA #> 2 ALP 3455 10410 ovar 1fal ant 27.6 9.6 12.2 7.6 8.9 8.3 NA NA #> 3 ALP 4245 7036 cahi hum NA 128.3 NA 12.9 NA 27.4 26.6 23.6 #> 4 ALP 4674 10227 cahi hum NA NA NA NA NA 26.0 25.7 22.3 #> 5 ALP 4085 10253 cahi hum NA NA NA NA NA 27.9 27.3 23.2 #> 6 TFC 24 407 ceel mc 262.7 41.3 30.8 25.0 21.2 41.1 27.1 NA #> GLc BFd Dl HmandM3 logGL logBp logDp logSD logBd #> 1 NA NA NA NA -0.10786052 -0.07991177 -0.07265930 0.2629585 -0.08911977 #> 2 NA NA NA NA -0.09992073 -0.09327573 -0.07620458 -0.1090810 -0.12428419 #> 3 NA NA NA NA NA 0.40167955 NA -0.2116296 -0.15130497 #> 4 NA NA NA NA NA NA NA NA -0.17408218 #> 5 NA NA NA NA NA NA NA NA -0.14345133 #> 6 NA NA NA NA NA NA NA NA -0.03354115 #> logDd logBT logGLc logBFd logDl logGB logSLC logGLP logBG logLG logDPA #> 1 NA NA NA NA NA NA NA NA NA NA NA #> 2 NA NA NA NA NA NA NA NA NA NA NA #> 3 -0.06787875 NA NA NA NA NA NA NA NA NA NA #> 4 -0.08282727 NA NA NA NA NA NA NA NA NA NA #> 5 -0.05659774 NA NA NA NA NA NA NA NA NA NA #> 6 NA NA NA NA NA NA NA NA NA NA NA #> logBPC logLA logLAR logSH logSB logL logH Width #> 1 NA NA NA NA NA NA NA 0.03130898 #> 2 NA NA NA NA NA NA NA -0.10888030 #> 3 NA NA NA NA NA NA NA 0.01291500 #> 4 NA NA NA NA NA NA NA -0.17408218 #> 5 NA NA NA NA NA NA NA -0.14345133 #> 6 NA NA NA NA NA NA NA -0.03354115