ThesaurusManagement.Rd
Functions to modify and check thesauri.
NewThesaurus(
caseSensitive = FALSE,
accentSensitive = FALSE,
punctuationSensitive = FALSE
)
AddToThesaurus(thesaurus, newName, category = NULL)
RemoveRepeatedNames(thesaurus)
ThesaurusAmbiguity(thesaurus)
Logical. They set
the case, accent, and punctuation sensitivity (FALSE
by default) of
the thesaurus.
A thesaurus object.
Character vector or (named) list of character vectors with new names to be added to the thesaurus.
Character vector identifying the classes where the new names should be included.
NewThesaurus
returns an empty thesaurus. This can then be
populated by AddToThesaurus
.
AddToThesaurus
returns the input thesaurus complemented with new
names in the categories identified. If any of the categories is not present
in the input thesaurus, new categories are added as required.
RemoveRepeatedNames
returns the input thesaurus pruned of redundant
names in each category. The redundancy is evaluated in agreement with the
case and accent sensitivity of the thesaurus.
ThesaurusAmbiguity
returns FALSE if no ambiguity is present. When any
ambiguity is found, it returns TRUE with an attribute errmessage
including the names present in more than one category and the
the involved categories. This is internally used by
ReadThesaurus
and AddToThesaurus
to generate an
error in case they attempt to read or generate an ambiguous thesaurus.
In the function AddToThesaurus
the categories in which to add new
names can be specified either as names of a named list given as argument
newName
or explicitly in the argument category
. See the
examples below illustrating both alternatives.
From version 1.2.0 AddToThesurus
directly removes repeated names in
the resulting thesaurus.
zoologThesaurus
for a description of the thesaurus and
thesaurus set structure,
## Load an example thesaurus:
thesaurus <- ReadThesaurus(system.file("extdata", "taxonThesaurus.csv",
package="zoolog"))
## with categories
names(thesaurus) # "bos taurus" "ovis aries" "sus domesticus"
#> [1] "Bos taurus" "Bos primigenius" "Bos"
#> [4] "Ovis aries" "Ovis orientalis" "Ovis"
#> [7] "Capra hircus" "Capra aegagrus" "Capra"
#> [10] "Caprini" "Sus domesticus" "Sus scrofa"
#> [13] "Sus" "Cervus elaphus" "Cervus"
#> [16] "Dama mesopotamica" "Dama" "Gazella gazella"
#> [19] "Gazella" "Equus asinus" "Equus caballus"
#> [22] "Equus" "Oryctolagus cuniculus" "Oryctolagus"
#> [25] "Canis familiaris" "Canis lupus" "Canis"
## Add names to several categories:
thesaurusExtended <- AddToThesaurus(thesaurus,
c("Kuh", "Schwein"),
c("bos taurus","sus domesticus"))
## This adds the name "Kuh" to the category "bos taurus" and
## the name "Schwein" to the category "sus domesticus".
## Generate a new thesaurus and populate it with two categories
## ("red" and "blue"):
thesaurusNew <- NewThesaurus()
thesaurusNew <- AddToThesaurus(thesaurusNew,
c("scarlet", "vermilion", "ruby", "cherry",
"carmine", "wine"),
"red")
thesaurusNew
#> red
#> 1 red
#> 2 scarlet
#> 3 vermilion
#> 4 ruby
#> 5 cherry
#> 6 carmine
#> 7 wine
thesaurusNew <- AddToThesaurus(thesaurusNew,
c("sky blue", "azure", "sapphire", "cerulean",
"navy"),
"blue")
thesaurusNew
#> red blue
#> 1 red blue
#> 2 scarlet sky blue
#> 3 vermilion azure
#> 4 ruby sapphire
#> 5 cherry cerulean
#> 6 carmine navy
#> 7 wine
## Categories and names can also be included as named list
thesaurusNew <- AddToThesaurus(thesaurusNew, list(
blue = c("lapis lazuli", "indigo", "cyan"),
brown = c("hazel", "chocolate-coloured", "brunette", "mousy", "beige")) )
thesaurusNew
#> red blue brown
#> 1 red blue brown
#> 2 scarlet sky blue hazel
#> 3 vermilion azure chocolate-coloured
#> 4 ruby sapphire brunette
#> 5 cherry cerulean mousy
#> 6 carmine navy beige
#> 7 wine lapis lazuli
#> 8 indigo
#> 9 cyan
## Attempt to generate an ambiguous thesaurus
try(AddToThesaurus(thesaurusNew, "scarlet", "blue"))
#> Error in AddToThesaurus(thesaurusNew, "scarlet", "blue") :
#> The resulting thesaurus would be ambiguous.
#> Ambiguity in pair ("red", "blue"). Shared names: scarlet
## From version 1.2.0 AddToThesurus directly removes repeated names:
AddToThesaurus(thesaurusNew, c("scarlet", "ruby"), "red")
#> red blue brown
#> 1 red blue brown
#> 2 scarlet sky blue hazel
#> 3 vermilion azure chocolate-coloured
#> 4 ruby sapphire brunette
#> 5 cherry cerulean mousy
#> 6 carmine navy beige
#> 7 wine lapis lazuli
#> 8 indigo
#> 9 cyan
## Remove repeated names in the same category:
## If we included any repetitions
thesaurusNew[8:9,1] <- c("scarlet", "ruby")
thesaurusNew
#> red blue brown
#> 1 red blue brown
#> 2 scarlet sky blue hazel
#> 3 vermilion azure chocolate-coloured
#> 4 ruby sapphire brunette
#> 5 cherry cerulean mousy
#> 6 carmine navy beige
#> 7 wine lapis lazuli
#> 8 scarlet indigo
#> 9 ruby cyan
## they can be removed with
RemoveRepeatedNames(thesaurusNew)
#> red blue brown
#> 1 red blue brown
#> 2 scarlet sky blue hazel
#> 3 vermilion azure chocolate-coloured
#> 4 ruby sapphire brunette
#> 5 cherry cerulean mousy
#> 6 carmine navy beige
#> 7 wine lapis lazuli
#> 8 indigo
#> 9 cyan