Find or replace row or column labels that match a regular expression
regex_funcs.Rd
match_by_pattern()
tells whether row or column labels
match a regular expression.
Internally, grepl()
decides whether a match occurs.
replace_by_pattern()
replaces portions of row of column labels
when a regular expression is matched.
Internally, gsub()
performs the replacements.
Usage
match_by_pattern(
labels,
regex_pattern,
pieces = "all",
prepositions = RCLabels::prepositions_list,
notation = RCLabels::bracket_notation,
inf_notation = TRUE,
choose_most_specific = FALSE,
...
)
replace_by_pattern(
labels,
regex_pattern,
replacement,
pieces = "all",
prepositions = RCLabels::prepositions_list,
notation = RCLabels::bracket_notation,
...
)
Arguments
- labels
The row and column labels to be modified.
- regex_pattern
The regular expression pattern to determine matches and replacements. Consider using
Hmisc::escapeRegex()
to escaperegex_pattern
before calling this function.- pieces
The pieces of row or column labels to be checked for matches or replacements. See details.
- prepositions
A vector of strings that count as prepositions. Default is prepositions_list. Used to detect prepositional phrases if
pieces
are to be interpreted as prepositions.- notation
The notation used in
labels
. Default is bracket_notation.- inf_notation
A boolean that tells whether to infer notation for
x
. Default isTRUE
. Seeinfer_notation()
for details.- choose_most_specific
A boolean that tells whether to choose the most specific notation from
notation
when inferring notation. Default isFALSE
so that a less specific notation can be inferred. In combination with notations_list, the default value ofFALSE
means that bracket_notation will be selected instead of anything more specific, such as from_notation.- ...
Other arguments passed to
grepl()
orgsub()
, such asignore.case
,perl
,fixed
, oruseBytes
. See examples.- replacement
For
replace_by_pattern()
, the string that replaces all matches toregex_pattern
.
Value
A logical vector of same length as labels
,
where TRUE
indicates a match was found and FALSE
indicates otherwise.
Details
By default (pieces = "all"
), complete labels (as strings) are checked for matches
and replacements.
If pieces == "pref"
or pieces == "suff"
,
only the prefix or the suffix is checked for matches and replacements.
Alternatively, pieces = "noun"
or pieces = <<preposition>>
indicate
that only specific pieces of labels are to be checked for matches and replacements.
When pieces = <<preposition>>
, only the object of <<preposition>>
is
checked for matches and replacement.
pieces
can be a vector, indicating multiple pieces to be checked for matches
and replacements.
But if any of the pieces
are "all", all pieces are checked and replaced.
If pieces
is "pref" or "suff", only one can be specified.
Examples
labels <- c("Production [of b in c]", "d [of Coal in f]", "g [of h in USA]")
# With default `pieces` argument, matching is done for whole labels.
match_by_pattern(labels, regex_pattern = "Production")
#> [1] TRUE FALSE FALSE
match_by_pattern(labels, regex_pattern = "Coal")
#> [1] FALSE TRUE FALSE
match_by_pattern(labels, regex_pattern = "USA")
#> [1] FALSE FALSE TRUE
# Check beginnings of labels
match_by_pattern(labels, regex_pattern = "^Production")
#> [1] TRUE FALSE FALSE
# Check at ends of labels: no match.
match_by_pattern(labels, regex_pattern = "Production$")
#> [1] FALSE FALSE FALSE
# Can match on nouns or prepositions.
match_by_pattern(labels, regex_pattern = "Production", pieces = "noun")
#> [1] TRUE FALSE FALSE
# Gives FALSE, because "Production" is a noun.
match_by_pattern(labels, regex_pattern = "Production", pieces = "in")
#> [1] FALSE FALSE FALSE