Skip to contents

match_by_pattern() tells whether row or column labels match a regular expression. Internally, grepl() decides whether a match occurs. replace_by_pattern() replaces portions of row of column labels when a regular expression is matched. Internally, gsub() performs the replacements.

Usage

match_by_pattern(
  labels,
  regex_pattern,
  pieces = "all",
  prepositions = RCLabels::prepositions_list,
  notation = RCLabels::bracket_notation,
  inf_notation = TRUE,
  choose_most_specific = FALSE,
  ...
)

replace_by_pattern(
  labels,
  regex_pattern,
  replacement,
  pieces = "all",
  prepositions = RCLabels::prepositions_list,
  notation = RCLabels::bracket_notation,
  ...
)

Arguments

labels

The row and column labels to be modified.

regex_pattern

The regular expression pattern to determine matches and replacements. Consider using Hmisc::escapeRegex() to escape regex_pattern before calling this function.

pieces

The pieces of row or column labels to be checked for matches or replacements. See details.

prepositions

A vector of strings that count as prepositions. Default is prepositions_list. Used to detect prepositional phrases if pieces are to be interpreted as prepositions.

notation

The notation used in labels. Default is bracket_notation.

inf_notation

A boolean that tells whether to infer notation for x. Default is TRUE. See infer_notation() for details.

choose_most_specific

A boolean that tells whether to choose the most specific notation from notation when inferring notation. Default is FALSE so that a less specific notation can be inferred. In combination with notations_list, the default value of FALSE means that bracket_notation will be selected instead of anything more specific, such as from_notation.

...

Other arguments passed to grepl() or gsub(), such as ignore.case, perl, fixed, or useBytes. See examples.

replacement

For replace_by_pattern(), the string that replaces all matches to regex_pattern.

Value

A logical vector of same length as labels, where TRUE indicates a match was found and FALSE indicates otherwise.

Details

By default (pieces = "all"), complete labels (as strings) are checked for matches and replacements. If pieces == "pref" or pieces == "suff", only the prefix or the suffix is checked for matches and replacements. Alternatively, pieces = "noun" or pieces = <<preposition>> indicate that only specific pieces of labels are to be checked for matches and replacements. When pieces = <<preposition>>, only the object of <<preposition>> is checked for matches and replacement.

pieces can be a vector, indicating multiple pieces to be checked for matches and replacements. But if any of the pieces are "all", all pieces are checked and replaced. If pieces is "pref" or "suff", only one can be specified.

Examples

labels <- c("Production [of b in c]", "d [of Coal in f]", "g [of h in USA]")
# With default `pieces` argument, matching is done for whole labels.
match_by_pattern(labels, regex_pattern = "Production")
#> [1]  TRUE FALSE FALSE
match_by_pattern(labels, regex_pattern = "Coal")
#> [1] FALSE  TRUE FALSE
match_by_pattern(labels, regex_pattern = "USA")
#> [1] FALSE FALSE  TRUE
# Check beginnings of labels
match_by_pattern(labels, regex_pattern = "^Production")
#> [1]  TRUE FALSE FALSE
# Check at ends of labels: no match.
match_by_pattern(labels, regex_pattern = "Production$")
#> [1] FALSE FALSE FALSE
# Can match on nouns or prepositions.
match_by_pattern(labels, regex_pattern = "Production", pieces = "noun")
#> [1]  TRUE FALSE FALSE
# Gives FALSE, because "Production" is a noun.
match_by_pattern(labels, regex_pattern = "Production", pieces = "in")
#> [1] FALSE FALSE FALSE