Title: | Error Handling Made Easy |
---|---|
Description: | Set of tools to facilitate package development and make R a more user-friendly place. Mostly for developers (or anyone who writes/shares functions). Provides a simple, powerful and flexible way to check the arguments passed to functions. The developer can easily describe the type of argument needed. If the user provides a wrong argument, then an informative error message is prompted with the requested type and the problem clearly stated--saving the user a lot of time in debugging. |
Authors: | Laurent Berge [aut, cre] |
Maintainer: | Laurent Berge <[email protected]> |
License: | GPL-3 |
Version: | 1.5.0 |
Built: | 2024-11-09 05:53:48 UTC |
Source: | https://github.com/lrberge/dreamerr |
The main purpose of this package is twofold: i) to facilitate the developer's life, and ii) to provide to the users meaningful, useful error messages. These objectives are accomplished with a single function: check_arg
. That function checks the arguments given by the user: it offers a compact syntax such that complex arguments can be simply stated by the developer. In turn, if the user provides an argument of the wrong type then an informative error message will be buit, stating the expected type and where the error comes from–saving the user quite some time in debugging.
Thus you can very easily make your package look professional with check_arg
(checking arguments properly is professional).
It also offers a set of small tools to provide informative messages to the users. See stop_up
and warn_up
to throw errors and warnings in the appropriate location. There are many tools to form messages: enumerate_items
to form textual list of elements (with many options including conjugating verbs, etc...), plural
to conjugate verbs depending on the argument, and n_letter
, n_th
, n_times
to write integers in words (which usually looks nicer).
To sum up in a few words, this package was created to enhance the user experience and facilitate package development.
Maintainer: Laurent Berge [email protected]
Useful links:
Report bugs at https://github.com/lrberge/dreamerr/issues
Full-fledged argument checking. Checks that the user provides arguments of the requested type (even complex) in a very simple way for the developer. Provides detailed and informative error messages for the user.
check_arg( .x, .type, .x1, .x2, .x3, .x4, .x5, .x6, .x7, .x8, .x9, ..., .message, .choices = NULL, .data = list(), .value, .env, .up = 0 ) check_set_arg( .x, .type, .x1, .x2, .x3, .x4, .x5, .x6, .x7, .x8, .x9, ..., .message, .choices = NULL, .data = list(), .value, .env, .up = 0 ) check_value( .x, .type, .message, .arg_name, .prefix, .choices = NULL, .data = list(), .value, .env, .up = 0 ) check_set_value( .x, .type, .message, .arg_name, .prefix, .choices = NULL, .data = list(), .value, .env, .up = 0 ) check_arg_plus check_value_plus
check_arg( .x, .type, .x1, .x2, .x3, .x4, .x5, .x6, .x7, .x8, .x9, ..., .message, .choices = NULL, .data = list(), .value, .env, .up = 0 ) check_set_arg( .x, .type, .x1, .x2, .x3, .x4, .x5, .x6, .x7, .x8, .x9, ..., .message, .choices = NULL, .data = list(), .value, .env, .up = 0 ) check_value( .x, .type, .message, .arg_name, .prefix, .choices = NULL, .data = list(), .value, .env, .up = 0 ) check_set_value( .x, .type, .message, .arg_name, .prefix, .choices = NULL, .data = list(), .value, .env, .up = 0 ) check_arg_plus check_value_plus
.x |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.type |
A character string representing the requested type(s) of the arguments.
This is a bit long so please look at the details section or the vignette for explanations.
Each type is composed of one main class and restrictions (optional).
Types can be separated with pipes ( |
.x1 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.x2 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.x3 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.x4 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.x5 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.x6 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.x7 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.x8 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
.x9 |
An argument to be checked. Must be an argument name. Can also be the type, see details/examples. |
... |
Only used to check |
.message |
A character string, optional. By default, if the user provides a
wrong argument, the error message stating what type of argument is required is
automatically formed. You can alternatively provide your own error message,
maybe more tailored to your function. The reason of why there is a problem is
appended in the end of the message. You can use the special character
|
.choices |
Only if one of the types (in argument |
.data |
Must be a data.frame, a list or a vector. Used in three situations. 1) if the global keywords |
.value |
An integer scalar or a named list of integers scalars. Used when the keyword |
.env |
An environment defaults to the frame where the user called the original function. Only used in two situations. 1) if the global keywords |
.up |
Integer, default is 0. If the user provides a wrong argument, the error message will integrate the call of the function from which |
.arg_name |
A character scalar. If |
.prefix |
A character scalar. If |
An object of class function
of length 1.
An object of class function
of length 1.
In case the type
is "match"
, it returns the matched value. In any other case, NULL
is returned.
check_set_arg()
: Same as check_arg
, but includes in addition: i) default setting, ii) type conversion, iii) partial matching, and iv) checking list elements. (Small drawback: cannot be turned off.)
check_value()
: Checks if a (single) value is of the appropriate type
check_set_value()
: Same as check_value
, but includes in addition: i) default setting, ii) type conversion, iii) partial matching, and iv) checking list elements. (Small drawback: cannot be turned off.)
To write the expected type of an argument, you need to write the main class in combination with the class's options and restrictions (if any).
The syntax is: "main_class option(s) restriction(s)"
A type MUST have at least one main class. For example: in the type "logical vector len(,2) no na"
, vector
is the main class, no na
is the option, and logical
and len(,2)
are restrictions
There are 14 main classes that can be checked. On the left the keyword, on the right what is expected from the argument, and in square brackets the related section in the examples:
scalar
: an atomic vector of length 1 [Section I)]
vector
: an atomic vector [Section IV)]
matrix
: a matrix [Section IV)]
vmatrix
: a matrix or vector [Section IV)]
data.frame
: a data.frame [Section VI)]
vdata.frame
: a data.frame or vector [Section VI)]
list
: a list [Section V)]
formula
: a formula [Section VIII)]
function
: a function [Section V)]
charin
: a character vector with values in a vector of choices [Section III)]
match
: a character vector with values in a vector of choices, partial matching enabled and only available in check_set_arg
[Section III)]
path
: a character scalar pointing to a file or a directory
class
: a custom class [Section VI)]
NA
: a vector of length 1 equal to NA–does not support options nor restrictions, usually combined with other main classes (see Section on combining multiple types) [Section VI)]
There are eight type options, they are not available for each types. Here what they do and the types to which they are associated:
NA OK
(or NAOK
): Tolerates the presence of NA values. Available for scalar
.
NO NA
(or NONA
): Throws an error if NAs are present. Available for vector
, matrix
, vmatrix
, data.frame
, and vdata.frame
.
square
: Enforces the matrix to be square. Available for matrix
, vmatrix
.
named
: Enforces the object to have names. Available for vector
, list
.
multi
: Allows multiple matches. Available for charin
, match
.
strict
: Makes the matching case-sensitive. Available for match
.
os
and ts
: Available for formula
. Option os
(resp. ts
) enforces that the formula is one-sided (resp. two-sided).
dir
, read
and create
: Available for path
. Option dir
checks whether the path points to a directory and not a file. Option read
checks whether the file actually exists and has read permission. Option create
creates up to the grand-parent folder if it was not yet existing.
You can further add restrictions. There are roughly six types of restrictions. Here what they do and the types to which they are associated:
sub-type restriction: For atomic types (scalar
, vector
, matrix
or vmatrix
), you can restrict the underlying data to be of a specific sub-type. The simple sub-types are: i) integer
(numeric without decimals and logicals), i') strict integer
(numeric that can be converted to integer with as.integer
, and not logicals), ii) numeric
, iii) factor
, iv) logical
and iv') loose logical
(0/1 are also OK). Simply add the sub-type in the type string (e.g. "integer scalar"
), or if you allow multiple types, put them in parentheses rigth after the main class: e.g. "scalar(character, integer)"
. See Section XI) in the examples. See also the section below for more information on the sub-types. Some types (character
, integer
, numeric
, logical
and factor
) also support the keyword "conv"
in check_set_arg
.
GE
/GT
/LE
/LT
: For atomic types with numeric data, you can check the values in the object. The GE/GT/LE/LT mean respectively greater or equal/greater than/lower or equal/lower than. The syntax is GE{expr}
, with expr any expression. See Section IV) in the examples.
len(a, b)
: You can restrict the length of objects with len(a, b)
(with a
and b
integers). Available for vector
and list
. Then the length must be in between a
and b
. Either a
or b
can be missing which means absence of restriction. If len(a)
, this means must be equal to a
. You can also use the keywords len(data) which ensures that the length is the same as the length of the object given in the argument .data
, or len(value)
which ensures the length is equal to the value given in .value
. See Section IV) in the examples.
nrow(a, b)
, ncol(a, b)
: To restrict the number of rows and columns. Available for matrix
, vmatrix
, data.frame
, vdata.frame
. Tolerates the data
and value
keywords (see in len
). See Section IV) in the examples.
var(data, env): Available only for formula
. var(data)
ensures that the variables in the formula are present in the data set given by the extra argument .data
. var(env)
ensures they are present in the environment, and var(data, env)
in either the environment or the data set. See Section VIII) in the examples.
arg(a, b)
: Available only for function
. Ensures that the function has a number of arguments between a
and b
, both integers (possibly missing). Tolerates the value
keyword (see in len
). See Section V) in the examples.
left(a, b)
and right(a, b)
: Only available for formula
. Restricts the number of parts in the left-hand-side or in the right-hand-side of the formula. Tolerates the value
keyword (see in len
). See Section VIII) in the examples.
[Section I)]: R:Section%20I) [Section IV)]: R:Section%20IV) [Section IV)]: R:Section%20IV) [Section IV)]: R:Section%20IV) [Section VI)]: R:Section%20VI) [Section VI)]: R:Section%20VI) [Section V)]: R:Section%20V) [Section VIII)]: R:Section%20VIII) [Section V)]: R:Section%20V) [Section III)]: R:Section%20III) [Section III)]: R:Section%20III) [Section VI)]: R:Section%20VI) [Section VI)]: R:Section%20VI)
There are eight global keywords that can be placed anywhere in the type. They are described in Section II) in the examples.
NULL
: allows the argument to be equal to NULL
.
safe NULL
: allows the argument to be equal to NULL
, but an error is thrown if the argument is of the type base$variable
or base[["variable"]]
. This is to prevent oversights from the user, especially useful when the main class is a vector.
NULL{expr}
: allows the argument to be equal to NULL
, if the argument is NULL
, then it assigns the value of expr to the argument.
MBT
: (means "must be there") an error is thrown if the argument is not provided by the user.
eval
: used in combination with the extra argument .data
. Evaluates the value of the argument both in the data set and in the environment (this means the argument can be a variable name).
evalset
: like eval
, but after evaluation, assigns the obtained value to the argument. Only available in check_set_arg
.
dotnames
: only when checking '...'
argument (see the related section below). Enforces that each object in '...'
has a name.
match
and charin
typesThe main classes match
and charin
are similar to match.arg
. These two types are detailed in the examples Section III).
By default, the main class match
expects a single character string whose value is in a set of choices. By default, there is no case sensitity (which can be turned on with the option strict
) and there is always partial matching. It can expect a vector (instead of a single element) if the option multi
is present.
You have three different ways to set the choices:
by setting the argument default: e.g. fun = function(x = c("Tom", "John")) check_arg(x, "match")
by providing the argument .choices
: e.g. fun = function(x) check_arg(x, "match", .choices = c("Tom", "John"))
by writing the choices in parentheses: e.g. fun = function(x) check_arg(x, "match(Tom, John)")
When the user doesn't provide the argument, the default is set to the first choice.
Since the main class match
performs a re-assignment of the variable, it is only available in check_set_arg
.
The main class charin
is similar to match
in that it expects a single character string in a set of choices. The main differences are: i) there is no partial matching, ii) the choices cannot be set by setting the argument default, and iii) its checking can be turned off with setDreamer_check(FALSE) (that's the main difference between check_arg
and check_set_arg
).
You can combine multiple types with a pipe: '|'. The syntax is as follows:
"main_type option(x) restriction(s) | main_type option(x) restriction(s) | main_type option(x) restriction(s)"
You can combine as many types as you want. The behavior is as follows: if the argument matches any of the types, then that's fine.
For example, say you require an argument to be either a logical scalar, either a data.frame, then you can write: check_arg(x, "logical scalar | data.frame")
. See Section X) in the examples for a more complex example.
The type MUST be a character string of length 1. Two main classes must be separated by a pipe. Otherwise the order of the keywords, the spaces, or the case don't matter. Further the global keywords can be placed anywhere and need not be separated by a pipe.
Note that a rare but problematic situation is when you set a default with the global NULL{default}
and that default contains a keyword. For example in the type "NULL{list()} numeric matrix"
list
should not be considered as a main class, but only matrix
. To be on the safe side, then just separate them with a pipe: "NULL{list()} | numeric matrix"
would work appropriately.
You can check multiple arguments at once provided they are of the same type. Say variables x1
to x5
should be logical scalars. Just use: check_arg(x1, x2, x3, x4, x5, "logical scalar")
. It is always more efficient to check multiple arguments of the same type at once.
It is important to note that in case of multiple arguments, you can place the type anywhere you want provided it is a character literal (and not in a variable!). This means that check_arg("logical scalar", x1, x2, x3, x4, x5)
would also work.
If your type is in a variable, then you must explicitly provide the argument .type
(like in check_arg(x, .type = my_type)
).
.up
)When you develop several functions that share common features, it is usually good practice to pool the common computations into an internal function (to avoid code duplication).
When you do so, you can do all the argument checking in the internal function. Then use the argument .up = 1
so that if the user provdes a wrong argument, the error message will refer to the user-level function and NOT to the internal function, making it much clearer for the user.
This is detailed in Section XII) in the examples.
...
(dot-dot-dot) argumentcheck_arg
offers the possibility to check the ...
, provided each expected object in ...
should be of the same type. To do that, just add ...
as the first argument in check_arg
, that's it! For example, you want all elements of ...
to be numeric vectors, then use check_arg(..., "numeric vector")
.
When checking ...
, you have the special global argument dotnames
which enforces that each element in ...
has a name. Further, the other global MBT
(must be there) now means that at least one element in ...
must be provided.
This is detailed in Section XIV) in the examples.
check_arg
and check_set_arg
?The function check_set_arg
extends check_arg
in several ways. First it offers new keywords:
evalset
: evaluates the argument in a data set (i.e. the argument can be variables names of a data set), then re-assigns back its value.
NULL{default}
: if the argument is NULL
, then the value in curly brackets is assigned to the argument.
match
: if the argument partially matches the choices, then the matches are assigned to the argument.
conv
: in atomic main classes (scalar
, vector
and matrix
), the data can be converted to a given sub-type (currently integer
, numeric
, logical
, character
and factor
), then assigned back to the argument.
As you can see, it's all about assignment: these special keywords of check_set_arg
will modify the arguments in place. You have such examples in Section II), III) and XI) of the examples.
Second, it allows to check arguments that are themselves list of arguments (note that conv
also works in that case). For example, one argument of your function is plot.opts
, a list of arguments to be passed to plot. You can check the elements of plot.opts
(e.g. plot.opts$main
) with check_set_arg
. It also re-assigns the values of the list given the special keywords just described. List element checking is described in Section XIII) of the examples.
Then why creating two functions? If the user runs a function in which the arguments were checked with check_arg
and it works, then argument checking can be safely disabled, and it would also work. On the other hand, since check_set_arg
does value re-assignment, it cannot be safely turned-off–therefore cannot be disabled with setDreamerr_check
. Distinguishing between the two allows the user to disable argument checking and gain (although very modest) perfomance in large loops. Therefore, when you create functions, I suggest to use always check_arg
, unless you need the extra features of check_set_arg
.
check_value
The functions check_value
and check_set_value
are almost identical to the respective functions check_arg
and check_set_arg
. The key differences are as follows:
They can check values instead of arguments. Indeed, if you try to check a value with check_arg
, nothing will happen (provided the name of the value is not an argument). Why? Because it will consider it as a missing argument. Therefore, you are can check anything with check_value
.
You can check only one item at a time (whereas you can check up to 10 arguments in check_arg
).
The main reason for using check_value
is that sometimes you only know if an argument is valid after having perfomed some modifications on it. For instance, the argument may be a formula, but you also require that the variables in the formula are numeric. You cannot check all that at once with check_arg
, but you can first check the formula with it, then extract the values from the formula and use check_value
to ensure that the variables from the formula are numeric.
check_value
is detailed in Section XVI) in the examples.
Although the argument checking offered by check_arg
is highly optimized and fast (it depends on the type (and your computer), but it is roughly of the order of 80 micro seconds for non-missing arguments, 20 micro seconds for missing arguments), you may want to disable it for small functions in large loops (>100K iterations although this practice is not really common in R). If so, just use the function setDreamerr_check
, by typing setDreamerr_check(FALSE)
. This will disable any call to check_arg
.
Note that the argument checking of check_set_arg
cannot be disabled because the special types it allows perform reassignment in the upper frame. That's the main difference with check_arg
.
If you're new to check_arg, given the many types available, it's very common to make mistakes when creating check_arg calls. But no worry, the developer mode is here to help!
The developer mode ensures that any problematic call is spotted and the problem is clearly stated. It also refers to the related section in the examples if appropriate. To turn the developer mode on, use setDreamerr_dev.mode(TRUE)
.
Note that since this mode ensures a detailed cheking of the call it is thus a strain on performance and should be always turned off otherwise needed. See Section XV) in the examples.
Laurent Berge
# check_arg is only used within functions # # I) Example for the main class "scalar" # test_scalar = function(xlog, xnum, xint, xnumlt, xdate){ # when forming the type: you can see that case, order and spaces don't matter check_arg(xlog, "scalarLogical") check_arg(xnum, "numeric scalar") check_arg(xint, " scalar Integer GE{0} ") check_arg(xnumlt, "numeric scalar lt{0.15}") # Below it is critical that there's no space between scalar and the parenthesis check_arg(xdate, "scalar(Date)") invisible(NULL) } # Following is OK test_scalar() test_scalar(xlog = FALSE, xnum = 55, xint = 5, xnumlt = 0.11, xdate = Sys.Date()) # # Now errors, all the following are wrong arguments, leading to errors # Please note the details in the error messages. # logical try(test_scalar(xlog = NA)) try(test_scalar(xlog = 2)) try(test_scalar(xlog = sum)) try(test_scalar(xlog = faefeaf5)) try(test_scalar(xlog = c(TRUE, FALSE))) try(test_scalar(xlog = c())) # numeric try(test_scalar(xnum = NA)) try(test_scalar(xnum = 1:5)) try(test_scalar(xnum = Sys.Date())) # integer try(test_scalar(xint = 5.5)) try(test_scalar(xint = -1)) # num < 0.15 try(test_scalar(xnumlt = 0.15)) try(test_scalar(xnumlt = 0.16)) try(test_scalar(xnumlt = Sys.Date())) # Date try(test_scalar(xdate = 0.15)) # # II) Examples for the globals: NULL, MBT, eval, evalset # test_globals = function(xnum, xlog = TRUE, xint){ # Default setting with NULL is only available in check_set_arg # MBT (must be there) throws an error if the user doesn't provide the argument check_set_arg(xnum, "numeric vector NULL{1} MBT") # NULL allows NULL values check_arg(xlog, "logical scalar safe NULL") check_arg(xint, "integer vector") list(xnum = xnum, xlog = xlog) } # xnum is required because of MBT option try(test_globals()) # NULL{expr} sets the value of xnum to expr if xnum = NULL # Here NULL{1} sets xnum to 1 test_globals(xnum = NULL) # NULL (not NULL{expr}) does not reassign: xlog remains NULL test_globals(xnum = NULL, xlog = NULL) # safe NULL: doesn't accept NULL from data.frame (DF) subselection # ex: the variable 'log' does not exist in the iris DF try(test_globals(5, xlog = iris$log)) # but xnum accepts it test_globals(iris$log) # # eval and evalset # test_eval = function(x1, x2, data = list(), i = c()){ check_arg(x1, "eval numeric vector", .data = data) # evalset is in check_set_arg check_set_arg(x2, "evalset numeric vector", .data = data) # We show the variables if(1 %in% i){ cat("x1:\n") print(as.character(try(x1, silent = TRUE))) } if(2 %in% i){ cat("x2:\n") print(as.character(try(x2, silent = TRUE))) } } # eval: evaluates the argument both in the environment and the data test_eval(x1 = Sepal.Length, data = iris) # OK # if we use a variable not in the environment nor in the data => error try(test_eval(x1 = Sopal.Length, data = iris)) # but eval doesn't reassign back the value of the argument: test_eval(x1 = Sepal.Length, data = iris, i = 1) # evaset does the same as eval, but also reasssigns the value obtained: test_eval(x2 = Sepal.Length, data = iris, i = 2) # # III) Match and charin # # match => does partial matching, only available in check_set_arg # charin => no partial matching, exact values required, but in check_arg # # match # # Note the three different ways to provide the choices # # If the argument has no default, it is kept that way (see x2) # If the argument is not provided by the user, # it is left untouched (see x3) test_match = function(x1 = c("bonjour", "Au revoir"), x2, x3 = "test"){ # 1) choices set thanks to the argument default (like in match.arg) check_set_arg(x1, "strict match") # 2) choices set with the argument .choices check_set_arg(x2, "match", .choices = c("Sarah", "Santa", "Santa Fe", "SANTA")) # 3) choices set with the parentheses check_set_arg(x3, "multi match(Orange, Juice, Good)") cat("x1:", x1, "\nx2:", tryCatch(x2, error = function(e) "[missing]"), "\nx3:", x3, "\n") } # Everything below is OK test_match() test_match(x1 = "Au", x2 = "sar", x3 = c("GOOD", "or")) test_match(x2 = "Santa") # Errors caught: try(test_match(x1 = c("Au", "revoir"))) try(test_match(x1 = "au")) try(test_match(x1 = sum)) try(test_match(x1 = list(a = 1:5))) try(test_match(x2 = "san")) try(test_match(x2 = "santa")) # Same value as x3's default, but now provided by the user try(test_match(x3 = "test")) try(test_match(x3 = c("or", "ju", "bad"))) # You can check multiple arguments at once # [see details for multiple arguments in Section X)] # Note that now the choices must be set in the argument # and they must have the same options (ie multi, strict) test_match_multi = function(x1 = c("bonjour", "Au revoir"), x2 = c("Sarah", "Santa"), x3 = c("Orange", "Juice", "Good")){ # multiple arguments at once check_set_arg(x1, x2, x3, "match") cat("x1:", x1, "\nx2:", x2, "\nx3:", x3, "\n") } test_match_multi() # # charin # # charin is similar to match but requires the user to provide the exact value # only the multi option is available test_charin = function(x1 = "bonjour", x2 = "Sarah"){ # 1) set the choices with .choices check_arg(x1, "charin", .choices = c("bonjour", "au revoir")) # 2) set the choices with the parentheses check_arg(x2, "multi charin(Sarah, Santa, Santa Fe)") cat("x1:", x1, "\nx2:", x2, "\n") } # Now we need the exact values test_charin("au revoir", c("Santa", "Santa Fe")) # Errors when partial matching tried try(test_charin("au re")) # # IV) Vectors and marices, equalities, dimensions and lengths # # You can restrict the length of objects with len(a, b) # - if len(a, b) length must be in between a and b # - if len(a, ) length must be at least a # - if len(, b) length must be at most b # - if len(a) length must be equal to a # You can also use the special keywords len(data) or len(value), # but then the argument .data or .value must also be provided. # (the related example comes later) # # You can restrict the number of rows/columns with nrow(a, b) and ncol(a, b) # # You can restrict a matrix to be square with the 'square' keyword # # You can restrict the values an element can take with GE/GT/LE/LT, # respectively greater or equal/greater than/lower or equal/lower than # The syntax is GE{expr}, with expr any expression # Of course, it only works for numeric values # # By default NAs are tolerated in vector, matrix and data.frame. # You can refuse NAs using the keyword: 'no na' or 'nona' # test_vmat = function(xvec, xmat, xvmat, xstmat, xnamed){ # vector of integers with values between 5 and exp(3) check_arg(xvec, "integer Vector GE{5} LT{exp(3)}") # logical matrix with at least two rows and with 3 columns check_arg(xmat, "logicalMatrix NROW(2,) NCOL(3)") # vector or matrix (vmatrix) of integers or character strings # with at most 3 observations # NAs are not allowed check_arg(xvmat, "vmatrix(character, integer) nrow(,3) no na") # square matrix of integers, logicals reports errors check_arg(xstmat, "strict integer square Matrix") # A vector with names of length 2 check_arg(xnamed, "named Vector len(2)") invisible(NULL) } # OK test_vmat(xvec = 5:20, xmat = matrix(TRUE, 3, 3), xvmat = c("abc", 4, 3), xstmat = matrix(1:4, 2, 2), xnamed = c(bon=1, jour=2)) # Vector checks: try(test_vmat(xvec = 2)) try(test_vmat(xvec = 21)) try(test_vmat(xvec = 5.5)) # Matrix checks: try(test_vmat(xmat = matrix(TRUE, 3, 4))) try(test_vmat(xmat = matrix(2, 3, 3))) try(test_vmat(xmat = matrix(FALSE, 1, 3))) try(test_vmat(xmat = iris)) try(test_vmat(xvmat = iris)) try(test_vmat(xvmat = c(NA, 5))) try(test_vmat(xstmat = matrix(1, 1, 3))) try(test_vmat(xstmat = matrix(c(TRUE, FALSE, NA), 3, 3))) # Named vector checks: try(test_vmat(xnamed = 1:3)) try(test_vmat(xnamed = c(bon=1, jour=2, les=3))) # # Illustration of the keywords 'data', 'value' # # 'value' # Matrix multiplication X * Y * Z test_dynamic_restriction = function(x, y, z){ check_arg(x, "mbt numeric matrix") check_arg(y, "mbt numeric matrix nrow(value)", .value = ncol(x)) check_arg(z, "mbt numeric matrix nrow(value)", .value = ncol(y)) # An alternative to the previous two lines: # check_arg(z, "mbt numeric matrix") # check_arg(y, "mbt numeric matrix nrow(value) ncol(value)", # .value = list(nrow = ncol(x), ncol = nrow(z))) x %*% y %*% z } x = matrix(1, 2, 3) y = matrix(2, 3, 5) z = matrix(rnorm(10), 5, 2) test_dynamic_restriction(x, y, z) # Now error try(test_dynamic_restriction(x, matrix(5, 1, 2), z)) # 'data' # Computing maximum difference between two matrices test_dynamic_bis = function(x, y){ check_arg(x, "mbt numeric matrix") # we require y to be of the same dimension as x check_arg(y, "mbt numeric matrix nrow(data) ncol(data)", .data = x) max(abs(x - y)) } test_dynamic_bis(x, x) # Now error try(test_dynamic_bis(x, y)) # # V) Functions and lists # # You can restrict the number of arguments of a # function with arg(a, b) [see Section IV) for details] test_funlist = function(xfun, xlist){ check_arg(xfun, "function arg(1,2)") check_arg(xlist, "list len(,3)") invisible(NULL) } # OK test_funlist(xfun = sum, xlist = iris[c(1,2)]) # function checks: try(test_funlist(xfun = function(x, y, z) x + y + z)) # list checks: try(test_funlist(xlist = iris[1:4])) try(test_funlist(xlist = list())) # # VI) Data.frame and custom class # test_df = function(xdf, xvdf, xcustom){ # data.frame with at least 100 observations check_arg(xdf, "data.frame nrow(100,)") # data.frame or vector (vdata.frame) check_arg(xvdf, "vdata.frame") # Either: i) object of class glm or lm # ii) NA # iii) NULL check_arg(xcustom, "class(lm, glm)|NA|null") invisible(NULL) } # OK m = lm(Sepal.Length~Species, iris) test_df(xdf = iris, xcustom = m) test_df(xvdf = iris$Sepal.Length) test_df(xcustom = NULL) # data.frame checks: try(test_df(xdf = iris[1:50,])) try(test_df(xdf = iris[integer(0)])) try(test_df(xdf = iris$Sepal.Length)) # Note that the following works: test_df(xvdf = iris$Sepal.Length) # Custom class checks: try(test_df(xcustom = iris)) # # VIII) Formulas # # The keyword is 'formula' # You can restrict the formula to be: # - one sided with 'os' # - two sided with 'ts' # # You can restrict that the variables of a forumula must be in # a data set or in the environment with var(data, env) # - var(data) => variables must be in the data set # - var(env) => variables must be in the environment # - var(data, env) => variables must be in the data set or in the environment # Of course, if var(data), you must provide a data set # # Checking multipart formulas is included. You can use left(a, b) # and right(a, b) to put restrictions in the number of parts allowed # in the left and right-hand-sides # test_formulas = function(fml1, fml2, fml3, fml4, data = iris){ # Regular formula, variables must be in the data set check_arg(fml1, "formula var(data)", .data = data) # One sided formula, variables in the environment check_arg(fml2, "os formula var(env)") # Two sided formula, variables in the data set or in the env. check_arg(fml3, "ts formula var(data, env)", .data = data) # One or two sided, at most two parts in the RHS, at most 1 in the LHS check_arg(fml4, "formula left(,1) right(,2)") invisible(NULL) } # We set x1 in the environment x1 = 5 # Works test_formulas(~Sepal.Length, ~x1, Sepal.Length~x1, a ~ b, data = iris) # Now let's see errors try(test_formulas(Sepal.Length~x1, data = iris)) try(test_formulas(fml2 = ~Sepal.Length, data = iris)) try(test_formulas(fml2 = Sepal.Length~x1, data = iris)) try(test_formulas(fml3 = ~x1, data = iris)) try(test_formulas(fml3 = x1~x555, data = iris)) try(test_formulas(fml4 = a ~ b | c | d)) try(test_formulas(fml4 = a | b ~ c | d)) # # IX) Multiple types # # You can check multiple types using a pipe: '|' # Note that global keywords (like NULL, eval, etc) need not be # separated by pipes. They can be anywhere, the following are identical: # - "character scalar | data.frame NULL" # - "NULL character scalar | data.frame" # - "character scalar NULL | data.frame" # - "character scalar | data.frame | NULL" # test_mult = function(x){ # x must be either: # i) a numeric vector of length at least 2 # ii) a square character matrix # iii) an integer scalar (vector of length 1) check_arg(x, "numeric vector len(2,) | square character matrix | integer scalar") invisible(NULL) } # OK test_mult(1) test_mult(1:2) test_mult(matrix("ok", 1, 1)) # Not OK, notice the very detailed error messages try(test_mult(matrix("bonjour", 1, 2))) try(test_mult(1.1)) # # X) Multiple arguments # # You can check multiple arguments at once if they have the same type. # You can add the type where you want but it must be a character literal. # You can check up to 10 arguments with the same type. test_multiarg = function(xlog1, xlog2, xnum1, xnum2, xnum3){ # checking the logicals check_arg(xlog1, xlog2, "logical scalar") # checking the numerics # => Alternatively, you can add the type first check_arg("numeric vector", xnum1, xnum2, xnum3) invisible(NULL) } # Let's throw some errors try(test_multiarg(xlog2 = 4)) try(test_multiarg(xnum3 = "test")) # # XI) Multiple sub-stypes # # For atomic arguments (like vector or matrices), # you can check the type of underlying data: is it integer, numeric, etc? # There are five simple sub-types: # - integer # - numeric # - factor # - logical # - loose logical: either TRUE/FALSE, either 0/1 # # If you require that the data is of one sub-type only: # - a) if it's one of the simple sub-types: add the keyword directly in the type # - b) otherwise: add the sub-type in parentheses # # Note that the parentheses MUST follow the main class directly. # # Example: # - a) "integer scalar" # - b) "scalar(Date)" # # If you want to check multiple sub-types: you must add them in parentheses. # Again, the parentheses MUST follow the main class directly. # Examples: # "vector(character, factor)" # "scalar(integer, logical)" # "matrix(Date, integer, logical)" # # In check_set_arg, you can use the keyword "conv" to convert to the # desired type # test_multi_subtypes = function(x, y){ check_arg(x, "scalar(integer, logical)") check_arg(y, "vector(character, factor, Date)") invisible(NULL) } # What follows doesn't work try(test_multi_subtypes(x = 5.5)) # Note that it works if x = 5 # (for check_arg 5 is integer although is.integer(5) returns FALSE) test_multi_subtypes(x = 5) try(test_multi_subtypes(y = 5.5)) # Testing the "conv" keyword: test_conv = function(x, type){ check_set_arg(x, .type = type) x } class(test_conv(5L, "numeric scalar conv")) class(test_conv(5, "integer scalar conv")) class(test_conv(5, "integer scalar")) # You can use the "conv" keyword in multi-types # Remember that types are checked in ORDER! (see the behavior) test_conv(5:1, "vector(logical, character conv)") test_conv(c(TRUE, FALSE), "vector(logical, character conv)") # # XII) Nested checking: using .up # # Say you have two user level functions # But you do all the computation in an internal function. # The error message should be at the level of the user-level function # You can use the argument .up to do that # sum_fun = function(x, y){ my_internal(x, y, sum = TRUE) } diff_fun = function(x, y){ my_internal(x, y, sum = FALSE) } my_internal = function(x, y, sum){ # The error messages will be at the level of the user-level functions # which are 1 up the stack check_arg(x, y, "numeric scalar mbt", .up = 1) if(sum) return(x + y) return(x - y) } # we check it works sum_fun(5, 6) diff_fun(5, 6) # Let's throw some errors try(sum_fun(5)) try(diff_fun(5, 1:5)) # The errors are at the level of sum_fun/diff_fun although # the arguments have been checked in my_internal. # => much easier for the user to understand the problem # # XIII) Using check_set_arg to check and set list defaults # # Sometimes it is useful to have arguments that are themselves # list of arguments. # Witch check_set_arg you can check the arguments nested in lists # and easily set default values at the same time. # # When you check a list element, you MUST use the syntax argument$element # # Function that performs a regression then plots it plot_cor = function(x, y, lm.opts = list(), plot.opts = list(), line.opts = list()){ check_arg(x, y, "numeric vector") # First we ensure the arguments are lists check_arg(lm.opts, plot.opts, line.opts, "named list") # The linear regression lm.opts$formula = y ~ x reg = do.call("lm", lm.opts) # plotting the correlation, with defaults check_set_arg(plot.opts$main, "character scalar NULL{'Correlation between x and y'}") # you can use variables created in the function when setting the default x_name = deparse(substitute(x)) check_set_arg(plot.opts$xlab, "character scalar NULL{x_name}") check_set_arg(plot.opts$ylab, "character scalar NULL{'y'}") # we restrict to only two plotting types: p or h check_set_arg(plot.opts$type, "NULL{'p'} match(p, h)") plot.opts$x = x plot.opts$y = y do.call("plot", plot.opts) # with the fit check_set_arg(line.opts$col, "NULL{'firebrick'}") # no checking but default setting check_set_arg(line.opts$lwd, "integer scalar GE{0} NULL{2}") # check + default line.opts$a = reg do.call("abline", line.opts) } sepal_length = iris$Sepal.Length ; y = iris$Sepal.Width plot_cor(sepal_length, y) plot_cor(sepal_length, y, plot.opts = list(col = iris$Species, main = "Another title")) # Now throwing errors try(plot_cor(sepal_length, y, plot.opts = list(type = "l"))) try(plot_cor(sepal_length, y, line.opts = list(lwd = -50))) # # XIV) Checking '...' (dot-dot-dot) # # You can also check the '...' argument if you expect all objects # to be of the same type. # # To do so, you MUST place the ... in the first argument of check_arg # sum_check = function(...){ # we want each element of ... to be numeric vectors without NAs # we want at least one element to be there (mbt) check_arg(..., "numeric vector mbt") # once the check is done, we apply sum sum(...) } sum_check(1:5, 5:20) # Now let's compare the behavior of sum_check() with that of sum() # in the presence of errors x = 1:5 ; y = pt try(sum_check(x, y)) try(sum(x, y)) # As you can see, in the first call, it's very easy to spot and debug the problem # while in the second call it's almost impossible # # XV) Developer mode # # If you're new to check_arg, given the many types available, # it's very common to make mistakes when creating check_arg calls. # The developer mode ensures that any problematic call is spotted # and the problem is clearly stated # # Note that since this mode ensures a detailed cheking of the call # it is thus a strain on performance and should be always turned off # otherwise needed. # # Setting the developer mode on: setDreamerr_dev.mode(TRUE) # Creating some 'wrong' calls => the problem is pinpointed test_err1 = function(x) check_arg(x, "integer scalar", "numeric vector") try(test_err1()) test_err2 = function(...) check_arg("numeric vector", ...) try(test_err2()) test_err3 = function(x) check_arg(x$a, "numeric vector") try(test_err3()) test_err4 = function(x) check_arg(x, "numeric vector integer") try(test_err4()) # Setting the developer mode off: setDreamerr_dev.mode(FALSE) # # XVI) Using check_value # # The main function for checking arguments is check_arg. # But sometimes you only know if an argument is valid after # having perfomed some modifications on it. # => that's when check_value kicks in. # # It's better with an example. # # In this example we'll construct a plotting function # using a formula, with a rock-solid argument checking. # # Plotting function, but using a formula # You want to plot only numeric values plot_fml = function(fml, data, ...){ # We first check the arguments check_arg(data, "data.frame mbt") check_arg(fml, "ts formula mbt var(data)", .data = data) # We extract the values of the formula y = fml[[2]] x = fml[[3]] # Now we check that x and y are valid => with check_value # We also use the possibility to assign the value of y and x directly # We add a custom message because y/x are NOT arguments check_set_value(y, "evalset numeric vector", .data = data, .message = "In the argument 'fml', the LHS must be numeric.") check_set_value(x, "evalset numeric vector", .data = data, .message = "In the argument 'fml', the RHS must be numeric.") # The dots => only arguments to plot are valid args_ok = c(formalArgs(plot.default), names(par())) validate_dots(valid_args = args_ok, stop = TRUE) # We also set the xlab/ylab dots = list(...) # dots has a special meaning in check_value (no need to pass .message) check_set_value(dots$ylab, "NULL{deparse(fml[[2]])} character vector conv len(,3)") check_set_value(dots$xlab, "NULL{deparse(fml[[3]])} character vector conv len(,3)") dots$y = y dots$x = x do.call("plot", dots) } # Let's check it works plot_fml(Sepal.Length ~ Petal.Length + Sepal.Width, iris) plot_fml(Sepal.Length ~ Petal.Length + Sepal.Width, iris, xlab = "Not the default xlab") # Now let's throw some errors try(plot_fml(Sepal.Length ~ Species, iris)) try(plot_fml(Sepal.Length ~ Petal.Length, iris, xlab = iris)) try(plot_fml(Sepal.Length ~ Petal.Length, iris, xlab = iris$Species))
# check_arg is only used within functions # # I) Example for the main class "scalar" # test_scalar = function(xlog, xnum, xint, xnumlt, xdate){ # when forming the type: you can see that case, order and spaces don't matter check_arg(xlog, "scalarLogical") check_arg(xnum, "numeric scalar") check_arg(xint, " scalar Integer GE{0} ") check_arg(xnumlt, "numeric scalar lt{0.15}") # Below it is critical that there's no space between scalar and the parenthesis check_arg(xdate, "scalar(Date)") invisible(NULL) } # Following is OK test_scalar() test_scalar(xlog = FALSE, xnum = 55, xint = 5, xnumlt = 0.11, xdate = Sys.Date()) # # Now errors, all the following are wrong arguments, leading to errors # Please note the details in the error messages. # logical try(test_scalar(xlog = NA)) try(test_scalar(xlog = 2)) try(test_scalar(xlog = sum)) try(test_scalar(xlog = faefeaf5)) try(test_scalar(xlog = c(TRUE, FALSE))) try(test_scalar(xlog = c())) # numeric try(test_scalar(xnum = NA)) try(test_scalar(xnum = 1:5)) try(test_scalar(xnum = Sys.Date())) # integer try(test_scalar(xint = 5.5)) try(test_scalar(xint = -1)) # num < 0.15 try(test_scalar(xnumlt = 0.15)) try(test_scalar(xnumlt = 0.16)) try(test_scalar(xnumlt = Sys.Date())) # Date try(test_scalar(xdate = 0.15)) # # II) Examples for the globals: NULL, MBT, eval, evalset # test_globals = function(xnum, xlog = TRUE, xint){ # Default setting with NULL is only available in check_set_arg # MBT (must be there) throws an error if the user doesn't provide the argument check_set_arg(xnum, "numeric vector NULL{1} MBT") # NULL allows NULL values check_arg(xlog, "logical scalar safe NULL") check_arg(xint, "integer vector") list(xnum = xnum, xlog = xlog) } # xnum is required because of MBT option try(test_globals()) # NULL{expr} sets the value of xnum to expr if xnum = NULL # Here NULL{1} sets xnum to 1 test_globals(xnum = NULL) # NULL (not NULL{expr}) does not reassign: xlog remains NULL test_globals(xnum = NULL, xlog = NULL) # safe NULL: doesn't accept NULL from data.frame (DF) subselection # ex: the variable 'log' does not exist in the iris DF try(test_globals(5, xlog = iris$log)) # but xnum accepts it test_globals(iris$log) # # eval and evalset # test_eval = function(x1, x2, data = list(), i = c()){ check_arg(x1, "eval numeric vector", .data = data) # evalset is in check_set_arg check_set_arg(x2, "evalset numeric vector", .data = data) # We show the variables if(1 %in% i){ cat("x1:\n") print(as.character(try(x1, silent = TRUE))) } if(2 %in% i){ cat("x2:\n") print(as.character(try(x2, silent = TRUE))) } } # eval: evaluates the argument both in the environment and the data test_eval(x1 = Sepal.Length, data = iris) # OK # if we use a variable not in the environment nor in the data => error try(test_eval(x1 = Sopal.Length, data = iris)) # but eval doesn't reassign back the value of the argument: test_eval(x1 = Sepal.Length, data = iris, i = 1) # evaset does the same as eval, but also reasssigns the value obtained: test_eval(x2 = Sepal.Length, data = iris, i = 2) # # III) Match and charin # # match => does partial matching, only available in check_set_arg # charin => no partial matching, exact values required, but in check_arg # # match # # Note the three different ways to provide the choices # # If the argument has no default, it is kept that way (see x2) # If the argument is not provided by the user, # it is left untouched (see x3) test_match = function(x1 = c("bonjour", "Au revoir"), x2, x3 = "test"){ # 1) choices set thanks to the argument default (like in match.arg) check_set_arg(x1, "strict match") # 2) choices set with the argument .choices check_set_arg(x2, "match", .choices = c("Sarah", "Santa", "Santa Fe", "SANTA")) # 3) choices set with the parentheses check_set_arg(x3, "multi match(Orange, Juice, Good)") cat("x1:", x1, "\nx2:", tryCatch(x2, error = function(e) "[missing]"), "\nx3:", x3, "\n") } # Everything below is OK test_match() test_match(x1 = "Au", x2 = "sar", x3 = c("GOOD", "or")) test_match(x2 = "Santa") # Errors caught: try(test_match(x1 = c("Au", "revoir"))) try(test_match(x1 = "au")) try(test_match(x1 = sum)) try(test_match(x1 = list(a = 1:5))) try(test_match(x2 = "san")) try(test_match(x2 = "santa")) # Same value as x3's default, but now provided by the user try(test_match(x3 = "test")) try(test_match(x3 = c("or", "ju", "bad"))) # You can check multiple arguments at once # [see details for multiple arguments in Section X)] # Note that now the choices must be set in the argument # and they must have the same options (ie multi, strict) test_match_multi = function(x1 = c("bonjour", "Au revoir"), x2 = c("Sarah", "Santa"), x3 = c("Orange", "Juice", "Good")){ # multiple arguments at once check_set_arg(x1, x2, x3, "match") cat("x1:", x1, "\nx2:", x2, "\nx3:", x3, "\n") } test_match_multi() # # charin # # charin is similar to match but requires the user to provide the exact value # only the multi option is available test_charin = function(x1 = "bonjour", x2 = "Sarah"){ # 1) set the choices with .choices check_arg(x1, "charin", .choices = c("bonjour", "au revoir")) # 2) set the choices with the parentheses check_arg(x2, "multi charin(Sarah, Santa, Santa Fe)") cat("x1:", x1, "\nx2:", x2, "\n") } # Now we need the exact values test_charin("au revoir", c("Santa", "Santa Fe")) # Errors when partial matching tried try(test_charin("au re")) # # IV) Vectors and marices, equalities, dimensions and lengths # # You can restrict the length of objects with len(a, b) # - if len(a, b) length must be in between a and b # - if len(a, ) length must be at least a # - if len(, b) length must be at most b # - if len(a) length must be equal to a # You can also use the special keywords len(data) or len(value), # but then the argument .data or .value must also be provided. # (the related example comes later) # # You can restrict the number of rows/columns with nrow(a, b) and ncol(a, b) # # You can restrict a matrix to be square with the 'square' keyword # # You can restrict the values an element can take with GE/GT/LE/LT, # respectively greater or equal/greater than/lower or equal/lower than # The syntax is GE{expr}, with expr any expression # Of course, it only works for numeric values # # By default NAs are tolerated in vector, matrix and data.frame. # You can refuse NAs using the keyword: 'no na' or 'nona' # test_vmat = function(xvec, xmat, xvmat, xstmat, xnamed){ # vector of integers with values between 5 and exp(3) check_arg(xvec, "integer Vector GE{5} LT{exp(3)}") # logical matrix with at least two rows and with 3 columns check_arg(xmat, "logicalMatrix NROW(2,) NCOL(3)") # vector or matrix (vmatrix) of integers or character strings # with at most 3 observations # NAs are not allowed check_arg(xvmat, "vmatrix(character, integer) nrow(,3) no na") # square matrix of integers, logicals reports errors check_arg(xstmat, "strict integer square Matrix") # A vector with names of length 2 check_arg(xnamed, "named Vector len(2)") invisible(NULL) } # OK test_vmat(xvec = 5:20, xmat = matrix(TRUE, 3, 3), xvmat = c("abc", 4, 3), xstmat = matrix(1:4, 2, 2), xnamed = c(bon=1, jour=2)) # Vector checks: try(test_vmat(xvec = 2)) try(test_vmat(xvec = 21)) try(test_vmat(xvec = 5.5)) # Matrix checks: try(test_vmat(xmat = matrix(TRUE, 3, 4))) try(test_vmat(xmat = matrix(2, 3, 3))) try(test_vmat(xmat = matrix(FALSE, 1, 3))) try(test_vmat(xmat = iris)) try(test_vmat(xvmat = iris)) try(test_vmat(xvmat = c(NA, 5))) try(test_vmat(xstmat = matrix(1, 1, 3))) try(test_vmat(xstmat = matrix(c(TRUE, FALSE, NA), 3, 3))) # Named vector checks: try(test_vmat(xnamed = 1:3)) try(test_vmat(xnamed = c(bon=1, jour=2, les=3))) # # Illustration of the keywords 'data', 'value' # # 'value' # Matrix multiplication X * Y * Z test_dynamic_restriction = function(x, y, z){ check_arg(x, "mbt numeric matrix") check_arg(y, "mbt numeric matrix nrow(value)", .value = ncol(x)) check_arg(z, "mbt numeric matrix nrow(value)", .value = ncol(y)) # An alternative to the previous two lines: # check_arg(z, "mbt numeric matrix") # check_arg(y, "mbt numeric matrix nrow(value) ncol(value)", # .value = list(nrow = ncol(x), ncol = nrow(z))) x %*% y %*% z } x = matrix(1, 2, 3) y = matrix(2, 3, 5) z = matrix(rnorm(10), 5, 2) test_dynamic_restriction(x, y, z) # Now error try(test_dynamic_restriction(x, matrix(5, 1, 2), z)) # 'data' # Computing maximum difference between two matrices test_dynamic_bis = function(x, y){ check_arg(x, "mbt numeric matrix") # we require y to be of the same dimension as x check_arg(y, "mbt numeric matrix nrow(data) ncol(data)", .data = x) max(abs(x - y)) } test_dynamic_bis(x, x) # Now error try(test_dynamic_bis(x, y)) # # V) Functions and lists # # You can restrict the number of arguments of a # function with arg(a, b) [see Section IV) for details] test_funlist = function(xfun, xlist){ check_arg(xfun, "function arg(1,2)") check_arg(xlist, "list len(,3)") invisible(NULL) } # OK test_funlist(xfun = sum, xlist = iris[c(1,2)]) # function checks: try(test_funlist(xfun = function(x, y, z) x + y + z)) # list checks: try(test_funlist(xlist = iris[1:4])) try(test_funlist(xlist = list())) # # VI) Data.frame and custom class # test_df = function(xdf, xvdf, xcustom){ # data.frame with at least 100 observations check_arg(xdf, "data.frame nrow(100,)") # data.frame or vector (vdata.frame) check_arg(xvdf, "vdata.frame") # Either: i) object of class glm or lm # ii) NA # iii) NULL check_arg(xcustom, "class(lm, glm)|NA|null") invisible(NULL) } # OK m = lm(Sepal.Length~Species, iris) test_df(xdf = iris, xcustom = m) test_df(xvdf = iris$Sepal.Length) test_df(xcustom = NULL) # data.frame checks: try(test_df(xdf = iris[1:50,])) try(test_df(xdf = iris[integer(0)])) try(test_df(xdf = iris$Sepal.Length)) # Note that the following works: test_df(xvdf = iris$Sepal.Length) # Custom class checks: try(test_df(xcustom = iris)) # # VIII) Formulas # # The keyword is 'formula' # You can restrict the formula to be: # - one sided with 'os' # - two sided with 'ts' # # You can restrict that the variables of a forumula must be in # a data set or in the environment with var(data, env) # - var(data) => variables must be in the data set # - var(env) => variables must be in the environment # - var(data, env) => variables must be in the data set or in the environment # Of course, if var(data), you must provide a data set # # Checking multipart formulas is included. You can use left(a, b) # and right(a, b) to put restrictions in the number of parts allowed # in the left and right-hand-sides # test_formulas = function(fml1, fml2, fml3, fml4, data = iris){ # Regular formula, variables must be in the data set check_arg(fml1, "formula var(data)", .data = data) # One sided formula, variables in the environment check_arg(fml2, "os formula var(env)") # Two sided formula, variables in the data set or in the env. check_arg(fml3, "ts formula var(data, env)", .data = data) # One or two sided, at most two parts in the RHS, at most 1 in the LHS check_arg(fml4, "formula left(,1) right(,2)") invisible(NULL) } # We set x1 in the environment x1 = 5 # Works test_formulas(~Sepal.Length, ~x1, Sepal.Length~x1, a ~ b, data = iris) # Now let's see errors try(test_formulas(Sepal.Length~x1, data = iris)) try(test_formulas(fml2 = ~Sepal.Length, data = iris)) try(test_formulas(fml2 = Sepal.Length~x1, data = iris)) try(test_formulas(fml3 = ~x1, data = iris)) try(test_formulas(fml3 = x1~x555, data = iris)) try(test_formulas(fml4 = a ~ b | c | d)) try(test_formulas(fml4 = a | b ~ c | d)) # # IX) Multiple types # # You can check multiple types using a pipe: '|' # Note that global keywords (like NULL, eval, etc) need not be # separated by pipes. They can be anywhere, the following are identical: # - "character scalar | data.frame NULL" # - "NULL character scalar | data.frame" # - "character scalar NULL | data.frame" # - "character scalar | data.frame | NULL" # test_mult = function(x){ # x must be either: # i) a numeric vector of length at least 2 # ii) a square character matrix # iii) an integer scalar (vector of length 1) check_arg(x, "numeric vector len(2,) | square character matrix | integer scalar") invisible(NULL) } # OK test_mult(1) test_mult(1:2) test_mult(matrix("ok", 1, 1)) # Not OK, notice the very detailed error messages try(test_mult(matrix("bonjour", 1, 2))) try(test_mult(1.1)) # # X) Multiple arguments # # You can check multiple arguments at once if they have the same type. # You can add the type where you want but it must be a character literal. # You can check up to 10 arguments with the same type. test_multiarg = function(xlog1, xlog2, xnum1, xnum2, xnum3){ # checking the logicals check_arg(xlog1, xlog2, "logical scalar") # checking the numerics # => Alternatively, you can add the type first check_arg("numeric vector", xnum1, xnum2, xnum3) invisible(NULL) } # Let's throw some errors try(test_multiarg(xlog2 = 4)) try(test_multiarg(xnum3 = "test")) # # XI) Multiple sub-stypes # # For atomic arguments (like vector or matrices), # you can check the type of underlying data: is it integer, numeric, etc? # There are five simple sub-types: # - integer # - numeric # - factor # - logical # - loose logical: either TRUE/FALSE, either 0/1 # # If you require that the data is of one sub-type only: # - a) if it's one of the simple sub-types: add the keyword directly in the type # - b) otherwise: add the sub-type in parentheses # # Note that the parentheses MUST follow the main class directly. # # Example: # - a) "integer scalar" # - b) "scalar(Date)" # # If you want to check multiple sub-types: you must add them in parentheses. # Again, the parentheses MUST follow the main class directly. # Examples: # "vector(character, factor)" # "scalar(integer, logical)" # "matrix(Date, integer, logical)" # # In check_set_arg, you can use the keyword "conv" to convert to the # desired type # test_multi_subtypes = function(x, y){ check_arg(x, "scalar(integer, logical)") check_arg(y, "vector(character, factor, Date)") invisible(NULL) } # What follows doesn't work try(test_multi_subtypes(x = 5.5)) # Note that it works if x = 5 # (for check_arg 5 is integer although is.integer(5) returns FALSE) test_multi_subtypes(x = 5) try(test_multi_subtypes(y = 5.5)) # Testing the "conv" keyword: test_conv = function(x, type){ check_set_arg(x, .type = type) x } class(test_conv(5L, "numeric scalar conv")) class(test_conv(5, "integer scalar conv")) class(test_conv(5, "integer scalar")) # You can use the "conv" keyword in multi-types # Remember that types are checked in ORDER! (see the behavior) test_conv(5:1, "vector(logical, character conv)") test_conv(c(TRUE, FALSE), "vector(logical, character conv)") # # XII) Nested checking: using .up # # Say you have two user level functions # But you do all the computation in an internal function. # The error message should be at the level of the user-level function # You can use the argument .up to do that # sum_fun = function(x, y){ my_internal(x, y, sum = TRUE) } diff_fun = function(x, y){ my_internal(x, y, sum = FALSE) } my_internal = function(x, y, sum){ # The error messages will be at the level of the user-level functions # which are 1 up the stack check_arg(x, y, "numeric scalar mbt", .up = 1) if(sum) return(x + y) return(x - y) } # we check it works sum_fun(5, 6) diff_fun(5, 6) # Let's throw some errors try(sum_fun(5)) try(diff_fun(5, 1:5)) # The errors are at the level of sum_fun/diff_fun although # the arguments have been checked in my_internal. # => much easier for the user to understand the problem # # XIII) Using check_set_arg to check and set list defaults # # Sometimes it is useful to have arguments that are themselves # list of arguments. # Witch check_set_arg you can check the arguments nested in lists # and easily set default values at the same time. # # When you check a list element, you MUST use the syntax argument$element # # Function that performs a regression then plots it plot_cor = function(x, y, lm.opts = list(), plot.opts = list(), line.opts = list()){ check_arg(x, y, "numeric vector") # First we ensure the arguments are lists check_arg(lm.opts, plot.opts, line.opts, "named list") # The linear regression lm.opts$formula = y ~ x reg = do.call("lm", lm.opts) # plotting the correlation, with defaults check_set_arg(plot.opts$main, "character scalar NULL{'Correlation between x and y'}") # you can use variables created in the function when setting the default x_name = deparse(substitute(x)) check_set_arg(plot.opts$xlab, "character scalar NULL{x_name}") check_set_arg(plot.opts$ylab, "character scalar NULL{'y'}") # we restrict to only two plotting types: p or h check_set_arg(plot.opts$type, "NULL{'p'} match(p, h)") plot.opts$x = x plot.opts$y = y do.call("plot", plot.opts) # with the fit check_set_arg(line.opts$col, "NULL{'firebrick'}") # no checking but default setting check_set_arg(line.opts$lwd, "integer scalar GE{0} NULL{2}") # check + default line.opts$a = reg do.call("abline", line.opts) } sepal_length = iris$Sepal.Length ; y = iris$Sepal.Width plot_cor(sepal_length, y) plot_cor(sepal_length, y, plot.opts = list(col = iris$Species, main = "Another title")) # Now throwing errors try(plot_cor(sepal_length, y, plot.opts = list(type = "l"))) try(plot_cor(sepal_length, y, line.opts = list(lwd = -50))) # # XIV) Checking '...' (dot-dot-dot) # # You can also check the '...' argument if you expect all objects # to be of the same type. # # To do so, you MUST place the ... in the first argument of check_arg # sum_check = function(...){ # we want each element of ... to be numeric vectors without NAs # we want at least one element to be there (mbt) check_arg(..., "numeric vector mbt") # once the check is done, we apply sum sum(...) } sum_check(1:5, 5:20) # Now let's compare the behavior of sum_check() with that of sum() # in the presence of errors x = 1:5 ; y = pt try(sum_check(x, y)) try(sum(x, y)) # As you can see, in the first call, it's very easy to spot and debug the problem # while in the second call it's almost impossible # # XV) Developer mode # # If you're new to check_arg, given the many types available, # it's very common to make mistakes when creating check_arg calls. # The developer mode ensures that any problematic call is spotted # and the problem is clearly stated # # Note that since this mode ensures a detailed cheking of the call # it is thus a strain on performance and should be always turned off # otherwise needed. # # Setting the developer mode on: setDreamerr_dev.mode(TRUE) # Creating some 'wrong' calls => the problem is pinpointed test_err1 = function(x) check_arg(x, "integer scalar", "numeric vector") try(test_err1()) test_err2 = function(...) check_arg("numeric vector", ...) try(test_err2()) test_err3 = function(x) check_arg(x$a, "numeric vector") try(test_err3()) test_err4 = function(x) check_arg(x, "numeric vector integer") try(test_err4()) # Setting the developer mode off: setDreamerr_dev.mode(FALSE) # # XVI) Using check_value # # The main function for checking arguments is check_arg. # But sometimes you only know if an argument is valid after # having perfomed some modifications on it. # => that's when check_value kicks in. # # It's better with an example. # # In this example we'll construct a plotting function # using a formula, with a rock-solid argument checking. # # Plotting function, but using a formula # You want to plot only numeric values plot_fml = function(fml, data, ...){ # We first check the arguments check_arg(data, "data.frame mbt") check_arg(fml, "ts formula mbt var(data)", .data = data) # We extract the values of the formula y = fml[[2]] x = fml[[3]] # Now we check that x and y are valid => with check_value # We also use the possibility to assign the value of y and x directly # We add a custom message because y/x are NOT arguments check_set_value(y, "evalset numeric vector", .data = data, .message = "In the argument 'fml', the LHS must be numeric.") check_set_value(x, "evalset numeric vector", .data = data, .message = "In the argument 'fml', the RHS must be numeric.") # The dots => only arguments to plot are valid args_ok = c(formalArgs(plot.default), names(par())) validate_dots(valid_args = args_ok, stop = TRUE) # We also set the xlab/ylab dots = list(...) # dots has a special meaning in check_value (no need to pass .message) check_set_value(dots$ylab, "NULL{deparse(fml[[2]])} character vector conv len(,3)") check_set_value(dots$xlab, "NULL{deparse(fml[[3]])} character vector conv len(,3)") dots$y = y dots$x = x do.call("plot", dots) } # Let's check it works plot_fml(Sepal.Length ~ Petal.Length + Sepal.Width, iris) plot_fml(Sepal.Length ~ Petal.Length + Sepal.Width, iris, xlab = "Not the default xlab") # Now let's throw some errors try(plot_fml(Sepal.Length ~ Species, iris)) try(plot_fml(Sepal.Length ~ Petal.Length, iris, xlab = iris)) try(plot_fml(Sepal.Length ~ Petal.Length, iris, xlab = iris$Species))
This functions checks the evaluation of an expression and, if an error is thrown, captures it and integrates the captured message after a custom error message.
check_expr(expr, ..., clean, up = 0, arg_name, verbatim = FALSE) check_expr_hook(expr, ..., clean, arg_name, verbatim = FALSE) generate_check_expr_hook(namespace)
check_expr(expr, ..., clean, up = 0, arg_name, verbatim = FALSE) check_expr_hook(expr, ..., clean, arg_name, verbatim = FALSE) generate_check_expr_hook(namespace)
expr |
An expression to be evaluated. |
... |
Character scalars. The values of If argument |
clean |
Character vector, default is missing. If provided, the function
|
up |
Integer, default is 0. It is used to construct the call in the error message.
By default the call reported is the function containing |
arg_name |
Character scalar, default is missing. Used when the expression in
|
verbatim |
Logical scalar, default is |
namespace |
Character scalar giving the namespace for which the hooks are valid. Only useful when hook functions are used in a package. |
The purpose of this functions is to provide useful error messages to the user.
check_expr_hook()
: As check_expr
but sets the error call at the level of the hooked function
generate_check_expr_hook()
: Generates a package specific check_expr_hook
function
Laurent Berge
For general argument checking, see check_arg()
and check_set_arg()
.
test = function(x, y){ check_expr(mean(x, y), "Computing the mean didn't work:") }
test = function(x, y){ check_expr(mean(x, y), "Computing the mean didn't work:") }
Transforms a vector into a single character string enumerating the values of the vector. Many options exist to customize the result. The main purpose of this function is to ease the creation of user-level messages.
enumerate_items( x, type, verb = FALSE, s = FALSE, past = FALSE, or = FALSE, start_verb = FALSE, quote = FALSE, enum = FALSE, other = "", nmax = 7 )
enumerate_items( x, type, verb = FALSE, s = FALSE, past = FALSE, or = FALSE, start_verb = FALSE, quote = FALSE, enum = FALSE, other = "", nmax = 7 )
x |
A vector. |
type |
A single character string, optional. If this argument is used, it supersedes all other arguments. It compactly provides the arguments of the function: it must be like |
verb |
Default is |
s |
Logical, default is |
past |
Logical, default is |
or |
Logical, default is |
start_verb |
Logical, default is |
quote |
Logical, default is |
enum |
Logical, default is |
other |
Character scalar, defaults to the empty string: |
nmax |
Integer, default is 7. If |
It returns a character string of lentgh one.
type
The argument type
is a "super argument". When provided, it supersedes all other arguments. It offers a compact way to give the arguments to the function.
Its sytax is as follows: "arg1.arg2.arg2"
, where argX
is an argument code. The codes are "s", "past", "or", "start", "quote", "enum" – they refer to the function arguments. If you want to add a verb, since it can have a free-form, it is deduced as the argument not equal to the previous codes. For example, if you have type = "s.contain"
, this is identical to calling the function with s = TRUE
and verb = "contain"
.
A note on enum
. The argument enum
can be equal to "i", "I", "a", "A" or "1". When you include it in type
, by default "i" is used. If you want another one, add it in the code. For example type = "is.enum a.past"
is identical to calling the function with verb = "is"
, past = TRUE
and enum = "a"
.
Laurent Berge
# Let's say you write an error/information message to the user # I just use the "type" argument but you can obtain the # same results by using regular arguments x = c("x1", "height", "width") message("The variable", enumerate_items(x, "s.is"), " not in the data set.") # Now just the first item message("The variable", enumerate_items(x[1], "s.is"), " not in the data set.") # Past message("The variable", enumerate_items(x, "s.is.past"), " not found.") message("The variable", enumerate_items(x[1], "s.is.past"), " not found.") # Verb first message("The problematic variable", enumerate_items(x, "s.is.start.quote"), ".") message("The problematic variable", enumerate_items(x[1], "s.is.start.quote"), ".") # covid times todo = c("wash your hands", "stay home", "code") message("You should: ", enumerate_items(todo[c(1, 1, 2, 3)], "enum 1"), "!") message("You should: ", enumerate_items(todo, "enum.or"), "?")
# Let's say you write an error/information message to the user # I just use the "type" argument but you can obtain the # same results by using regular arguments x = c("x1", "height", "width") message("The variable", enumerate_items(x, "s.is"), " not in the data set.") # Now just the first item message("The variable", enumerate_items(x[1], "s.is"), " not in the data set.") # Past message("The variable", enumerate_items(x, "s.is.past"), " not found.") message("The variable", enumerate_items(x[1], "s.is.past"), " not found.") # Verb first message("The problematic variable", enumerate_items(x, "s.is.start.quote"), ".") message("The problematic variable", enumerate_items(x[1], "s.is.start.quote"), ".") # covid times todo = c("wash your hands", "stay home", "code") message("You should: ", enumerate_items(todo[c(1, 1, 2, 3)], "enum 1"), "!") message("You should: ", enumerate_items(todo, "enum.or"), "?")
Utility to display long messages with nice formatting. This function cuts the message to fit the current screen width of the R console. Words are never cut in the middle.
fit_screen(msg, width = NULL, leading_ws = TRUE, leader = "")
fit_screen(msg, width = NULL, leading_ws = TRUE, leader = "")
msg |
Text message: character vector. |
width |
A number between 0 and 1, or an integer. The maximum width of the screen the message should take.
Numbers between 0 and 1 represent a fraction of the screen. You can also refer to the
screen width with the special variable |
leading_ws |
Logical, default is |
leader |
Character scalar, default is the empty string. If provided, this value will be placed in front of every line. |
This function does not handle tabulations.
It returns a single character vector with line breaks at the appropriate width.
# A long message of two lines with a few leading spaces msg = enumerate_items(state.name, nmax = Inf) msg = paste0(" ", gsub("Michigan, ", "\n", msg)) # by default the message takes 95% of the screen cat(fit_screen(msg)) # Now we reduce it to 50% cat(fit_screen(msg, 0.5)) # we add leading_ws = FALSE to avoid the continuation of leading WS cat(fit_screen(msg, 0.5, FALSE)) # We add "#> " in front of each line cat(fit_screen(msg, 0.5, leader = "#> "))
# A long message of two lines with a few leading spaces msg = enumerate_items(state.name, nmax = Inf) msg = paste0(" ", gsub("Michigan, ", "\n", msg)) # by default the message takes 95% of the screen cat(fit_screen(msg)) # Now we reduce it to 50% cat(fit_screen(msg, 0.5)) # we add leading_ws = FALSE to avoid the continuation of leading WS cat(fit_screen(msg, 0.5, FALSE)) # We add "#> " in front of each line cat(fit_screen(msg, 0.5, leader = "#> "))
Formatting of numbers, when they are to appear in messages. Displays only significant digits in a "nice way" and adds commas to separate thousands. It does much less than the format
function, but also a bit more though.
fsignif(x, s = 2, r = 0, commas = TRUE) signif_plus
fsignif(x, s = 2, r = 0, commas = TRUE) signif_plus
x |
A numeric vector. |
s |
The number of significant digits to be displayed. Defaults to 2. All digits not in the decimal are always shown. |
r |
For large values, the number of digits after the decimals to be displayed (beyond the number of significant digits). Defaults to 0. It is useful to suggest that a number is not an integer. |
commas |
Whether or not to add commas to separate thousands. Defaults to |
An object of class function
of length 1.
It returns a character vector of the same length as the input.
x = rnorm(1e5) x[sample(1e5, 1e4, TRUE)] = NA # Dumb function telling the number of NA values tell_na = function(x) message("x contains ", fsignif(sum(is.na(x))), " NA values.") tell_na(x) # Some differences with signif: show_diff = function(x, d = 2) cat("signif(x, ", d, ") -> ", signif(x, d), " vs fsignif(x, ", d, ") -> ", fsignif(x, d), "\n", sep = "") # Main difference is for large numbers show_diff(95123.125) show_diff(95123.125, 7) # Identical for small numbers show_diff(pi / 500)
x = rnorm(1e5) x[sample(1e5, 1e4, TRUE)] = NA # Dumb function telling the number of NA values tell_na = function(x) message("x contains ", fsignif(sum(is.na(x))), " NA values.") tell_na(x) # Some differences with signif: show_diff = function(x, d = 2) cat("signif(x, ", d, ") -> ", signif(x, d), " vs fsignif(x, ", d, ") -> ", fsignif(x, d), "\n", sep = "") # Main difference is for large numbers show_diff(95123.125) show_diff(95123.125, 7) # Identical for small numbers show_diff(pi / 500)
When devising complex functions, errors or warnings can be deeply nested in internal function calls while the user-relevant call is way up the stack. In such cases, these "hook" functions facilitate the creation of error/warnings informative for the user.
generate_set_hook(namespace) generate_stop_hook(namespace) generate_warn_hook(namespace) set_hook() generate_get_hook(namespace) stop_hook(..., msg = NULL, envir = parent.frame(), verbatim = FALSE) warn_hook(..., envir = parent.frame(), immediate. = FALSE, verbatim = FALSE)
generate_set_hook(namespace) generate_stop_hook(namespace) generate_warn_hook(namespace) set_hook() generate_get_hook(namespace) stop_hook(..., msg = NULL, envir = parent.frame(), verbatim = FALSE) warn_hook(..., envir = parent.frame(), immediate. = FALSE, verbatim = FALSE)
namespace |
Character scalar giving the namespace for which the hooks are valid. Only useful when hook functions are used in a package. |
... |
Objects that will be coerced to character and will compose the error message. |
msg |
A character vector, default is |
envir |
An environment, default is |
verbatim |
Logical scalar, default is |
immediate. |
Whether the warning message should be prompted directly. Defaults to |
These functions are useful when developing complex functions relying on nested internal functions. It is important for the user to know where the errors/warnings come from for quick debugging. This "_hook" family of functions write the call of the user-level function even if the errors happen at the level of the internal functions.
If you need these functions within a package, you need to generate the set_hook
, stop_hook
and
warn_hook
functions so that they set, and look up for, hooks speficic to your function. This ensures that
if other functions outside your package also use hooks, there will be no conflict. The only thing to do
is to write this somewhere in the package files:
set_hook = generate_set_hook("pkg_name") stop_hook = generate_stop_hook("pkg_name") warn_hook = generate_warn_hook("pkg_name")
generate_set_hook()
: Generates a package specific set_hook
function
generate_stop_hook()
: Generates a package specific stop_hook
function
generate_warn_hook()
: Generates a package specific warn_hook
function
set_hook()
: Marks the function as the hook
generate_get_hook()
: Generates the function giving the number of frames we
need to go up the call stack to find the hooked function
warn_hook()
: Warning with a call located at a hook location
Laurent Berge
Regular stop functions with interpolation: stop_up()
. Regular argument checking
with check_arg()
and check_set_arg()
.
# The example needs to be complex since it's about nested functions, sorry # Let's say you have an internal function that is dispatched into several # user-level functions my_mean = function(x, drop_na = FALSE){ set_hook() my_mean_internal(x = x, drop_na = drop_na) } my_mean_skip_na = function(x){ set_hook() my_mean_internal(x = x, drop_na = TRUE) } my_mean_internal = function(x, drop_na){ # simple check if(!is.numeric(x)){ # note that we use string interpolation with stringmagic. stop_hook("The argument `x` must be numeric. PROBLEM: it is of class {enum.bq ? class(x)}.") } if(drop_na){ return(mean(x, na.rm = TRUE)) } else { return(mean(x, na.rm = FALSE)) } } # Let's run the function with a wrong argument x = "five" try(my_mean(x)) # => the error message reports that the error comes from my_mean # and *not* my_mean_internal
# The example needs to be complex since it's about nested functions, sorry # Let's say you have an internal function that is dispatched into several # user-level functions my_mean = function(x, drop_na = FALSE){ set_hook() my_mean_internal(x = x, drop_na = drop_na) } my_mean_skip_na = function(x){ set_hook() my_mean_internal(x = x, drop_na = TRUE) } my_mean_internal = function(x, drop_na){ # simple check if(!is.numeric(x)){ # note that we use string interpolation with stringmagic. stop_hook("The argument `x` must be numeric. PROBLEM: it is of class {enum.bq ? class(x)}.") } if(drop_na){ return(mean(x, na.rm = TRUE)) } else { return(mean(x, na.rm = FALSE)) } } # Let's run the function with a wrong argument x = "five" try(my_mean(x)) # => the error message reports that the error comes from my_mean # and *not* my_mean_internal
Tiny functions shorter, and hopefully more explicit, than ifelse
.
ifsingle(x, yes, no) ifunit(x, yes, no)
ifsingle(x, yes, no) ifunit(x, yes, no)
x |
A vector ( |
yes |
Something of length 1. Result if the condition is fulfilled. |
no |
Something of length 1. Result if the condition is not fulfilled. |
Yes, ifunit
is identical to ifelse(test == 1, yes, no)
. And regarding ifsingle
, it is identical to ifelse(length(test) == 1, yes, no)
.
Why writing these functions then? Actually, I've found that they make the code more explicit, and this helps!
Returns something of length 1.
ifunit()
: Conditional element selection depending on whether x
is equal to unity or not.
Laurent Berge
# Let's create an error message when NAs are present my_crossprod = function(mat){ if(anyNA(mat)){ row_na = which(rowSums(is.na(mat)) > 0) n_na = length(row_na) stop("In argument 'mat': ", n_letter(n_na), " row", plural(n_na, "s.contain"), " NA values (", ifelse(n_na<=3, "", "e.g. "), "row", enumerate_items(head(row_na, 3), "s"), "). Please remove ", ifunit(n_na, "it", "them"), " first.") } crossprod(mat) } mat = matrix(rnorm(30), 10, 3) mat4 = mat1 = mat mat4[c(1, 7, 13, 28)] = NA mat1[7] = NA # Error raised because of NA: informative (and nice) messages try(my_crossprod(mat4)) try(my_crossprod(mat1))
# Let's create an error message when NAs are present my_crossprod = function(mat){ if(anyNA(mat)){ row_na = which(rowSums(is.na(mat)) > 0) n_na = length(row_na) stop("In argument 'mat': ", n_letter(n_na), " row", plural(n_na, "s.contain"), " NA values (", ifelse(n_na<=3, "", "e.g. "), "row", enumerate_items(head(row_na, 3), "s"), "). Please remove ", ifunit(n_na, "it", "them"), " first.") } crossprod(mat) } mat = matrix(rnorm(30), 10, 3) mat4 = mat1 = mat mat4[c(1, 7, 13, 28)] = NA mat1[7] = NA # Error raised because of NA: informative (and nice) messages try(my_crossprod(mat4)) try(my_crossprod(mat1))
Set of (tiny) functions that convert integers into words.
n_times(n) n_th(n) n_letter(n)
n_times(n) n_th(n) n_letter(n)
n |
An integer vector. |
It returns a character vector of length one.
n_th()
: Transforms the integer n
to nth
appropiately.
n_letter()
: Transforms small integers to words.
Laurent Berge
find = function(v, x){ if(x %in% v){ message("The number ", n_letter(x), " appears ", n_times(sum(v == x)), ", the first occurrence is the ", n_th(which(v==x)[1]), " element.") } else message("The number ", n_letter(x), " was not found.") } v = sample(100, 500, TRUE) find(v, 6)
find = function(v, x){ if(x %in% v){ message("The number ", n_letter(x), " appears ", n_times(sum(v == x)), ", the first occurrence is the ", n_th(which(v==x)[1]), " element.") } else message("The number ", n_letter(x), " was not found.") } v = sample(100, 500, TRUE) find(v, 6)
Summary statistics of a packages: number of lines, number of functions, etc...
package_stats()
package_stats()
This function looks for files in the R/
and src/
folders and gives some stats. If there is no R/
folder directly accessible from the working directory, there will be no stats displayed.
Why this function? Well, it's just some goodies for package developers trying to be user-friendly!
The number of documentation lines (and number of words) corresponds to the number of non-empty roxygen documentation lines. So if you don't document your code with roxygen, well, this stat won't prompt.
Code lines correspond to non-commented, non-empty lines (by non empty: at least one letter must appear).
Comment lines are non-empty comments.
Doesn't return anything, just a prompt in the console.
package_stats()
package_stats()
Utilities to write user-level messages. These functions add an ‘s’ or a verb at the appropriate form depending on whether the argument is equal to unity (plural
) or of length one (plural_len
).
plural(x, type, s, verb = FALSE, past = FALSE) plural_len(x, type, s, verb = FALSE, past = FALSE)
plural(x, type, s, verb = FALSE, past = FALSE) plural_len(x, type, s, verb = FALSE, past = FALSE)
x |
An integer of length one ( |
type |
Character string, default is missing. If |
s |
Logical, used only if the argument type is missing. Whether to add an "s" if the form of |
verb |
Character string or |
past |
Logical, used only if the argument type is missing. Whether the verb should be in past tense. Default is |
Returns a character string of length one.
plural_len()
: Adds an s and conjugate a verb depending on the length of x
Laurent Berge
# Let's create an error message when NAs are present my_crossprod = function(mat){ if(anyNA(mat)){ row_na = which(rowSums(is.na(mat)) > 0) n_na = length(row_na) stop("In argument 'mat': ", n_letter(n_na), " row", plural(n_na, "s.contain"), " NA values (", ifelse(n_na<=3, "", "e.g. "), "row", enumerate_items(head(row_na, 3), "s"), "). Please remove ", ifunit(n_na, "it", "them"), " first.") } crossprod(mat) } mat = matrix(rnorm(30), 10, 3) mat4 = mat1 = mat mat4[c(1, 7, 13, 28)] = NA mat1[7] = NA # Error raised because of NA: informative (and nice) messages try(my_crossprod(mat4)) try(my_crossprod(mat1))
# Let's create an error message when NAs are present my_crossprod = function(mat){ if(anyNA(mat)){ row_na = which(rowSums(is.na(mat)) > 0) n_na = length(row_na) stop("In argument 'mat': ", n_letter(n_na), " row", plural(n_na, "s.contain"), " NA values (", ifelse(n_na<=3, "", "e.g. "), "row", enumerate_items(head(row_na, 3), "s"), "). Please remove ", ifunit(n_na, "it", "them"), " first.") } crossprod(mat) } mat = matrix(rnorm(30), 10, 3) mat4 = mat1 = mat mat4[c(1, 7, 13, 28)] = NA mat1[7] = NA # Error raised because of NA: informative (and nice) messages try(my_crossprod(mat4)) try(my_crossprod(mat1))
You can allow your users to turn off argument checking within your function by using set_check
. Only the functions check_arg
nd check_value
can be turned off that way.
set_check(x)
set_check(x)
x |
A logical scalar, no default. |
This function can be useful if you develop a function that may be used in large range loops (>100K). In such situations, it may be good to still check all arguments, but to offer the user to turn this checking off with an extra argument (named arg.check
for instance). Doing so you would achieve the feat of i) having a user-friendly function thanks to argument checking and, ii) still achieve high performance in large loops (although the computational footprint of argument checking is quite low (around 30 micro seconds for missing arguments to 80 micro seconds for non-missing arguments of simple type)).
# Let's give an example test_check = function(x, y, arg.check = TRUE){ set_check(arg.check) check_arg(x, y, "numeric scalar") x + y } # Works: argument checking on test_check(1, 2) # If mistake, nice error msg try(test_check(1, "a")) # Now argument checking turned off test_check(1, 2, FALSE) # But if mistake: "not nice" error message try(test_check(1, "a", FALSE))
# Let's give an example test_check = function(x, y, arg.check = TRUE){ set_check(arg.check) check_arg(x, y, "numeric scalar") x + y } # Works: argument checking on test_check(1, 2) # If mistake, nice error msg try(test_check(1, "a")) # Now argument checking turned off test_check(1, 2, FALSE) # But if mistake: "not nice" error message try(test_check(1, "a", FALSE))
When check_arg
(or stop_up
) is used in non user-level functions, the argument .up
is used to provide an appropriate error message referencing the right function.
set_up(.up = 1)
set_up(.up = 1)
.up |
An integer greater or equal to 0. |
To avoid repeating the argument .up
in each check_arg
call, you can set it (kind of) "globally" with set_up
.
The function set_up
does not set the argument up
globally, but only for all calls to check_arg
and check_value
within the same function.
# Example with computation being made within a non user-level function sum_fun = function(x, y){ my_internal(x, y, sum = TRUE) } diff_fun = function(x, y){ my_internal(x, y, sum = FALSE) } my_internal = function(x, y, sum){ set_up(1) # => errors will be at the user-level function check_arg(x, y, "numeric scalar mbt") # Identical to calling # check_arg(x, y, "numeric scalar mbt", .up = 1) if(sum) return(x + y) return(x - y) } # we check it works sum_fun(5, 6) diff_fun(5, 6) # Let's throw some errors try(sum_fun(5)) try(sum_fun(5, 1:5))
# Example with computation being made within a non user-level function sum_fun = function(x, y){ my_internal(x, y, sum = TRUE) } diff_fun = function(x, y){ my_internal(x, y, sum = FALSE) } my_internal = function(x, y, sum){ set_up(1) # => errors will be at the user-level function check_arg(x, y, "numeric scalar mbt") # Identical to calling # check_arg(x, y, "numeric scalar mbt", .up = 1) if(sum) return(x + y) return(x - y) } # we check it works sum_fun(5, 6) diff_fun(5, 6) # Let's throw some errors try(sum_fun(5)) try(sum_fun(5, 1:5))
This function allows to disable, or re-enable, all calls to check_arg
within any function. Useful only when running (very) large loops (>100K iter.) over small functions that use dreamerr's check_arg
.
setDreamerr_check(check = TRUE)
setDreamerr_check(check = TRUE)
check |
Strict logical: either |
Laurent Berge
# Let's create a small function that returns the argument # if it is a single character string, and throws an error # otherwise: test = function(x){ check_arg(x, "scalar character") x } # works: test("hey") # error: try(test(55)) # Now we disable argument checking setDreamerr_check(FALSE) # works (although it shouldn't!): test(55) # re-setting argument checking on: setDreamerr_check(TRUE)
# Let's create a small function that returns the argument # if it is a single character string, and throws an error # otherwise: test = function(x){ check_arg(x, "scalar character") x } # works: test("hey") # error: try(test(55)) # Now we disable argument checking setDreamerr_check(FALSE) # works (although it shouldn't!): test(55) # re-setting argument checking on: setDreamerr_check(TRUE)
Turns on/off a full fledged checking of calls to check_arg
. If on, it enables the developer mode which checks extensively calls to check_arg, allowing to find any problem. If a problem is found, it is pinpointed and the associated help is referred to.
setDreamerr_dev.mode(dev.mode = FALSE)
setDreamerr_dev.mode(dev.mode = FALSE)
dev.mode |
A logical, default is |
Since this mode ensures a detailed cheking of all check_arg
calls, it is thus a strain on performance and should be always turned off otherwise needed.
Laurent Berge
# If you're new to check_arg, given the many types available, # it's very common to make mistakes when creating check_arg calls. # The developer mode ensures that any problematic call is spotted # and the problem is clearly stated # # Note that since this mode ensures a detailed cheking of the call # it is thus a strain on performance and should be always turned off # otherwise needed. # # Setting the developer mode on: setDreamerr_dev.mode(TRUE) # Creating some 'wrong' calls => the problem is pinpointed test = function(x) check_arg(x, "integer scalar", "numeric vector") try(test()) test = function(...) check_arg("numeric vector", ...) try(test()) test = function(x) check_arg(x$a, "numeric vector") try(test()) test = function(x) check_arg(x, "numeric vector integer") try(test()) test = function(x) check_arg(x, "vector len(,)") try(test()) # etc... # Setting the developer mode off: setDreamerr_dev.mode(FALSE)
# If you're new to check_arg, given the many types available, # it's very common to make mistakes when creating check_arg calls. # The developer mode ensures that any problematic call is spotted # and the problem is clearly stated # # Note that since this mode ensures a detailed cheking of the call # it is thus a strain on performance and should be always turned off # otherwise needed. # # Setting the developer mode on: setDreamerr_dev.mode(TRUE) # Creating some 'wrong' calls => the problem is pinpointed test = function(x) check_arg(x, "integer scalar", "numeric vector") try(test()) test = function(...) check_arg("numeric vector", ...) try(test()) test = function(x) check_arg(x$a, "numeric vector") try(test()) test = function(x) check_arg(x, "numeric vector integer") try(test()) test = function(x) check_arg(x, "vector len(,)") try(test()) # etc... # Setting the developer mode off: setDreamerr_dev.mode(FALSE)
Errors generated with dreamerr functions only shows the call to which
the error should point to. If setDreamerr_show_stack
is set to TRUE,
error will display the full call stack instead.
setDreamerr_show_stack(show_full_stack = FALSE)
setDreamerr_show_stack(show_full_stack = FALSE)
show_full_stack |
Logical scalar, default is |
Laurent Berge
# Let's create a toy example of a function relying on an internal function # for the heavy lifting (although here it's pretty light!) make_sum = function(a, b){ make_sum_internal(a, b) } make_sum_internal = function(a, b){ if(!is.numeric(a)) stop_up("arg. 'a' must be numeric!") a + b } # By default if you feed stg non numeric, the call shown is # make_sum, and not make_sum_internal, since the user could not # care less of the internal structure of your functions try(make_sum("five", 55)) # Now with setDreamerr_show_stack(TRUE), you would get the full call stack setDreamerr_show_stack(TRUE) try(make_sum("five", 55))
# Let's create a toy example of a function relying on an internal function # for the heavy lifting (although here it's pretty light!) make_sum = function(a, b){ make_sum_internal(a, b) } make_sum_internal = function(a, b){ if(!is.numeric(a)) stop_up("arg. 'a' must be numeric!") a + b } # By default if you feed stg non numeric, the call shown is # make_sum, and not make_sum_internal, since the user could not # care less of the internal structure of your functions try(make_sum("five", 55)) # Now with setDreamerr_show_stack(TRUE), you would get the full call stack setDreamerr_show_stack(TRUE) try(make_sum("five", 55))
Fills a string vector with a user-provided symbol, up to the required length.
sfill(x = "", n = NULL, symbol = " ", right = FALSE, anchor, na = "NA")
sfill(x = "", n = NULL, symbol = " ", right = FALSE, anchor, na = "NA")
x |
A character vector. |
n |
A positive integer giving the total expected length of each character string. Can be NULL (default). If |
symbol |
Character scalar, default to |
right |
Logical, default is |
anchor |
Character scalar, can be missing. If provided, the filling is done up to this anchor. See examples. |
na |
Character that will replace any NA value in input. Default is "NA". |
Returns a character vector of the same length as x
.
# Some self-explaining examples x = c("hello", "I", "am", "No-one") cat(sep = "\n", sfill(x)) cat(sep = "\n", sfill(x, symbol = ".")) cat(sep = "\n", sfill(x, symbol = ".", n = 15)) cat(sep = "\n", sfill(x, symbol = ".", right = TRUE)) cat(sep = "\n", paste(sfill(x, symbol = ".", right = TRUE), ":", 1:4)) # Argument 'anchor' can be useful when using numeric vectors x = c(-15.5, 1253, 32.52, 665.542) cat(sep = "\n", sfill(x)) cat(sep = "\n", sfill(x, anchor = "."))
# Some self-explaining examples x = c("hello", "I", "am", "No-one") cat(sep = "\n", sfill(x)) cat(sep = "\n", sfill(x, symbol = ".")) cat(sep = "\n", sfill(x, symbol = ".", n = 15)) cat(sep = "\n", sfill(x, symbol = ".", right = TRUE)) cat(sep = "\n", paste(sfill(x, symbol = ".", right = TRUE), ":", 1:4)) # Argument 'anchor' can be useful when using numeric vectors x = c(-15.5, 1253, 32.52, 665.542) cat(sep = "\n", sfill(x)) cat(sep = "\n", sfill(x, anchor = "."))
Useful if you employ non-user level sub-functions within user-level functions or
if you want string interpolation in error messages. When an error is thrown in the sub
function, the error message will integrate the call of the user-level function, which
is more informative and appropriate for the user. It offers a similar functionality for warning
.
stop_up(..., up = 1, msg = NULL, envir = parent.frame(), verbatim = FALSE) stopi(..., envir = parent.frame()) warni(..., envir = parent.frame(), immediate. = FALSE) warn_up( ..., up = 1, immediate. = FALSE, envir = parent.frame(), verbatim = FALSE )
stop_up(..., up = 1, msg = NULL, envir = parent.frame(), verbatim = FALSE) stopi(..., envir = parent.frame()) warni(..., envir = parent.frame(), immediate. = FALSE) warn_up( ..., up = 1, immediate. = FALSE, envir = parent.frame(), verbatim = FALSE )
... |
Objects that will be coerced to character and will compose the error message. |
up |
The number of frames up, default is 1. The call in the error message will be based on the function |
msg |
A character vector, default is |
envir |
An environment, default is |
verbatim |
Logical scalar, default is |
immediate. |
Whether the warning message should be prompted directly. Defaults to |
These functions are really made for package developers to facilitate the good practice of providing informative user-level error/warning messages.
The error/warning messages allow variable interpolation by making use of stringmagic's interpolation.
stopi()
: Error messages with string interpolation
warni()
: Warnings with string interpolation
warn_up()
: Warnings at the level of user-level functions
Laurent Berge
For general argument checking, see check_arg()
and check_set_arg()
.
# We create a main user-level function # The computation is done by an internal function # Here we compare stop_up with a regular stop main_function = function(x = 1, y = 2){ my_internal_function(x, y) } my_internal_function = function(x, y){ if(!is.numeric(x)){ stop_up("Argument 'x' must be numeric but currently isn't.") } # Now regular stop if(!is.numeric(y)){ stop("Argument 'y' must be numeric but currently isn't.") } nx = length(x) ny = length(y) if(nx != ny){ # Note that we use string interpolation with {} warn_up("The lengths of x and y don't match: {nx} vs {ny}.") } x + y } # Let's compare the two error messages # stop_up: try(main_function(x = "a")) # => the user understands that the problem is with x # Now compare with the regular stop: try(main_function(y = "a")) # Since the user has no clue of what my_internal_function is, # s/he will be puzzled of what to do to sort this out # Same with the warning => much clearer with warn_up main_function(1, 1:2)
# We create a main user-level function # The computation is done by an internal function # Here we compare stop_up with a regular stop main_function = function(x = 1, y = 2){ my_internal_function(x, y) } my_internal_function = function(x, y){ if(!is.numeric(x)){ stop_up("Argument 'x' must be numeric but currently isn't.") } # Now regular stop if(!is.numeric(y)){ stop("Argument 'y' must be numeric but currently isn't.") } nx = length(x) ny = length(y) if(nx != ny){ # Note that we use string interpolation with {} warn_up("The lengths of x and y don't match: {nx} vs {ny}.") } x + y } # Let's compare the two error messages # stop_up: try(main_function(x = "a")) # => the user understands that the problem is with x # Now compare with the regular stop: try(main_function(y = "a")) # Since the user has no clue of what my_internal_function is, # s/he will be puzzled of what to do to sort this out # Same with the warning => much clearer with warn_up main_function(1, 1:2)
Compares a character scalar to the elements from a character vector and returns the elements that are the closest to the input.
suggest_item( x, items, msg.write = FALSE, msg.newline = TRUE, msg.item = "variable" )
suggest_item( x, items, msg.write = FALSE, msg.newline = TRUE, msg.item = "variable" )
x |
Character scalar, must be provided. This reference will be compared
to the elements of the string vector in the argument |
items |
Character vector, must be provided. Elements to which the value in
argument |
msg.write |
Logical scalar, default is |
msg.newline |
Logical scalar, default is |
msg.item |
Character scalar, default is |
This function is useful when used internally to guide the user to relevant choices.
The choices to which the user is guided are in decreasing quality. First light mispells are checked. Then more important mispells. Finally very important mispells. Completely off potential matches are not reported.
If the argument msg.write
is TRUE
, then a character scalar is returned containing
a message suggesting the matches.
It returns a vector of matches. If no matches were found
Laurent Berge
# function reporting the sum of a variable sum_var = function(data, var){ # var: a variable name, should be part of data if(!var %in% names(data)){ suggestion = suggest_item(var, names(data), msg.write = TRUE) stopi("The variable `{var}` is not in the data set. {suggestion}") } sum(data[[var]]) } # The error message guides us to a suggestion try(sum_var(iris, "Petal.Le"))
# function reporting the sum of a variable sum_var = function(data, var){ # var: a variable name, should be part of data if(!var %in% names(data)){ suggestion = suggest_item(var, names(data), msg.write = TRUE) stopi("The variable `{var}` is not in the data set. {suggestion}") } sum(data[[var]]) } # The error message guides us to a suggestion try(sum_var(iris, "Petal.Le"))
This function informs the user of arguments passed to a method but which are not used by the method.
validate_dots( valid_args = c(), suggest_args = c(), message, warn, stop, call. = FALSE, immediate. = TRUE )
validate_dots( valid_args = c(), suggest_args = c(), message, warn, stop, call. = FALSE, immediate. = TRUE )
valid_args |
A character vector, default is missing. Arguments that are not in the definition of the function but which are considered as valid. Typically internal arguments that should not be directly accessed by the user. |
suggest_args |
A character vector, default is missing. If the user provides invalid arguments, he might not be aware of the main arguments of the function. Use this argument to inform the user of these main arguments. |
message |
Logical, default is |
warn |
Logical, default is |
stop |
Logical, default is |
call. |
Logical, default is |
immediate. |
Logical, default is |
This function returns the message to be displayed. If no message is to be displayed because all the arguments are valid, then NULL
is returned.
# The typical use of this function is within methods # Let's create a 'my_class' object and a summary method my_obj = list() class(my_obj) = "my_class" # In the summary method, we add validate_dots # to inform the user of invalid arguments summary.my_class = function(object, arg_one, arg_two, ...){ validate_dots() # CODE of summary.my_class invisible(NULL) } # Now let's test it, we add invalid arguments summary(my_obj, wrong = 3) summary(my_obj, wrong = 3, info = 5) # Now let's : # i) inform the user that argument arg_one is the main argument # ii) consider 'info' as a valid argument (but not shown to the user) # iii) show a message instead of a warning summary.my_class = function(object, arg_one, arg_two, ...){ validate_dots(valid_args = "info", suggest_args = "arg_one", message = TRUE) # CODE of summary.my_class invisible(NULL) } # Let's retest it summary(my_obj, wrong = 3) # not OK => suggestions summary(my_obj, info = 5) # OK
# The typical use of this function is within methods # Let's create a 'my_class' object and a summary method my_obj = list() class(my_obj) = "my_class" # In the summary method, we add validate_dots # to inform the user of invalid arguments summary.my_class = function(object, arg_one, arg_two, ...){ validate_dots() # CODE of summary.my_class invisible(NULL) } # Now let's test it, we add invalid arguments summary(my_obj, wrong = 3) summary(my_obj, wrong = 3, info = 5) # Now let's : # i) inform the user that argument arg_one is the main argument # ii) consider 'info' as a valid argument (but not shown to the user) # iii) show a message instead of a warning summary.my_class = function(object, arg_one, arg_two, ...){ validate_dots(valid_args = "info", suggest_args = "arg_one", message = TRUE) # CODE of summary.my_class invisible(NULL) } # Let's retest it summary(my_obj, wrong = 3) # not OK => suggestions summary(my_obj, info = 5) # OK