--- title: "Package miscset introduction" author: "Sven E. Templer" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{miscset} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # About R package **miscset** version ``r packageVersion("miscset")``. A collection of miscellaneous methods to simplify various tasks, including [plotting](#plot-methods), [data.frame and matrix](#data-frame-and-matrix-methods) transformations, [environment](#environment-methods) functions, [regular expression](#regular-expression-methods) methods, and [string and logical](#string-and-logical-methods) operations, as well as [numerical and statistical](#numeric-methods) tools. Most of the methods are simple but useful wrappers of common base `R` functions, which extend S3 generics or provide default values for important parameters. # Index * [Installation and Introduction](#installation-and-introduction) * [Plots](#plot-methods) - [ciplot](#function-ciplot) - Barplot with Confindence Intervals - [ggplotGrid](#function-ggplotgrid) - Arrange a List of ggplots - [gghl](#function-gghcl) - HTML Colours Like ggplot2 - [plotn](#function-plotn) - Plot Nothing (but a Plot) * [Data Frame and Matrix](#data-frame-and-matrix-methods) - [sort](#function-sort) - Sort data.frame Objects - [do.rbind](#function-do.rbind) - Bind data.frames in a List by Rows - [enpaire](#function-enpaire) - Create a Pairwise List from a Matrix - [squarematrix](#function-squarematrix) - Create a Square Matrix - [textable](#function-textable) - Table to Latex * [Environment](#environment-functions) - [help.index](#function-help.index) - Open The Package Help Index Page - [lload](#function-lload) - Load RData Objects to a List - [lsall](#function-lsall) - List Object Details - [rmall](#function-rmall) - Remove All Objects from Global Environment * [Regular Expression](#regular-expression-methods) - [mgrepl](#function-mgrepl) - Multiple Pattern Matching - [gregexprind](#function-gregexprind) - Pattern Matching and Extraction * [String and Logical](#string-and-logical-methods) - [collapse](#function-collapse) - Collapse objects - [leading0](#function-leading0) - Numeric to Character with Leading Zero(s) - [strextr](#function-strextr) - Extract a Substring - [str_part](#function-str_part) - Split String and Return Part - [str_rev](#function-str_rev) - Reverse Text Strings - [duplicates and duplicatei](#function-duplicates-and-duplicatei) - Determine Duplicates * [Numerical and Statistical](#numeric-methods) - [p2star](#function-p2star) - P Value Significance Level Indicator - [confint.numeric](#function-confint.numeric) - Confidence Intervals for Numeric Vectors - [ntri](#function-ntri) - Return Triangular Numbers - [scale0 and scaler](#function-scale0-and-scaler) - Scale Numeric Values to Defined Ranges - [nunique and uniquei](#function-nunique-and-uniquei) - Amount and Index of Unique Values # Installation and Introduction Install the latest version from **CRAN** via: ```{r eval=FALSE} install.packages('miscset') ``` Install the development version from **github** via: ```{r eval=FALSE} install.packages('devtools') devtools::install_github('setempler/miscset@develop', build_vignettes = TRUE) ``` After installation, load the package via ```{r} library(miscset) ``` If you like to contribute to the development of the packages, please * find the source code at [github](https://github.com/setempler/miscset) * contribute with filing an [issues](https://github.com/setempler/miscset/issues) Get help in an R session via * `help.index(miscset)` * `?` + function name # Plot methods ([back to top](#about)) ### Function `ciplot` Plot a bargraph with error bars. Input data is a list with numeric vectors. Functions to calculate bar heights (e.g. `mean` by default) and error bar sizes (e.g. `confint.numeric` by default) can be modified (e.g. `sd` for error bars). ```{r fig.height=3, fig.width=4, fig.align='center'} d <- data.frame(a=c(2,1,3,NA,1), b=2:6, c=5:1) ciplot(d) ``` ### Function `ggplotGrid` Arrange ggplots on a grid (plot window or pdf file). Supply a list with `ggplot` objects and define number of rows and/or columns. If a `path` is supplied, the plot is written to that file instead of the internal graphics device. ```{r fig.height=2, fig.width=6, fig.align='center'} library(ggplot2) plots <- list( ggplot(d, aes(x = b, y= -c, col = b)) + geom_line(), ggplot(d, aes(x = b, y = -c, shape = factor(b))) + geom_point()) ggplotGrid(plots, ncol = 2) ``` The function `ggplotGridA4` supports direct output to DIN A4 sized pdfs. ### Function `gghcl` Generate a character vector with html values from a color hue as in `ggplot`. ```{r comment=NA, fig.height=3, fig.width=4, fig.align='center'} d <- data.frame(a=c(2,1,3,NA,1), b=2:6, c=5:1) n <- length(d) gghcl(n) ciplot(d, col = gghcl(n)) ``` ### Function `plotn` Create an empty plot. Useful to fill `layout`. ```{r fig.height=2, fig.width=2, fig.align='center'} plotn() ``` # Data Frame and Matrix Methods ([back to top](#about)) ### Function `sort` Sort `data.frame` objects. This extends the functionality of the base R distributed generic `sort`. Define multiple columns by column names as character vector or expression. ```{r, comment=NA} d <- data.frame(a=c(2,1,3,NA,1), b=2:6, c=5:1) print(d) sort(d, by = c("a", "c")) ``` ### Function `do.rbind` Note: This function is now deprecated. It is recommended to use `rbindlist` from the [data.table](https://cran.r-project.org/package=data.table) package. A wrapper function to row-bind `data.frame` objects in a list with `do.call` and `rbind`. Object names from the list are inserted as additional column. ```{r comment=NA} d <- data.frame(a=c(2,1,3,NA,1), b=2:6, c=5:1) print(d[1:3,]) do.rbind(list(first=d[1:2,], second=d[1:3,])) ``` ### Function `enpaire` Generate a pairwise list (`data.frame`) of a matrix containing row and column id and upper and lower triangle values. ```{r comment=NA} m <- matrix(letters[1:9], 3, 3, dimnames = list(1:3,1:3)) print(m) enpaire(m) ``` ### Function `squarematrix` Generate a symmetric (square) matrix from an unsymmetric one using column and row names. Fills empty cells with `NA`. ```{r comment=NA} m <- matrix(letters[1:9], 3, 3, dimnames = list(1:3,1:3)) print(m[-1,]) squarematrix(m[-1,]) ``` ### Function `textable` Print a `data.frame` as latex table. Extends `xtable` by optionally including a latex header, and if desired writing the output to a file directly and calling a system command to convert it to a `.pdf` file, for example. ```{r comment=NA} d <- data.frame(a=c(2,1,3,NA,1), b=2:6, c=5:1) textable(d, caption = 'miscset vignette example data.frame', as.document = TRUE) ``` # Environment Functions ([back to top](#about)) ### Function `help.index` Show the help index page of a package (with the list of all help pages of a package). ```r help.index(miscset) ``` ### Function `lload` Load multiple R data objects into a list. List is of same length as number of files provided. Sublists contain all respective objects. Simplification is possible if all names are unique. ```r lload("folder/with/rdata/", "test*.RData") ``` ### Function `lsall` Return all current workspace (or any custom) object names, lengths, classes, modes and sizes in a `data.frame`. ```{r comment=NA} lsall() ``` ### Function `rmall` Remove all objects from the current or custom environment. ```r rmall() ``` # Regular Expression Methods ([back to top](#about)) ### Function `mgrepl` Search for multiple patterns in a character vector. Merge results by (custom) logical functions (e.g. `any`, `all`) and use mutlicore support from the `parallel` package. Optionally return the index (as with `which`). Use `identity` to return a matrix with the results of each pattern per row. ```{r comment=NA} s <- c("ab","ac","bc", NA) mgrepl(c("a","b"), s) mgrepl(c("a","b"), s, any) # similar to: grepl("a|b", s) mgrepl(c("a", "b"), s, sum) mgrepl(c("a","b"), s, identity) ``` ### Function `gregexprind` Retreive the `n`th or `"last"` index of an expression found in a character string. ```{r comment=NA} gregexprind(c("a"), c("ababa","ab","xyz",NA), 1) gregexprind(c("a"), c("ababa","ab","xyz",NA), 2) gregexprind(c("a"), c("ababa","ab","xyz",NA), "last") ``` # String and Logical Methods ([back to top](#about)) ### Function `collapse` To collapse vectors, usually a call to `paste` or `paste0` setting the argument `collapse` is applied. The collapse function is a wrapper of this functionality applied to a single vector. It can be extended with the `.unique`, `.sort` and `.decreasing` arguments, to return only unique and sorted values. ```{r comment=NA} paste(letters, collapse = "") collapse(letters) ``` The `data.frame` method allows to collapse a data frame by identifier/grouping columns (specified with `by`). Each group piece has then all value columns collapsed with the default method. In addition, the value columns can be collapsed to vectors, when `sep = NULL` is selected, keeping a list of vectors for this column in the returned data frame. `.sortby` allows to choose if the result should be sorted by the grouping columns. `.unlist` provides a way to unlist value columns per group, which is useful if the input has list columns. ```{r comment=NA} # create example data set.seed(12) s <- s2 <- sample(LETTERS[1:4], 9, replace = TRUE) s2[1:2] <- rev(s2[1:2]) d <- data.frame(group = rep(letters[c(3,1,2)], each = 3), value = s, level = factor(s2), stringsAsFactors = FALSE) print(d) ``` The following (default settings) collapses by all columns, which results in an output similar to `unique(d)`, but the row names are not kept. ```{r comment=NA} collapse(d) ``` Specifying no grouping columns (setting `by` to `0` or `NULL`) collapses all columns. ```{r comment=NA} collapse(d, by = NULL) ``` Specifying at least one and maximum less than the total columns groups the `data.frame`, splits it into group pieces, and applies the collapsing to all remaining columns. ```{r comment=NA} collapse(d, "/", 1) ``` If the separator `sep` is not specified, the `data.frame` method allows to return list columns, containing vectors of values per group. With the `.sortby` argument, the ouptut can be sorted on the grouping values. ```{r comment=NA} # by first column, but keep values as vectors collapse(d, NULL, c(1,3), .sortby = T) ``` The `data.frame` method also works on `data.table` objects, since it uses the methods from the package of the same name to split the input into group pieces. If the input inherits from `data.table`, the class is retained. ### Function `leading0` Prepend `0` characters to numbers to generate equally sized strings. ```{r comment=NA} leading0(c(9, 112, 5009)) ``` ### Function `strextr` Note: This function is now deprecated. It is recommended to use `str_extract` or `str_extract_all` from the [stringr](https://cran.r-project.org/package=stringr) package. Split strings by a separator (`sep`) and extract all substrings matching a `pattern`. Optionally allow multiple matches, and use multicore support from the `parallel` package. ```{r comment=NA} s <- "xa,xb,xn,ya,yb" strextr(s, "n$", ",") strextr(s, "^x", ",", mult=T) library(stringr) str_extract(s, "[^,]*n") str_extract_all(s, "x[^,]*") ``` ### Function `str_part` Similar to `strextr`, but extracting substrings is done by setting an index value `n`. Optionally roll the last value to `n` if it's index is less. ```{r comment=NA} s <- "xa,xb,xn,ya,yb" str_part(s, ",", 3) ``` ### Function `str_rev` Create reverse version of strings of a `character` vector. ```{r comment=NA} str_rev(c("olleH", "!dlroW")) ``` ### Function `duplicates` and `duplicatei` Determine duplicates. Return either a logical vector (`duplicates`) or an integer index (`duplicatei`). Extends the base method `duplicated` by also returning `TRUE` for the first occurence of a value. ```{r comment=NA} data.frame( duplicate = d$a, ".d" = duplicated(d$a), # standard R function ".s" = duplicates(d$a), ".i" = duplicatei(d$a)) ``` # Numeric Methods ([back to top](#about)) ### Function `p2star` Asign range symbols to values, e.g. convert p-values to significance characters. ```{r comment=NA} p2star(c(0.003, 0.049, 0.092, 0.431)) ``` ### Function `confint.numeric` Calculate confidence intervals. Extends the base method `confint` to numeric vectors. ```{r comment=NA} n <- c(2,1,3,NA,1) confint(n, ret.attr = FALSE) ``` ### Function `ntri` Generate a series of triangular numbers of length `n` according to [OEIS#A000217](https://oeis.org/A000217). The series for 12 rows of a triangle, for example, can be returned as in the following example. ```{r comment=NA} ntri(12) ``` ### Function `scale0` and `scaler` Scale numeric vectors to a range of _0_ to _1_ with `scale0` or to a custom output range `r` and input range `b` with `scaler`. ```{r comment=NA} n <- 5:1 scale0(n) scaler(n, c(2, 6), b = c(1, 10)) ``` ### Function `nunique` and `uniquei` Return the amount (with `nunique`) or index (with `uniquei`) of unique values in a vector. Extends `plyr::nunique` by allowing `NA` values to be counted as a 'level'. ```{r comment=NA} n <- c(2,1,3,NA,1) nunique(n) nunique(n, FALSE) uniquei(n) uniquei(n, FALSE) ``` # Legal > Copyright (c) 2017 - Sven E. Templer ([back to top](#about))