Functions to create cache that accelerates many operations
Usage
hashcache(x, nunique = NULL, ...)
sortcache(x, has.na = NULL)
sortordercache(x, has.na = NULL, stable = NULL)
ordercache(x, has.na = NULL, stable = NULL, optimize = "time")Arguments
- x
an atomic vector (note that currently only integer64 is supported)
- nunique
giving correct number of unique elements can help reducing the size of the hashmap
- ...
passed to
hashmap()- has.na
boolean scalar defining whether the input vector might contain
NAs. If we know we don't haveNAs, this may speed-up. Note that you risk a crash if there are unexpectedNAs withhas.na=FALSE.- stable
boolean scalar defining whether stable sorting is needed. Allowing non-stable may speed-up.
- optimize
by default ramsort optimizes for 'time' which requires more RAM, set to 'memory' to minimize RAM requirements and sacrifice speed.
Value
x with a cache() that contains the result of the expensive operations,
possible together with small derived information (such as nunique.integer64())
and previously cached results.
Details
The result of relative expensive operations hashmap(), bit::ramsort(),
bit::ramsortorder(), and bit::ramorder() can be stored in a cache in
order to avoid multiple excutions. Unless in very specific situations, the
recommended method is hashsortorder only.
Note
Note that we consider storing the big results from sorting and/or ordering as a relevant side-effect, and therefore storing them in the cache should require a conscious decision of the user.
See also
cache() for caching functions and nunique.integer64() for methods benefiting
from small caches
Examples
x <- as.integer64(sample(c(rep(NA, 9), 1:9), 32, TRUE))
sortordercache(x)