8.12 Caching

In patRoon lengthy processing operations such as finding features and generating annotation data is cached. This means that when you run such a calculation again (without changing any parameters), the data is simply loaded from the cache data instead of re-generating it. This in turn is very useful, for instance, if you have closed your R session and want to continue with data processing at a later stage.

The cache data is stored in a sqlite database file. This file is stored by default under the name cache.sqlite in the current working directory (for this reason it is very important to always restore your working directory!). However, the name and location can be changed by setting a global package option:

options(patRoon.cache.fileName = "~/myCacheFile.sqlite")

For instance, this might be useful if you want to use a shared cache file between projects.

After a while you may see that your cache file can get quite large. This is especially true when testing different parameters to optimize your workflow. Furthermore, you may want to clear the cache after you have updated patRoon and want to make sure that the latest code is used to generate the data. At any point you can simply remove the cache file. A more fine tuned approach which doesn’t wipe all your cached data is by using the clearCache() function. With this function you can selectively remove parts of the cache file. The function has two arguments: what, which specifies what should be removed, and file which specifies the path to the cache file. The latter only needs to be specified if you want to manage a different cache file.

In order to figure what is in the cache you can run clearCache() without any arguments:

clearCache()
#> Please specify which cache you want to remove. Available are:
#> - EICData (3 rows)
#> - LC50_SMILES (23 rows)
#> - MS2QMD (1 rows)
#> - MSLibraryJSON (1 rows)
#> - MSLibraryMSP (1 rows)
#> - MSPeakListsAvg (4 rows)
#> - MSPeakListsMzR (97 rows)
#> - MSPeakListsSetAvg (2 rows)
#> - RF_SMILES (5 rows)
#> - TPsLib (1 rows)
#> - annotateSuspects (1 rows)
#> - calculatePeakQualities (3 rows)
#> - componentsCAMERA (1 rows)
#> - componentsNontarget (1 rows)
#> - componentsTPs (1 rows)
#> - compoundsCluster (1 rows)
#> - compoundsMetFrag (30 rows)
#> - dataCentroided (12 rows)
#> - featureGroupsOpenMS (6 rows)
#> - featuresOpenMS (69 rows)
#> - filterFGroups_blank (4 rows)
#> - filterFGroups_intensity (11 rows)
#> - filterFGroups_minAnalyses (1 rows)
#> - filterFGroups_minReplicates (83 rows)
#> - filterFGroups_replicateAbundance (8 rows)
#> - filterFGroups_replicate_group (11 rows)
#> - filterFGroups_retention (3 rows)
#> - filterMSPeakLists (4 rows)
#> - formulasFGroupConsensus (2 rows)
#> - formulasGenForm (89 rows)
#> - formulasSIRIUS (5 rows)
#> - generateTPsBT (74 rows)
#> - loadIntensities (69 rows)
#> - mzREIC (3426 rows)
#> - reportHTMLCompounds (1 rows)
#> - reportHTMLFormulas (1 rows)
#> - screenSuspects (7 rows)
#> - screenSuspectsPrepList (8 rows)
#> - specData (12 rows)
#> - all (removes complete cache database)

Using this output you can re-run the function again, for instance:

clearCache("featuresOpenMS")
clearCache(c("featureGroupsOpenMS", "formulasGenForm")) # clear multiple
clearCache("OpenMS") # clear all with OpenMS in name (ie partial matched)
clearCache("all") # same as simply removing the file