Working with directories and lists of files

tools

A short description of the post.

if (!require(pacman)){install.packages('pacman')}
p_load(tidyverse)

A few useful R code chunks to execute common tasks when I compile a complex research project with several sub-folders organizing the content. A very common goal is to load in the environment all the files stored in a certain sub-directory of the project.

1) Using {base}dir + {base}load(wrapped in {base}lapply) to LOAD several files

For example base::dir produces a character vector of the names of files or directories in the named dir:

dir(path = ".", pattern = NULL, all.files = FALSE,full.names = FALSE, recursive = FALSE, > ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)

… and I can use it to obtain a list of files which I then base::readRDS using base::lapply:

# Get the list of FILES 
file_names  <-  as.list(dir(
  path ="./_posts/2018-09-28-dirs-files/files", # no final {/}
  pattern = "*.Rds",
  full.names = TRUE ))
file_names
list()
# read them in the environment
lapply( file_names, readRDS  )
list()

2) Selectively load the list of FILES ending with “”

Similarly, I can use base::list.files, with the addition of the argument pattern = <some regex> to screen certain files… again followed by base::lapply(load)

list.files(path = ".", pattern = NULL, all.files = FALSE, full.names = FALSE, recursive = FALSE, ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)

# List of files ending with <.Rds>
file_names2 <- list.files(path = "./_posts/2018-09-28-dirs-files/files",
               pattern = ".Rds$",
                      full.names = T # "xx$" = "ends with  xx"
                      # all.files = FALSE,
                      # full.names = FALSE, recursive = FALSE,
                      # ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)
                      )   
# read them in the environment
lapply( file_names2, readRDS  )
list()

3) Now, a variation on the theme from tidyverse package purrr to EXECUTE several R scripts

Again, I use base::list.files, but combined with purrr::walk(source) - which is a more sophisticated function to loop through things and execute something.

Plus, instead of load I use source because I intend to EXECUTE several R scripts contained in a project’s sub folder.

(*) function purrr:walk is specifically used for functions that don’t produce an output (as opposed to purrr:map)

# Get the list of R SCRIPTS [!!!]
file_names3 <- list.files(
  path = "./_posts/2018-09-28-dirs-files/files",
  pattern = ".R$",
  all.files = FALSE,  # def (= only visible)
  full.names = TRUE,  # I NEED dir name prepended
  recursive = FALSE,  # def  (= no inside sub-dir )
  ignore.case = TRUE, # (= pattern-matching be case-insensitive)
  include.dirs = FALSE, # def (subdirectory names NOT be included in recursive listings)
  no.. = FALSE) %>% # def (both "." and ".." be excluded also from non-recursive listings) 
  sort(decreasing = FALSE
      )  
file_names3
character(0)
# Execute them them in the environment
purrr::walk(.x = file_names3,
        .f = source, 
        local = FALSE, 
        echo = TRUE, 
        verbose = TRUE)