Note: Before installing and using the cori.db package, please refer to the coriverse wiki for instructions on setting up your local environment to install packages from private Github repositories.
cori.db is an internal R package for all things database connection, SQL function related and AWS S3.
It can be installed with (requires GITHUB_PAT):
# install.packages("devtools")
devtools::install_github("ruralinnovation/cori.db")If you are connecting from a personal computer, your IP address will need to be whitelisted before you will be able to connect. Contact Olivier Leroy or John Hall.
Before using cori.db, you must set the environment variables for our Database (Postgresql hosted in RDS from AWS) and accessing S3.
For the database you need: DATABASE_USER_AD and DATABASE_PASSWORD_AD with the values of your database username and password, respectively, in either a local .Renviron file (or your shell's profile i.e. ~/.bash_profile or ~/.profile).
To use any of the S3 functions, you will need to set environment variables that represent your AWS credentials, including AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION. The instructions in the following section provide a convenient function to do this from within R.
- Run
set_db_credentials('user_name_here', 'password_here') - Restart R
- Run
Sys.getenv('DATABASE_USER_AD'). If the above steps were successful, it should return the value of DB_USER you set in Step 1
- Run
library(cori.db)~~2. Runset_aws_credentials(keyID = "###", accessKey = "###")(the region is set by default) . ~~ - Use the
awsCLI to configure standard AWS credentials (see Configure the AWS CLI) - Quick test
cori.db:::has_aws_credentials()should returnTRUE - Restart R
So, this is how the functions work...
# connect to schema metadata
con <- connect_to_db("metadata")
# always end scripts by disconnecting from the database!
DBI::dbDisconnect(con)TODO: Improve/optimize list_s3_buckets() funciton logic.
S3 interactions is still in a prototype mode, some functions and their arguments could change as needed.
First you need to list what are our s3 buckets:
list_s3_buckets()You can then ask is inside an s3 buckets:
list_s3_objects(bucket_name = "test-coridata") # argument is the bucket nameTo download a file you need to specify (1) its bucket, (2) its key and (optional) where you want to download it:
(get_s3_object("test-coridata", "blabla.txt"))Alternatively if the file is a csv you can read it in your R session:
dbtables <- read_s3_object(bucket_name = "test-coridata", key = "data-1715776270877.csv")Attention: it is only working with .csv!
A data.frame or a tibble (but not an sf object!) can be directly send as a csv in a bucket:
# Write the data.frame to a CSV file
write_s3_object(bucket_name = "test-coridata", s3_key_path = "tests/data/cars.csv", data_frame = cars)You can either send a file:
write.csv(cars, file = "tests/data/cars.csv", row.names = FALSE)
put_s3_object("test-coridata", s3_key_path = "tests/data/cars.csv", file_path = "tests/data/cars.csv")... or you can send a directory of files:
put_s3_objects_recursive("test-coridata", s3_key_prefix = "tests/data", dir_path ="tests/data")Here you can see an example of how a fake file hiearchy is implemented in S3 (just prefix with some_fake_dir/). It's fake because the s3_key_path is actually represented in S3 as a string that is prefixed to the file name, but for all intents and purposes we can think of it as a diretory structure.
For now (cori.db Version: 0.3.0) sending to an S3 bucket is limited to specific buckets (To be discussed!).
Once you have all of the dependencies installed, to build and install this package from the local project directory, run:
pkgbuild::clean_dll(); pkgbuild::compile_dll(); devtools::document(); devtools::check(); devtools::install();