Skip to contents

This function reads an RCDF (Reusable Data Container Format) archive, decrypts its contents using the specified decryption key, and loads it into R as an RCDF object. The data files within the archive (usually Parquet files) are decrypted and, if provided, metadata (such as data dictionary and value sets) are applied to the data.

Usage

read_rcdf(
  path,
  decryption_key,
  ...,
  password = NULL,
  metadata = list(),
  ignore_duplicates = TRUE,
  as_arrow_table = TRUE,
  recursive = FALSE,
  return_meta = FALSE
)

Arguments

path

A string specifying the path to the RCDF archive (zip file).

decryption_key

The key used to decrypt the RCDF contents. This can be an RSA or AES key, depending on how the RCDF was encrypted.

...

Additional parameters passed to other functions, if needed.

password

A password used for RSA decryption (optional).

metadata

An optional list of metadata object containing data dictionaries, value sets, and primary key constraints for data integrity measure (a data.frame or tibble that includes at least two columns: file and pk_field_name. This metadata is applied to the data if provided.

ignore_duplicates

A logical flag. If TRUE, a warning is issued when duplicates are found. If FALSE, the function stops with an error.

as_arrow_table

Logical. If TRUE, the function will return the result as an Arrow table. If FALSE, a regular data frame will be returned. Default is FALSE.

recursive

Logical. If TRUE and path is a directory, the function will search recursively for .rcdf files.

return_meta

Logical. If TRUE, metadata extracted from the RCDF (excluding sensitive parts like encryption keys)

Value

An RCDF object, which is a list of Parquet files (one for each record) along with attached metadata.

Examples

dir <- system.file("extdata", package = "rcdf")
rcdf_path <- file.path(dir, 'mtcars.rcdf')
private_key <- file.path(dir, 'sample-private-key.pem')

rcdf_data <- read_rcdf(path = rcdf_path, decryption_key = private_key)
#> Error in check_dbplyr(): The package "dbplyr" is required to communicate with database backends.
rcdf_data
#> Error: object 'rcdf_data' not found

# Using encrypted/password protected private key
rcdf_path_pw <- file.path(dir, 'mtcars-pw.rcdf')
private_key_pw <- file.path(dir, 'sample-private-key-pw.pem')
pw <- '1234'

rcdf_data_with_pw <- read_rcdf(
  path = rcdf_path_pw,
  decryption_key = private_key_pw,
  password = pw
)
#> Error in check_dbplyr(): The package "dbplyr" is required to communicate with database backends.

rcdf_data_with_pw
#> Error: object 'rcdf_data_with_pw' not found