Skip to content

Request for creating H5R_OBJECT access #225

@mvfki

Description

@mvfki

I'm trying to create a H5AD file from R side solely relying on hdf5r (i.e. need to get rid of reticulate/Python). The way H5AD file (encoding-version 0.1.0) organizes a categorical variable in H5D is to save zero-based integer representation in an H5D 1D array, and create attribute pointing to another H5D 1D array which stores the "categories", analogy to R's factor levels. And here the attribute seems to be an H5R_OBJECT, if I load an existing H5AD file via hdf5r and access it with h5attr(file[['obs/clusters']], "categories"), where file is the H5File object and "obs/clusters" is the path to the integer 1D array.

I tried a lot, but seems currently the only available "exported" way of creating a H5R reference is calling file[['path']]$create_reference(), which is hardcoded to only return H5R_DATASET_REGION object, which cannot be identified by H5AD library (i.e. AnnData in Python). I tried some very dirty way like the following (simply modifying create_reference() source code by replacing the class). This works perfectly (no error and the H5 file I write can be smoothly loaded with Python AnnData library), but then my R package won't pass R CMD check because of calling something internal in hdf5r.

.H5.create_reference <- function(self, ...) {
    space <- self$get_space()
    do.call("[", c(list(space), list(...)))
    ref_type <- hdf5r::h5const$H5R_OBJECT
    ref_obj <- hdf5r::H5R_OBJECT$new(1, self)
    res <- .Call("R_H5Rcreate", ref_obj$ref, self$id, ".", ref_type,
                 space$id, FALSE, PACKAGE = "hdf5r")
    if (res$return_val < 0) {
        stop("Error creating object reference")
    }
    ref_obj$ref <- res$ref
    return(ref_obj)
}

Appreciate it a lot if you have any suggestions on cleanly accomplishing my goal or plan to add this as a new feature to your future versions.

Best,
Yichen

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions