Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 20 additions & 20 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,31 @@
# galah 2.2.0

### Improved organisational support
* `filter()` now builds predicate queries natively when atlas is set to `GBIF`. Filter now uses an object-oriented workflow.
* DOIs now supported for `GBIF`
* Kew gardens and Flanders living atlases added
* authentication supported for ALA users within `galah_config()`
* `filter()` now builds predicate queries natively when atlas is set to `GBIF`. `filter()` now uses an object-oriented workflow.
* DOIs now supported for `GBIF`.
* Kew gardens and Flanders living atlases added (#256).
* Authentication supported for ALA users within `galah_config()` (#189).

### New & amended functions
* `dplyr::distinct()` can be used to find grouped data and summaries, generalising `atlas_species()`
* new functions `as_query()` and `compound()` as prequels to `collapse()`
* `dplyr::distinct()` can be used to find grouped data and summaries, generalising `atlas_species()` (#284).
* New functions `as_query()` and `compound()` as prequels to `collapse()` (#278).
* `galah_call()` is now synonmous with `request_data()` rather than wrapping all `request_` functions; `method` argument is removed.

## Changes to metadata functions
* all metadata requests now accept `select()`
* metadata types that support `unnest()` now also support `filter()` when unnest is not supplied
* all `show_all()` and `search_all()` functions gain an `all_fields` argument
* metadata now supports list-columns where the API returns nested data
* metadata functions now return columns names in `snake_case` rather than `camelCase`
* all metadata functions support caching, and are affected by re-introduced `caching` argument in `galah_config()` (set to `TRUE` by default)
* media metadata now uses a different API to return more relevant information

### Minor and internal changes, bug fixes
* Move to `testthat` 3rd edition for improved test functionality
* move to `{cli}` for `print()` calls, not `cat()`
* reduce usage of `@importFrom` in favour of `pkg::fun()` syntax, as per R style guide
* `basisOfRecord` now included as default field (i.e. with `select(group = "basic")`) (#281)
* `query` objects now have a `request` slot showing the request that generated them
* All metadata requests now accept `select()`.
* Metadata types that support `unnest()` now also support `filter()` when `unnest()` is not supplied.
* All `show_all()` and `search_all()` functions gain an `all_fields` argument.
* Metadata now supports list-columns where the API returns nested data.
* Metadata functions now return columns names in `snake_case` rather than `camelCase`.
* All metadata functions support caching, and are affected by re-introduced `caching` argument in `galah_config()` (set to `TRUE` by default).
* Media metadata now uses a different API to return more relevant information.

### Minor improvements and bug fixes
* Move to `testthat` 3rd edition for improved test functionality.
* Move to `{cli}` for `print()` calls, not `cat()`.
* Reduce usage of `@importFrom` in favour of `pkg::fun()` syntax, as per R style guide.
* `basisOfRecord` now included as default field (i.e. with `select(group = "basic")`). (#281)
* `query` objects now have a `request` slot showing the request that generated them .


# galah 2.1.2
Expand Down
29 changes: 20 additions & 9 deletions R/capture.R
Original file line number Diff line number Diff line change
@@ -1,25 +1,36 @@
#' Capture a request
#'
#' @description
#' The first step in evaluating a request is to capture and parse the
#' information it contains. The resulting object has class `prequery`
#' for those requiring further processing or `query` for those that don't.
#' for those requiring further processing or `query` for those that don't.
#' A `prequery` object shows the basic structure of what has been requested by
#' a user in a given [galah_call()].
#'
#' @name capture.data_request
#' @param x A `_request` object to convert to a `prequery`.
#' @param ... Other arguments, currently ignored
#' @details
#' Typically, queries in galah are piped using [galah_call()], which builds
#' an object of class `"data_request"`; or [request_metadata()] or
#' [request_files()]. All these objects can be converted to class `"query"`
#' using \code{\link[=collapse.data_request]{collapse()}}. However,
#' properly evaluating a query often requires building and running
#' additional queries to populate or validate the requested information.
#' [request_files()]. Under the hood, [galah_call()] consists of a series of
#' step-wise functions that run in order:
#'
#' [capture()] → [compound()] →
#' \code{\link[=collapse.data_request]{collapse()}} →
#' \code{\link[=compute.data_request]{compute()}} →
#' \code{\link[=collect.data_request]{collect()}}
#'
#' [capture()] is the first of the [galah_call()] workflow, and it parses the
#' basic structure of a user request, returned as a `prequery` object.
#' A `prequery` object shows what has been requested, before those
#' calls are built by [compound()] and evaluated by
#' \code{\link[=collapse.data_request]{collapse()}}.
#' For simple cases, this gives the same result as running
#' \code{\link[=collapse.data_request]{collapse()}} while the `run_checks`
#' argument of [galah_config()] is set to `FALSE`, but is slightly faster.
#' In complex cases, it is simply a precursor to [compound()]
#' @name capture.data_request
#' @param x A `_request` object to convert to a `prequery`.
#' @param ... Other arguments, currently ignored
#' In complex cases, it is simply a precursor to [compound()].
#'
#' @order 1
#' @return Either an object of class `prequery` when further processing is
#' required; or `query` when it is not. Both classes are structurally identical,
Expand Down
3 changes: 2 additions & 1 deletion R/check_queue.R
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ check_queue_loop <- function(.query){
iter <- 1
verbose <- potions::pour("package", "verbose", .pkg = "galah")
if(verbose){
cli::cli_text("Current queue length: {current_queue}")
position <- glue::glue("Current queue length: {current_queue}")
cli::cli_text(position)
}
while(continue == TRUE){
.query <- check_occurrence_status(.query)
Expand Down
37 changes: 34 additions & 3 deletions R/collect_occurrences.R
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,15 @@ collect_occurrences_default <- function(.query, wait, file, call){
# check queue
download_response <- check_queue(.query, wait = wait)
if(is.null(download_response)){
cli::cli_abort("No response from selected atlas",
cli::cli_abort("No response from selected atlas.",
call = call)
}
# get data
if(potions::pour("package", "verbose", .pkg = "galah") &
download_response$status == "complete") {
cli::cli_text("Downloading")

scrolly_dots_message("Downloading")
# cli::cli_par()
}
# sometimes lookup info critical, but not others - unclear when/why!
if(any(names(download_response) == "download_url")){
Expand Down Expand Up @@ -143,4 +145,33 @@ download_failed_message <- function(call){
i = "This usually suggests a problem with the download itself, rather than the API.",
i = "Consider checking that a file has been created in the expected location.") |>
cli::cli_abort(call = call)
}
}



#' Theatrics
#' @noRd
#' @keywords Internal
scrolly_dots_message <- function(message) {

spinny <- cli::make_spinner(
which = "simpleDotsScrolling",
template = paste0(message, " {spin}")
)

# update the spinner 100 times
lapply(1:100, function(x) {
spinny$spin()
wait(.001)
})

# clear the spinner from the status bar
# spinny$finish()
}

#' Wait time
#' @noRd
#' @keywords Internal
wait <- function(seconds = 1) {
Sys.sleep(seconds)
}
22 changes: 19 additions & 3 deletions R/compound.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
#' Force evaluation of a database query
#'
#' [compound()] is an S3 generic function intended to be called before
#' [collapse()]. It is important as it shows the full set of queries
#' required to properly evaluate the user's request. This is often broader
#' @description
#' [compound()] shows the full set of queries
#' required to properly evaluate the user's request, run prior to [collapse()].
#'
#' The number of total queries to send for a single data request is often broader
#' than the single query returned by [collapse()]. If, for example,
#' the user's query includes a call to
#' \code{\link[=identify.data_request]{identify()}}, then a taxonomic query
Expand All @@ -14,6 +16,20 @@
#' @param x An object to be compounded. Works for `data_request`,
#' `metadata_request`, `file_request`, `query` or `prequery`.
#' @param ... Other arguments passed to [capture()].
#' @details
#' Typically, queries in galah are piped using [galah_call()], which builds
#' an object of class `"data_request"`; or [request_metadata()] or
#' [request_files()]. Under the hood, [galah_call()] consists of a series of
#' step-wise functions that run in order:
#'
#' [capture()] → [compound()] →
#' \code{\link[=collapse.data_request]{collapse()}} →
#' \code{\link[=compute.data_request]{compute()}} →
#' \code{\link[=collect.data_request]{collect()}}
#'
#' [compound()] is the second of the [galah_call()] workflow, and it collates
#' the complete list of queries required to send in order to meet the user's
#' data request, returned by \code{\link[=collapse.data_request]{collapse()}}.
#' @order 1
#' @return An object of class `query_set`, which is simply a list of all `query`
#' objects required to properly evaluate the specified request. Objects are
Expand Down
4 changes: 1 addition & 3 deletions R/compute_occurrences.R
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,10 @@ compute_occurrences_la <- function(.query){
check_occurrence_response()
if(potions::pour("package", "verbose")){
n_records <- status_code$total_records
cli::cli_par()
if(!is.null(.query$request$authenticate)){
cli::cli_text("Query sent including JWT token")
}
cli::cli_text("Request for {n_records} occurrences placed in queue")
cli::cli_end()
cli::cli_text("Request for {n_records} occurrences placed in queue.")
}
# return a useful object
c(list(type = "data/occurrences"),
Expand Down
26 changes: 22 additions & 4 deletions R/dplyr-collapse.R
Original file line number Diff line number Diff line change
@@ -1,18 +1,36 @@
#' Generate a query
#'
#' This function constructs a query so it can be inspected before being sent. It
#' is typically called at the end of a pipe begun with [galah_call()]. Objects
#' Constructs a query so it can be inspected before being sent. `collapse()` can
#' be called at the end of a pipe that begins with [galah_call()] to return the
#' constructed user query generated by the user's data request
#' (a `query` object). Objects
#' of class `data_request` (created using [request_data()]), `metadata_request`
#' (from [request_metadata()]) or `files_request` (from [request_files()]) are
#' all supported. Any of these objects can be created using [galah_call()] via
#' the `method` argument.
#' all supported.
#' @name collapse.data_request
#' @order 1
#' @param x An object to run `collapse()` on. Classes supported by `galah`
#' include `data_request`, `metadata_request` and `files_request` for building
#' queries; and `prequery`, `query` or `query_set` once constructed (via
#' [capture()] or [compound()]).
#' @param ... Arguments passed on to [capture()].
#' @details
#' Typically, queries in galah are piped using [galah_call()], which builds
#' an object of class `"data_request"`; or [request_metadata()] or
#' [request_files()]. Under the hood, [galah_call()] consists of a series of
#' step-wise functions that run in order:
#'
#' [capture()] → [compound()] →
#' \code{\link[=collapse.data_request]{collapse()}} →
#' \code{\link[=compute.data_request]{compute()}} →
#' \code{\link[=collect.data_request]{collect()}}
#'
#' \code{\link[=collapse.data_request]{collapse()}} constructs a complete
#' user query, ready to be sent by
#' \code{\link[=compute.data_request]{compute()}}.
#' Information required to construct a complete user query are
#' provided by [capture()] and [compound()], preceding functions to
#' parse and combine all required API calls necessary to build a user's query.
#' @return An object of class `query`, which is a list-like object containing
#' two or more of the following slots:
#'
Expand Down
21 changes: 18 additions & 3 deletions R/dplyr-collect.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
#' Retrieve a database query
#'
#' This function retrieves the specified query from the server. It is the
#' default way to end a piped query begun with [galah_call()].
#'
#' @description
#' Retrieve the result of a query from the server. It is the
#' default way to end a piped query that begins with [galah_call()].
#' @name collect.data_request
#' @order 1
#' @param x An object of class `data_request`, `metadata_request` or
Expand All @@ -15,6 +16,20 @@
#' @param file (Optional) file name. If not given, will be set to `data` with
#' date and time added. The file path (directory) is always given by
#' `galah_config()$package$directory`.
#' @details
#' Typically, queries in galah are piped using [galah_call()], which builds
#' an object of class `"data_request"`; or [request_metadata()] or
#' [request_files()]. Under the hood, [galah_call()] consists of a series of
#' step-wise functions that run in order:
#'
#' [capture()] → [compound()] →
#' \code{\link[=collapse.data_request]{collapse()}} →
#' \code{\link[=compute.data_request]{compute()}} →
#' \code{\link[=collect.data_request]{collect()}}
#'
#' \code{\link[=collect.data_request]{collect()}} is the final step of the
#' [galah_call()] workflow, and it retrieves the result of a
#' query once it is processed by the server.
#' @return In most cases, `collect()` returns a `tibble` containing requested
#' data. Where the requested data are not yet ready (i.e. for occurrences when
#' `wait` is set to `FALSE`), this function returns an object of class `query`
Expand Down
36 changes: 28 additions & 8 deletions R/dplyr-compute.R
Original file line number Diff line number Diff line change
@@ -1,19 +1,39 @@
#' Compute a query
#'
#' This function sends a request for information to a server. This is only
#' useful for processes that run a server-side process, as it separates the
#' submission of the request from its' retrieval. Within galah, this is used
#' exclusively for generating occurrence queries, where calling
#' \code{\link[=compute.data_request]{compute()}} and then passing
#' the resulting `query` object to \code{\link[=collect.data_request]{collect()}}
#' at a later time can be preferable to calling [atlas_occurrences()], which
#' @description
#' Sends a request for information to a server. This is useful
#' for requests that run a server-side process, as it separates the
#' submission of the request from its retrieval.
#'
#' Within galah, `compute()` is generally hidden as it is one part of the overall
#' process to complete a `data_request`,
#' `metadata_request` or `file_request`. However, calling
#' \code{\link[=compute.data_request]{compute()}} at the
#' end of a [galah_call()] sends a request to be completed server-side
#' (i.e., outside of R), and the result can be returned in R by
#' calling \code{\link[=collect.data_request]{collect()}}
#' at a later time. This can be preferable to calling [atlas_occurrences()], which
#' prevents execution of new code until the server-side process is complete.
#' @name compute.data_request
#' @order 1
#' @param x An object of class `data_request`, `metadata_request` or
#' `files_request` (i.e. constructed using a pipe) or `query`
#' (i.e. constructed by `collapse()`)
#' (i.e. constructed by \code{\link[=collapse.data_request]{collapse()}})
#' @param ... Arguments passed on to other methods
#' @details
#' Typically, queries in galah are piped using [galah_call()], which builds
#' an object of class `"data_request"`; or [request_metadata()] or
#' [request_files()]. Under the hood, [galah_call()] consists of a series of
#' step-wise functions that run in order:
#'
#' [capture()] → [compound()] →
#' \code{\link[=collapse.data_request]{collapse()}} →
#' \code{\link[=compute.data_request]{compute()}} →
#' \code{\link[=collect.data_request]{collect()}}
#'
#' \code{\link[=compute.data_request]{compute()}} sends a query to a server,
#' which, once completed, can be retrieved using
#' \code{\link[=collect.data_request]{collect()}}.
#' @return An object of class `computed_query`, which is identical to class
#' `query` except for occurrence data, where it also contains information on the
#' status of the request.
Expand Down
31 changes: 18 additions & 13 deletions R/galah_config.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
#' View or set package behaviour
#'
#'
#' @description
#' The `galah` package supports queries to a number of different data providers,
#' and once selected, it is desirable that all later queries are sent to that
#' organisation. Rather than supply this information separately in each
#' query, therefore, it is more parsimonious to cache that information centrally
#' and call it as needed, which is what this function supports. Beyond choosing
#' query, it is more parsimonious to cache it centrally
#' and call it as needed, which is what this function supports.
#'
#' Beyond choosing
#' an organisation, there are several other use cases for caching. Many
#' GBIF nodes require the user to supply a registered email address,
#' password, and (in some cases) a reason for downloading data, all stored via
Expand All @@ -19,23 +22,25 @@
#' Valid arguments to this function are:
#'
#' * `atlas` string: Living Atlas to point to, Australia by default. Can be
#' an organisation name, acronym, or region (see [show_all_atlases()] for
#' admissible values)
#' * `authenticate` logical: should `galah` authenticate your queries using
#' JWT tokens? Defaults to `FALSE`.
#' an organisation name, acronym, or region (see [show_all_atlases()] for
#' admissible values)
#' * `authenticate` logical: Should `galah` use authenticate your queries using
#' JWT tokens? Defaults to `FALSE`. If `TRUE`, user credentials are
#' verified prior to sending a query. This can allow users with special
#' access to download additional information in `galah`.
#' * `caching` logical: should metadata query results be cached in `options()`?
#' Defaults to `TRUE` for improved stability and speed.
#' * `directory` string: the directory to use for the disk cache.
#' * `directory` string: The directory to use for the disk cache.
#' By default this is a temporary directory, which means that results will
#' only be cached within an R session and cleared automatically when the user
#' exits R. The user may wish to set this to a non-temporary directory for
#' caching across sessions. The directory must exist on the file system.
#' * `download_reason_id` numeric or string: the "download reason" required.
#' by some ALA services, either as a numeric ID (currently 0--13)
#' or a string (see `show_all(reasons)` for a list of valid ID codes and
#' names). By default this is NA. Some ALA services require a valid
#' download_reason_id code, either specified here or directly to the
#' associated R function.
#' by some ALA services, either as a numeric ID (currently 0--13)
#' or a string (see `show_all(reasons)` for a list of valid ID codes and
#' names). By default this is NA. Some ALA services require a valid
#' download_reason_id code, either specified here or directly to the
#' associated R function.
#' * `email` string: An email address that has been registered with the chosen
#' atlas. For the ALA, you can register at
#' [this address](https://auth.ala.org.au/userdetails/registration/createAccount).
Expand Down
Loading
Loading