Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 43 additions & 7 deletions examples/GCP/BigQuery/README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,33 @@
# Big Query Flex Integration
# BigQuery Flex Integration

This flex integration runs bq queries via GCP CLI and reports the output to New Relic. Most useful information is found under the [INFORMATION_SCHEMA](https://cloud.google.com/bigquery/docs/information-schema-intro) view.
This directory contains two complementary Flex integrations for BigQuery, each using a different approach.

| File | Approach | Best for |
|---|---|---|
| `bq.yml` | `bq` CLI tool | Hosts with GCP CLI installed |
| `bq-rest.yml` | BigQuery REST API | Containers or hosts without `bq` CLI |

Most useful information is found under the [INFORMATION_SCHEMA](https://cloud.google.com/bigquery/docs/information-schema-intro) view.


## CLI Approach (`bq.yml`)

### Pre-requirements

## Pre-requirements
* [Infrastructure Agent](https://docs.newrelic.com/docs/infrastructure/infrastructure-agent/linux-installation/package-manager-install/) installed
* [GCP CLI](https://cloud.google.com/sdk/docs/install#linux) installed & configured on host running infrastructure agent
* [GCP Service Account](https://developers.google.com/identity/protocols/oauth2/service-account#creatinganaccount) created with json key file downloaded

Service account must have the following permissions:

* Bigquery.tables.get
* Bigquery.tables.list
* Bigquery.routines.get
* Bigquery.routines.list
* Bigquery.jobs.listAll

### Installation


## Installation
1. Copy service account json key file to host running the integration
2. Copy `bq.yml` under `/etc/newrelic-infra/integrations.d`
3. Authenticate the CLI with the service account key file:
Expand All @@ -36,9 +46,35 @@ Comment out the `bq-auth` block after this is done successfully.

5. [Restart the infrastructure agent](https://docs.newrelic.com/docs/infrastructure/infrastructure-agent/manage-your-agent/start-stop-restart-infrastructure-agent/)

### Configuration


## Configuration
The bq configuration requires the service account email, GCP project id, and region. These values are substituted dynamically into each bq CLI command ran, so any additional queries added can follow the same format as the examples provided.

Additionally, the polling interval can be set at the top (in seconds), and the `INSIGHTS*` environment variables can be used to remove all infrastructure agent metadata tacked onto each bq payload forwarded to New Relic. These are configured with an ingest key and an account id within the URL variable.


## REST API Approach (`bq-rest.yml`)

Use this approach when the `bq` CLI is not available on the host (e.g., containers, restricted environments). It calls the BigQuery REST API directly using an OAuth2 bearer token obtained via `gcloud`.

### Pre-requirements

* [Infrastructure Agent](https://docs.newrelic.com/docs/infrastructure/infrastructure-agent/linux-installation/package-manager-install/) installed
* `gcloud` available in PATH with [application-default credentials](https://cloud.google.com/docs/authentication/application-default-credentials) configured

### Installation

1. Configure application-default credentials on the host:

```bash
gcloud auth application-default login
```

2. Copy `bq-rest.yml` under `/etc/newrelic-infra/integrations.d`
3. [Restart the infrastructure agent](https://docs.newrelic.com/docs/infrastructure/infrastructure-agent/manage-your-agent/start-stop-restart-infrastructure-agent/)

### Configuration

Set `project_id` and `dataset_id` in the `variable_store` section of `bq-rest.yml`. These are substituted into the API URLs and query payloads at runtime.

The integration uses the Flex `lookup` feature to chain the token-fetch step into subsequent API calls — no manual token management required.
41 changes: 41 additions & 0 deletions examples/GCP/BigQuery/bq-rest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
integrations:
- name: nri-flex
interval: 300s
config:
name: gcp-bq-rest-flex
variable_store:
project_id: my-project-id-123456 # Required
dataset_id: my_dataset # Required
apis:
## Fetch OAuth2 access token using application-default credentials.
## gcloud caches the token locally (valid for ~1 hour) and only
## refreshes it near expiry, so this is not a network call on every poll.
- name: gettoken
ignore_output: true
commands:
- run: echo "access_token:$(gcloud auth application-default print-access-token)"
split_by: ':'
## List all tables in the dataset with metadata (rows, size, type)
- name: bq-rest-table-list
event_type: bq_rest_table_stats
method: GET
url: https://bigquery.googleapis.com/bigquery/v2/projects/${var:project_id}/datasets/${var:dataset_id}/tables
headers:
Authorization: Bearer "${lookup.gettokenSample:access_token}"
Content-Type: application/json
jq: '.tables[]'
## Run a query against INFORMATION_SCHEMA to get table storage stats
- name: bq-rest-storage
event_type: bq_rest_storage_stats
method: POST
url: https://bigquery.googleapis.com/bigquery/v2/projects/${var:project_id}/queries
headers:
Authorization: Bearer "${lookup.gettokenSample:access_token}"
Content-Type: application/json
payload: >
{
"query": "SELECT table_name, total_rows, total_logical_bytes, total_physical_bytes FROM `${var:project_id}`.`${var:dataset_id}`.INFORMATION_SCHEMA.TABLE_STORAGE",
"useLegacySql": false,
"timeoutMs": 30000
}
jq: '.rows[].f | {table_name: .[0].v, total_rows: .[1].v, total_logical_bytes: .[2].v, total_physical_bytes: .[3].v}'