|
| 1 | +# Enrichment Agent |
| 2 | + |
| 3 | +The enrichment agent for Knowledge Catalog provides a customizable agentic |
| 4 | +workflow for extracting information from various sources to build metadata |
| 5 | +about data assets, which can then be used as context. |
| 6 | + |
| 7 | +## Usage |
| 8 | + |
| 9 | +### Prerequisites |
| 10 | + |
| 11 | +The enrichment agent depends on the [Metadata as Code](../mdcode/README.md) capability. |
| 12 | +Follow the instructions on that page on using the `kcmd` tool. |
| 13 | + |
| 14 | +### CLI |
| 15 | + |
| 16 | +The package provides the `kcenrich` CLI tool. This is distributed as a standalone binary. |
| 17 | + |
| 18 | +```bash |
| 19 | +# Initialize a new catalog snapshot for a bigquery dataset |
| 20 | +kcmd init --bigquery-dataset <projectId>.<datasetId> |
| 21 | + |
| 22 | +# Initialize a new catalog snapshot for a bigquery dataset with specific types |
| 23 | +kcmd init --bigquery-dataset <projectId>.<datasetId> |
| 24 | + |
| 25 | +# Pull the latest catalog snapshot from the Knowledge Catalog service |
| 26 | +kcmd pull |
| 27 | + |
| 28 | +# Run the enrichment tool |
| 29 | +kcenrich catalog --path . --config-path ../demo |
| 30 | +``` |
| 31 | +## Developer Workflow |
| 32 | + |
| 33 | +### Setup |
| 34 | + |
| 35 | +```bash |
| 36 | +git clone https://github.com/googlecloudplatform/knowlege-catalog |
| 37 | +cd toolbox/mac |
| 38 | +npm install |
| 39 | +``` |
| 40 | + |
| 41 | +### Build |
| 42 | + |
| 43 | +```bash |
| 44 | +npm run build |
| 45 | +``` |
| 46 | + |
| 47 | +### Test |
| 48 | + |
| 49 | +```bash |
| 50 | +npm run test |
| 51 | +``` |
| 52 | + |
| 53 | +### Demo |
| 54 | + |
| 55 | +The repository contains a self-contained demo. Running the demo involves creating a BigQuery dataset and a Dataplex EntryGroup within your cloud project. |
| 56 | + |
| 57 | +**Initialize Environment** |
| 58 | +```bash |
| 59 | +export DEMO_CLOUD_PROJECT="<your-gcp-project-id>" |
| 60 | +``` |
| 61 | + |
| 62 | +**Initialize gcloud** |
| 63 | +```bash |
| 64 | +gcloud auth application-default login |
| 65 | +gcloud config set project $DEMO_CLOUD_PROJECT |
| 66 | +gcloud config set compute/region us |
| 67 | +``` |
| 68 | + |
| 69 | +**Setup demo resources** |
| 70 | +```bash |
| 71 | +# Create a BigQuery dataset and table |
| 72 | +bq mk ${DEMO_CLOUD_PROJECT}:demo-dataset |
| 73 | +bq mk -t ${DEMO_CLOUD_PROJECT}:demo-dataset.demo-table name:string,value:string |
| 74 | +``` |
| 75 | + |
| 76 | +**Create and populate a catalog snapshot** |
| 77 | +```bash |
| 78 | +mkdir -p catalog |
| 79 | +cd catalog |
| 80 | +kcmd init --bigquery-dataset ${DEMO_CLOUD_PROJECT}.demo-dataset |
| 81 | +kcmd pull |
| 82 | +``` |
| 83 | + |
| 84 | +**Enrich the metadata** |
| 85 | +```bash |
| 86 | +kcenrich catalog --path . --config-path ../config |
| 87 | +``` |
| 88 | + |
| 89 | +**Clean up** |
| 90 | +```bash |
| 91 | +bq rm -r ${DEMO_CLOUD_PROJECT}:demo-dataset |
| 92 | +``` |
0 commit comments