Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
c1f8e0f
draft commit of contract llm
Jul 29, 2025
140ac2f
draft codebase with experiment code
Aug 4, 2025
415f87d
experiment script
Aug 4, 2025
0f1985c
updated draft contract mapping tool
Aug 7, 2025
f214130
make llm deterministic and reproducible
DavisAgyemang Aug 12, 2025
be17098
new experiment and readme edited and removed estate services from prompt
DavisAgyemang Aug 13, 2025
3c282ed
improved readme
DavisAgyemang Aug 13, 2025
c6eab3e
Add pandas and dependencies
SamuelHLewis Aug 15, 2025
b24ca04
resolving feedback
DavisAgyemang Aug 18, 2025
7af0dff
Merge pull request #1 from Crown-Commercial-Service/AI-105-contractmap
SamuelHLewis Aug 19, 2025
db33698
api feature draft completed
DavisAgyemang Aug 27, 2025
823c14b
added requirements.txt which is needed for deployment and added uvico…
DavisAgyemang Sep 4, 2025
04d3f41
Merge pull request #3 from Crown-Commercial-Service/AI-108-api
DavisAgyemang Sep 23, 2025
5905b1d
added the api for version 2 llm
DavisAgyemang Sep 23, 2025
f9afef4
Merge pull request #4 from Crown-Commercial-Service/AI-108-api
DavisAgyemang Sep 23, 2025
a9df023
optimised model 91%
DavisAgyemang Oct 7, 2025
cb0f124
notebook experiment
DavisAgyemang Oct 7, 2025
abaafcd
Add config and instructions for jupyter notebook
SamuelHLewis Oct 8, 2025
3e37017
Merge pull request #5 from Crown-Commercial-Service/AI-117-optmise-model
DavisAgyemang Oct 8, 2025
8231292
Add Githib actions and pre commit hook
georges1996 Feb 18, 2026
117ad22
Add tests
ccs-gs Feb 18, 2026
115def1
Add linting fixes
SamuelHLewis Feb 24, 2026
4204b7b
Merge pull request #10 from Crown-Commercial-Service/add-github-actions
SamuelHLewis Feb 24, 2026
855561d
async is working now
Mar 10, 2026
2fdd06b
Merge pull request #11 from Crown-Commercial-Service/AI-164-Async-con…
DavisAgyemang Mar 10, 2026
d8a38ba
Restructure codebase to clarify which bit does what
SamuelHLewis Mar 12, 2026
ca121be
Reformat prompts to markdown
SamuelHLewis Mar 12, 2026
d3c8c8d
Change prompt imports to work with markdown
SamuelHLewis Mar 12, 2026
c8b3a2c
Add ignore for snyk
SamuelHLewis Mar 12, 2026
8a89d95
Fix import statement error
SamuelHLewis Mar 12, 2026
efe053c
Merge pull request #12 from Crown-Commercial-Service/ai-222-mlflow
ccs-gs Mar 13, 2026
bc00c70
Refactor Eval scripts
ccs-gs Mar 16, 2026
9fef3b3
Update README
ccs-gs Mar 16, 2026
af413d5
Update classification_v2
ccs-gs Mar 16, 2026
66ca219
Remove leftover poetry files
SamuelHLewis Mar 17, 2026
93fb1f5
Merge pull request #13 from Crown-Commercial-Service/refactor-eval-sc…
ccs-gs Mar 17, 2026
43b5c28
Add ML Flow
ccs-gs Mar 19, 2026
de7fd4b
Add configuration for AzureML-MLFlow
SamuelHLewis Mar 20, 2026
1c1acc2
Always send data to azure mlflow
ccs-gs Mar 20, 2026
e92cbea
Merge pull request #14 from Crown-Commercial-Service/add-ml-flow
SamuelHLewis Mar 20, 2026
566a612
fixed errors in new version of contract map due mlflow changes that w…
Mar 25, 2026
30bd893
fixed the bug ready for deployment
Mar 25, 2026
755a37a
new branch to merge to branch_for_appservice_fix
Mar 25, 2026
831f1f9
Merge pull request #17 from Crown-Commercial-Service/brach_for_appser…
DavisAgyemang Mar 25, 2026
2e3371e
Add instructions for local installation
SamuelHLewis Mar 27, 2026
bed0b01
Fix import statements to remove src prefix
SamuelHLewis Mar 27, 2026
b4472a2
Merge pull request #16 from Crown-Commercial-Service/branch_for_appse…
DavisAgyemang Apr 13, 2026
bc88d60
updated how it libraries are imported
Apr 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
############################
# Azure OpenAI (required)
############################
AZURE_OPENAI_API_VERSION=2024-02-15-preview
AZURE_OPENAI_ENDPOINT=https://your-resource-name.openai.azure.com/
AZURE_OPENAI_KEY=your-azure-openai-api-key
DEPLOYMENT_NAME=your-chat-model-deployment

############################
# MLflow (required for evaluation)
############################
# Required by evaluation/run_evaluation.py
MLFLOW_TRACKING_URI=azureml://<region>.api.azureml.ms/mlflow/v1.0/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace-name>
MLFLOW_EXPERIMENT_NAME=ContractMap-Evaluation
49 changes: 49 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: CI

on:
push:
branches:
- "**"

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"

- name: Install Ruff
run: python -m pip install --upgrade pip ruff

- name: Ruff (syntax / undefined-name checks)
run: ruff check .

pytest:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"

- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y unixodbc-dev

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
python -m pip install -e .

- name: Run tests
run: pytest -q
14 changes: 14 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
.env
venv/*
*.csv
*.xlsx
**/__pycache__
**egg-info
*.DS_Store
*ipynb_checkpoints
*.vscode
.github/instructions/
.idea/
Davis_atamis_test.xlsx
nhs_data_evaluation.py
tenders_experiment.ipynb
10 changes: 10 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.1.14
hooks:
# Run the linter.
- id: ruff
args: [ --fix ]
# Run the formatter.
- id: ruff-format
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.11
151 changes: 150 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,150 @@
# ccs-contract-map
# ccs-contract-map


This repository provides a tool to automatically label contract descriptions using the CCS categories. It leverages a Large Language Model (LLM) to classify contract descriptions into predefined categories accurately and consistently.

## Project Structure

```
ccs-contract-map/
├── src/ # Source code
│ ├── core/ # Core classification modules
│ │ ├── classification_v1.py # Version 1 (SystemMessage/HumanMessage)
│ │ └── classification_v2.py # Version 2 (string concatenation)
│ └── api/ # FastAPI endpoints
│ ├── v1_endpoint.py # API for version 1
│ └── v2_endpoint.py # API for version 2
├── prompts/ # System prompts
│ ├── system_prompts.py # Python prompt definitions
│ ├── new_system_prompt.txt # Text-based prompt
│ └── contractmap_prompt_with_descriptions.txt
├── evaluation/ # Evaluation scripts
│ ├── run_evaluation.py # Unified evaluation CLI (v1 or v2)
│ └── prompt_engineering_experiment.ipynb
├── utils/ # Utility modules
│ └── file_io/ # File I/O utilities
│ └── file_to_string.py
├── data/ # Data files
│ ├── input/ # Input datasets
│ └── results/ # Evaluation results
└── tests/ # Unit tests
```

## Features

- Uses a gpt-4.1-mini LLM for classification
- There are 2 LLM architectures: Version 1 (in `src/core/classification_v1.py`) and Version 2 (in `src/core/classification_v2.py`)
- Version 1 uses role‑tagged messages (SystemMessage + HumanMessage) so instructions are treated as high‑priority and
protected from user input, whereas Version 2 sends one raw string with specified newlines that mixes instructions with content.
- The LLM ran on 74 descriptions: Version 1 got accuracy of 87.67123287671232% and
Version 2 got accuracy 89.04109589041096%. However, Version 1
is safer because it uses a SystemMessage that separates instructions from user input,
so the model treats those instructions as higher priority and they are harder for user text to override.
This reduces the risk of prompt‑injection.

Note: According to Microsoft, it is not possible to obtain 100% deterministic
results from LLMs. When you repeat an experiment, the model’s outputs
can vary by a few percentage points. This variability occurs because many queries are subjective
or admit multiple valid answers, so the model may produce different responses on different runs.
Setting temperature to 0 reduces randomness but does not guarantee identical outputs. For more information, see:
https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/reproducible-output?tabs=pyton

recent experiment of LLM version 2 system prompt vs system prompt version 2 can be seen:
here: https://docs.google.com/document/d/1faUwE-W7Eh3n6qg4sblHMjMtkmziJCG9/edit#heading=h.vigkhlj1brjf
The experiment was ran using `evaluation/run_demo_v2.py`

## How It Works

1. Input:LLM is given a contract description text
2. Processing: LLM uses the system prompt to understand how to categorise the given contract description
3. Output: A single CCS category label that best fits the contract description is outputted by LLM

## How To Install Locally

1. Create a venv:
```
python -m venv venv
```
2. Load the venv:
```
source venv/bin/activate
```
3. Update pip:
```
python -m pip install --upgrade pip
```
4. Install the dependencies:
```
python -m pip install -r requirements.txt
```

## Developer Tooling (Pre-commit, Ruff, pytest)

This project uses:

- [pre-commit](https://pre-commit.com/) for running checks automatically before each commit.
- [Ruff](https://docs.astral.sh/ruff/) for fast linting.
- [pytest](https://docs.pytest.org/) for unit testing.

### Set up pre-commit hooks

Install hooks locally:

```bash
pre-commit install
```

Run all hooks manually across the repository:

```bash
pre-commit run --all-files
```

### Run Ruff and pytest manually

Run Ruff:

```bash
ruff check .
```

Run tests:

```bash
pytest -q
```

## How to get to run on own pc

### From the terminal

1. go into the repo folder in your command-line using `cd`
2. create a `.env` file to load your azure credentials (name your credentials as shown below):
- AZURE_OPENAI_API_VERSION
- AZURE_OPENAI_ENDPOINT
- AZURE_OPENAI_KEY
- DEPLOYMENT_NAME
3. Make sure you have a AI Category Mapping csv that contain descriptions and what categories, make sure
the columns are labelled as `Description` and `Category`. Place this file in `data/input/AI Category Mapping - Category Desc Examples_new.csv`
4. Run the unified evaluation script:
```bash
python evaluation/run_evaluation.py --mapper v2
```
Optional arguments:
- `--truth-set /path/to/truth.csv` to use a different truth set file.
- `--prompt system_prompt_v2.md` to choose a prompt from `prompts/`.
- `--list-prompts` to print available prompt files.
- `--mlflow-tracking-uri <azureml://...>` to set tracking server (or use `MLFLOW_TRACKING_URI`).
- `--mlflow-experiment-name ContractMap-Evaluation` to set experiment (or use `MLFLOW_EXPERIMENT_NAME`).
- `--mlflow-run-name my-run` to set a custom run name.

`run_evaluation.py` always logs to MLflow (params, metrics, prompt, and results CSV).

### From a jupyter notebook

1. Install the environment as a jupyter kernel by running `poetry run python -m ipykernel install --user --name="ccs-contract-map"`
2. Launch a jupyter lab session by running `poetry run jupyter lab`
3. In the jupyter lab landing page that launches in your browser, select the kernel `ccs-contract-map`
4. Open the notebook `prompt_engineering_experiment.ipynb`

**Note:** if your browser doesn't automatically load the jupyter lab landing page, you may need to follow the link that is displayed in the terminal instead
Loading
Loading