Skip to content

Add grounded information extraction rewards for GRPO#722

Open
ALI-AL-MARJANI wants to merge 1 commit into
huggingface:mainfrom
ALI-AL-MARJANI:feat/grounded-extraction-rewards
Open

Add grounded information extraction rewards for GRPO#722
ALI-AL-MARJANI wants to merge 1 commit into
huggingface:mainfrom
ALI-AL-MARJANI:feat/grounded-extraction-rewards

Conversation

@ALI-AL-MARJANI

Copy link
Copy Markdown

Motivation

GRPO's reward machinery is domain-agnostic — there's no reason to limit it to math. This
PR adds 5 composable reward functions for hallucination-free information extraction via
GRPO.

Core idea: instead of rewarding correct math answers, reward the model for verbatim
citations. Every extracted_quote must be an exact substring of the source context. This
eradicates factual hallucinations at training time.

Output format

{
  "reasoning_path": "<chain-of-thought>",
  "is_context_sufficient": true,
  "final_answer": "<answer>",
  "extracted_quotes": [
    {"chunk_id": "doc_0", "exact_quote": "<verbatim substring from context>"}
  ]
}

New reward functions

Reward: grounded_format
Signal: Additive JSON schema check — valid JSON → keys → types → quote structure
────────────────────────────────────────
Reward: quote_grounding
Signal: exact_quote ∈ context_raw — core anti-hallucination signal
────────────────────────────────────────
Reward: chunk_routing
Signal: Quote must come from the correct gold chunk, not a distractor
────────────────────────────────────────
Reward: answer_faithfulness
Signal: Token overlap between final_answer and extracted quotes
────────────────────────────────────────
Reward: reasoning_quality
Signal: Encourages substantive CoT

All functions register in REWARD_FUNCS_REGISTRY. Activate via reward_funcs: 
[grounded_format, quote_grounding, ...] in any GRPO recipe YAML — zero changes to the
training loop.

Tests

42 tests, all passing. Covers hallucination detection, correct abstention, partial
grounding, batch processing, multi-chunk routing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant