FinSight Assurance

Safe AI reporting, proven before use.

FinSight Assurance is a multi-agent certification system for finance teams adopting AI-assisted financial reporting workflows.

The product helps a finance manager answer a high-risk operational question:

Which team members are ready to use AI in financial reporting, which ones need supervision, which ones must be blocked, and what evidence is needed next?

The submission is built for the Agents League Hackathon - Reasoning Agents track. It is a Foundry-aligned agentic product build with Foundry validation evidence, a Foundry IQ-ready synthetic source set, a seven-specialist-agent workflow record, a deterministic multi-agent orchestration core, a synthetic enterprise tool gateway, a dual-role review workspace, and an evaluation pack.

All source data is synthetic. Learners are represented by anonymous synthetic IDs such as EMP-001; no human names, real employee records, financial data, company records, customer data, supplier details, credentials, or confidential information are included.

Team and Contributors

Jako Xie (@Jako0309) - project owner, product direction, implementation, validation, and submission packaging.
Lingyu Gu (@guccilucyg97-wq) - team member, finance-domain review, responsible AI framing, and learner-facing communication review.

30-Second Judge Proof

FinSight Assurance is not a generic dashboard. It is a controlled reasoning workflow for deciding whether AI-generated work is safe enough to enter financial reporting.

Problem: finance teams need to know who can safely use AI for variance explanations, Power BI narratives, and reporting commentary.
Why agents: the decision requires role mapping, evidence retrieval, policy gating, workload-aware remediation, final certification, and manager rollout insight. A single chatbot or scorecard would hide those failure conditions.
Workflow: seven specialist agents produce a verdict, blocked scope, missing evidence, remediation plan, and manager action.
Tools: read-only synthetic tools ground the agents in learner profiles, role requirements, module evidence, policy signals, workload calendars, thresholds, and team readiness.
Safety: the Policy and Privacy Gatekeeper can restrict or block use even when other evidence is positive.
Human control: the system recommends certification status; managers approve, supervise, or block AI use.
Reviewability: every case includes source files, tool-call trace, policy gate result, and audit metadata.

Why Multi-Agent Reasoning

The project decomposes a real enterprise decision into specialist responsibilities:

Case Intake Router identifies the synthetic staff case and reporting scope.
Certification Requirement Mapper maps the role to mandatory modules, thresholds, and fail conditions.
Evidence Curator checks module scores and workflow proof through approved synthetic sources.
Policy and Privacy Gatekeeper applies reporting, privacy, and AI-output verification gates.
Workload-Aware Remediation Planner schedules realistic next steps around close and review windows.
Certification Assessment Agent returns the readiness verdict and practice checks.
Manager Insight Agent converts individual decisions into team rollout guidance.

Each agent has a visible input, tool call, finding, and decision impact in the product workspace.

Reviewer Quick Path

Use this path to verify the project quickly:

Run the local synthetic tool API, then open app/index.html to review the live manager and learner workspace.
Run PYTHONPATH=src python3 -m finsight.evaluation.runner and confirm PASS 5/5 evaluation cases.
Open foundry/foundry_workflow_saved_v2.yaml to verify the seven-agent Foundry workflow record.
Open docs/architecture_diagram.md or docs/assets/architecture_diagram.svg to review the submission architecture diagram.
Open screenshots/ to review the clean seven-screen evidence set for the product, trace, controls, architecture, and live agent workbench.
Open docs/rubric_alignment.md to map the project to the public judging rubric.
Open docs/tool_api_demo.md and evaluations/tool_gateway_results.md to verify the synthetic OpenAPI tool layer.
Open docs/user_journey_and_accessibility.md to review user stories, use cases, internal communication, and accessibility coverage.
Use the root README, architecture diagram, screenshots, demo video, and public GitHub URL for the Innovation Studio project form.

The repository is demoable without a continuously running Azure environment. Foundry validation evidence is preserved in evidence/foundry-validation/, and the local orchestration core makes the same reasoning flow repeatable for review.

Product Demo

Open the review workspace:

app/index.html

The repository also includes a root index.html redirect so the product demo opens cleanly from the project root.

The review workspace shows:

whether each learner is approved, supervised only, or not allowed yet
chart-led readiness, gate, risk, and workflow coverage cards
queue filters for ready, review-only, and blocked cases
an interactive assistant for plain-language case questions, using the live local Python reasoning engine when it is running and packaged synthetic evidence only as the offline fallback
Guided reading, high-contrast, keyboard, and read-aloud controls for accessible review
privacy, AI verification, and workflow evidence gates
visible multi-agent reasoning stages
seven-day workload-aware remediation plans
learning support cards for newer staff, guided explanations, and module-level training gaps
Microsoft IQ layer mapping
source-backed dashboard definitions and readiness metrics
grounded practice checks for the learner
manager action queue, approval metadata, and cohort rollout controls
learner safe-use guidance focused on what can be done now and what is blocked

Run the orchestration core:

PYTHONPATH=src python3 -m finsight.cli assess EMP-001
PYTHONPATH=src python3 -m finsight.cli assess EMP-002
PYTHONPATH=src python3 -m finsight.cli assess EMP-004
PYTHONPATH=src python3 -m finsight.cli team
PYTHONPATH=src python3 -m finsight.cli workflow

Run the evaluation suite:

PYTHONPATH=src python3 -m finsight.evaluation.runner

Expected result:

PASS 5/5 evaluation cases

The evaluation runner checks verdict accuracy, required safety terms, seven specialist stages, tool-call evidence on every learner decision stage, and approved synthetic source citations.

The tool gateway validation in evaluations/tool_gateway_results.md confirms that all seven synthetic enterprise tools return source-bounded results and that the privacy-critical blocking case is enforced.

Regenerate the review workspace from the multi-agent engine:

PYTHONPATH=src python3 -m finsight.cli export-workspace

This writes app/workspace-data.js. The browser workspace reads this exported trace, so the interface is not a separate static screen; it is generated from the same seven-agent workflow used by the CLI and evaluation runner.

Generate the OpenAPI schema for the synthetic tool gateway:

PYTHONPATH=src python3 -m finsight.cli openapi-tools

Run the local synthetic tool API:

PYTHONPATH=src python3 -m finsight.cli serve-tools --host 127.0.0.1 --port 8787

Open http://127.0.0.1:8787/health to check the API health status, or http://127.0.0.1:8787/ for the API landing JSON. The product workspace remains http://127.0.0.1:8790/ when served locally, or app/index.html when opened from the file system.

Use the same local service as the live data engine for the product workspace. When app/index.html opens, it automatically checks GET /workspace on http://127.0.0.1:8787. If the service is running, the interface refreshes from the current seven-agent reasoning payload and the floating Ask FinSight assistant calls POST /chat for scoped, server-side answers about the active synthetic case. If the service is not running, the workspace clearly shows an offline fallback badge and keeps the checked-in review data so the submission remains easy to open and use.

Reviewers do not need to provide their own API key to run the repository. The local review service runs without a key by default so the product workspace, CLI evaluation, and synthetic tool gateway can be checked immediately.

Optional live-model testing can be enabled by the project owner without changing repository files. Set these environment variables in the terminal that starts serve-tools:

bash scripts/start_live_model.sh

The helper tries to list Azure deployments through Azure CLI if it is available. If Azure CLI is not available, paste the full chat-completions endpoint from the Foundry or Azure deployment page. For Azure OpenAI, the endpoint should include the deployment path and api-version query string. The key stays in the local Python process, is not sent to the browser, is not written to disk, and is not needed by reviewers. A live model call happens only when the floating Ask FinSight assistant sends a chat request while those environment variables are configured.

Review the local API demo and Foundry attachment path:

docs/tool_api_demo.md

If the OpenAPI tool service is hosted publicly later, set FINSIGHT_TOOL_API_KEY as a platform environment variable and configure the same value in Foundry as the OpenAPI tool credential. Do not commit the key.

That key is a project-side hosted-service credential, not a reviewer requirement. It exists only for optional hosted OpenAPI deployment and credential-boundary validation.

Demo Cases

Case	Expected Verdict	Why It Matters
`EMP-001`	Conditionally ready	Shows supervised-use logic when AI verification and Power BI evidence are incomplete.
`EMP-002`	Not certified yet	Shows privacy-critical blocking logic for unsafe supplier payment prompting.
`EMP-004`	Certified	Shows the positive path for a strong finance analyst with manager approval.
`EMP-005`	Not certified yet	Shows that missing workflow evidence cannot be invented.
Team briefing	Manager rollout blocked	Shows cohort-level readiness, risk, and training priorities.

System Architecture

Approved synthetic source set
  (prepared for Foundry IQ knowledge grounding)
  |
  v
Foundry seven-agent workflow record
  |
  v
Local synthetic enterprise tool gateway
  |
  +--> get_learner_profile
  +--> resolve_role_requirements
  +--> evaluate_module_evidence
  +--> classify_policy_signals
  +--> get_workload_calendar
  +--> read_certification_thresholds
  +--> summarize_team_readiness
  |
  +--> Case Intake Router
  +--> Certification Requirement Mapper
  +--> Evidence Curator
  +--> Policy and Privacy Gatekeeper
  +--> Workload-Aware Remediation Planner
  +--> Certification Assessment Agent
  +--> Manager Insight Agent
  |
  v
Semantic model
  |
  v
FinSight multi-agent orchestration core
  |
  +--> Case Intake Router
  +--> Certification Requirement Mapper
  +--> Evidence Curator
  +--> Policy and Privacy Gatekeeper
  +--> Workload-Aware Remediation Planner
  +--> Certification Assessment Agent
  +--> Manager Insight Agent
  |
  v
Certification decision + role-specific briefing
  |
  +--> Foundry validation transcript
  +--> Dual-role review workspace
  +--> Evaluation results

Microsoft Technology Used

Microsoft Foundry: agent build, workflow design, and validation environment.
Foundry Agent Service concepts: portal-first agent implementation path represented through agent instructions, workflow records, and validation evidence.
Foundry IQ: Microsoft IQ target for approved synthetic certification sources. The same six sources are also embedded in the Foundry agent instructions when the selected model or region does not expose file search or knowledge attachment.
Foundry Workflow: validated seven-specialist-agent workflow record in foundry/foundry_workflow_saved_v2.yaml plus the reusable blueprint in foundry/workflow_blueprint.yaml.
Azure validation environment: used for Foundry validation evidence.
GitHub: public project submission.

The Foundry validation build used the Japan East environment and gpt-oss-120b, which was the usable model available during validation. The Foundry agent instructions embed the approved synthetic source set when file search or uploaded knowledge attachment is unavailable. The repository preserves the Foundry instructions, workflow record, and validation evidence, and it also includes a local orchestration core to make the multi-agent workflow explicit, repeatable, and evaluable without requiring a continuously running Azure environment.

Microsoft IQ Integration

Foundry IQ

Foundry IQ is the Microsoft IQ layer targeted by the submission build. The approved source set contains six synthetic documents:

data/01_certification_requirements.md
data/02_role_profiles.md
data/03_reporting_policy.md
data/04_ai_usage_and_privacy_policy.md
data/05_learner_progress_and_workload.md
data/06_team_readiness_dataset.md

Work IQ Concept

The system models work context through workload calendars, month-end close pressure, reporting pack deadlines, manager review windows, and available training capacity.

Fabric IQ Concept

The system includes a semantic model in data/semantic_model.json covering learners, roles, modules, risks, thresholds, managers, workload slots, and certification decisions.

Tool and API Integration

FinSight Assurance includes a local synthetic enterprise tool gateway rather than a paid third-party API. The tools are intentionally limited to approved synthetic sources:

get_learner_profile
resolve_role_requirements
evaluate_module_evidence
classify_policy_signals
get_workload_calendar
read_certification_thresholds
summarize_team_readiness

The implemented local tool gateway is in src/finsight/tools.py. The local OpenAPI-compatible service is in src/finsight/tool_api.py, and the generated OpenAPI 3.0 schema is in foundry/finsight_tools_openapi.json. Every local agent trace records the tool call used by that specialist stage.

The Foundry Tools UI exposes custom tool connections through OpenAPI, MCP, or A2A endpoints. Because those routes require an externally hosted endpoint, this submission does not claim that the local Python functions are attached inside the Foundry portal. Instead, foundry/finsight_tools_openapi.json documents the endpoint contract that can be deployed later without changing the reasoning design. foundry/tool_definitions.json keeps the same tool contract in function-definition form for agent-framework style runtimes.

Repository Structure

app/
  index.html
  styles.css
  app.js
  workspace-data.js
index.html
DISCLAIMER.md
LICENSE
SECURITY.md
data/
  01_certification_requirements.md
  02_role_profiles.md
  03_reporting_policy.md
  04_ai_usage_and_privacy_policy.md
  05_learner_progress_and_workload.md
  06_team_readiness_dataset.md
  semantic_model.json
docs/
  agent_instructions.md
  architecture_and_orchestration.md
  architecture_diagram.md
  assets/
    architecture_diagram.svg
  challenge_alignment_checklist.md
  evaluation_plan.md
  foundry_alignment.md
  logic_walkthrough.md
  repository_safety_notes.md
  rubric_alignment.md
  source_manifest.md
  tool_api_demo.md
  user_journey_and_accessibility.md
foundry/
  README.md
  agent_cards.md
  finsight_tools_openapi.json
  foundry_workflow_saved_v2.yaml
  specialist_agent_instructions.md
  tool_definitions.json
  workflow_blueprint.yaml
evaluations/
  evaluation_cases.json
  evaluation_results.md
  tool_gateway_results.md
evidence/
  foundry-validation/
screenshots/
src/
  finsight/
tests/
  test_tool_api.py

Why This Fits the Reasoning Agents Challenge

The Reasoning Agents starter kit asks for a multi-agent learning and certification-readiness system: understand requirements, build study plans, generate practice checks, assess readiness, provide feedback, and explain the agent workflow. FinSight Assurance adapts that pattern to a high-stakes internal AI-use certification programme for finance teams:

maps certification requirements to organisational roles
creates role-based and workload-aware study plans
generates grounded practice questions from approved sources
evaluates individual readiness with conservative fail conditions
provides manager-level insight across cohort readiness and rollout risk
uses Microsoft Foundry validation evidence, Foundry Agent Service implementation patterns, a Foundry IQ-ready source set, and a documented Foundry Workflow design
uses seven read-only synthetic tools for learner profiles, requirements, evidence, policy signals, workload, thresholds, and team readiness
uses synthetic data only
includes evaluation and responsible AI controls

The project is intentionally framed as a certification-readiness system rather than a generic finance assistant. The domain is financial reporting AI-use approval, but the workflow remains the challenge workflow: requirements, learning evidence, study plan, practice checks, readiness verdict, feedback, and manager guidance.

Safety Position

FinSight Assurance does not replace finance managers, accounting judgment, reporting controls, or human review. AI-assisted commentary is treated as draft material. The system blocks certification when privacy, source traceability, AI verification, workflow evidence, or manager approval is insufficient.

Originality and Public Boundary

The submitted solution, architecture, code, data, UI, documentation, and demo materials are original team work for this challenge. No copied third-party repository, private data, credentials, or unlicensed assets are included.

Evidence Pack

The final package is intentionally limited to the product, source code, synthetic data, Foundry workflow evidence, screenshots, and verification assets.

Product walkthrough: app/index.html
Foundry validation evidence: evidence/foundry-validation/
Full screenshot evidence set: screenshots/
Tool/API demo guide: docs/tool_api_demo.md
Architecture diagram: docs/architecture_diagram.md
Uploadable architecture diagram asset: docs/assets/architecture_diagram.svg
Foundry workflow blueprint: foundry/workflow_blueprint.yaml
Saved Foundry workflow shape: foundry/foundry_workflow_saved_v2.yaml
Judging rubric alignment: docs/rubric_alignment.md
Evaluation result table: evaluations/evaluation_results.md

Public team attribution is limited to names, GitHub handles, and project roles. Do not publish private contact details, student IDs, tenant details, subscription details, account screenshots, or credentials in the repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FinSight Assurance

Team and Contributors

30-Second Judge Proof

Why Multi-Agent Reasoning

Reviewer Quick Path

Product Demo

Demo Cases

System Architecture

Microsoft Technology Used

Microsoft IQ Integration

Foundry IQ

Work IQ Concept

Fabric IQ Concept

Tool and API Integration

Repository Structure

Why This Fits the Reasoning Agents Challenge

Safety Position

Originality and Public Boundary

Evidence Pack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
data		data
docs		docs
evaluations		evaluations
evidence/foundry-validation		evidence/foundry-validation
foundry		foundry
screenshots		screenshots
scripts		scripts
src/finsight		src/finsight
tests		tests
.gitignore		.gitignore
DISCLAIMER.md		DISCLAIMER.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
index.html		index.html
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

FinSight Assurance

Team and Contributors

30-Second Judge Proof

Why Multi-Agent Reasoning

Reviewer Quick Path

Product Demo

Demo Cases

System Architecture

Microsoft Technology Used

Microsoft IQ Integration

Foundry IQ

Work IQ Concept

Fabric IQ Concept

Tool and API Integration

Repository Structure

Why This Fits the Reasoning Agents Challenge

Safety Position

Originality and Public Boundary

Evidence Pack

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages