Skip to content

MCP Tool Responses Do Not Match API Documentation #72

@nikopuf

Description

@nikopuf

Summary

The Security Analysis and Compliance & Reporting MCP tools return minimal raw Wazuh data instead of the rich, structured responses described in the API documentation. Every tool tested returns a thin wrapper over the Wazuh API with no enrichment, scoring, correlation, or framework-specific mapping.

Environment

  • Wazuh Version: 4.x (2 agents: xxx, xxxx)
  • MCP Server: http://127.0.0.1:3000
  • Date Tested: 2026-03-31

Tools Tested

1. get_top_security_threats

Called with: { "limit": 5, "time_range": "24h" }

Expected (per docs): Ranked threats with threat_score, threat_name, severity, affected_systems, indicators (source IPs, target ports, attack patterns), impact_assessment (CIA triad), timeline (first detected, peak, status), mitigation_status, and ranking_criteria.

Actual response:

{
  "data": {
    "time_range": "24h",
    "threats": [
      {
        "rule_id": "100200",
        "description": "File modified in /root directory.",
        "level": 7,
        "count": 533,
        "groups": ["syscheck"]
      }
    ],
    "total_unique_rules": 10
  }
}

Missing: threat_score, threat_name, affected_systems, indicators, impact_assessment, timeline, mitigation_status, ranking_criteria, ranking_timestamp.


2. perform_risk_assessment

Called with: {} (environment-wide)

Expected (per docs): overall_risk_score, risk_categories (vulnerability, threat exposure, configuration, compliance with individual scores), critical_findings, risk_trends, mitigation_priorities, and executive_summary.

Actual response:

{
  "data": {
    "total_agents": 2,
    "risk_factors": [],
    "risk_level": "low"
  }
}

Missing: overall_risk_score, risk_categories, critical_findings, risk_trends, mitigation_priorities, executive_summary, confidence.


3. run_compliance_check

Called with: { "framework": "PCI-DSS" } and { "framework": "NIST" }

Expected (per docs): overall_compliance (score, status, requirements met/total), requirement_categories with per-category scores, detailed_findings with per-requirement status/severity/remediation, risk_assessment, remediation_roadmap, and compliance_trends.

Actual response (identical for both PCI-DSS and NIST):

{
  "data": {
    "framework": "PCI-DSS",
    "agents_checked": 2,
    "results": [
      {
        "agent_id": "000",
        "agent_name": "xxxx",
        "sca": {
          "pass": 51,
          "fail": 45,
          "invalid": 87,
          "total_checks": 183,
          "score": 53,
          "policy_id": "cis_amazon_linux_2023",
          "name": "CIS Benchmark for Amazon Linux 2023 Benchmark v1.0.0."
        }
      },
      {
        "agent_id": "002",
        "agent_name": "xxxx",
        "sca": {
          "pass": 118,
          "fail": 119,
          "invalid": 42,
          "total_checks": 279,
          "score": 49,
          "policy_id": "cis_ubuntu24-04",
          "name": "CIS Ubuntu Linux 24.04 LTS Benchmark v1.0.0."
        }
      }
    ]
  }
}

Issues:

  • PCI-DSS and NIST return identical data — no framework-specific filtering or mapping
  • Returns raw SCA (CIS Benchmark) results, not compliance requirement assessments
  • No per-requirement breakdown, no detailed findings, no remediation roadmap

Additionally: ISO27001 and FISMA are listed in the docs but return:

Invalid parameter 'framework': invalid value 'ISO27001'. Use one of: GDPR, HIPAA, NIST, PCI-DSS, SOX

4. generate_security_report

Expected (per docs): Full report with executive_summary, threat_landscape_analysis, vulnerability_management, compliance_status, security_metrics, risk_assessment, financial_analysis, recommendations. Seven report types documented: daily, weekly, monthly, quarterly, incident, compliance, executive.

4a. daily report

Called with: { "report_type": "daily", "include_recommendations": true }

Actual response:

{
  "data": {
    "report_type": "daily",
    "generated_at": "2026-03-31T05:36:33.624128+00:00",
    "sections": {
      "agents": {
        "total": 2,
        "active": 2,
        "disconnected": 0
      },
      "manager": {
        "title": "Wazuh API REST",
        "api_version": "4.14.1",
        "revision": "rc2",
        "hostname": "xxx"
      },
      "vulnerabilities": {
        "total_vulnerabilities": 0,
        "affected_agents": 0,
        "by_severity": {},
        "critical": 0,
        "high": 0,
        "medium": 0,
        "low": 0
      }
    }
  }
}

Missing: executive_summary, threat_landscape_analysis, security_metrics (MTTD, MTTR), incident_summary, recommendations, risk_assessment. The include_recommendations: true parameter has no effect — no recommendations are returned.

4b. incident report

Called with: { "report_type": "incident", "include_recommendations": true }

Actual response: Identical to daily — same agent count, manager info, and vulnerability summary. No incident-specific data (incident timeline, affected systems, root cause analysis, lessons learned, containment actions).

4c. compliance report type — NOT SUPPORTED

Called with: { "report_type": "compliance" }

Response:

Invalid parameter 'report_type': invalid value 'compliance'. Use one of: daily, incident, monthly, weekly

4d. executive report type — NOT SUPPORTED

Called with: { "report_type": "executive" }

Response:

Invalid parameter 'report_type': invalid value 'executive'. Use one of: daily, incident, monthly, weekly

4e. quarterly report type — NOT SUPPORTED (documented but not listed in allowed values)

Report type support summary:

Report Type Documented Supported Returns Unique Data
daily Yes Yes No — generic agent/vuln summary
weekly Yes Yes Not tested, likely same
monthly Yes Yes Not tested, likely same
quarterly Yes No N/A
incident Yes Yes No — identical to daily
compliance Yes No N/A
executive Yes No N/A

Root Cause

The MCP tools are thin wrappers over the Wazuh Manager API:

  • get_top_security_threats → single terms aggregation on rule.id sorted by count
  • perform_risk_assessment → agent count + basic threshold check
  • run_compliance_check → SCA scan results from /sca/{agent_id} endpoint, same data regardless of framework
  • generate_security_report → agent summary + manager info + vulnerability counts, identical across all supported report types

No enrichment, correlation, scoring, or framework mapping is performed.

Gap Summary

Capability Documented Implemented
Security Analysis
Threat scoring (0-100) Yes No
Source IP / indicator extraction Yes No
Impact assessment (CIA triad) Yes No
Attack timeline construction Yes No
Mitigation status tracking Yes No
Risk score calculation Yes No — returns only "low"/"medium"/"high"
Risk category breakdown Yes No
Critical findings with remediation Yes No
Compliance
Framework-specific requirement mapping Yes No — all frameworks return same SCA data
Per-requirement compliance status Yes No
Remediation roadmap Yes No
Compliance trends / history Yes No
ISO27001 / FISMA support Yes No — returns error
Reporting
Executive summaries Yes No
Financial impact analysis Yes No
compliance report type Yes No — returns "invalid value"
executive report type Yes No — returns "invalid value"
quarterly report type Yes No — not in allowed values
Incident-specific report content Yes No — identical to daily report
Report type differentiation Yes No — daily/incident return same data
include_recommendations parameter Yes No effect — no recommendations returned
Threat landscape analysis in reports Yes No
Security metrics (MTTD, MTTR) Yes No
Team performance metrics Yes No

Suggested Implementation Path

Option A: Direct Elasticsearch Enrichment (No LLM)

Build enrichment in the MCP tool layer by running additional Elasticsearch queries:

  1. get_top_security_threats: After getting top rule IDs, run sub-queries to extract src.ip, agent.id, @timestamp ranges per rule. Calculate threat score from level * log(count). Build timeline from min/max timestamps.

  2. perform_risk_assessment: Combine SCA scores + alert severity distribution + vulnerability counts + failed auth rates into a weighted risk score. Break down by category.

  3. run_compliance_check: Map CIS benchmark check IDs to framework requirements (PCI-DSS requirement numbers, NIST functions). Return per-requirement pass/fail. This requires a static mapping table.

Option B: LLM Enrichment Layer

Pass raw data through an LLM to generate:

  • Executive summaries and recommendations
  • Threat scoring with reasoning
  • Impact assessments
  • Remediation roadmaps

Adds latency (~2-5s) and cost per call, but produces the richest output.

Option C: Hybrid

  • Factual data (IPs, counts, timelines, scores): Direct ES queries
  • Analysis (summaries, recommendations, impact narrative): LLM

Reproduction

# Get auth token
TOKEN=$(curl -s -X POST http://127.0.0.1:3000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"api_key":"YOUR_API_KEY"}' | python3 -c "import sys,json; print(json.load(sys.stdin)['access_token'])")

# Test any tool
curl -s -X POST http://127.0.0.1:3000/mcp \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "MCP-Protocol-Version: 2024-11-05" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "get_top_security_threats",
      "arguments": { "limit": 5, "time_range": "24h" }
    }
  }'

Thanks
@nikopuf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions