A4IHackathon - Guardians Of Data : Security hardening, column masking & injection detection, REST/Swagger integration, validated bug fixes, and full docs/onboarding#16
Open
nakultripathi-lab wants to merge 25 commits into
Conversation
Implements the validated bug backlog (from REPLICATION_REPORT.md / REPRODUCTION_STEPS.md), each covered by regression tests in test/test_security_fixes_unit.py. Security: - S1 escape generated restriction values via sqlglot literals; quote IN members - S2 numeric (not string) comparison for < > <= >=; inject the correct operator - S3 close column allow-list bypass when all FROM tables are dynamic - S4 verify EXCEPT/INTERSECT (dispatch on SetOperation), not just UNION - S7 explicitly reject stacked / multi-statement SQL Config validation (no more caller-facing crashes / HTTP 500): - C1 IN accepts a non-empty list of consistently-typed values - C2 require a value for =, >, <, <=, >= - C3 case-insensitive operations - C4 verify_sql returns a result dict on invalid config (also catches ValueError) Reliability / build / opt-in hardening: - Q1 errors are an ordered, de-duplicated list - Q2 remove dead double find(Where) - Q3 make the MCP wrapper import-safe - Q4 valid PEP 440 dev version - S5 opt-in max_risk hard-block; S6 opt-in X-API-Key auth Full unit suite: 326 passed, 1 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y-and-config-bugs
Add three config-driven SQL safety features: - F1: allowed_functions / blocked_functions enforcement - F2: force_limit to inject/clamp a mandatory LIMIT - F3: per-table denied_columns with deny-aware * expansion Includes the injection-detection module, unit tests for each feature, per-feature docs, and a REST port/debug update.
column-level masking
Integrate new feature work from main (injection_detection module, function allow/deny-list, mandatory LIMIT, denied columns) with the validated bug fixes on this branch. Conflict resolution: - restriction_validation.py: keep the fixed per-restriction validation (C1-C4) and add the new force_limit / function-list / denied_columns config checks. - sql_data_guard.py: keep parsed=None + pre-parse scan_raw_sql; keep the None-safe fixed assignment alongside _enforce_force_limit; unify the stacked-query message with injection_detection's intent-revealing text. - Adapt the new tests' `errors == set()` assertions to the ordered-list errors (Q1). Full unit suite: 378 passed, 1 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Integrate column-level masking (and manual/doc updates) from main with the validated bug fixes on this branch. Conflict resolution (sql_data_guard.py): call both validate_restrictions and validate_column_masks, and keep catching ValueError (C4) so an invalid config returns a result dict instead of crashing. Masking, injection detection, function policy, force_limit, and the S6 API key all coexist. Full unit suite: 395 passed, 1 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ig-bugs Fix validated security and config-validation bugs
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ions - Add §4 intake questions (all optional) mapped to quadrants, with use-answer-else-derive-from-repo precedence - Reframe §5 as the repo-derived fallback; generalize repo signals beyond Python (package.json, pom.xml, go.mod, .agents/.claude) - Add §8 end-to-end flow; works with zero, partial, or full answers Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Bundle template.pptx (official A4I 4-quadrant deck) with the skill - Rework Method B to clone the template and swap only bullet text, preserving exact design (cards, hero border, fonts, colours) - Add fill_template.py helper + sample.html / sample.png / one-pager.pptx as worked examples Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Generates the A4I Hackathon submission write-up following the official template format exactly (sections 1-8). Enforces no-hallucination rules: every claim sourced from repo or user, unknowns left as explicit NEEDS INPUT placeholders. Written for a management/reviewer audience. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replaces the README with the production-grade rewrite (overview, architecture diagram, quick start, REST/MCP/Dify integration, project structure, testing, security, FAQ). Verified README.md had no team edits today before replacing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Documents the features the team shipped today, each validated against the code and the passing test suite (395 passed): - Column masking (redact/hash/partial), denied_columns - Injection detection (stacked/dangerous-func/system-catalog, opt-in comments) - Function allow/block lists, force_limit row cap, max_risk hard-block - New 'Policy and security controls' config reference section - AI-assisted development section (.agents/skills, .clinerules) - Refresh architecture diagram, how-it-works, project structure, testing (13 suites) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Security hardening, column masking & injection detection, REST/Swagger integration, validated bug fixes, and full docs/onboarding
Summary
This MR delivers a major round of work on sql-data-guard: five new security
capabilities, a backlog of validated security/config bug fixes, a REST API with
Swagger, an extensive unit-test suite, and complete technical + onboarding
documentation.
Net change: +5,644 / −172 across 45 files (24 commits).
New security features
src/sql_data_guard/injection_detection.py):pre-parse raw-string scan (catches
--,/* */,#comment-evasion) plus anAST scan for stacked statements, dangerous functions, and system-catalog
probing, with intent-revealing messages and risk weights. Makes good on the
README's long-standing "built-in detection module" claim that previously had
no implementation.
src/sql_data_guard/column_masking.py): per-tablecolumn_masksrewrite sensitive columns instead of dropping them —redact,hash(MD5), andpartial(show last N) policies, preserving result shape._verify_functions): config-drivenfunction policy on top of F1's always-on deny-list, so deployments can
permit/forbid specific functions (e.g. block
LOAD_FILE).LIMIT/ max-rows enforcement (_enforce_force_limit):injects or clamps
LIMITon the outermost query via the existing auto-fixpath, closing the full-table-exfiltration gap.
denied_columnswith deny-wins precedence and
SELECT */table.*expansion that neverleaks a denied column.
Validated bug fixes (regression-tested)
Security
INmembers.< > <= >=; inject the correct operator.EXCEPT/INTERSECT(dispatch on SetOperation), not justUNION.Config validation (no more caller-facing HTTP 500s)
INaccepts a non-empty list of consistently-typed values.=,>,<,<=,>=.verify_sqlreturns a result dict on invalid config (also catchesValueError).Reliability
REST API & integration
src/sql_data_guard/rest/sql_data_guard_rest.py.src/sql_data_guard/mcpwrapper/mcp_wrapper.py).pyproject.toml,requirements.txt).Tests
New unit suites covering:
test/test_injection_detection_unit.pytest/test_column_masking_unit.pytest/test_denied_columns_unit.pytest/test_function_policy_unit.pytest/test_limit_enforcement_unit.pytest/test_rest_api_unit.pytest/test_security_fixes_unit.py(validated security-fix backlog, ~395 lines)Documentation & onboarding
docs/technical-documentation/): index, codeexplanation, HLD, LLD, architecture, coding guidelines.
ONBOARDING.mdand a comprehensiveREADME.mdrewrite..agents/skills,.clinerules/Security-Rule.md, anda technical-documentation workflow.