Add PostgreSQL support to core analysis classes by antarr · Pull Request #21 · antarr/sql_genius

antarr · 2026-05-18T03:00:54Z

Summary

This PR adds full PostgreSQL support to the mysql_genius-core analysis layer, enabling the dashboard to work against PostgreSQL servers in addition to MySQL/MariaDB. Dialect detection is automatic and transparent to callers.

Key Changes

Architecture

Dialect-aware query builders — New QueryBuilders factory module dispatches to Mysql or Postgresql builders based on connection.server_version.dialect. Each builder is a stateless module exposing class methods that return raw SQL strings, with a stable column-name contract so downstream transformation logic remains dialect-agnostic.
Dialect-specific analysis implementations — ServerOverview now dispatches to nested Mysql or Postgresql classes; other analyses (TableSizes, QueryStats, UnusedIndexes, StatsCollector) delegate SQL generation to the appropriate builder.

New Files

lib/mysql_genius/core/query_builders.rb — Factory and interface definition
lib/mysql_genius/core/query_builders/mysql.rb — MySQL/MariaDB SQL generation (extracted from existing analysis classes)
lib/mysql_genius/core/query_builders/postgresql.rb — PostgreSQL SQL generation using pg_stat_statements, pg_stat_activity, pg_class, and pg_settings

Modified Analysis Classes

ServerOverview — Refactored to dispatch to Mysql or Postgresql nested classes; PostgreSQL implementation populates the innodb block with shared_buffers and buffer cache hit rate (blks_hit/blks_read ratio) so the existing UI continues to render.
QueryStats — Uses builder to generate SQL; transforms both MySQL performance_schema.events_statements_summary_by_digest and PostgreSQL pg_stat_statements rows into a uniform output schema.
TableSizes — Uses builder to query information_schema.tables (MySQL) or pg_class + pg_total_relation_size (PostgreSQL).
UnusedIndexes — Uses builder to query performance_schema.table_io_waits_summary_by_index_usage (MySQL) or pg_stat_user_indexes (PostgreSQL); generates dialect-appropriate DROP INDEX statements.
StatsCollector — Uses builder for snapshot SQL generation.
QueryRunner — Timeout strategy now dialect-aware: MySQL uses /*+ MAX_EXECUTION_TIME(ms) */ hint, MariaDB uses SET STATEMENT max_statement_time=s FOR, PostgreSQL uses SET statement_timeout = ms with reset in ensure block.

Supporting Changes

ServerInfo — Added #dialect method that collapses :mysql and :mariadb vendors into :mysql dialect, and :postgresql vendor into :postgresql dialect. Enhanced #parse to recognize PostgreSQL version strings (case-insensitive).
SqlValidator — Added POSTGRESQL_SYSTEM_SCHEMAS constant; validation now checks the appropriate schema list based on dialect.
FakeAdapter — Updated quote_table_name to use double quotes for PostgreSQL, backticks for MySQL; added PostgreSQL version string detection.

Test Coverage

New test suite spec/mysql_genius/core/query_builders_spec.rb validates builder dispatch and SQL generation for both dialects.
Existing analysis specs extended with PostgreSQL context blocks that stub pg_stat_statements, pg_stat_activity, pg_class, and pg_settings queries.
QueryRunner, SqlValidator, and ServerInfo specs updated to cover PostgreSQL-specific behavior.

Notable Implementation Details

PostgreSQL pg_stat_statements requires the extension to be installed; if unavailable, the query raises just as it does on MySQL when performance_schema is disabled.
PostgreSQL schema/table identifiers use double quotes; MySQL uses backticks. The builders and FakeAdapter handle this transparently.
Lock-related fields in the innodb block (e.g., row_lock_waits) fall back to 0 on PostgreSQL since there is no direct equivalent of InnoDB row lock waits.
Fragmentation metrics

https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

Introduces a dialect-aware QueryBuilders layer (Mysql + Postgresql) so the existing analysis classes — TableSizes, QueryStats, UnusedIndexes, ServerOverview, StatsCollector — run unchanged against either MySQL, MariaDB, or PostgreSQL. Dialect selection is automatic via Core::ServerInfo, which now recognises :postgresql as a vendor. Beyond the SQL: SqlValidator blocks the right system schemas for each dialect and parses double-quoted identifiers; QueryRunner enforces statement timeouts via SET statement_timeout on PostgreSQL with an ensure-block reset, and recognises PG's cancellation message as a timeout error; FakeAdapter quotes identifiers with double-quotes when stubbed as PostgreSQL so tests exercise the right dialect path. PostgreSQL query stats require the pg_stat_statements extension; the ServerOverview innodb block is repurposed to hold the closest PG analogues (shared_buffers, blks_hit/blks_read ratio, deadlocks). MySQL-only AI features (InnoDB status, slow log capture) remain unchanged and are not exposed on PostgreSQL connections. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

Follow-up to the core gem PG support: the dashboard tabs visibly broke when pointed at a PG database because several Rails-side controllers, views, and AI features still emit MySQL-only SQL (performance_schema reads, SHOW commands, backticked ALTER TABLE DROP INDEX). This pass extends the dialect-aware layer across the rest of the stack. Adds: - Core::Analysis::QueryHistory — single-digest stats lookup used by the query detail page; reads pg_stat_statements on PG. - Core::Analysis::DuplicateIndexes now emits drop_sql in each result hash via the dialect builder, so the dashboard JS no longer assembles ALTER TABLE statements client-side. - Core::UnsupportedDialect error class; raised by VariableReviewer, ConnectionAdvisor, and InnodbInterpreter when called on a PG connection so the UI gets a clear MySQL-only message instead of a raw PG::SyntaxError. Refactors: - queries_controller.rb#query_history delegates to Core::Analysis:: QueryHistory rather than inlining performance_schema SQL. - database_analysis.rb's friendly error messages pick the right stats source name based on dialect (pg_stat_statements vs performance_schema). - ai_features.rb gates root_cause and anomaly_detection — both run performance_schema/SHOW queries directly — behind a connected_to_ postgresql? guard that returns a JSON "MySQL/MariaDB-only" error. - ActiveRecordAdapter#server_version memoizes; dialect detection now costs one SELECT VERSION() per request rather than one per analysis. - Cosmetic tab headers no longer mention performance_schema / information_schema by name. Test plumbing: - FakeConnectionHelper default-stubs SELECT VERSION() to "8.0.35" and routes unstubbed select_value calls through the exec_query matchers, so request specs don't need to grow per-controller version stubs. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

The split was originally added to support a planned desktop app (mysql_genius-desktop). With that direction discontinued, the two-gem layout creates friction with no payoff — lockstep version bumps, two gemspecs/Gemfiles/CHANGELOGs/test suites, and host apps having to declare a path dependency on both gems during development (which the PG branch surfaced when several controllers initially appeared not to load the new dialect-aware code). What moved: - gems/mysql_genius-core/lib/mysql_genius/core/** → lib/mysql_genius/core/** - gems/mysql_genius-core/spec/** → spec/mysql_genius/core/** - Views under core/views/ continue to live where Core.views_path expects. - Git rename detection caught 89 of these so file history is preserved. What was removed: - gems/mysql_genius-core/ entirely (gemspec, Gemfile, Rakefile, CHANGELOG). - gems/mysql_genius-desktop/ entirely (the stub that justified the split). - packaging/macos/ (build-dmg.sh + icon generator that only packaged the desktop gem). - mysql_genius.gemspec's runtime dependency on `mysql_genius-core`. - The path declaration in the top-level Gemfile. - Makefile's test-core/test-desktop/lint-core/lint-desktop/sidecar targets. - CI's separate `desktop` job and the publish workflow's separate mysql_genius-core build/push step. - The unused `MysqlGenius::Core::VERSION` constant. Use MysqlGenius::VERSION. What stayed identical: - The `MysqlGenius::Core::*` namespace and every public require path (`require "mysql_genius/core/..."`). No host-app code changes needed. - Single rspec run from the repo root now executes both formerly-split suites (401 examples, was 94 + 307). https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

A user reported ~231 indexes flagged as unused on a real PG database, basically every non-PK/non-unique index. Investigation: idx_scan is cumulative since the stats source was last reset, so a recent PG restart or pg_stat_reset() makes everything read as "0 scans" until real traffic accumulates. We were emitting a bare JSON array of matches, with no context for the user to tell "truly unused" apart from "stats are fresh." Matches PgHero's approach: - The route now returns { indexes, stats_reset_at, min_scans } so the dashboard can render "Stats last reset 2h ago · threshold: scans ≤ 0" above the table. `stats_reset_at` comes from pg_stat_database.stats_reset on PG and is nil on MySQL (whose table_io_waits counters don't surface a reset time cheaply). - New config.min_unused_index_scans (default 0 to match PgHero). Both builders thread it through to idx_scan <= N / COUNT_READ <= N so users can ignore indexes that are technically used but rarely enough to be drop candidates. - PG results now sort by pg_relation_size(indexrelid) DESC and carry a size_bytes field; the table shows a Size column with the biggest candidates at the top (where most wins live). MySQL emits a nil size_bytes — individual index size isn't cheap to surface from information_schema.STATISTICS. - Dropped the `c.reltuples > 0` filter on PG. PgHero doesn't have it and an empty table with unused indexes is still actionable noise. The view + count widget were updated to read data.indexes; existing JSON-array consumers will need the small shape change but that's called out in CHANGELOG. IndexPlanner (which consumed UnusedIndexes directly) now pulls .indexes off the Result struct. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

The dashboard JS ROUTES.query_detail was hardcoded to '/queries/' without the engine mount prefix, so query stats rows rendered <a href="/queries/${digest}"> instead of "/mysql_genius/queries/...", 404'ing in host apps. Every other route in the ROUTES table goes through path_for(...) which generates engine-aware paths; query_detail was special-cased because path_for(:query_detail) requires a digest param it doesn't have on the dashboard page. Switching to '<%= mysql_genius.root_path %>queries/' picks up the mount point from the engine's URL helpers without needing a placeholder digest. Added a regression spec on GET /mysql_genius/ that asserts the rendered prefix is /mysql_genius/queries/, not /queries/. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

A user hit `NameError: undefined local variable or method 'mysql_genius'` in production when first rendering the dashboard. The `mysql_genius` URL helpers proxy is supposed to be injected into controllers/views when the engine is mounted, but in production with eager loading the proxy isn't always available in time — the failure surfaced on the first path_for() call. The robust fix is to bypass the proxy and reference the engine's URL helpers module directly: MysqlGenius::Engine.routes.url_helpers.public_send("#{name}_path") This always resolves once the engine routes are drawn, with no dependency on Rails injecting `mysql_genius` onto each controller instance. Also folds the query_detail link prefix into path_for as a named `:query_detail_prefix` lookup, replacing the inline `mysql_genius.root_path` ERB from the previous commit (which would have hit the same proxy problem). The existing regression spec from c7c350f already asserts the rendered JS reads `query_detail: '/mysql_genius/queries/'`, so it now also guards this fix. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

EXPLAIN on captured digest text was failing on PostgreSQL because pg_stat_statements normalizes literals to $1, $2, ... bind placeholders, and PG can't plan a parameterized query without bind values ("there is no parameter $1"). MySQL has the same issue with ? in performance_schema digest text but the PG report was the visible one. QueryExplainer now rewrites placeholders to NULL before wrapping the SQL in EXPLAIN: - PG: $N → NULL (unambiguous; $N doesn't appear in hand-written SQL outside dollar-quoted strings, which are an edge case) - MySQL: ? → NULL, but only outside single-quoted string literals, so user-typed `WHERE name = '?'` from the Query Explorer is left alone. The walker also handles escaped quotes (''). Tradeoff: selectivity-dependent plan choices (e.g. index scan vs. seq scan based on estimated row counts) may differ from production, but the shape of the plan — join order, indexes used, scan types — is still useful for diagnostics. A future enhancement could prefer EXPLAIN (GENERIC_PLAN) on PG 16+ for a more accurate plan; for now NULL substitution works on every supported PG version. The substitution happens BEFORE the existing truncation heuristic, since a bare `?` doesn't match the "value-like trailing char" pattern and would otherwise short-circuit as Truncated. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

OpenAI's GPT-5 and o-series ("reasoning") models renamed max_tokens to max_completion_tokens and only accept the default temperature (1). Sending the legacy params returns: Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead. Detect the model family from @config.model and pick the right shape: - Legacy chat models (gpt-4o, gpt-4, gpt-3.5, etc.): body[:max_tokens] = N body[:temperature] = T - Reasoning models (gpt-5*, o1*, o3*, o4*): body[:max_completion_tokens] = N (no temperature — default is 1, anything else is rejected) The regex `\b(gpt-5|o1|o3|o4)(-|\b)/i` matches the bare model name (`gpt-5`, `o1`, `o3-mini`, `o4`) plus the common variants (`gpt-5-mini`, `o1-mini`, `o3-pro`). gpt-4o is *not* matched (the trailing `o` doesn't sit on a word boundary the way `o1` does), so the existing default model keeps working unchanged. Anthropic and the custom callable path are untouched. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

Four issues surfaced in actual PG dashboard use: 1. EXPLAIN false-positive on Rails query annotations SQL like `SELECT COUNT(*) FROM "taggings" /*action='table_sizes', ...*/` was rejected as "truncated" because the closing `*/` ends in `/` and tripped the trailing-operator check. looks_complete? now strips block comments before inspecting the trailing token. 2. AI assistant generates MySQL backticks on a PG database `Suggestion`, `RewriteQuery`, `Optimization`, `IndexAdvisor`, `IndexPlanner`, `MigrationRisk`, `SchemaReview`, `WorkloadDigest`, and `PatternGrouper` all hardcoded "MySQL" in their system prompts; Suggestion also told the model to "use backticks for table and column names." On PG this produced unparseable SQL. New Ai::DialectHints module exposes name_for(conn) and identifier_quoting_rule(conn). Every prompt now interpolates the dialect name; Suggestion's quoting rule swaps to double quotes on PG. IndexPlanner's prompt also specifies dialect-appropriate DROP INDEX syntax. 3. MySQL-only AI buttons visible on PG dashboard "Why is it slow?", "Anomaly Detection", "Variable Review", "Connection Advisor", "InnoDB Health" all require MySQL-only data sources. The buttons were rendered, clicking them returned a friendly "MySQL/MariaDB-only" error — but the noise was the user experience. capability?(:mysql_only_ai) now returns false on PG, and the Server tab gates those five buttons on it. The controller- side friendly-error guard stays as a defensive backstop. 4. Misleading server labels on PG - "InnoDB Buffer Pool" card → renamed "Buffer Cache" (works for both InnoDB and shared_buffers). - PG ServerOverview was reporting tmp_disk_pct = 100% whenever temp_files > 0, flagging a red badge on every dashboard. PG's temp_files is a single count with no in-memory denominator, so it now reports 0% and shows N / N for the count. - Was mapping xact_rollback to "Aborted Clients" which is a different concept; PG's connections block now reports 0 for aborted_clients/aborted_connects until a proper PG-side mapping exists. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE

claude added 9 commits May 18, 2026 02:51

antarr force-pushed the claude/add-postgresql-support-eEEGJ branch 2 times, most recently from 035ee42 to a5512ba Compare May 19, 2026 03:48

Fix query explorer quoting on PostgreSQL

cca2f72

antarr self-assigned this May 19, 2026

antarr merged commit f3d9e9e into main May 30, 2026
30 checks passed

antarr deleted the claude/add-postgresql-support-eEEGJ branch May 30, 2026 23:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PostgreSQL support to core analysis classes#21

Add PostgreSQL support to core analysis classes#21
antarr merged 10 commits into
mainfrom
claude/add-postgresql-support-eEEGJ

antarr commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

antarr commented May 18, 2026

Summary

Key Changes

Architecture

New Files

Modified Analysis Classes

Supporting Changes

Test Coverage

Notable Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants