Add PostgreSQL support to core analysis classes#21
Merged
Conversation
Introduces a dialect-aware QueryBuilders layer (Mysql + Postgresql) so the existing analysis classes — TableSizes, QueryStats, UnusedIndexes, ServerOverview, StatsCollector — run unchanged against either MySQL, MariaDB, or PostgreSQL. Dialect selection is automatic via Core::ServerInfo, which now recognises :postgresql as a vendor. Beyond the SQL: SqlValidator blocks the right system schemas for each dialect and parses double-quoted identifiers; QueryRunner enforces statement timeouts via SET statement_timeout on PostgreSQL with an ensure-block reset, and recognises PG's cancellation message as a timeout error; FakeAdapter quotes identifiers with double-quotes when stubbed as PostgreSQL so tests exercise the right dialect path. PostgreSQL query stats require the pg_stat_statements extension; the ServerOverview innodb block is repurposed to hold the closest PG analogues (shared_buffers, blks_hit/blks_read ratio, deadlocks). MySQL-only AI features (InnoDB status, slow log capture) remain unchanged and are not exposed on PostgreSQL connections. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
Follow-up to the core gem PG support: the dashboard tabs visibly broke when pointed at a PG database because several Rails-side controllers, views, and AI features still emit MySQL-only SQL (performance_schema reads, SHOW commands, backticked ALTER TABLE DROP INDEX). This pass extends the dialect-aware layer across the rest of the stack. Adds: - Core::Analysis::QueryHistory — single-digest stats lookup used by the query detail page; reads pg_stat_statements on PG. - Core::Analysis::DuplicateIndexes now emits drop_sql in each result hash via the dialect builder, so the dashboard JS no longer assembles ALTER TABLE statements client-side. - Core::UnsupportedDialect error class; raised by VariableReviewer, ConnectionAdvisor, and InnodbInterpreter when called on a PG connection so the UI gets a clear MySQL-only message instead of a raw PG::SyntaxError. Refactors: - queries_controller.rb#query_history delegates to Core::Analysis:: QueryHistory rather than inlining performance_schema SQL. - database_analysis.rb's friendly error messages pick the right stats source name based on dialect (pg_stat_statements vs performance_schema). - ai_features.rb gates root_cause and anomaly_detection — both run performance_schema/SHOW queries directly — behind a connected_to_ postgresql? guard that returns a JSON "MySQL/MariaDB-only" error. - ActiveRecordAdapter#server_version memoizes; dialect detection now costs one SELECT VERSION() per request rather than one per analysis. - Cosmetic tab headers no longer mention performance_schema / information_schema by name. Test plumbing: - FakeConnectionHelper default-stubs SELECT VERSION() to "8.0.35" and routes unstubbed select_value calls through the exec_query matchers, so request specs don't need to grow per-controller version stubs. https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
The split was originally added to support a planned desktop app (mysql_genius-desktop). With that direction discontinued, the two-gem layout creates friction with no payoff — lockstep version bumps, two gemspecs/Gemfiles/CHANGELOGs/test suites, and host apps having to declare a path dependency on both gems during development (which the PG branch surfaced when several controllers initially appeared not to load the new dialect-aware code). What moved: - gems/mysql_genius-core/lib/mysql_genius/core/** → lib/mysql_genius/core/** - gems/mysql_genius-core/spec/** → spec/mysql_genius/core/** - Views under core/views/ continue to live where Core.views_path expects. - Git rename detection caught 89 of these so file history is preserved. What was removed: - gems/mysql_genius-core/ entirely (gemspec, Gemfile, Rakefile, CHANGELOG). - gems/mysql_genius-desktop/ entirely (the stub that justified the split). - packaging/macos/ (build-dmg.sh + icon generator that only packaged the desktop gem). - mysql_genius.gemspec's runtime dependency on `mysql_genius-core`. - The path declaration in the top-level Gemfile. - Makefile's test-core/test-desktop/lint-core/lint-desktop/sidecar targets. - CI's separate `desktop` job and the publish workflow's separate mysql_genius-core build/push step. - The unused `MysqlGenius::Core::VERSION` constant. Use MysqlGenius::VERSION. What stayed identical: - The `MysqlGenius::Core::*` namespace and every public require path (`require "mysql_genius/core/..."`). No host-app code changes needed. - Single rspec run from the repo root now executes both formerly-split suites (401 examples, was 94 + 307). https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
A user reported ~231 indexes flagged as unused on a real PG database,
basically every non-PK/non-unique index. Investigation: idx_scan is
cumulative since the stats source was last reset, so a recent PG
restart or pg_stat_reset() makes everything read as "0 scans" until
real traffic accumulates. We were emitting a bare JSON array of
matches, with no context for the user to tell "truly unused" apart
from "stats are fresh."
Matches PgHero's approach:
- The route now returns { indexes, stats_reset_at, min_scans } so the
dashboard can render "Stats last reset 2h ago · threshold: scans
≤ 0" above the table. `stats_reset_at` comes from
pg_stat_database.stats_reset on PG and is nil on MySQL (whose
table_io_waits counters don't surface a reset time cheaply).
- New config.min_unused_index_scans (default 0 to match PgHero). Both
builders thread it through to idx_scan <= N / COUNT_READ <= N so
users can ignore indexes that are technically used but rarely
enough to be drop candidates.
- PG results now sort by pg_relation_size(indexrelid) DESC and carry
a size_bytes field; the table shows a Size column with the biggest
candidates at the top (where most wins live). MySQL emits a nil
size_bytes — individual index size isn't cheap to surface from
information_schema.STATISTICS.
- Dropped the `c.reltuples > 0` filter on PG. PgHero doesn't have it
and an empty table with unused indexes is still actionable noise.
The view + count widget were updated to read data.indexes; existing
JSON-array consumers will need the small shape change but that's
called out in CHANGELOG. IndexPlanner (which consumed UnusedIndexes
directly) now pulls .indexes off the Result struct.
https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
The dashboard JS ROUTES.query_detail was hardcoded to '/queries/'
without the engine mount prefix, so query stats rows rendered
<a href="/queries/${digest}"> instead of "/mysql_genius/queries/...",
404'ing in host apps. Every other route in the ROUTES table goes
through path_for(...) which generates engine-aware paths; query_detail
was special-cased because path_for(:query_detail) requires a digest
param it doesn't have on the dashboard page.
Switching to '<%= mysql_genius.root_path %>queries/' picks up the
mount point from the engine's URL helpers without needing a placeholder
digest. Added a regression spec on GET /mysql_genius/ that asserts the
rendered prefix is /mysql_genius/queries/, not /queries/.
https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
A user hit `NameError: undefined local variable or method 'mysql_genius'`
in production when first rendering the dashboard. The `mysql_genius`
URL helpers proxy is supposed to be injected into controllers/views
when the engine is mounted, but in production with eager loading the
proxy isn't always available in time — the failure surfaced on the
first path_for() call.
The robust fix is to bypass the proxy and reference the engine's URL
helpers module directly:
MysqlGenius::Engine.routes.url_helpers.public_send("#{name}_path")
This always resolves once the engine routes are drawn, with no
dependency on Rails injecting `mysql_genius` onto each controller
instance.
Also folds the query_detail link prefix into path_for as a named
`:query_detail_prefix` lookup, replacing the inline `mysql_genius.root_path`
ERB from the previous commit (which would have hit the same proxy
problem).
The existing regression spec from c7c350f already asserts the rendered
JS reads `query_detail: '/mysql_genius/queries/'`, so it now also
guards this fix.
https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
EXPLAIN on captured digest text was failing on PostgreSQL because
pg_stat_statements normalizes literals to $1, $2, ... bind
placeholders, and PG can't plan a parameterized query without bind
values ("there is no parameter $1"). MySQL has the same issue with ?
in performance_schema digest text but the PG report was the visible
one.
QueryExplainer now rewrites placeholders to NULL before wrapping the
SQL in EXPLAIN:
- PG: $N → NULL (unambiguous; $N doesn't appear in hand-written SQL
outside dollar-quoted strings, which are an edge case)
- MySQL: ? → NULL, but only outside single-quoted string literals,
so user-typed `WHERE name = '?'` from the Query Explorer is left
alone. The walker also handles escaped quotes ('').
Tradeoff: selectivity-dependent plan choices (e.g. index scan vs.
seq scan based on estimated row counts) may differ from production,
but the shape of the plan — join order, indexes used, scan types —
is still useful for diagnostics. A future enhancement could prefer
EXPLAIN (GENERIC_PLAN) on PG 16+ for a more accurate plan; for now
NULL substitution works on every supported PG version.
The substitution happens BEFORE the existing truncation heuristic,
since a bare `?` doesn't match the "value-like trailing char" pattern
and would otherwise short-circuit as Truncated.
https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
OpenAI's GPT-5 and o-series ("reasoning") models renamed max_tokens
to max_completion_tokens and only accept the default temperature (1).
Sending the legacy params returns:
Unsupported parameter: 'max_tokens' is not supported with this
model. Use 'max_completion_tokens' instead.
Detect the model family from @config.model and pick the right shape:
- Legacy chat models (gpt-4o, gpt-4, gpt-3.5, etc.):
body[:max_tokens] = N
body[:temperature] = T
- Reasoning models (gpt-5*, o1*, o3*, o4*):
body[:max_completion_tokens] = N
(no temperature — default is 1, anything else is rejected)
The regex `\b(gpt-5|o1|o3|o4)(-|\b)/i` matches the bare model name
(`gpt-5`, `o1`, `o3-mini`, `o4`) plus the common variants
(`gpt-5-mini`, `o1-mini`, `o3-pro`). gpt-4o is *not* matched (the
trailing `o` doesn't sit on a word boundary the way `o1` does), so
the existing default model keeps working unchanged.
Anthropic and the custom callable path are untouched.
https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
Four issues surfaced in actual PG dashboard use:
1. EXPLAIN false-positive on Rails query annotations
SQL like `SELECT COUNT(*) FROM "taggings" /*action='table_sizes',
...*/` was rejected as "truncated" because the closing `*/` ends
in `/` and tripped the trailing-operator check. looks_complete?
now strips block comments before inspecting the trailing token.
2. AI assistant generates MySQL backticks on a PG database
`Suggestion`, `RewriteQuery`, `Optimization`, `IndexAdvisor`,
`IndexPlanner`, `MigrationRisk`, `SchemaReview`, `WorkloadDigest`,
and `PatternGrouper` all hardcoded "MySQL" in their system
prompts; Suggestion also told the model to "use backticks for
table and column names." On PG this produced unparseable SQL.
New Ai::DialectHints module exposes name_for(conn) and
identifier_quoting_rule(conn). Every prompt now interpolates the
dialect name; Suggestion's quoting rule swaps to double quotes
on PG. IndexPlanner's prompt also specifies dialect-appropriate
DROP INDEX syntax.
3. MySQL-only AI buttons visible on PG dashboard
"Why is it slow?", "Anomaly Detection", "Variable Review",
"Connection Advisor", "InnoDB Health" all require MySQL-only data
sources. The buttons were rendered, clicking them returned a
friendly "MySQL/MariaDB-only" error — but the noise was the user
experience. capability?(:mysql_only_ai) now returns false on PG,
and the Server tab gates those five buttons on it. The controller-
side friendly-error guard stays as a defensive backstop.
4. Misleading server labels on PG
- "InnoDB Buffer Pool" card → renamed "Buffer Cache" (works for
both InnoDB and shared_buffers).
- PG ServerOverview was reporting tmp_disk_pct = 100% whenever
temp_files > 0, flagging a red badge on every dashboard. PG's
temp_files is a single count with no in-memory denominator, so
it now reports 0% and shows N / N for the count.
- Was mapping xact_rollback to "Aborted Clients" which is a
different concept; PG's connections block now reports 0 for
aborted_clients/aborted_connects until a proper PG-side mapping
exists.
https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE
035ee42 to
a5512ba
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds full PostgreSQL support to the mysql_genius-core analysis layer, enabling the dashboard to work against PostgreSQL servers in addition to MySQL/MariaDB. Dialect detection is automatic and transparent to callers.
Key Changes
Architecture
QueryBuildersfactory module dispatches toMysqlorPostgresqlbuilders based onconnection.server_version.dialect. Each builder is a stateless module exposing class methods that return raw SQL strings, with a stable column-name contract so downstream transformation logic remains dialect-agnostic.ServerOverviewnow dispatches to nestedMysqlorPostgresqlclasses; other analyses (TableSizes,QueryStats,UnusedIndexes,StatsCollector) delegate SQL generation to the appropriate builder.New Files
lib/mysql_genius/core/query_builders.rb— Factory and interface definitionlib/mysql_genius/core/query_builders/mysql.rb— MySQL/MariaDB SQL generation (extracted from existing analysis classes)lib/mysql_genius/core/query_builders/postgresql.rb— PostgreSQL SQL generation usingpg_stat_statements,pg_stat_activity,pg_class, andpg_settingsModified Analysis Classes
MysqlorPostgresqlnested classes; PostgreSQL implementation populates theinnodbblock withshared_buffersand buffer cache hit rate (blks_hit/blks_read ratio) so the existing UI continues to render.performance_schema.events_statements_summary_by_digestand PostgreSQLpg_stat_statementsrows into a uniform output schema.information_schema.tables(MySQL) orpg_class+pg_total_relation_size(PostgreSQL).performance_schema.table_io_waits_summary_by_index_usage(MySQL) orpg_stat_user_indexes(PostgreSQL); generates dialect-appropriate DROP INDEX statements./*+ MAX_EXECUTION_TIME(ms) */hint, MariaDB usesSET STATEMENT max_statement_time=s FOR, PostgreSQL usesSET statement_timeout = mswith reset in ensure block.Supporting Changes
#dialectmethod that collapses:mysqland:mariadbvendors into:mysqldialect, and:postgresqlvendor into:postgresqldialect. Enhanced#parseto recognize PostgreSQL version strings (case-insensitive).POSTGRESQL_SYSTEM_SCHEMASconstant; validation now checks the appropriate schema list based on dialect.quote_table_nameto use double quotes for PostgreSQL, backticks for MySQL; added PostgreSQL version string detection.Test Coverage
spec/mysql_genius/core/query_builders_spec.rbvalidates builder dispatch and SQL generation for both dialects.pg_stat_statements,pg_stat_activity,pg_class, andpg_settingsqueries.Notable Implementation Details
pg_stat_statementsrequires the extension to be installed; if unavailable, the query raises just as it does on MySQL whenperformance_schemais disabled.innodbblock (e.g.,row_lock_waits) fall back to 0 on PostgreSQL since there is no direct equivalent of InnoDB row lock waits.https://claude.ai/code/session_01Xy96nK3Ron2NqukBiSgtzE