Skip to content

bug(trino): SQL formatter rewrites Trino-native functions to invalid names (date_diff→datediff, to_hex→hex) via sqlglot dialect mismatch #39100

@shimonp

Description

@shimonp

Bug description

When "Format SQL" is enabled in SQL Lab, Superset passes Trino queries through
sqlglot for pretty-printing but does not thread the Trino dialect through to
sqlglot's generator. This causes sqlglot to fall back to its generic SQL
generator, which renames several Trino-native functions to names that Trino
does not recognise:

date_diff → datediff (FUNCTION_NOT_FOUND)
to_hex → hex (FUNCTION_NOT_FOUND)
from_hex → unhex (FUNCTION_NOT_FOUND)
date_trunc → (argument order swapped in some paths)

The error from Trino is always the same pattern:
TrinoUserError(type=USER_ERROR, name=FUNCTION_NOT_FOUND,
message="line N:N: Function 'datediff' not registered")

sqlglot itself is NOT at fault. When the dialect is passed correctly, it
round-trips Trino functions perfectly:

sqlglot.transpile("SELECT date_diff('day', a, b)", read="trino", write="trino")
→ ["SELECT date_diff('day', a, b)"] ✅

sqlglot.transpile("SELECT date_diff('day', a, b)") # no dialect
→ ["SELECT DATEDIFF(b, a)"] ❌ ← what Superset produces

Root cause: Superset's SQL formatting path calls sqlglot without specifying
the write dialect for TrinoEngineSpec, so the generic generator is used.
The fix belongs in superset/db_engine_specs/presto.py (PrestoBaseEngineSpec)
or trino.py, adding explicit sqlglot TRANSFORMS overrides for the affected
exp.* AST nodes so they emit the correct Trino function names regardless of
how the generator is invoked.

Screenshots/recordings

  1. Connect Superset 6.0.0 (or 6.1rc1) to a Trino database (any version)
  2. Open SQL Lab, ensure "Format SQL" is ON
  3. Run:
    SELECT date_diff('day', DATE '2023-01-01', current_date)
  4. Observe error:
    TrinoUserError: Function 'datediff' not registered
  5. Disable "Format SQL" — same query runs successfully
  6. Same issue reproducible with: to_hex(), from_hex(), sha256(), md5()

Superset version

master / latest-dev

Python version

3.9

Node version

16

Browser

Chrome

Additional context

Expected result:
Trino-native functions (date_diff, to_hex, from_hex, md5, sha256, date_add)
are passed through to Trino unchanged, regardless of whether "Format SQL"
is enabled.

Checklist

  • I have searched Superset docs and Slack and didn't find a solution to my problem.
  • I have searched the GitHub issue tracker and didn't find a similar bug report.
  • I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

Metadata

Metadata

Assignees

No one assigned

    Labels

    data:connect:trinoRelated to TrinosqllabNamespace | Anything related to the SQL Lab

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions