Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ require (
)

require (
github.com/itchyny/gojq v0.12.18
github.com/itchyny/gojq v0.12.19
github.com/santhosh-tekuri/jsonschema/v5 v5.3.1
github.com/stretchr/testify v1.11.1
github.com/tetratelabs/wazero v1.11.0
Expand All @@ -30,7 +30,7 @@ require (
github.com/google/uuid v1.6.0 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.28.0 // indirect
github.com/inconshreveable/mousetrap v1.1.0 // indirect
github.com/itchyny/timefmt-go v0.1.7 // indirect
github.com/itchyny/timefmt-go v0.1.8 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/segmentio/asm v1.1.3 // indirect
github.com/segmentio/encoding v0.5.4 // indirect
Expand Down
8 changes: 4 additions & 4 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,10 @@ github.com/grpc-ecosystem/grpc-gateway/v2 v2.28.0 h1:HWRh5R2+9EifMyIHV7ZV+MIZqgz
github.com/grpc-ecosystem/grpc-gateway/v2 v2.28.0/go.mod h1:JfhWUomR1baixubs02l85lZYYOm7LV6om4ceouMv45c=
github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8=
github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=
github.com/itchyny/gojq v0.12.18 h1:gFGHyt/MLbG9n6dqnvlliiya2TaMMh6FFaR2b1H6Drc=
github.com/itchyny/gojq v0.12.18/go.mod h1:4hPoZ/3lN9fDL1D+aK7DY1f39XZpY9+1Xpjz8atrEkg=
github.com/itchyny/timefmt-go v0.1.7 h1:xyftit9Tbw+Dc/huSSPJaEmX1TVL8lw5vxjJLK4GMMA=
github.com/itchyny/timefmt-go v0.1.7/go.mod h1:5E46Q+zj7vbTgWY8o5YkMeYb4I6GeWLFnetPy5oBrAI=
github.com/itchyny/gojq v0.12.19 h1:ttXA0XCLEMoaLOz5lSeFOZ6u6Q3QxmG46vfgI4O0DEs=
github.com/itchyny/gojq v0.12.19/go.mod h1:5galtVPDywX8SPSOrqjGxkBeDhSxEW1gSxoy7tn1iZY=
github.com/itchyny/timefmt-go v0.1.8 h1:1YEo1JvfXeAHKdjelbYr/uCuhkybaHCeTkH8Bo791OI=
github.com/itchyny/timefmt-go v0.1.8/go.mod h1:5E46Q+zj7vbTgWY8o5YkMeYb4I6GeWLFnetPy5oBrAI=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
Expand Down
10 changes: 5 additions & 5 deletions internal/middleware/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,19 +96,19 @@ with open(payload_path) as f:
The middleware uses the same jq filter logic as the gh-aw jqschema utility:

```jq
def walk(f):
def walk_schema:
. as $in |
if type == "object" then
reduce keys[] as $k ({}; . + {($k): ($in[$k] | walk(f))})
reduce keys[] as $k ({}; . + {($k): ($in[$k] | walk_schema)})
elif type == "array" then
if length == 0 then [] else [.[0] | walk(f)] end
if length == 0 then [] else [.[0] | walk_schema] end
else
type
end;
walk(.)
walk_schema
```

This recursively walks the JSON structure and replaces values with their type names.
This recursively walks the JSON structure and replaces values with their type names. The function is named `walk_schema` to avoid shadowing gojq's built-in `walk/1`.

### Go Implementation

Expand Down
29 changes: 15 additions & 14 deletions internal/middleware/jqschema.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ type PayloadMetadata struct {
}

// jqSchemaFilter is the jq filter that transforms JSON to schema
// This filter leverages gojq v0.12.18 features including:
// This filter leverages gojq v0.12.19 features including:
// - Enhanced array handling (supports up to 536,870,912 elements / 2^29)
// - Improved concurrent execution performance
// - Better error messages for type errors
Expand All @@ -56,25 +56,26 @@ type PayloadMetadata struct {
// For arrays, only the first element's schema is retained to represent the array structure.
// Empty arrays are preserved as [].
//
// NOTE: This defines a custom walk function rather than using gojq's built-in walk(f).
// NOTE: This defines a custom walk_schema function rather than using gojq's built-in walk(f).
// The built-in walk(f) applies f to every node but preserves the original structure.
// Our custom walk does two things the built-in cannot:
// Our custom walk_schema does two things the built-in cannot:
// 1. Replaces leaf values with their type name (e.g., "test" → "string")
// 2. Collapses arrays to only the first element for schema inference
//
// These behaviors are incompatible with standard walk(f) semantics, which would
// apply f post-recursion without structural changes to arrays.
// Using a distinct name avoids shadowing gojq's built-in walk/1.
const jqSchemaFilter = `
def walk(f):
def walk_schema:
. as $in |
if type == "object" then
reduce keys[] as $k ({}; . + {($k): ($in[$k] | walk(f))})
reduce keys[] as $k ({}; . + {($k): ($in[$k] | walk_schema)})
elif type == "array" then
if length == 0 then [] else [.[0] | walk(f)] end
if length == 0 then [] else [.[0] | walk_schema] end
else
type
end;
walk(.)
walk_schema
`

// Pre-compiled jq query code for performance
Expand Down Expand Up @@ -108,7 +109,7 @@ func init() {
return
}

logMiddleware.Printf("Successfully compiled jq schema filter at init (gojq v0.12.18)")
logMiddleware.Printf("Successfully compiled jq schema filter at init (gojq v0.12.19)")
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding the gojq version in logs tends to become stale as dependencies are upgraded. Consider removing the version from this message (or sourcing it from build/module metadata) so the log stays accurate without needing manual updates.

Suggested change
logMiddleware.Printf("Successfully compiled jq schema filter at init (gojq v0.12.19)")
logMiddleware.Printf("Successfully compiled jq schema filter at init")

Copilot uses AI. Check for mistakes.
logger.LogInfo("startup", "jq schema filter compiled successfully - array limit: 2^29 elements, timeout: %v", DefaultJqTimeout)
}

Expand All @@ -117,7 +118,7 @@ func generateRandomID() string {
bytes := make([]byte, 16)
if _, err := rand.Read(bytes); err != nil {
// Fallback to timestamp-based ID if random fails
return fmt.Sprintf("fallback-%d", os.Getpid())
return fmt.Sprintf("fallback-%d-%d", os.Getpid(), time.Now().UnixNano())
}
return hex.EncodeToString(bytes)
}
Expand All @@ -136,7 +137,7 @@ func generateRandomID() string {
// Error handling:
// - Returns compilation errors if init() failed
// - Returns context.DeadlineExceeded if query times out
// - Returns enhanced error messages for type errors (gojq v0.12.18+)
// - Returns enhanced error messages for type errors (gojq v0.12.19+)
// - Properly handles gojq.HaltError for clean halt conditions
func applyJqSchema(ctx context.Context, jsonData interface{}) (interface{}, error) {
// Check if compilation succeeded at init time
Expand All @@ -153,7 +154,7 @@ func applyJqSchema(ctx context.Context, jsonData interface{}) (interface{}, erro
}

// Run the pre-compiled query with context support (much faster than Parse+Run)
// The iterator is consumed only once because the walk(.) filter produces exactly
// The iterator is consumed only once because the walk_schema filter produces exactly
// one output value (the fully-transformed schema). There is no need to drain it.
iter := jqSchemaCode.RunWithContext(ctx, jsonData)
v, ok := iter.Next()
Expand All @@ -178,7 +179,7 @@ func applyJqSchema(ctx context.Context, jsonData interface{}) (interface{}, erro
return nil, fmt.Errorf("jq schema filter halted with error (exit code %d): %w", haltErr.ExitCode(), err)
}

// Generic error case (includes enhanced v0.12.18+ type error messages)
// Generic error case (includes enhanced v0.12.19+ type error messages)
return nil, fmt.Errorf("jq schema filter error: %w", err)
}

Expand Down Expand Up @@ -210,13 +211,13 @@ func savePayload(baseDir, pathPrefix, sessionID, queryID string, payload []byte)
logger.LogInfo("payload", "Writing large payload to filesystem: path=%s, size=%d bytes (%.2f KB, %.2f MB)",
filePath, payloadSize, float64(payloadSize)/1024, float64(payloadSize)/(1024*1024))

if err := os.WriteFile(filePath, payload, 0644); err != nil {
if err := os.WriteFile(filePath, payload, 0600); err != nil {
logger.LogError("payload", "Failed to write payload file: path=%s, size=%d bytes, error=%v",
filePath, payloadSize, err)
return "", fmt.Errorf("failed to write payload file: %w", err)
}

logger.LogInfo("payload", "Successfully saved large payload to filesystem: path=%s, size=%d bytes, permissions=0644",
logger.LogInfo("payload", "Successfully saved large payload to filesystem: path=%s, size=%d bytes, permissions=0600",
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.WriteFile only applies the provided file mode when the file is created; if payload.json already exists (e.g., from a prior run or an ID collision), its existing permissions are preserved. That can leave a world-readable file even after switching to 0600. Consider explicitly enforcing the mode after writing (e.g., chmod to 0600) and/or opening with flags that ensure the file is created with the intended permissions.

Suggested change
logger.LogInfo("payload", "Successfully saved large payload to filesystem: path=%s, size=%d bytes, permissions=0600",
if err := os.Chmod(filePath, 0600); err != nil {
logger.LogError("payload", "Failed to enforce payload file permissions: path=%s, size=%d bytes, error=%v",
filePath, payloadSize, err)
return "", fmt.Errorf("failed to set payload file permissions: %w", err)
}
logger.LogInfo("payload", "Successfully saved large payload to filesystem: path=%s, size=%d bytes, enforcedPermissions=0600",

Copilot uses AI. Check for mistakes.
filePath, payloadSize)
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The success log hardcodes permissions=0600, but the resulting mode can still differ (e.g., if the file already existed, or if the OS applies different semantics). Consider logging the actual stat.Mode().Perm() after the write so logs match reality.

See below for a potential fix:

	// Verify file was written correctly and log the actual resulting mode
	if stat, err := os.Stat(filePath); err != nil {
		logger.LogWarn("payload", "Could not verify payload file after write: path=%s, error=%v", filePath, err)
		logger.LogInfo("payload", "Successfully saved large payload to filesystem: path=%s, size=%d bytes",
			filePath, payloadSize)
	} else {
		logger.LogInfo("payload", "Successfully saved large payload to filesystem: path=%s, size=%d bytes, permissions=%#o",
			filePath, payloadSize, stat.Mode().Perm())

Copilot uses AI. Check for mistakes.

// Verify file was written correctly
Expand Down
29 changes: 28 additions & 1 deletion internal/middleware/jqschema_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,33 @@ func TestApplyJqSchema(t *testing.T) {
}
}

// TestApplyJqSchema_SingleOutputContract verifies that the walk_schema filter produces
// exactly one output value. This documents the invariant that the iterator yields a single
// result, catching any future filter changes that accidentally produce multiple outputs.
func TestApplyJqSchema_SingleOutputContract(t *testing.T) {
require := require.New(t)

inputs := []interface{}{
map[string]interface{}{"name": "test", "count": 42},
[]interface{}{map[string]interface{}{"id": 1}},
map[string]interface{}{"nested": map[string]interface{}{"a": []interface{}{1, 2, 3}}},
}

for _, input := range inputs {
iter := jqSchemaCode.RunWithContext(context.Background(), input)

Comment on lines +126 to +128
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses jqSchemaCode directly without asserting compilation succeeded first. If init-time compilation ever fails (or changes in the future), jqSchemaCode may be nil and this will panic rather than fail with a helpful assertion. Consider checking jqSchemaCompileErr/jqSchemaCode != nil up front (or calling applyJqSchema and validating the iterator behavior indirectly).

Copilot uses AI. Check for mistakes.
// First call must return a value
v, ok := iter.Next()
require.True(ok, "walk_schema should produce at least one output")
_, isErr := v.(error)
require.False(isErr, "walk_schema should not produce an error: %v", v)

// Second call must signal exhaustion (no more values)
v2, ok2 := iter.Next()
require.False(ok2, "walk_schema should produce exactly one output, got second value: %v", v2)
}
}

func TestSavePayload(t *testing.T) {
// Create temporary directory for test
baseDir := filepath.Join(os.TempDir(), "test-jq-payloads")
Expand Down Expand Up @@ -517,7 +544,7 @@ func TestPayloadStorage_FilePermissions(t *testing.T) {
// Check file permissions
fileInfo, err := os.Stat(filePath)
require.NoError(t, err)
assert.Equal(t, os.FileMode(0644), fileInfo.Mode().Perm(), "File should have 0644 permissions")
assert.Equal(t, os.FileMode(0600), fileInfo.Mode().Perm(), "File should have 0600 permissions")
}

// TestPayloadStorage_DefaultSessionID verifies behavior when session ID is empty
Expand Down
Loading