refactor(api): 重新引入业务错误重构,并收敛 diff lint 作用域#373
Conversation
|
不能先合这个求求了,这个合了我们的都合不进去了 |
There was a problem hiding this comment.
Pull request overview
该 PR 旨在重新引入后端“业务错误体系(bizerr)+ 统一错误出口(resputil.HandleError)”的重构,并同步完成前端错误码双轨兼容与 lint-diff 约束收敛,确保迁移方向清晰可执行。
Changes:
- 后端新增
bizerr分组错误码体系,并引入resputil.HandleError作为统一错误出口(旧接口保留但标记 deprecated)。 - 前端错误码生成与使用方式升级:新增 error group 常量、新旧错误码双轨(legacy)与请求层策略化处理(含
markApiErrorHandled)。 - 工具链调整:后端 lint 拆分为
lint-full+lint-diff,并在 CI 中准备 upstream base;补充中英文 README 规范说明。
Reviewed changes
Copilot reviewed 33 out of 34 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/src/utils/toast.ts | toast 侧增加“已处理标记”以避免冲突类错误 fallback 重复提示 |
| frontend/src/services/generator.py | 错误码生成脚本增强:支持 bizerr 结构体 tag、deprecated-only、prefix、group 常量输出 |
| frontend/src/services/error_code_legacy.ts | 新增:生成的 legacy 错误码常量文件(供前端双轨兼容) |
| frontend/src/services/error_code.ts | 更新:新 bizerr 错误码常量与分组常量输出 |
| frontend/src/services/client.ts | 请求层错误处理策略化:冲突组 delegate + fallback、双轨错误码识别、markApiErrorHandled |
| frontend/src/routes/auth/-components/login-form.tsx | 登录页对新旧错误码兼容,并在页面自行处理时报错去重 |
| frontend/src/routes/admin/more/index.tsx | 管理端系统配置场景适配新错误码并去重 toast |
| frontend/src/components/job/detail/s-s-h-port-dialog.tsx | SSHD not found 场景兼容 legacy 错误码并去重 toast |
| frontend/README.zh-CN.md | 补充前端错误码使用规范与示例(中文) |
| frontend/README.md | 补充前端错误码使用规范与示例(英文) |
| frontend/Makefile | generate-error-code 改为从 bizerr/groups.go 生成新码,并生成 legacy 文件 |
| backend/internal/service/statistics.go | service 层将参数/DB 错误改为 bizerr 返回(便于统一 HandleError) |
| backend/internal/resputil/response.go | Response 的 code 类型对齐 bizerr.BizCode;保留旧接口并标记 deprecated |
| backend/internal/resputil/handle.go | 新增统一错误出口 HandleError:从 error 提取 BizError 并映射 HTTP 状态码 |
| backend/internal/resputil/code.go | 旧错误码常量迁移为 bizerr.BizCode,并整体标记 deprecated(兼容历史代码) |
| backend/internal/handler/system_config.go | system-config 部分场景切换到 bizerr + HandleError |
| backend/internal/handler/statistics.go | statistics handler 全面切换到 bizerr + HandleError |
| backend/internal/handler/gpu_analysis.go | gpu-analysis 部分场景切换到 bizerr + HandleError |
| backend/internal/handler/account.go | account handler 大量路径切换到 bizerr + HandleError,并新增若干 gorm 错误抬升工具 |
| backend/internal/bizerr/init.go | 新增:bizerr 分组初始化与 FromError 提取工具 |
| backend/internal/bizerr/groups.go | 新增:bizerr 分组与 code tag 声明(生成 TS 错误码的来源) |
| backend/internal/bizerr/common/bizerrdb.go | 新增:gorm 常见错误抬升为 bizerr 的通用方法 |
| backend/internal/bizerr/bizerr.go | 新增:BizCode/BizError 定义与 New/Wrap 能力(基于 cockroachdb/errors) |
| backend/go.mod | 引入 cockroachdb/errors 等依赖 |
| backend/go.sum | 依赖校验更新 |
| backend/docs/swagger.yaml | swagger 中 Response.code 改为 integer 描述(移除旧 ErrorCode enum 定义) |
| backend/docs/swagger.json | 同步 swagger.json 更新 |
| backend/docs/docs.go | 同步 swagger docs.go 更新 |
| backend/README.zh-CN.md | 补充后端错误处理规则(中文) |
| backend/README.md | 补充后端错误处理规则(英文) |
| backend/Makefile | lint 拆分 full/diff,并增加缓存目录设置 |
| backend/.golangci-full.yml | full lint 增加对 deprecated resputil 使用的静态检查排除规则 |
| backend/.golangci-diff.yml | 新增 diff lint 配置:forbidigo 仅约束 handler/service 并禁止 fmt.Errorf/errors.New 等 |
| .github/workflows/backend-pr.yml | CI 调整:为 lint-diff 准备 upstream/main,并改为直接运行 make lint |
Comments suppressed due to low confidence (2)
backend/internal/handler/system_config.go:136
- 【核心规范】UpdateLLMConfig 这里对 "validation failed" 分支返回的是 bizerr.Conflict.ResourceStatusError(HTTP 409),但 swagger 注释目前仍写的是
@Failure400;另外非该分支仍在用旧的 resputil.Error(..., ServiceError)(会统一按旧逻辑返回 500),与“顶层统一 HandleError”迁移方向不一致。建议:1) swagger 补充/修正 409;2) 将非 validation 分支也改为 resputil.HandleError(c, err) 或用 bizerr.Internal.* 进行 Wrap。
// Service 内部会判断:如果 Key == "********" 则不更新 DB 中的 Key
err := mgr.service.UpdateLLMConfig(c.Request.Context(), serviceCfg, req.Validate)
if err != nil {
if strings.Contains(err.Error(), "validation failed") {
resputil.HandleError(c, bizerr.Conflict.ResourceStatusError.New("LLM connection check failed. Please verify your settings."))
return
}
resputil.Error(c, err.Error(), resputil.ServiceError)
return
backend/internal/handler/gpu_analysis.go:97
- 【核心规范】TriggerPodAnalysis 在功能未开启时改为返回 409(bizerr.Conflict.ResourceStatusError),但 swagger 仍只声明了 400/500;同时后续 err != nil 仍走旧的 resputil.Error(..., ServiceError),会丢失 service 层可能返回的更精确业务码/HTTP 状态。建议补充
@Failure409,并将错误出口统一切到 resputil.HandleError(c, err)。
// @Success 200 {object} resputil.Response[model.GpuAnalysis] "分析成功,返回创建的分析记录"
// @Failure 400 {object} resputil.Response[any] "请求参数错误"
// @Failure 500 {object} resputil.Response[any] "分析过程中发生错误"
// @Router /v1/admin/gpu-analysis/trigger [post]
func (mgr *GpuAnalysisMgr) TriggerPodAnalysis(c *gin.Context) {
if !mgr.configService.IsGpuAnalysisEnabled(c.Request.Context()) {
resputil.HandleError(c, bizerr.Conflict.ResourceStatusError.New("GPU Analysis feature is currently disabled."))
return
}
var req TriggerAnalysisReq
if err := c.ShouldBindJSON(&req); err != nil {
resputil.BadRequestError(c, err.Error())
return
}
analysisResult, err := mgr.service.AnalyzePodByName(c.Request.Context(), req.Namespace, req.PodName)
if err != nil {
klog.Errorf("Manual analysis trigger failed for pod %s: %v", req.PodName, err)
resputil.Error(c, err.Error(), resputil.ServiceError)
return
You can also share your feedback on Copilot code review. Take the survey.
| func (c BizCode) Wrap(cause error, msg string) error { | ||
| if cause == nil { | ||
| return nil | ||
| } | ||
| be := &BizError{Code: c, Message: msg} | ||
| // 1. errors.Mark 将 be 标记为 cause 的一部分,使得 errors.Is(err, be) 成立 | ||
| // 2. errors.WithStack 加上堆栈 | ||
| // 3. errors.WithMessage 加上新的描述 | ||
| err := errors.Mark(cause, be) | ||
| err = errors.WithMessage(err, be.Error()) | ||
| return errors.WithStack(err) |
| func handleUnexpectedError(c *gin.Context, err error) { | ||
| // 如果完全不是业务错误,记录详细堆栈到日志(%+v 是关键) | ||
| fmt.Printf("Unexpected Error: %+v\n", err) | ||
| respond(c, http.StatusInternalServerError, &bizerr.BizError{ | ||
| Code: 50000, | ||
| Message: "Internal Server Error", | ||
| }) | ||
| } | ||
|
|
||
| func logServerError(err error, bErr *bizerr.BizError) { | ||
| if bErr.Code.Group() >= firstServerErrorGroup { | ||
| fmt.Printf("Business 5xx Error: %+v\n", err) | ||
| } |
| // Success 200 OK - 常规查询、修改成功 | ||
| func Success(c *gin.Context, data any) { | ||
| wrapResponse(c, "", data, OK) | ||
| emit(c, http.StatusOK, OK, "success", data) |
| const markToastHandled = (error: unknown) => { | ||
| if (!error || typeof error !== 'object') { | ||
| return | ||
| } | ||
|
|
||
| const handledError = error as { | ||
| isHandledByBiz?: boolean | ||
| fallbackLogTimer?: ReturnType<typeof setTimeout> | ||
| } | ||
| handledError.isHandledByBiz = true | ||
|
|
||
| if (handledError.fallbackLogTimer) { | ||
| clearTimeout(handledError.fallbackLogTimer) | ||
| handledError.fallbackLogTimer = undefined | ||
| } |
| .PHONY: lint-diff | ||
| lint-diff: golangci-lint | ||
| @echo "$(BLUE)Linting changed code (ignore deprecated)...$(RESET)" | ||
| $(GOLANGCI_LINT) run \ | ||
| -c .golangci-diff.yml\ | ||
| --new-from-rev=upstream/main \ | ||
| --timeout 5m |
| token := util.GetToken(c) | ||
| // Check if current user is in account (does not require admin role) | ||
| if err := mgr.checkUserInAccount(c, token.UserID, req.ID); err != nil { | ||
| resputil.HTTPError(c, httpStatusForbidden, "Forbidden: User is not in account", resputil.NotSpecified) | ||
| resputil.HandleError(c, err) | ||
| return |
| action: 'toast', | ||
| match: (code) => code === LEGACY_ERROR_LEGACY_TOKEN_NOT_SUPPORTED, | ||
| message: '不再支持这种登陆方式,直接通过 LDAP 登录即可', | ||
| }, | ||
| { | ||
| action: 'toast', | ||
| match: (code) => code === ERROR_INVALID_REQUEST || code === ERROR_PARAMETER_ERROR, | ||
| message: (errorResponse) => `请求参数有误, ${errorResponse.msg}`, | ||
| }, |
| ``` | ||
|
|
||
| ```go | ||
| // 新错误码写这里 |
There was a problem hiding this comment.
按照会上说的,新增错误码应该谨慎或者不新增,现在覆盖的已经比较全面了
AkashiSensei
left a comment
There was a problem hiding this comment.
神,代码没细看,看看 Copilot 怎么说好了
|
该合这个了 |
…bility"" This reverts commit 241a3a7. Also scope diff forbidigo checks to internal/handler and internal/service only.
| // 1. errors.Mark 将 be 标记为 cause 的一部分,使得 errors.Is(err, be) 成立 | ||
| // 2. errors.WithStack 加上堆栈 | ||
| // 3. errors.WithMessage 加上新的描述 | ||
| err := errors.Mark(cause, be) |
There was a problem hiding this comment.
BizCode.Wrap uses cockroachdb/errors.Mark(cause, be), but resputil.HandleError later extracts the business error with errors.As(err, &bErr). The marker reference is not part of the unwrap/cause chain, so wrapped errors such as bizerr.Internal.DatabaseError.Wrap(...) are not recognized by bizerr.FromError and fall into handleUnexpectedError, returning HTTP 500 with code 50000 and message Internal Server Error instead of the intended biz code/message.
This affects the newly migrated wrapped paths, for example loadResourceMetadata returning DatabaseError.Wrap(...); it should preserve 50002 and the wrapped message, but currently loses both. It will be worse for future BadRequest/NotFound wraps because the HTTP status will also become 500.
Suggested fix: make the wrapped error explicitly carry *BizError in the unwrap chain or implement an As method on the wrapper, instead of relying on Mark for errors.As. Alternatively, change FromError and response extraction to use a single marker-aware mechanism that can recover both code and message. Please add a unit test like BadRequest.ParameterError.Wrap(errors.New("x"), "bad input") -> HandleError returns HTTP 400/code 40002/msg bad input.
| }, | ||
| { | ||
| action: 'toast', | ||
| match: (code) => code === ERROR_INVALID_REQUEST || code === ERROR_PARAMETER_ERROR, |
There was a problem hiding this comment.
This global policy treats code 40002 as the new ERROR_PARAMETER_ERROR, but 40002 is also still LEGACY_ERROR_BUSINESS_LOGIC_ERROR. That means legacy business-logic errors from endpoints not migrated in this PR are now shown as 请求参数有误, ... before the page-level handler gets a chance to handle them.
A concrete existing path is job creation billing precheck: backend/internal/handler/vcjob/vcjob.go still returns resputil.BusinessLogicError (40002) with a billing precheck blocked: message, and the frontend still has useJobCreateBillingBlock checking LEGACY_ERROR_BUSINESS_LOGIC_ERROR for that special dialog. With this policy, users will see an incorrect parameter-error toast in addition to the billing block handling.
Suggested fix: do not reuse 40002 for the new ParameterError while legacy BusinessLogicError is still emitted, or add a legacy-aware policy before this branch that preserves the old silent/delegate behavior for legacy business-logic errors. The safer option is to assign ParameterError a non-conflicting new code and keep 40002 reserved for the legacy meaning until all emitters are migrated.
|
@Cx330-502 我在代码附近留了两条需要优先处理的 inline review,都是运行时行为回归风险:
建议这两处修掉后再合。 |
| // Success 200 OK - 常规查询、修改成功 | ||
| func Success(c *gin.Context, data any) { | ||
| wrapResponse(c, "", data, OK) | ||
| emit(c, http.StatusOK, OK, "success", data) |
There was a problem hiding this comment.
This changes every successful response from msg: "" to msg: "success". That looks harmless for generic consumers, but some existing frontend success paths intentionally use response.msg || <localized/default message> so the backend can optionally provide a specific message while the frontend falls back to a localized business message.
For example, frontend/src/components/model/model-download-dialog.tsx uses const message = response.msg || defaultMessage and currently shows 已提交模型下载任务 / 已提交数据集下载任务 unless the backend returns a specific message. After this change, the generic success string always wins, so users will see success instead of the localized/business-specific toast for normal model/dataset download creation. Similar patterns exist in a few other consumers that treat empty msg as "no backend message".
Suggested fix: keep Success emitting an empty message for compatibility, or only set msg: "success" for endpoints that explicitly need it. If the response contract should change globally, update the frontend call sites to ignore generic success and keep their localized defaults.
|
@Cx330-502 我重新拉了最新 commit
GitHub 上 backend/frontend/storage 的 lint 和 build checks 也都过了。 剩下几个是非阻塞建议,不影响这次合入判断,可以后续 follow-up:
以上不阻塞合入。 |
背景
此前
refactor API error handling and sync frontend compatibility曾被误推到main后又被回滚。本 PR 以正常评审流程重新引入该重构(revert the revert),并补充 lint 作用域调整,保证迁移方向清晰且可落地。
主要变更
bizerr分组错误体系(400/401/403/404/405/409/413/429/500/502/503/504)resputil.HandleErrorresputil旧接口保留兼容(deprecated),支持平滑迁移markApiErrorHandled,避免页面已处理后再次全局重复 toastmake lint保持lint-full + lint-diff双阶段lint-diff中forbidigo作用域收敛为仅检查internal/handler与internal/service兼容性与风险
code/msg/data开发注意文档位置
backend/README.md中Error Handling Rulesbackend/README.zh-CN.md中错误处理规则frontend/README.md中Error Code Usagefrontend/README.zh-CN.md中错误码使用frontend/Makefile的generate-error-code