Enforce data structure and data type consistency for JSON metadata#1421
Open
Enforce data structure and data type consistency for JSON metadata#1421
Conversation
Implements JSON Schema for titles
jrhoads
approved these changes
Nov 11, 2025
…led vocabulary term because checking happens prior to save and the API adds null as the value.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/models/schemas/doi/title.json`:
- Around line 6-8: The "title" schema currently only enforces type:string so
empty or whitespace-only titles are allowed; update the "title" property (in
app/models/schemas/doi/title.json) to require non-empty, non-blank values by
adding constraints such as "minLength": 1 and a "pattern" that requires at least
one non-whitespace character (e.g., \\\\S), and ensure the existing required
declaration that references "title" still applies; apply the same constraints
for any other title occurrence mentioned in the schema.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1c82f3ce-0cef-4020-b4ee-a276dd35c1ef
📒 Files selected for processing (12)
app/models/schemas/doi/controlled_vocabularies/contributor_type.jsonapp/models/schemas/doi/controlled_vocabularies/date_type.jsonapp/models/schemas/doi/controlled_vocabularies/description_type.jsonapp/models/schemas/doi/controlled_vocabularies/funder_identifier_type.jsonapp/models/schemas/doi/controlled_vocabularies/name_type.jsonapp/models/schemas/doi/controlled_vocabularies/number_type.jsonapp/models/schemas/doi/controlled_vocabularies/related_identifier_type.jsonapp/models/schemas/doi/controlled_vocabularies/related_item_type.jsonapp/models/schemas/doi/controlled_vocabularies/relation_type.jsonapp/models/schemas/doi/controlled_vocabularies/resource_type_general.jsonapp/models/schemas/doi/controlled_vocabularies/title_type.jsonapp/models/schemas/doi/title.json
🚧 Files skipped from review as they are similar to previous changes (8)
- app/models/schemas/doi/controlled_vocabularies/funder_identifier_type.json
- app/models/schemas/doi/controlled_vocabularies/resource_type_general.json
- app/models/schemas/doi/controlled_vocabularies/title_type.json
- app/models/schemas/doi/controlled_vocabularies/number_type.json
- app/models/schemas/doi/controlled_vocabularies/date_type.json
- app/models/schemas/doi/controlled_vocabularies/related_identifier_type.json
- app/models/schemas/doi/controlled_vocabularies/relation_type.json
- app/models/schemas/doi/controlled_vocabularies/contributor_type.json
Comment on lines
+6
to
+8
| "title": { | ||
| "type": "string" | ||
| }, |
There was a problem hiding this comment.
Disallow empty/blank required titles.
Line 6–Line 8 only enforce type: "string", so "" (or whitespace-only values) will still pass even though Line 17–Line 19 marks title as required. Please add content constraints.
Suggested schema patch
"title": {
- "type": "string"
+ "type": "string",
+ "minLength": 1,
+ "pattern": ".*\\S.*"
},Also applies to: 17-19
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/models/schemas/doi/title.json` around lines 6 - 8, The "title" schema
currently only enforces type:string so empty or whitespace-only titles are
allowed; update the "title" property (in app/models/schemas/doi/title.json) to
require non-empty, non-blank values by adding constraints such as "minLength": 1
and a "pattern" that requires at least one non-whitespace character (e.g.,
\\\\S), and ensure the existing required declaration that references "title"
still applies; apply the same constraints for any other title occurrence
mentioned in the schema.
…IdentifierType be required?
…'uri' for uris, use long form of dependentRequired (if)
…, xml xs:anyURI, dependencyRequired (documentation 'suggestions' vs xsd.
…, xml xs:anyURI, dependencyRequired (documentation 'suggestions' vs xsd. Also, just use default minitems in the array of objects instead of explicitly specifying minitems: 0
…lupo uses in check_language.
…1 date ranges, and the standard vocab for unknown information do validate date fields (as is in our documentation).
…ts.json. (Breaks spec/requests/repositories_spec.rb:458 otherwise).
…d accept string ''
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
closes: https://github.com/datacite/product-backlog/issues/325
Approach
See #1341 for the approach
Open Questions and Pre-Merge TODOs
Learning
Types of changes
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Reviewer, please remember our guidelines:
Summary by CodeRabbit
New Features
Refactor
Chore
Tests