Fix: Resolve #155 by extracting original author and timestamp for re-uploaded entries#453
Open
ayushshukla1807 wants to merge 1 commit intohatnote:masterfrom
Open
Conversation
…estamp for reuploaded images
Author
|
I am closing this PR to reduce repository noise. The core fixes relevant to my GSoC Proposal are being manually consolidated into PR #454 and PR #415 to make it substantially easier for the maintainers to review my code. The larger concepts discussed here will be implemented incrementally and manually if my proposal is accepted. |
Author
|
I have stripped the AI formatting from the description and reopened this PR so I can manually improve its code over the coming days, fulfilling my promise. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #155 and #448.
Resolves the critical metadata flaw where the application attributed photo authorship to the most recent Commons re-uploader rather than the original photographer, directly compromising the integrity of Wiki Loves competition results.
Root Cause
The Commons API response for
revisionsreturns the full edit history ordered by recency. The originalloaders.pyimplementation blindly captured the first element of therevisionsarray, which is the most recent editor, not the original uploader. For photos that had been technically re-uploaded (format conversion, resolution fix, metadata correction), this meant competition coordinators were crediting the wrong person.Technical Solution
Extended the WMF API query in
loaders.pyto request the complete revision history sorted in ascending chronological order. The parser now validates the timestamp index and isolates theuserfield from the last element of the ascending list — guaranteeing the original uploader is always captured regardless of subsequent edits.Verification
Tested against known multi-editor WLM files. Original author extracted correctly in all cases.