Skip to content

Improve Visio authoring safety and round-trip fidelity#1756

Merged
PrzemyslawKlys merged 13 commits intomasterfrom
codex/visio-bugfixes-worktree
Apr 8, 2026
Merged

Improve Visio authoring safety and round-trip fidelity#1756
PrzemyslawKlys merged 13 commits intomasterfrom
codex/visio-bugfixes-worktree

Conversation

@PrzemyslawKlys
Copy link
Copy Markdown
Member

Summary

  • harden Visio authoring flows so page-managed IDs, hierarchy edits, connector reconnects, and save-time validation behave deterministically
  • preserve significantly more Visio-authored XML during load/save round-trips across document, page, master, shape, connector, and style metadata
  • expand Visio regression coverage for grouping, connector retargeting, validation, and fidelity scenarios

Why

The Visio layer still had a few authoring paths that could bypass page-managed state and produce invalid or inconsistent IDs, hierarchy links, or connector behavior. It also rebuilt several parts of Visio-authored XML too aggressively, which caused loss of document fidelity on round-trip even when OfficeIMO did not intend to change that metadata.

Impact

This makes OfficeIMO.Visio safer for programmatic authoring and much closer to the larger OfficeIMO goal of reliable editing without dropping existing document metadata. It improves compatibility with Visio-authored files, preserves more native behavior/layout metadata, and adds first-class APIs for grouped-shape and connector maintenance workflows.

Validation

  • dotnet test OfficeIMO.Tests/OfficeIMO.Tests.csproj -f net8.0 --filter "FullyQualifiedName~VisioLoadFidelity"
  • dotnet test OfficeIMO.Tests/OfficeIMO.Tests.csproj -f net8.0 --filter "Visio"

@claude
Copy link
Copy Markdown

claude bot commented Apr 8, 2026

Claude finished @PrzemyslawKlys's task —— View job


Pull Request Review

I've completed my comprehensive review of this significant Visio authoring safety and fidelity improvement PR. Here are my findings:

Summary

This PR implements substantial improvements to OfficeIMO's Visio processing capabilities, focusing on preserving XML fidelity during load/save operations and hardening shape hierarchy management. The changes are well-architected and address important round-trip fidelity issues.

Code Quality Assessment

Strengths:

  • Comprehensive preservation logic: Excellent implementation of XML element/attribute preservation across document structure (VisioDocument.LoadCore.cs:42-114, VisioDocument.SaveCore.cs:164-183)
  • Robust validation: Strong shape hierarchy validation preventing cycles and orphaned references (VisioPage.cs:941-973, VisioShape.cs:296-318)
  • Improved connector management: Enhanced connector endpoint handling with proper glue preservation (VisioPage.cs:1026-1099)
  • Systematic testing: Extensive load/save fidelity tests validating XML round-trip behavior (Visio.LoadFidelity.cs)

Code Organization:

  • Clean separation between load and save logic
  • Well-structured preservation data classes
  • Consistent error handling patterns
  • Good use of internal visibility for implementation details

Critical Issues to Address

1. Shape Indexer Safety Issue (VisioShape.cs:392-409)
The shape indexer has a race condition where the existing child is detached before validation, potentially leaving inconsistent state if PrepareChildForParent throws:

// Current problematic code
_owner.PrepareChildForParent(value);
try {
    _owner._children[index] = value;
} catch {
    _owner.DetachChild(value); // Old child already detached!
    throw;
}
_owner.DetachChild(existing);

Recommendation: Validate first, then swap atomically.

2. Connector Ordering Issue (Per Codex review)
The connector indexer lacks duplicate/self-assignment guards, which could corrupt connector collections during reordering operations.

3. Missing Page Cell Filter (Per Codex review)
ShouldPreservePageCell only excludes 5 cell names, but SaveCore generates many additional PageSheet cells, potentially causing duplicate Cell entries.

Performance & Memory Considerations

Positive:

  • Efficient ID generation using HashSet lookup (VisioPage.cs:340-374)
  • Lazy XML element cloning only when needed
  • Smart delta-only serialization for master-backed shapes

Areas for Optimization:

  • Large preservation collections could benefit from capacity hints
  • Consider pooling XElement instances for frequently cloned elements
  • Shape hierarchy traversal could use iterative approach for deep nesting

Security Assessment

No security concerns identified. The code:

  • Properly validates XML namespace handling
  • Uses safe XElement parsing
  • Implements appropriate input validation
  • No unsafe operations or external dependencies

Test Coverage Analysis

Excellent coverage of:

  • Load/save fidelity scenarios (84% patch coverage per Codecov)
  • Shape hierarchy operations
  • Connector retargeting
  • XML round-trip preservation

Missing coverage areas (227 lines uncovered):

  • Error handling paths in preservation logic
  • Edge cases in master reference resolution
  • Validation failure scenarios

Automated Review Response

The Codex reviews identified several legitimate issues:

  1. Same-parent reordering bug: Forward moves calculate wrong insertion index
  2. Master metadata source selection: Risk of using generated master for preserved root metadata
  3. Stale formula preservation: Data row updates don't clear old formulas
  4. Style sheet base attribute exclusion: Inconsistent filtering for ID "0"

These should be addressed in a follow-up commit.

Final Recommendations

Approve with follow-up required - This is high-quality work that significantly improves Visio fidelity, but the shape indexer race condition should be fixed promptly.

Priority fixes:

  1. Fix shape indexer atomic update pattern
  2. Address connector duplicate assignment guards
  3. Expand page cell exclusion filter
  4. Review same-parent reordering logic

Future enhancements:

  • Consider performance optimizations for large documents
  • Add more edge case test coverage
  • Document preservation strategy for future maintainers

The architectural improvements and comprehensive preservation logic make this a valuable contribution to OfficeIMO's capabilities.


Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5ffca9d5e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +890 to +891
if (childIndex < 0 || childIndex == currentIndex) {
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor append index when reparenting within same parent

The ReparentShape contract says childIndex = -1 should append, but in the same-parent branch this path returns immediately, so calling ReparentShape(child, parent, -1) does nothing instead of moving the child to the end. This breaks the documented behavior and prevents callers from reordering children within a group using the advertised append mode.

Useful? React with 👍 / 👎.

Comment on lines +384 to +385
_owner.DetachChild(existing);
_owner.PrepareChildForParent(value);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve parent links when child replacement throws

The indexer detaches the existing child before validating the replacement, so if PrepareChildForParent throws (for example, replacement already has another parent or would create a cycle), the old child remains in _children but its Parent is now null. That leaves the hierarchy internally inconsistent and can later trigger validation/save failures after what should have been a rejected no-op update.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e677ce6b5e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1212 to +1214
set {
_page.PrepareConnectorForPage(value, _page._connectors[index]);
_page._connectors[index] = value;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard connector indexer against duplicate/self assignments

The connector indexer setter writes value directly into the slot after PrepareConnectorForPage without checking whether value is already stored elsewhere (or is the same instance already at this index). If callers try to reorder connectors with index assignments (for example swapping two entries), this can duplicate object references in _connectors and also mutate automatic IDs during the assignment, leaving duplicate connector IDs and causing subsequent validation/save failures. The setter should mirror ShapeCollection behavior by short-circuiting same-reference updates and rejecting connectors already present at another index.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8169e10bf1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +313 to +315
VisioMaster firstMaster = masters[0].Master;
WritePreservedAttributes(writer, firstMaster.PreservedMastersRootAttributes);
WritePreservedElements(writer, firstMaster.PreservedMastersRootElements);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Select preserved master before writing Masters root metadata

The <Masters> root-level preserved attributes/elements are taken only from masters[0].Master, so mixed documents can lose fidelity: if the first emitted master is a newly generated one (for example a dynamic-connector master from an earlier page) and a loaded master with preserved root metadata appears later, those preserved root fragments are never written back. Since load stores this metadata on loaded masters, writing from index 0 can silently strip masters.xml root extensions on save.

Useful? React with 👍 / 👎.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 84.73436% with 227 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.05%. Comparing base (e6b0074) to head (e79fbec).
⚠️ Report is 17 commits behind head on master.

Files with missing lines Patch % Lines
OfficeIMO.Visio/VisioPage.cs 67.04% 58 Missing and 30 partials ⚠️
OfficeIMO.Visio/VisioDocument.SaveCore.cs 88.46% 33 Missing and 22 partials ⚠️
OfficeIMO.Visio/VisioDocument.LoadCore.cs 91.86% 9 Missing and 31 partials ⚠️
OfficeIMO.Visio/VisioDocument.SaveHelpers.cs 74.35% 4 Missing and 16 partials ⚠️
OfficeIMO.Visio/VisioShape.cs 77.10% 16 Missing and 3 partials ⚠️
OfficeIMO.Visio/VisioDocument.Validation.cs 78.26% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1756      +/-   ##
==========================================
+ Coverage   69.68%   70.05%   +0.37%     
==========================================
  Files         983      988       +5     
  Lines      115229   117875    +2646     
  Branches    23598    24294     +696     
==========================================
+ Hits        80300    82583    +2283     
- Misses      24839    24983     +144     
- Partials    10090    10309     +219     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e79fbec249

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1316 to +1321
return !string.IsNullOrWhiteSpace(cellName) &&
!string.Equals(cellName, "PageWidth", StringComparison.OrdinalIgnoreCase) &&
!string.Equals(cellName, "PageHeight", StringComparison.OrdinalIgnoreCase) &&
!string.Equals(cellName, "PageScale", StringComparison.OrdinalIgnoreCase) &&
!string.Equals(cellName, "DrawingScale", StringComparison.OrdinalIgnoreCase) &&
!string.Equals(cellName, "InhibitSnap", StringComparison.OrdinalIgnoreCase);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exclude generated PageSheet cells from preservation

The new ShouldPreservePageCell filter only skips five names, but VisioDocument.SaveCore always regenerates many additional PageSheet cells (for example DrawingSizeType, PageLockReplace, UIVisibility, DrawingResizeType, PageShapeSplit). In a load→save round-trip, those cells are preserved and then written again via WritePreservedElements, producing duplicate Cell entries with the same N value in the same PageSheet. This can create conflicting page metadata and unstable Visio behavior for otherwise unchanged documents.

Useful? React with 👍 / 👎.

new XAttribute("N", "Value"));
clone.Add(valueCell);
}
valueCell.SetAttributeValue("V", value);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clear stale formulas when updating preserved data rows

When a shape data row is preserved, WriteDataSection clones the original row and only updates the Value cell's V attribute. If that preserved Value cell carries a formula (F), the stale formula is kept, so Visio can recompute from F and ignore the new V value; user updates to shape.Data[...] then fail to persist after reopen. Updating a modeled value should also remove or rewrite formula attributes on the Value cell.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef677623b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +967 to +968
if (childIndex > currentIndex) {
childIndex--;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor target index when reordering siblings forward

When reparenting within the same parent, decrementing childIndex after RemoveAt shifts forward moves one slot too early. For example, with children [A,B,C], calling ReparentShape(B, parent, 2) should place B at index 2 ([A,C,B]), but this logic decrements to index 1 and leaves order effectively unchanged. Since the API documents childIndex as the insertion index in the target group, this breaks same-parent reordering for non-append forward moves.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 40dc5ee6b1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1575 to +1579
if ((string.Equals(styleSheetId, "1", StringComparison.Ordinal) || string.Equals(styleSheetId, "2", StringComparison.Ordinal)) &&
(string.Equals(localName, "BasedOn", StringComparison.OrdinalIgnoreCase) ||
string.Equals(localName, "LineStyle", StringComparison.OrdinalIgnoreCase) ||
string.Equals(localName, "FillStyle", StringComparison.OrdinalIgnoreCase) ||
string.Equals(localName, "TextStyle", StringComparison.OrdinalIgnoreCase))) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exclude base style attrs when preserving style sheet 0

The style-sheet preservation filter currently excludes BasedOn/LineStyle/FillStyle/TextStyle only for IDs 1 and 2, so ID 0 preserves those attributes. On save, CreateGeneratedStyleSheet("0", ...) already creates the same attributes and then appends preserved ones again, which produces duplicate attribute names on the same StyleSheet element and can throw during save (including load→save of OfficeIMO-generated files).

Useful? React with 👍 / 👎.

Comment on lines +485 to +487
if (entry.Shape != null && page.Shapes.Contains(entry.Shape)) {
WriteShapeElement(writer, ns, entry.Shape, persistedIds, pageMasters, masters);
emittedShapes.Add(entry.Shape);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor runtime page shape order instead of stale preserved order

When a page was loaded from disk, save emits each preserved shape entry as long as that shape is still present, without consulting its current index in page.Shapes. This means reordering existing shapes/connectors in memory after load (e.g., remove/insert to change z-order) is ignored because the old PreservedShapesChildren order wins, so authoring changes to stacking/order do not persist.

Useful? React with 👍 / 👎.

@PrzemyslawKlys PrzemyslawKlys merged commit 5baa169 into master Apr 8, 2026
10 checks passed
@PrzemyslawKlys PrzemyslawKlys deleted the codex/visio-bugfixes-worktree branch April 8, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant