Fix PDF scraper

### Context

The PDF extraction logic does not understand PagerDuty's two-column layout. This leads to artifacts like the report author's name being spliced into parts of the document:

<img width="665" height="470" alt="Image" src="https://github.com/user-attachments/assets/b0d38194-dfb1-4663-8736-c9c210b34e97" />

### Proposal

I wonder whether we should just author our own reports, and provide some useful constructs for things like timelines, etc., rather than relying on PagerDuty output — I don't find the PagerDuty UI all that useful.

However, shorter term, can can probably parse this in different passes: 
1. Find the right hand column with `extract_text_lines` and locate the `Owner of review process` marker to determine `x0`.
2. Find the bottom of the multi-column layout with `extract_text_lines` and locate `Timeline` to determine `y1`
3. Parse `(0, 0, x0, y1)` as a single column (walk over `extract_text_lines` scoped to this bounding box, and regex match headings to delineate sections)
4. Parse the timeline as-is.

### Updates and actions

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix PDF scraper #18

Context

Proposal

Updates and actions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Fix PDF scraper #18

Description

Context

Proposal

Updates and actions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions