Generates a formatted RTF publication list for a tenure packet, driven by two files:
- A BibTeX file exported from Google Scholar (publications + patents).
- A YAML side file for everything Scholar doesn't track — invited talks, leadership roles, media appearances, conference presentations, grants, students, CVEs, security disclosures, key works.
Output is ready to paste into Word with formatting preserved: bold for you, superscripts for student / advisor / last-author roles, italic venue + tier, clickable hyperlinks to DOIs / NVD / USPTO / publisher pages, hanging-indent layout, RTF tables for patents + student lists.
- BibTeX is the canonical publication store. This repo assumes you
are using Google Scholar as the master reference for your publications,
and that you use the "James Davis (TM)" venue notation: prefix every
journal/booktitlewith[ACRONYM'YY]. The acronyms are required because Purdue wants to see venue rankings in the packet, and the acronym is what we look up inassets/config.yaml. - Everything else lives in
non-scholar-work.yaml. Talks, grants, students, CVEs, etc. — Scholar can't carry them, so we keep them in one hand-edited YAML file. - Cross-references are automatic. CVE entries link back to the
originating paper (
(see C.4.7)); Key Works cite the paper's canonical location ((listed as C.4.7)); the conference-presentation section name-drops the linked paper's section ref; student tables auto-populate the "Related Publications" column from the bib.
Tip: A GenAI tool --- Codex, Claude, etc. --- can help you populate
non-scholar-work.yamlfrom your CV. Just give it screenshots of each section.
./setup.sh # one-time: venv + editable install + dev extras
./pubs-emitter.py --bib my_papers.bib \
--non-scholar non-scholar-work.yamlOr once setup.sh has run, the shorter form works from anywhere in the venv:
source .venv/bin/activate
pubs-emitter --bib my_papers.bib --non-scholar non-scholar-work.yamlThis writes publications.rtf. Open it in Word, then copy → "Paste
Special → Unformatted Text" into your tenure-packet template if you
want it to inherit the host doc's font.
publications/
├── pubs-emitter.py # root driver (delegates to src/pubs_emitter/cli.py)
├── setup.sh # one-command bootstrap (venv + editable install)
├── pyproject.toml # build + pylint + mypy + pytest config + project deps
├── README.md
├── CLAUDE.md # editor-facing notes; non-derivable rules + pitfalls
├── .gitignore
├── assets/
│ ├── config.example.yaml # committed schema + starter venue rankings
│ └── config.yaml # ME, ADVISORS, STUDENTS, RANKS (gitignored — your data)
├── src/
│ └── pubs_emitter/
│ ├── __init__.py
│ ├── types.py # NamedTuples + type aliases
│ ├── config.py # loads assets/config.yaml + code-side constants
│ ├── latex.py # decode_latex + rtf_escape_unicode
│ ├── db.py # SQLite cache + LOOKUP_STATS
│ ├── network.py # RateLimiter, polite_get, try_{crossref,dblp,nvd,patentsview}
│ ├── authors.py # name parsing + format_author / format_inventors
│ ├── venue.py # parse_venue, lookup_rank, classify_entry, ID extractors
│ ├── lookup.py # plan / dispatch / commit + cache-aware fetchers
│ ├── builders.py # build_* + load_non_scholar + validate_non_scholar
│ ├── rtf.py # RtfTable, render_*, write_rtf
│ └── cli.py # parse_args + main()
└── tests/ # pytest; ~190 tests, full coverage of the public API
├── conftest.py
├── fixtures/
│ ├── config.yaml
│ ├── sample.bib
│ └── non-scholar.yaml
└── test_*.py
The bib (my_papers.bib), the YAML (non-scholar-work.yaml), the live
user config (assets/config.yaml), the cache (lookup_cache.sqlite),
and the output (publications.rtf) are all gitignored. Only the example
config (assets/config.example.yaml) is committed.
main() runs five phases:
plan_lookups— walks every entry + CVE, emitsNetworkTasks for cache misses only.dispatch_parallel— runs all tasks concurrently viaThreadPoolExecutor. Each host has its ownRateLimiter(Crossref ~10/s, DBLP ~5/s, NVD 5/30s without key or 50/30s with, PatentsView ~2/s). Retries transient HTTP failures with exponential backoff (1s, 2s, 4s) and honorsRetry-Afterheaders.commit_results— persists results into the appropriate cache table (doi_cache,patent_cache,cve_cache).build_*— assemblesCitation/Patent/KeyWork/InvitedTalk/LeadershipRole/MediaAppearance/ConferencePresentation/Grant/Studentrecords. By now the cache is warm; build is local-only.write_rtf— emits the final RTF, sections rendered inSECTION_ORDER, with cross-reference indexes built between sort and render.
- Citations —
journal/booktitleMUST begin with a bracketed acronym + year tag. The acronym is looked up inassets/config.yamlunderranks:. Examples:journal = {[JSS'25] The Journal of Systems and Software} booktitle = {[ICSE'25] Proceedings of the International Conference on Software Engineering} journal = {[arXiv'26] arXiv preprint arXiv:2605.10712} - Patents —
@miscwhosepublisherornotecontainspatent.note = {US Patent 11,176,090}carries the number; USPTO date lookup is attempted via PatentsView whenPATENTSVIEW_API_KEYis set. - Book chapters —
@incollectionor@inbook. No bracket-tag needed (no venue field for it). DOI / URL comes frommanual_links:inassets/config.yaml. - Theses —
@phdthesis/@mastersthesis. Built internally but not emitted in any section yet — held for future cross-references. - CVEs are NOT in the bib. Bib stays Scholar-canonical. CVEs go in
non-scholar-work.yaml.
The renderer always emits sections in this order (skipped if the corresponding YAML / bib source is empty):
| Code | Heading | Source |
|---|---|---|
| C.1 | Key Scholarly Publications or Patents | key_works: in YAML |
| C.2 | Journals | bib @article |
| C.3 | Books and chapters in books | bib @incollection / @inbook |
| C.4 | Conferences and Workshops | bib @inproceedings |
| C.5 | Other publications and products | bib arXiv + magazines + YAML cves: + security_disclosures: |
| C.6 | Invited Talks | invited_talks: |
| C.7 | Leadership Roles | leadership_roles: |
| C.8 | Media Appearances | media_appearances: |
| C.9 | Conference Presentations | conference_presentations: |
| C.10 | Externally sponsored grants as PI | grants_as_pi: |
| C.11 | Externally sponsored grants as Co-PI or Co-I | grants_as_co_pi: |
| C.12 | External gifts and voluntary support | gifts: |
| C.13 | Internal competitive grants as PI or Co-PI | internal_grants: |
| C.14 | Graduate students advised | graduate_students: (RTF table) |
| C.16 | Undergraduate students advised | undergraduate_students: (RTF table) |
| C.19 | Issued U.S. and International Patents | bib @misc with note = US Patent ... (RTF table) |
| C.23 | Service to Purdue | university_service: |
| C.24 | Service to the profession through professional societies | profession_service: |
| C.25 | Service to State, Nation, or International Organizations | national_service: |
| C.26 | Other external service activities to the profession not noted above | other_service: |
Within each section, entries are sorted chronologically (oldest first).
For peer-reviewed venues (C.2, C.4), the tier marker renders inline as
Venue rank: Tier 1 / Tier 2 / etc. — italicized + underlined. Other
sections use a bare label (Preprint, CVE, Magazine).
Grant sections C.10–C.13 each render a Total amount of ...: $X,XXX,XXX
line above the numbered list.
The YAML file is one big map; each top-level key drives one or more sections. Every section's schema is validated at load time; ALL errors are batched into one report (no first-fail crashes).
key_works:
- paper_title: Engineering Patterns for Trust and Safety on Social Media Platforms
impact: |
A 100-word-ish impact statement on why this paper matters.
cves:
# CVE attached to a paper. C.5 entry inherits the paper's author list
# (with student markers) and gets a "(see C.4.N)" back-pointer.
- cve_id: CVE-2024-38373
organization: FreeRTOS # optional; auto-derived from NVD's CPE
paper_title: Engineering Patterns for Trust and Safety on Social Media Platforms
# Stand-alone CVE — no associated paper. Provide the disclosers explicitly.
- cve_id: CVE-2025-1675
organization: Zephyr-RTOS
disclosers:
- Davis, James C
- Amusuo, Paschal C
security_disclosures:
# Vendor-acknowledged disclosure with no CVE assigned. Always linked to a paper.
- paper_title: Engineering Patterns for Trust and Safety on Social Media Platforms
vendor: VendorName
description: One-sentence description of the disclosure.
year: 2024
invited_talks:
- topic: Regular Expression Denial of Service
subtitle: "" # optional
venue: Some University, City
year: 2024
leadership_roles:
- role: Co-Chair
description: The 12th International Workshop on X
society: ACM SIGSOFT
year: 2023
media_appearances:
- title: How software supply chains break
venue: Some Podcast
year: 2024
url: "https://example.com/episode"
conference_presentations:
# Links back to the bib via paper_title. Renders as
# "Talk at <venue> in <year>. Associated with publication C.4.N."
- paper_title: Engineering Patterns for Trust and Safety on Social Media Platforms
grants_as_pi:
- title: NSF CAREER PROJECT
agency: US National Science Foundation
agency_short: NSF
grant_number: "2541917"
role: PI
start_year: 2025
end_year: 2030
amount: 600000
activities: Optional multi-sentence description.
responsibility: Optional free-text role/percent statement.
# Validated against the bib (titles must match) but NOT rendered in
# C.10. Reserved for C.1 Key Works cross-link wiring (TODO).
inspired_by: []
publication_outcomes: []
grants_as_co_pi: []
gifts: []
internal_grants: []
graduate_students:
- name: Paschal C. Amusuo
degree: PhD # PhD / DEng / MS-Thesis / MS-Non-Thesis
role: Chair # Chair / Co-Chair / Committee member
grad_year: 2025 # 9999 for ongoing; sorts to bottom
graduation: "2025 Spring" # display label
position: Software Engineer @ Some Co. # optional
undergraduate_students: []
# C.23-C.26 service entries. Same shape across all four sections:
# a free-form description (full role + venue string) and an optional year.
# Year may be int (2025), string for multi-year ("2025, 2026, 2027"), or
# range ("2024-2025" / "2023-present"). Omit `year:` entirely for ongoing
# service with no fixed date (typically journal reviewing) — the renderer
# emits description only and sorts such entries to the bottom.
university_service:
- description: "Member, Purdue ECE ABET Committee"
year: "2023-present"
profession_service: # conference PC roles
- description: "PC Member, ICSE"
year: "2025, 2026, 2027"
national_service:
- description: "US National Science Foundation, Panelist"
year: 2025
other_service:
- description: "Reviewer, IEEE Transactions on Software Engineering (TSE)" # journal, no year
- description: "PC Member, EuroSec"
year: "2024, 2025, 2026"Validation rules (all checked at YAML load time, batched into one report):
cves[].cve_idrequired; must look likeCVE-YYYY-NNNN.- Each
cves[]entry must have at least one ofpaper_titleordisclosers. - Every
paper_titleacrosscves/security_disclosures/conference_presentations/key_worksAND every title in each grant'sinspired_by/publication_outcomesMUST match a bib entry (case + whitespace-insensitive). - Grants must have
title,agency,role,start_year,end_year,amount.agency_shortis optional — when present it prefixes the bolded head line as{agency_short}[ #grant_number]: {title}; when empty the head renders as just the title (use this for fellowships and internal Purdue programs that don't carry a canonical funder-name prefix). - Students must have
name,degree,role,grad_year. - Service entries (C.23–C.26) must have
description.yearis optional — when omitted, only the description is rendered (used for journal reviewing). - Invited talks must have
venue,year, and at least one oftopic/subtitle.
A failed validation prints every error and exits 1 — fix them all and re-run.
| Variable | Effect |
|---|---|
LOG_LEVEL |
DEBUG / INFO (default) / WARNING / ERROR |
PATENTSVIEW_API_KEY |
Enables USPTO issue-date lookup for patents |
NVD_API_KEY |
Raises NVD rate limit ~10x (5/30s → 50/30s) |
PUBS_EMITTER_CONFIG |
Override path to assets/config.yaml |
PUBS_EMITTER_USER_AGENT |
Override the HTTP User-Agent (mailto:) |
.venv/bin/pylint src/pubs_emitter # 9.95/10 baseline
.venv/bin/mypy # type-clean
.venv/bin/pytest # ~190 tests, sub-secondPylint, mypy, and pytest configs all live in pyproject.toml. The
pylint config disables the noisy style/code-org categories
(missing-docstring, too-many-arguments, etc.) but keeps every
substantive check (unused imports, broad except, dangerous default
value, etc.).
The test suite is structured as one file per package module
(test_latex.py, test_venue.py, …) plus test_e2e.py which drives
cli.main end-to-end with the three network entry points monkey-patched.
No real HTTP, no real network. Fixtures live in tests/fixtures/:
config.yaml— minimal config used by all tests (loaded viaPUBS_EMITTER_CONFIGset inconftest.py).sample.bib— one of each entry kind (article / inproceedings / arXiv / patent / incollection / phdthesis), including a Çakar coauthor for Unicode-escape coverage.non-scholar.yaml— one entry under every YAML key.
Run a single file with pytest tests/test_latex.py; a single class with
pytest tests/test_rtf.py::TestRtfTable; a single test with pytest tests/test_e2e.py::TestE2eContentInvariants::test_unicode_round_trip_through_rtf_escape.