Skip to content

Treat leading period-abbreviations as titles (#109)#196

Merged
derek73 merged 6 commits into
masterfrom
worktree-leading-period-title
Jul 2, 2026
Merged

Treat leading period-abbreviations as titles (#109)#196
derek73 merged 6 commits into
masterfrom
worktree-leading-period-title

Conversation

@derek73

@derek73 derek73 commented Jul 2, 2026

Copy link
Copy Markdown
Owner

Summary

  • An unrecognized, multi-letter token ending in a period (e.g. "Major."), appearing before the first name is set, is now parsed as title instead of first, across all three parse formats (no-comma, suffix-comma, lastname-comma).
  • Adds a period_abbreviation regex (^[^\W\d_]{2,}\.$) and an is_leading_title() helper that composes it with the existing is_title() check, without mutating C.titles or any other Constants collection — so the periodless form (e.g. "Major") is never affected in later parses.
  • Single-letter initials ("J.") and internal-period abbreviations ("E.T.") are unaffected; a period-word after the first name is still parsed as a middle name.
  • Default-on behavior change: two pre-existing tests (test_suffix_in_parenthesis_with_period, test_brute_force.test16-18) had their expectations updated because they exercised the same leading-title code path with the old (now superseded) behavior — see docs/release_log.rst for the versioning note.

Closes #109.

Test plan

  • uv run pytest — 1070 passed, 22 xfailed
  • uv run mypy nameparser/ — clean
  • uv run ruff check nameparser/ tests/ — clean
  • New tests in tests/test_titles.py cover the happy path across all three parse formats, chained leading abbreviations, exclusions (single-letter initial, internal-period), post-first-name placement, and interaction with known titles/middle initials

derek73 added 5 commits July 1, 2026 20:27
…109)

Add period_abbreviation regex and is_leading_title() helper. In the
no-comma parse path, an unrecognized multi-letter token ending in a
period before the first name is set (e.g. "Major.") is now parsed as
title instead of first. Update test_suffix_in_parenthesis_with_period,
which documented the old behavior as a known limitation, to match.
…path (#109)

Complete the leading-title wiring across all three parse_full_name()
paths. Update test_brute_force test16-18, which documented the old
behavior for "Doe, John. A. Kenneth..." (unrecognized "John." parsed
as first name); it's now correctly recognized as a leading title, same
as the no-comma and suffix-comma paths.
@derek73 derek73 added this to the v1.3.0 milestone Jul 2, 2026
@derek73 derek73 self-assigned this Jul 2, 2026
…ocs (#109)

Add tests for the leading-period-abbreviation feature's remaining
untested boundaries per PR #196 review: digit/apostrophe exclusion,
case-insensitivity, and interaction with parenthetical nicknames.
Also tighten doc wording that implied only the literal first token in
a name could become a title, when the rule applies to the whole
leading title run (chained abbreviations included).
@derek73 derek73 merged commit 7728747 into master Jul 2, 2026
8 checks passed
@derek73 derek73 deleted the worktree-leading-period-title branch July 2, 2026 04:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider names followed by a period as titles or suffixes

1 participant