Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full citations are skipped on overlap with misidentified nominative reporter citations #221

Closed
sentry-io bot opened this issue Feb 19, 2025 · 2 comments
Assignees

Comments

@sentry-io
Copy link

sentry-io bot commented Feb 19, 2025

This Sentry issue may help understand problems with eyecite.

The problem with "volume_nominative"
I am just picking a single example, but there are many for each "reporter"

  • Cooke

    • example: "In re Cooke, 93 Wn. App. 526, 529"
  • Holmes

    • example: "Connecticut v. Holmes, 221 A.3d 407"
  • Add.

    • example: "misconduct.” Add. 219 (CMEEC Bylaws)"
  • Taney

    • example: "Kern v Taney, 11 Pa D & C 5th 558 [2010])"
  • Olcott

    • Example: " MacArdell v. Olcott, 82 N.E. 161"

Cases without "volume_nominative" but weird results. EDIT: in the end these are actually nominative reporters; we just don't track that on reporters-db

Filed by @grossir


Sentry Issue: COURTLISTENER-96B

Unexpected null value in FullCaseCitation FullCaseCitation(' Cooke, 93', groups={'volume_nominative': None, 'reporter': 'Cooke', 'page': '93'}, metadata=FullCaseCitation.Metadata(parenthetical=None, pin_cite=None, year='1999', court=None, plaintiff=None, defendant=None, extra='Wn. App. 526, 529, 969 P.2d 127', antecedent_guess=None, resolved_case_name_short=None, resolved_case_name=None))
@flooie flooie moved this to Backlog Feb 24 to March 7 in Case Law Sprint Feb 24, 2025
@flooie flooie moved this from Backlog Feb 24 to March 7 to Future... in Case Law Sprint Feb 24, 2025
@grossir grossir self-assigned this Mar 5, 2025
@grossir
Copy link
Contributor

grossir commented Mar 6, 2025

We are not only missing creating UnmatchedCitations; or linking real citations, we are also generating bad citations
Example where Thompson 181 links to this. All the opinions in it's cited by tab are wrong...

Image

grossir added a commit that referenced this issue Mar 6, 2025
Solves #221 and #174

Uses a list of problematic nominative reporters to resolve overlpas

Due to the way we tokenize, an overlap was always resolved in favor
of the first token. In the case of nominative reporters, this caused
 a CitationToken to be found when a party name matched the
reporter's name, discarding the actual citation

This could be solved in a cleaner way by being consistent on
tagging nominative reporters on reporters-db
grossir added a commit that referenced this issue Mar 6, 2025
Solves #221 and #174

Uses a list of problematic nominative reporters to resolve overlpas

Due to the way we tokenize, an overlap was always resolved in favor
of the first token. In the case of nominative reporters, this caused
 a CitationToken to be found when a party name matched the
reporter's name, discarding the actual citation

This could be solved in a cleaner way by being consistent on
tagging nominative reporters on reporters-db
@grossir grossir moved this from Future... to PR'd Issues 🤞 in Case Law Sprint Mar 6, 2025
@grossir grossir changed the title Malformed FullCaseCitations Full citations are skipped on overlap with misidentified nominative reporter citations Mar 6, 2025
@grossir
Copy link
Contributor

grossir commented Mar 6, 2025

I downloaded all the Sentry events and got the full list of nominative reporters that are causing problems, and their counts

{
         'Thompson': 8592,
         'Holmes': 1852,
         'Chase': 1314,
         'Cooke': 607,
         'Olcott': 298,
         'Gilmer': 260,          
         'Bee': 134,         
         'Chase.': 85,
         'Taney': 34,
         'Crabbe': 30,
}

@grossir grossir closed this as completed Mar 6, 2025
@github-project-automation github-project-automation bot moved this from PR'd Issues 🤞 to Done in Case Law Sprint Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

1 participant