Skip to content

Conversation

@jheuel
Copy link

@jheuel jheuel commented Jun 1, 2025

Summary

Adds an arXiv identifier ARXIV similar to PMID, PMCID and DOI.

Makes it possible to add the recommended form of citing arXiv submissions to a bibliography template with <text variable="ARXIV" prefix="arXiv:"/>.

Context

The arXiv references come in two forms

  • pre 2007: arXiv:hep-th/9603067
  • post 2007: arXiv:2412.11645 [hep-ex]

where the identifier itself can also have a version, e.g. 2412.11645v2.

See also the biblatex manual section "3.14.7 Electronic Publishing Information", which basically says that arXiv submissions are given in their format as

eprint = {identifier},
eprinttype = {arxiv},
eprintclass = {class},

with a few aliases like primaryclass for eprintclass which are already implemented in typst/biblatex#75 but are not yet in the published release.

Related discussions

The actual change

Before this PR a hayagriva import of a bibtex file ignores the eprintclass and saves only the identifier as a serial number. Neither class or identifier are accessible from the CSL styles. After this PR the class is also added to the arXiv serial number but it still works without the class.

serial-number:
  - arxiv: '{identifier} [{class}]'

For the tests I added it to the end of the APS style, which looks like this

[1] R. Aaij others, Test of lepton flavor universality with B^+ arrow K^+ pi^+ pi^- ell^+ ell^- decays, Phys. Rev. Lett. 134, 181803 (2025), arXiv:2412.11645 [hep-ex].
[2] N. Itzhaki, Some remarks on 't Hooft's S-matrix for black holes, (1996), arXiv:hep-th/9603067.

First two commits are the discussed changes and I sneaked in two "cleanup" commits. One deletes an unused file in the tests folder and the other one adds custom error types for the cli.

Fixes #302.
Depends on typst/biblatex#75 and typst/citationberg#24.

@Drodt
Copy link
Contributor

Drodt commented Jun 1, 2025

The problem is that the CSL spec has no variable for arXiv and therefore no CSL style will use it. I propose opening an issue on the CSL spec repo as well

@jheuel
Copy link
Author

jheuel commented Jun 1, 2025

The problem is that the CSL spec has no variable for arXiv and therefore no CSL style will use it. I propose opening an issue on the CSL spec repo as well

I opened a PR to add the identifier to their schema as well.

@PgBiel PgBiel added the csl-upstream This is an upstream problem with a CSL style or spec. label Jun 3, 2025
@PgBiel
Copy link
Contributor

PgBiel commented Jun 3, 2025

Thank you for the contribution. For the time being, I'll consider this blocked on upstream (CSL schema). I'll note however that I've seen some styles which seemed to use some unofficial CSL variables, including ARXIV (but also identifiers for some less known journals and publishers), which is also why they aren't supported by hayagriva, so it is not unheard of.
I would like to see the CSL team's position on the matter first though (if this is intended to be added to a future CSL version or not).

@jheuel
Copy link
Author

jheuel commented Jun 3, 2025

You make a great point for custom bibliography variables and styles. It feels weird that hayagriva has to know of a journal in order to display an entry. There are custom bib styles but one cannot add variables without compiling a custom hayagriva. Maybe it would be nice to add a way to configure a map of serial-number to custom variables.

I think the CSL styles should be a large list of styles that users can depend on for professional style and should never be a constraint. Sounds dramatic but look at this lovely open issue citation-style-language/schema#131 from 2016 where the proposed workaround is using another field (PMID). Or this one citation-style-language/schema#350 from 2020. However, this proposed identifier is exactly what we already have in hayagriva in the serial-number dictionary only that we cannot use the data saved in those fields.

Current write-only implementation of the arXiv serial-number

| **Description:** | Any serial number, including article numbers. If you have serial numbers of well-known schemes like `doi`, you should put them into the serial number as a dictionary like in the second example. Hayagriva will recognize and specially treat `doi`, `isbn` `issn`, `pmid`, `pmcid`, and `arxiv`. You can also include `serial` for the serial number when you provide other formats as well. |

- Interpret the `eprint` BibTeX key as `serial-number.arxiv` if the `eprinttype` is set to `arxiv`

hayagriva/src/lib.rs

Lines 716 to 724 in 0c3c700

/// ArXiv identifier.
pub fn arxiv(&self) -> Option<&str> {
self.keyed_serial_number("arxiv")
}
/// Set the `arxiv` field.
pub fn set_arxiv(&mut self, arxiv: String) {
self.set_keyed_serial_number("arxiv", arxiv);
}

hayagriva/src/interop.rs

Lines 454 to 456 in 0c3c700

if eprint_type == Some("arxiv") {
item.set_arxiv(eprint);
} else if eprint_type == Some("pubmed") {

@Enivex
Copy link
Contributor

Enivex commented Jun 9, 2025

I find it highly unlikely that CSL would introduce a new variable just for a single preprint server. Even if it's large.

@jheuel
Copy link
Author

jheuel commented Jun 9, 2025

I find it highly unlikely that CSL would introduce a new variable just for a single preprint server. Even if it's large.

I already linked the discussion on the topic: citation-style-language/schema#350 (comment)
They said that they want to put it into a identifier array variable (essentially hayagriva's serial-numbers). However, the CSL schema repository seems abandoned.

@jheuel
Copy link
Author

jheuel commented Jun 10, 2025

@PgBiel

I'll note however that I've seen some styles which seemed to use some unofficial CSL variables, including ARXIV (but also identifiers for some less known journals and publishers), which is also why they aren't supported by hayagriva, so it is not unheard of.

I have given it some thought.

Proposition

What do you think about adding a extra-variables: name -> content key to the hayagriva spec that holds mappings of name -> content and then making the entries available in the templates under variables name maybe with some prefix or even with tag extravariable instead of variable?

This would allow

  • user-defined fields available in templates
  • using it (as staging area) for not officially CSL supported fields (e.g. for bibtex compatibility)

both without interferring with the CSL spec. We could even go for something longer like custom-extra-variables to make collisions with CSL even less likely and add a warning message if a template definition uses a variable that is not in the official CSL spec.

If you are interested I can throw together a PR.

@PgBiel
Copy link
Contributor

PgBiel commented Jun 18, 2025

I can see the value in having something like that, but I would hold off from adding this for now. Hayagriva's format so far has attempted to be fairly independent from CSL itself (there was also a period where it didn't use CSL at all, though it was much more limited then).

I wouldn't discard the idea entirely, but for now, I think we'll want to focus on developing Typst-side solutions allowing full customization of rendered bibliography entries. This PR typst/typst#5932 was the closest we have gotten so far to getting this to work, though some technical challenges made it not progress. I hope we can find a solution to them soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

csl-upstream This is an upstream problem with a CSL style or spec.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

How to render a arxiv number in typst?

4 participants