Skip to content

Export hygiene, Unicode use, and code structure #97

@goerz

Description

@goerz

Even though it is not strictly part of the JOSS acceptance criteria in openjournals/joss-reviews#8545 (comment), I have some concerns about the package's API and code structure.

There are two particular points that I feel very strongly about:

  • Do not use single-letter names for functions
  • Do not require Unicode as part of the public API

I will detail these below, in the context of some more general recommendations regarding the API and code structure.

Obviously, this is ultimately up to you, although I would very strongly encourage you to resolve the above two items in whatever way you feel most comfortable with. Presumably, this will be a rather big "breaking change"; but it seems to me that the context of the JOSS review is the best opportunity to revise the API of the package in a way that can then be "stable" and tagged as v1.0.0, see #91.

API for eigenvalues, eigenfunctions, potentials

From the README

using Antique

The energy E(), the wave function ψ(), the potential V(), and some other functions will be exported.

I find these names highly problematic:

  • This clashes with most guidelines and commonly accepted best practices. E.g. from the common "blue" style:

    Use lower case with underscores for method names

  • Single-letter function names are "jargon". Probably, E and V are both familiar symbols for eigenvalues and potential energies, but they are hardly universal. A much more common understanding of E(x) might be "expectation value". The lowercase ψ might be less common. Wave functions might be more commonly associated with an uppercase Ψ. What makes matters worse is that Ψ and ψ are very hard to distinguish in most monospaced fonts.

    For the non-exported functions R, L, Y associated with some models, I would have no idea what these are without looking at the documentation.

  • The single-letter function names have an extremely high likelihood of clashing with single-letter variable names that a user might use. In fact, if you naively try to run through the code on the "Home" page without restarting the REPL for the "Demonstration", you get a conflict between Antique and the line E, C = eigen(Symmetric(H),Symmetric(S)). You've clearly encountered this, since you spend a whole paragraph on the front page on this:

    There are two ways to avoid function name conflicts. Run import Antique instead of using Antique, and use the energy Antique.E(), the wave function Antique.ψ() and the potential Antique.V(). Or try giving other function names like using Antique: V as potential, E as energy, ψ as wavefuntion, HydrogenAtom.

    This is really the wrong way around! If anything, that should be using Antique: potential as V, energy as E, wavefunction as ψ. That is, use longer, more descriptive names for the function names, and then let people who want a shorthand import it under whatever name they like (including using Antique: wavefunction as Ψ with the uppercase Ψ).

The bottom line is, the exported functions should be named eigenvalue (or energy), wavefunction (or eigenfunction), potential.

Also, for the models that have those, it should be radial_function instead of R, laguerre instead of L, spherical_harmonic instead of Y, rodrigues_formula instead of P, etc.

Unicode

I think there is a pretty wide consensus in the Julia community that Unicode should not be used in public APIs. See, e.g., the SciML Style Guide:

Unicode is fine within code where it increases legibility, but in no case should Unicode be used in public APIs. This is to allow support for terminals which cannot use Unicode: if a keyword argument must be η, then it can be exclusionary to uses on clusters which do not support Unicode inputs.

See also some of the lively discussions on Discourse.

I say this as someone who is extremely supportive of Unicode, and finds Julia's support for Unicode one of the most attractive features of the language: don't require the user of your packages to be able to type Unicode. Besides the "old terminal" argument from the SciML guide, you'll also exclude any student wanting to write a Julia script in a basic text editor without a Julia plugin. Also: Unicode completion in Jupyter, for example, doesn't work while the kernel is busy. You're also asking people to have a monospace font with good Unicode support installed, like JuliaMono. If I'm not mistaken, the default Windows terminal did not have Unicode support until very recently. It's great to allow your users to use Unicode, but making it impossible to use the library without Unicode is a mistake.

This very much applies to the exported function ψ, but also in other places, you really should have an ASCII fallback. Unicode symbols are fine for positional arguments, but not for (public) attributes or keyword arguments without an ASCII alternative. So, take

@kwdef struct HydrogenAtom
  Z = 1
  mₑ = 1.0
  a₀ = 1.0
  Eₕ = 1.0= 1.0
end

This could be redefined as

"""
    H = HydrogenAtom(Z = 1, m_e = 1.0, a_0 = 1.0, E_h = 1.0, hbar = 1.0)
    H = HydrogenAtom(Z = 1, mₑ = 1.0, a₀ = 1.0, Eₕ = 1.0, ħ = 1.0)

# Arguments

* `Z`: …
* `m_e` (`mₑ`): …
* `a_0` (`a₀`): …
* `E_h` (`Eₕ`): …
* `hbar` (`ħ`): …

"""
struct HydrogenAtom
    Z::Int64
    m_e::Float64
    a_0::Float64
    E_h::Float64
    hbar::Float64
end

function HydrogenAtom(;
    Z = 1,
    m_e = 1.0,
    a_0 = 1.0,
    E_h = 1.0,
    hbar = 1.0,
    mₑ = m_e,
    a₀ = a_0,
    Eₕ = E_h,
    ħ = hbar
)
    return HydrogenAtom(Z, mₑ, a₀, Eₕ, ħ)
end

to support both a Unicode and ASCII interface. If you want to go all the way, you could add

function Base.getproperty(H::HydrogenAtom, sym::Symbol)
    if sym == :mₑ
        return getfield(H, :m_e)
    elseif sym == :a₀
        return getfield(H, :a_0)
    elseif sym == :Eₕ
        return getfield(H, :E_h)
    elseif sym == 
        return getfield(H, :hbar)
    else
        return getfield(H, sym)
    end
end

function Base.propertynames(H::HydrogenAtom)
    return (:Z, :m_e, :a_0, :E_h, :hbar, :mₑ, :a₀, :Eₕ, )
end

to that.

You may also decide to avoid this complication and stick to ASCII for field names and keyword arguments. Since you're expecting contributions for new models, not requiring contributors to have parallel ASCII and Unicode interfaces, like in the code above (and not screw up something!) lowers the burden significantly.

You may also run into some trouble by wanting every ASCII symbol to have a Unicode equivalent: what if you ever add a model that has a symbol with a subscript "q"? Unfortunately, Unicode does not provide a subscript "q".

Even if you use ASCII in field names, you can still use Unicode equivalents internally to make equations look nice:

function V(model::HydrogenAtom, r; Z = model.Z, a_0 = model.a_0, E_h = model.E_h, a₀ = a_0, Eₕ = E_h)
  # Or, put the definitions inside the function body, but making them keyword arguments adds flexibility
  if !(0  r)
    throw(DomainError("r = $r", "r must be non-negative: 0 ≤ r."))
  end
  return -Z / abs(r / a₀) * Eₕ
end

Model Organization and Documentation

Antique's current "flat" code organization makes using the library more difficult than necessary. Fundamentally, you have global functions E, H, L, P, R, V, Y, nₘₐₓ, ψ (all of which should be renamed, see above), but each of these may or may not be applicable to a specific model. In an interactive context such as the REPL, or a Jupyter/Pluto notebook, there is no good way (via tab completion) to know which of these methods applies to a given model.

An easy way around this is to wrap each model in a module:

  • Define the possible functions of the API as empty generic functions in Antique, e.g., eigenvalue, potential, wavefunction, etc.

  • For each model, create a submodule with a pluralized name (Antique.HydrogenAtoms for the HydrogenAtom model). Define the HydrogenAtom struct and methods for all the functions that are applicable to the object. The methods should be re-exported (i.e., made public) in the submodule.

  • If you want a show method for the keyword-constructors by default, implement the 2-arg show method for an Antique.AbstractModel abstract type from which the individual models should subtype (you don't need, and probably shouldn't have, a method for string.

I made a proof of concept for this approach at https://github.com/goerz-testing/AntiqueModules.jl. Feel free to copy any of the setup.

The major benefit of this is that

using Antique.HydrogenAtoms

only imports those functions that are actually relevant for the model: for HydrogenAtoms you would get laguerre (what is now L), but for

using Antique.HarmonicOscillators

you wouldn't get laguerre, but only potential, eigenvalue, eigenfunction, and hermite.

Also, in the REPL or in a notebook, Antique.HydrogenAtoms.<tab> would tab-complete the functions applicable (and only those applicable) to the model.

To preserve the current API, you could still re-export the AbstractModel subtypes (alongside all the possible functions) from the submodules, as I've done in the proof of concept.

This also makes the documentation easier to organize, as you can see in the proof of concept: you can use a simple @autodocs block as a starting point for a model-specific page, like the current one for the Hydrogen Atom. You'd still have to add the Usage & Examples, or course.

Metadata

Metadata

Assignees

Labels

jossIssues related to JOSS reviews

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions