RFC: `val` declarations in structures #59

gasche · 2025-11-25T21:59:48Z

Code examples in this PR. No drooling!

val map : ('a -> 'b) -> 'a list -> 'b list
let rec map f li = ...

val even : int -> bool
val odd : int -> bool
let rec even n = ...
and odd n = ...

let rev li =
  let val loop : 'a list -> 'a list -> 'a list in
  let rec loop li acc = ... in
  loop li []

val mem : type a . a -> a list -> bool
let mem elt li =
  let rec loop : a list -> bool = function
  | [] -> false
  | x :: xs -> (x = elt) || loop xs
  in loop li

val map : _
let rec map f li = ...

gasche · 2025-11-25T22:01:29Z

(cc @Octachron, @garrigue, @t6s, @samsa1, @goldfirere with whom I discussed this during a synchronous meeting.)

kit-ty-kate · 2025-11-25T22:23:23Z

I think the val … [val …] let … [and …] and let val syntax fits reasonably well within the OCaml syntax, however i think the extensions proposed are a bit too alien looking.

Locally abstract types in val declarations […]

What about simply introducing all the new locally abstract types by default instead of requiring a separate syntax?
Meaning this would be valid:

val mem : type a . a -> a list -> bool
let mem elt li =
  let rec loop : a list -> bool = function
  | [] -> false
  | x :: xs -> (x = elt) || loop xs
  in loop li

gasche · 2025-11-26T08:27:17Z

I mention the val : type a . ... syntax because it is something that has been discussed, and a viable plan forward (among others). But in my mind it is not part of the RFC (that is, I wouldn't consider "acceptance" of the RFC to imply acceptance of this particular feature), only a possible "Extension" to be discussed on its own later. Mentioning possible extensions helps by making sure that the feature will be compatible with other desirable features (and sometimes can generate positive interactions, as with val foo : _), instead of painting us into a corner later on.

yallop · 2025-11-26T09:29:28Z

I like this idea, but I think we should also consider another possible interpretation of val declarations in structures before committing to this one.

In some languages forward declarations are used to introduce recursion. For example, in Agda types can be forward-declared to introduce recursion, like this:

-- forward-declare the `B` type
data B : Set -- Now `B` is in scope

data A where
  a : B → A -- okay: `B` has been declared (but not yet defined)

data B where
  b : A → B

We could give the same interpretation to val in OCaml, so that we could define the even and odd functions in the proposal as follows:

val even : int -> bool
val odd : int -> bool
let even n = ...
let odd n = ...

One motivation for this alternative interpretation of val bindings is that it naturally extends to solve a separate longstanding problem in OCaml: it is currently awkward to introduce recursion between different kinds of items. For example, suppose that we wanted to define a tree with unordered children. If OCaml had support for forward declarations then we could define the types for tree and the set of children as follows:

type unordered_tree                                             (* forward declaration *)
module Tree_set : Set.OrderedType with type t = unordered_tree  (* forward declaration *)

type unordered_tree = Empty | Node of int * Tree_set.t     (* okay: Tree_set.t in scope *)
module Tree_set = Set.Make(struct type t = unordered_tree
                                  let compare = ...        (* also, Tree_set.compare in scope here *)
                           end)

The current alternative is to use a multiple-binding module rec, which is more cumbersome (due, for example, to the need to give full signatures):

module rec Unordered_tree :
sig
  type t = Empty | Node of int * Tree_set.t
  val compare : t -> t -> int
end = 
struct
  type t = Empty | Node of int * Tree_set.t
  let compare = ...
end
and
Tree_set : Set.S with type elt = Unordered_tree.t
  = Set.Make(Unordered_tree)

Adding support for full forward declarations would require addressing various additional design and implementation questions. But I think we should explicitly decide whether we might like to give val that alternative interpretation before cutting off the possibility by adopting the incompatible interpretation in the current proposal.

gasche · 2025-11-26T09:37:30Z

I have myself been thinking about more generalized forms of forward declarations, but none of them felt completely right so I haven't submitted a RFC about them. The difficulty in my view is the interaction between forward declarations and shadowing of definitions.

But is it actually an incompatible interpretation? Naively I would expect that everything allowed with my proposal would keep the same interpretation with your proposal. Is that not the case, are there examples of programs that would be accepted here and rejected in your world, or whose semantics would change?

yallop · 2025-11-26T09:42:05Z

Here's a distinguishing example:

let f x = 1 + x
val f : int -> int
let f x = 2 + f x
let y = f 3

If a forward declaration introduces recursion then this program doesn't terminate. If it instead serves as an ascription for the following definition then it terminates with y bound to 6

kit-ty-kate · 2025-11-26T13:21:35Z

If a forward declaration introduces recursion then this program doesn't terminate.

Not necessarily. We could simply say that the forward definition declares the name everywhere after it except in the next definition. I think that it makes sense this way, otherwise things could get too complicated too quickly i think.
If we take that definition, the following would now need let rec instead of just let:

val even : int -> bool
val odd : int -> bool
let rec even n = ...
let rec odd n = ...

which i think is probably sensible

yallop · 2025-11-26T13:34:52Z

I don't understand that proposal.

I can imagine two interpretations of "the next definition". Perhaps you mean "the next definition of the same name" (i.e. a binding let x that occurs at some point after a declaration val x), which would mean that in the following code

let f x = 1 + x
val f : int -> int
let g x = 2 + f x
let f x = 3 + f x

2 + f x would refer to the ~~first~~ second [edit: corrected] definition of f but 3 + f x would refer to the ~~second~~ first [edit: corrected] definition.

Alternatively you might mean "the immediately following definition of any name", which would mean that in the following code

val even : int -> bool
val odd : int -> bool
let even n = ...
let odd n = ...

both even and odd would be in scope in odd, but neither would be in scope in even.

Both of these seem perverse, so perhaps you mean something else.

kit-ty-kate · 2025-11-26T15:00:23Z

Sorry if i wasn't clear. Finding the right names for new things is a bit hard ^^

I meant the following (rough) semantics:

               [...]
----------------------------------  :: declaration
Γ & Δ |- val x : T ▷ Γ & Δ, x : T

              x : T ∈ Δ
            Γ & Δ \ x |- t
-------------------------------------- :: definition
Γ & Δ |- let x = t ▷ Γ, x : T & Δ \ x

              x : T ∈ Δ
            Γ, x : T & Δ |- t
------------------------------------------ :: rec_definition
Γ & Δ |- let rec x = t ▷ Γ, x : T & Δ \ x

  x ∉ Δ
  x ∈ Γ
---------- :: var_from_definition
Γ & Δ |- x

  x ∈ Δ
  x ∉ Γ
---------- :: var_from_declaration
Γ & Δ |- x

I believe the difference between your semantics and mine is that your definition would instead be:

              x : T ∈ Δ
              Γ & Δ |- t
-------------------------------------- :: definition
Γ & Δ |- let x = t ▷ Γ, x : T & Δ \ x

where x isn't removed from Δ.

Legend:

Γ: the classical OCaml environment
Δ: a new environment keeping track of forward declarations (this can probably be implemented as part of Γ in the real implementation but i'm using a different environment to make writing the semantics a bit simpler)
T: types
t: terms
x: identifier/variable

Side note: sorry if i made some gruesome mistakes/oversight when writing the semantics, i'm doing it from some memory/notes from 10 years ago.

yallop · 2025-11-26T15:29:28Z

Okay, you mean roughly "the next definition of the same name", so that in the following code:

let f x = 1 + x
val f : int -> int
let g x = 2 + f x
let f x = 3 + f x

the body of the last f refers to the definition of f above, and the body of g would refer to the definition of f below, except that the var rules don't support resolving a variable that is both declared and defined.

Niols · 2025-11-26T17:41:19Z

I would love to see this supported, indeed, under one form or another. The proposal about forward declaration is more involved and I'm not a big fan of this type of feature because I find they make the code harder to reason with, although I have been frustrated occasionally by the module rec example.

The original RFC, however, is not very far from the existing code base. In fact, the parser is already accepting val at the structure level and would accept let val ... in if ocaml/ocaml#14040 was merged. I'd written an experiment a while ago to do this sort of things in a PPX; I've never used it but I've cleaned it up and put it on GitHub for the occasion.

gasche · 2025-11-26T19:34:19Z

Here's a distinguishing example:
let f x = 1 + x
val f : int -> int
let f x = 2 + f x
let y = f 3
If a forward declaration introduces recursion then this program doesn't terminate. If it instead serves as an ascription for the following definition then it terminates with y bound to 6

My reaction to this example is that forward declarations should not implicitly make things recursive, so let rec would still be required to allow resolving names that are forward-declared-but-not-defined. In other words, in a let .. and ... block (non-rec), identifiers in bound position refer to their latest definition in scope (ignoring declarations), and in a let rec .. and .., identifiers in bound position refer to their latest definition, or declaration, or the current definition, whichever is closest.

I think this is roughly similar to the intuition of @kit-ty-kate.

yallop · 2025-11-26T20:19:05Z

As far as I can see, the point of forward declarations, whether in Agda or C or any other language, is that they make names available in the scope following, to allow referring to those names before the associated definitions are available.

I can see the use for the separate signatures proposed in this PR (which are not forward declarations). But I don't understand why you would want forward declarations that don't bring names into scope. What am I missing?

gasche · 2025-11-26T20:30:58Z

In OCaml (unlike C or Agda I believe) we have a distinction between recursive and non-recursive definitions, where "recursive" can be understood as "has access to things that are in the process of being defined". Note that only term-level definitions are non-recursive by defaults, others are recursive by default and some (but not all) have a nonrec marker to tweak the default.

So the less liberal interpretation of forward declarations that Kate and myself propose in fact probably coincides exactly with what you want for the cases of type-level recursion that you actually care about, it is only on terms that it is more debatable, and the semantics we propose have the benefits of aligning with the simpler interpretation in this PR (in particular it makes your proposal a secondary aspect that could be brought later on).

gasche · 2025-11-26T20:36:21Z

Thinking about this a bit more, I can still see examples where the semantics is a bit tricky.

let g = ...

val f : ...
val g : ...
let rec f = ... g ...
let rec g = ...

With the restriction of your semantics that I propose, let rec f = ... let rec g = ... is equivalent to let rec f = ... and g = ... when both f and g are forward-declared. In particular the occurrence of g in let rec f = .... g ... resolves to the definition below which is forward-declared. With the semantics I propose in this PR, this example is rejected (because val g is not correctly attached to let rec g), and this is very fortunate because the semantics that I would otherwise expect (where val are basically ignored in the dynamic semantics) is that g would be resolved to the older definition, despite the forward declaration.

This example is close to having incompatible interpretations with my PR and with your proposal, even with the restriction that we proposed, but it isn't actually a counter-example.

yallop · 2025-11-26T20:44:08Z

My view is that the "less liberal" design for forward declarations is simply not viable from a usability perspective. It's not palatable to have forward declarations that bring names into scope in some definitions but not others. So I personally don't really mind whether or not this PR is compatible with the "less liberal" interpretation, because I don't think there's any prospect of adopting that feature.

lpw25 · 2025-11-27T12:40:28Z

I think you don't need the "Locally abstract types in val declarations" if we merge ocaml/ocaml#12732, so I wouldn't bother with that.

I didn't read through the entirity of the discussion on forward declarations, so I could have missed it, but I don't think I saw anyone suggest one of the possibilities, which is that for a forward definition to be used before it is defined you need to do:

val rec foo : int -> int
let bar i = foo (i + 1)
let foo i = i * 7
end

with a rec keyword on the val rather than on the let.

Note that val would always bind the name, but if you don't do rec on the val, then you using the variable is just an error:

val foo : int -> int
let foo i = foo (i + 1)
(* Error: `foo` cannot be used here as it is not recursive and has not yet been defined. *)

You can of course still use rec on the let to bind the name recursively in the body of the definition itself:

val foo : int -> int
let rec foo i = foo (i + 1)

In terms of how this affects this current proposal it would change it so that Jeremy's example was an error:

let f x = 1 + x
val f : int -> int
let f x = 2 + f x
(* Error: `f` cannot be used here as it is not recursive and has not yet been defined. *)

lpw25 · 2025-11-27T12:45:15Z

(While were are in this area, I'd really love to support:

val foo x = x + 1

too, with an additional restriction compared to let that a val definition cannot be shadowed -- like other explicit structure items it promises that it will actually be part of the module being defined.)

yallop · 2025-11-27T13:27:46Z

I like the val rec proposal (perhaps excepting end)

gasche · 2025-11-27T13:51:24Z

I have updated the RFC to fix a minor mistake (and mem li ... comes from a previous iteration of the proposal), to clarify the non-binding status of the "Discussions and extensions" section, to mention @lpw25's val f = <def> proposal (as implicitly implying val f : _) as well as @yallop's forward-declaration questions.

diremy · 2025-11-27T14:52:35Z

Why should we constraint val to immediately preceed the let declaration? In many cases, we never rebind declarations and we could introduce all the (usesul/important) val declarations at the beginning of the file let bindings later on.

Perhaps, even more generally, we could let a val declaration just apply to the next declaration of that name. In case of rebinding the same name, it would not apply to the second binding.

Niols · 2025-11-27T15:06:43Z

In many cases, we never rebind declarations

I feel, depending on the codebase, it's either never done or done everywhere. Personally, I shadow my top-level declarations all the time.

Then I would argue we could consider doing the opposite, and apply the val to the last declaration, seeing as it is the one that will actually make it in the signature of the module.

gasche · 2025-11-27T15:15:16Z

Why should we constraint val to immediately preceed the let declaration?

The reason for this restriction is that it is simple and predictable, and all my previous attempts to lift it resulted in weird corner cases (due to shadowing) that people do not agree on, or that I myself felt unsatisfying. This is a minimal proposal that has more chances of gathering consensus than something more complex.

xavierleroy · 2025-11-27T16:04:03Z

I agree with @gasche's message above, and I would go even further. For me, the one case where a val declaration really improves legibility of the code is if the declaration occurs just before the definition, whether it's a let, a let rec, or the and part of a let rec. Consider:

val f : ...
val g : ...
let rec f x y =
  ...
and g z w =
  ...

This is not optimal because val g is separated from and g by arbitrarily many lines, and val f from let rec f by one line. I'd prefer

val f : ...
let rec f x y =
  ...
val g : ...
and g z w =
  ...

An extra benefit of this val/let - val/and style is that it can be explained as syntactic sugar for type constraints on the definitions, assuming we had a good consistent story to tell about type constraints:

val f : ...
let rec f = ...
val g: ...
and g = ...

is morally like

let rec f : ... = ...
and g : ... = ...

In contrast, treating val as a more-or-less general forward declaration raises all sorts of questions regarding the scoping of the declaration, its influence on the type-checking, and perhaps its influence on the compilation of let bindings.

gasche · 2025-11-27T16:19:29Z

The suggestion is interesting but I would worry that it makes the syntax ambiguous or non-regular. For example, I am not sure how to adapt it to the case of local declarations (let rec ... and ... in ...). If the syntax turns out to work out (in terms of regularity, and of it being accepted by our LR parser without too many conflicts), I am happy to go with that.

goldfirere · 2025-11-29T21:50:38Z

I'm largely in favor of this direction. I really miss being able to write type signatures, having come over from Haskell. A few points, mostly from my Haskell experience (numbered only for easy reference, not ordering):

I mostly don't like making a new top-level structure item val f : ... ;; let f = ... for this. Even if we require the val to be right before the let (mumble mumble except mutual recursion), it creates an awkward case for the type checker, in that it means that one structure item has to be remembered as the most recent. Or there's some lookup facility and other check for whether the ordering constraint is satisfied. If we don't require proximity, then that simplifies things in one way (no proximity check!) but now the val may be remote from the let, something Haskellers have generally drifted away from. (Occasionally one comes across some Haskell code with a bunch of type signatures at the top describing an interface implemented below... but OCaml has proper module types for this!) To me, the value in writing out the type signature is in its proximity to the definition, so that (as a reader) I can understand the definition more easily.

One other small challenge: how would this work in the ocaml toplevel? Could a val declaration be entered in separately? If "no", then it shouldn't be an independent structure item.

Using and in the right way easily fixes this problem. (And I would support lax rules within a mutually recursive block.)
We have to figure out the relationship between mutual recursion, generalization, and type signatures. Consider this example (written in Haskell so as to avoid nerd-sniping folks about OCaml syntax):
```
f x y = g x x                                                                                             
g :: a -> b -> c                                                                                          
g x y = f y x
```
Without the type signature, f is inferred to have type a -> a -> b. With the type signature for g, f is inferred to have type a -> b -> c. Generalization (in both Haskell and OCaml) must be done for an entire mutually recursive group. This is what effectively avoids polymorphic recursion. Haskell identifies a mutually recursive group by using a topological sort, identifying dependencies and strongly connected components. After this initial analysis, GHC repeats it, snipping off edges where there is a type signature. In so doing, components become smaller, increasing the possibility for polymorphism (as seen in this example -- f is generalized before we look at the body of g). Should OCaml behave similarly? Probably. And maybe this already happens? When I translate this into OCaml using today's polymorphic signatures, I indeed get the more polymorphic type for f. So it's possible this point is already accounted for, but the repeated analysis is subtle and so I wanted to make sure it was in folks' heads.
I'm dubious of allowing e.g. val f : type a. ... have a be in scope in let f = .... Haskell has supported this for decades, via its -XScopedTypeVariables extension. But there is an effort to remove (well, discourage) this feature, in favor of something much more akin to OCaml's let f (type a) ... = ... syntax. (I've lost track of that effort recently -- maybe it's all done now.) Firstly, having scope cross from a type signature to a definition is a bit of spooky action at a distance. Maybe that's just taste? But it does taste bad. The second problem is more pressing. What do we make of val f : type a. ... and f (type b) = ... ? Are a and b both in scope? Can they unify? What about let f = if flip () then (fun (type c) -> ...) else some_polymorphic_function?

All that said, I see why it's tempting to allow this, as the alternative is very duplicative. (Haskell avoids this duplication by having a stable ordering for type variables, allowing the equivalent of let f (type a) just to bring the first variable into scope. OCaml, alas, does not order its type variables in this way and so this option is closed to us.) So I'm dubious... but there may be no better option.

Please do not let these subtleties or this post suggest that I want to slow this down. On the contrary: I love this idea and want it to be amazing!

OlivierNicole · 2025-11-30T16:52:54Z

I am also in favor of having a syntax for signatures, and I like the suggested one with val, and I hope we will be able to address the points raised by people above.

One other small challenge: how would this work in the ocaml toplevel? Could a val declaration be entered in separately? If "no", then it shouldn't be an independent structure item.

Using and in the right way easily fixes this problem. (And I would support lax rules within a mutually recursive block.)

I’m not sure what you mean by “using and in the right way” here. Are you referring to one of the alternatives that were suggested?

goldfirere · 2025-11-30T20:29:49Z

I’m not sure what you mean by “using and in the right way” here. Are you referring to one of the alternatives that were suggested?

I cleverly left this as an exercise to the reader. :)

Without thinking too too hard about it, I think I'd like and to join a bunch of unordered declarations, where each declaration is a val, a let, or a let rec. So we would have val ... and let ... and let .... and let rec .... and val ... and let .... val are type declarations. let introduces an identifier in scope after the end of the block (but not within the block). let rec introduces an identifier in scope within the block. It is an error for a val to be in a block without a corresponding let or let rec. For backward compatibility, an and without a let or let rec following it is treated as if it has either let or let rec there (choosing which by using the most recent written let or let rec). This form is usable both as a structure item and as a local definition within an expression (paired with in, as today).

I don't see this proposed above. And maybe there are holes in it? But something along these lines feels right to me.

gasche · 2025-11-30T20:46:54Z

The RFC includes a proposal for an Alternative syntax that has the benefit of forming a single structure item, but that is probably going to lose because it looks a bit uglier and no one likes it.

The following examples give the idea:

let rec
  val map : ('a -> 'b) -> 'a list -> 'b list
  and map f li = ...

let rec val f : <type>
and val g : <type>
and f <def>
and g <def>
and val h : <type>
and h <def>
and i <def>

Niols · 2025-12-01T00:20:48Z

Would it be an option to introduce a grammar rule like (in very pseudo code):

option(VAL separated_list(AND, IDENT COLUM core_type)) LET value_bindings

? As in allow a bunch of val before a let in one structure item, eg:

val f : <type>
and g : <type>

let rec f <def>
and g <def>

That probably means a breaking change in the parser because it wouldn't behave well with the current parsing of standalone val.

gasche force-pushed the val-annotations branch from 68150a7 to 4f4d77e Compare November 26, 2025 08:49

gasche added 2 commits November 27, 2025 14:49

revised RFC based on discussions at the type-system meeting

d0c25ec

update RFCs following discussions

3db5562

gasche force-pushed the val-annotations branch from 4f4d77e to 3db5562 Compare November 27, 2025 13:49

RFC: val declarations in structures #59

Are you sure you want to change the base?

RFC: val declarations in structures #59

Conversation

gasche commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gasche commented Nov 25, 2025

Uh oh!

kit-ty-kate commented Nov 25, 2025

Uh oh!

gasche commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yallop commented Nov 26, 2025

Uh oh!

gasche commented Nov 26, 2025

Uh oh!

yallop commented Nov 26, 2025

Uh oh!

kit-ty-kate commented Nov 26, 2025

Uh oh!

yallop commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kit-ty-kate commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yallop commented Nov 26, 2025

Uh oh!

Niols commented Nov 26, 2025

Uh oh!

gasche commented Nov 26, 2025

Uh oh!

yallop commented Nov 26, 2025

Uh oh!

gasche commented Nov 26, 2025

Uh oh!

gasche commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yallop commented Nov 26, 2025

Uh oh!

lpw25 commented Nov 27, 2025

Uh oh!

lpw25 commented Nov 27, 2025

Uh oh!

yallop commented Nov 27, 2025

Uh oh!

gasche commented Nov 27, 2025

Uh oh!

diremy commented Nov 27, 2025

Uh oh!

Niols commented Nov 27, 2025

Uh oh!

gasche commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xavierleroy commented Nov 27, 2025

Uh oh!

gasche commented Nov 27, 2025

Uh oh!

goldfirere commented Nov 29, 2025

Uh oh!

OlivierNicole commented Nov 30, 2025

Uh oh!

goldfirere commented Nov 30, 2025

Uh oh!

gasche commented Nov 30, 2025

Uh oh!

Niols commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

RFC: `val` declarations in structures #59

RFC: `val` declarations in structures #59

gasche commented Nov 25, 2025 •

edited

Loading

gasche commented Nov 26, 2025 •

edited

Loading

yallop commented Nov 26, 2025 •

edited

Loading

kit-ty-kate commented Nov 26, 2025 •

edited

Loading

gasche commented Nov 26, 2025 •

edited

Loading

gasche commented Nov 27, 2025 •

edited

Loading