<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://www.dra27.uk/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.dra27.uk/" rel="alternate" type="text/html" /><updated>2026-02-01T12:28:54+00:00</updated><id>https://www.dra27.uk/feed.xml</id><title type="html">Notes from the Windows corner</title><subtitle>David Allsopp’s blog featuring musings on O(x)Caml, Windows and other vaguely-related things. Views expressed are my own, rather than my employer’s.
</subtitle><entry><title type="html">open Core</title><link href="https://www.dra27.uk/blog/platform/2026/01/19/plotting-a-new-course.html" rel="alternate" type="text/html" title="open Core" /><published>2026-01-19T00:00:00+00:00</published><updated>2026-01-19T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2026/01/19/plotting-a-new-course</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2026/01/19/plotting-a-new-course.html"><![CDATA[<p>On 16 December 2000, a young @dra<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> stepped out on to the stage of <a href="https://www.stmartin-in-the-fields.org/">St Martin-in-the-Fields</a>
making what would be the first of many performances of Johann Sebastian Bach’s
great <a href="https://en.wikipedia.org/wiki/Mass_in_B_minor">Mass in B minor</a>. On 16
November last year, just under 25 years later, a slightly greyer @dra27 stepped
out on the stage of <a href="https://www.newcastlebachchoir.org.uk/dbpage.php?pg=view&amp;dbase=events&amp;id=202182">King’s Hall</a>
at Newcastle University for what, for now at least<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, would be his last
performance of this great work<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>. As I write this in the 9 hour window of
unemployment between finishing at the <a href="https://www.cam.ac.uk">University of Cambridge</a>
and <a href="https://tarides.com">Tarides</a> and commuting down to 2½ Devonshire Square to
start at <a href="https://www.janestreet.com">Jane Street</a>, it’s a new year and a change
of course.</p>

<p>My professional life to now has always been a balancing act between the arts and
technology (perhaps a rollercoaster would be a better analogy; balancing act
somehow evokes the elegance and skill of a trapeze artist). I’ve <a href="/blog/platform/2025/07/18/through-a-glass-darkly.html">mused before</a>
on some of the common threads that drive me; more recently I’ve been musing on
more fundamental similarities. Many years ago, I remember in some Cathedral or
other being told of the various carvings which exist in hidden parts of these
buildings; art created not to be seen, at least by human eyes. Amongst others,
restorations at <a href="https://www.salisburycathedral.org.uk">Salisbury Cathedral</a>
uncovered <a href="https://www.theguardian.com/commentisfree/2023/sep/12/the-guardian-view-on-the-hidden-carvings-of-salisbury-cathedral-messages-to-the-future">such carvings</a>.
Effort expended not for human reward, but for its own worth or, one could say,
<a href="https://en.wikipedia.org/wiki/Ad_maiorem_Dei_gloriam">ad maiorem Dei gloriam</a>.
Or, in technology, The Right Thing™. The right thing is what instantly drew me
to functional programming back way before it was cool and shortly after the “27”
had been added to “dra”. The pragmatic approach of OCaml trying to balance the
safety, correctness, and Right Thing of functional programming with the need to
write performant programs in a less Right Thing-like world made it a natural
choice for a young professional singer writing and maintaining small systems
written on trains, planes and hotels around the world! It’s continued to draw me
in over the last 9 years.</p>

<p>But for me the art <em>is</em> made to be seen. During the COVID-19 pandemic, when live
performance became impossible, I remember spending many months at home unable,
or at least unwilling, to sing. Without even the <a href="https://open.spotify.com/track/6nQeKKpvUFJE0S6gwem2JN">colleagues to perform with</a>,
let alone the audience to consume the result, there was no purpose. And so too
the perfect software is without purpose without users<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>. In championing and
furthering Windows OCaml, I chose the niche of a niche, but I am hugely proud
that today <em>every</em> Windows user of OCaml benefits (hopefully!) from the work I
<a href="https://github.com/ocaml-multicore/ocaml-multicore/pull/351">both</a> <a href="https://github.com/ocaml/ocaml/pull/11642">did</a>,
and <a href="https://github.com/ocaml/ocaml/pull/12954">spearheaded others to do too</a>.
Likewise, <em>every</em> Windows user running <code class="language-plaintext highlighter-rouge">winget install opam</code> begins their
journey in OCaml following my vision of how <a href="https://github.com/ocaml/opam/issues/246#issuecomment-2166133625">it should work</a>,
thanks to the seemingly boundless patience and efforts of my opam
co-maintainers!</p>

<p>Behind all this, though, are the companies which allowed this to happen: first
at OCaml Labs at the University of Cambridge and then spinning out into Tarides.
And behind all that is Jane Street. After I started at OCaml Labs back in 2016,
I explained to (mainly musical) colleagues that I was carrying on doing the open
source work I’d been doing for the previous 10 years, but that somehow that had
become work one could be <em>paid</em> to do (I can’t underscore enough how
inconceivable the idea of that would have felt in 2006). That inevitably led to
the question “so what do they get out of it?” - and the inevitable surprise that
the answer was, directly at least, nothing. Windows opam represented work on a
platform with no business case on an unused tool. The OCaml community was - is -
the benefit.</p>

<p>Which makes this week feel less a change of course and more a continuation.
<a href="https://oxcaml.org">OxCaml</a> represents to me the evolution of the pragmatism
that drew me to OCaml in the first place. And we’ve definitely got users!
This year, like last year, is clearly going to be <a href="https://en.wikipedia.org/wiki/May_you_live_in_interesting_times">interesting</a>.
I think as we continue to navigate the whirlwind of agentic engineering, we’re
going to care more and more about the foundation it’s all sat on, especially the
compilers. Let’s hope they continue to strive to do the Right Thing™. And let’s
see how this looks next year! So, here goes…</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">open</span> <span class="nc">Core</span>
</code></pre></div></div>

<p><img src="https://www.dra27.uk/assets/2026-01-19/2026-01-19-tori-from-a-bag.jpg" alt="The cat’s out of the bag" /></p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>the “27” wouldn’t be allocated until the next October <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>never say never… <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p>I’ve never recorded the work, although I recorded <a href="https://open.spotify.com/track/7kZypUPNwPu0JKCCbD57X2">Ach, bleibe doch</a> from <a href="https://en.wikipedia.org/wiki/Lobet_Gott_in_seinen_Reichen,_BWV_11#4">Himmelfahrtsoratorium BWV 11</a> with <a href="https://musikpodium.de">Musik Podium Stuttgart</a> ten years ago, which is one of the source arias for the famous <em>Agnus Dei</em> of the mass <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:4" role="doc-endnote">
      <p>Perhaps I should adopt <em>sine usoribus sine proposito</em> as a motto <a href="#fnref:4" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="oxcaml" /><summary type="html"><![CDATA[On 16 December 2000, a young @dra1 stepped out on to the stage of St Martin-in-the-Fields making what would be the first of many performances of Johann Sebastian Bach’s great Mass in B minor. On 16 November last year, just under 25 years later, a slightly greyer @dra27 stepped out on the stage of King’s Hall at Newcastle University for what, for now at least2, would be his last performance of this great work3. As I write this in the 9 hour window of unemployment between finishing at the University of Cambridge and Tarides and commuting down to 2½ Devonshire Square to start at Jane Street, it’s a new year and a change of course. the “27” wouldn’t be allocated until the next October &#8617; never say never… &#8617; I’ve never recorded the work, although I recorded Ach, bleibe doch from Himmelfahrtsoratorium BWV 11 with Musik Podium Stuttgart ten years ago, which is one of the source arias for the famous Agnus Dei of the mass &#8617;]]></summary></entry><entry><title type="html">Opening up old release branches</title><link href="https://www.dra27.uk/blog/platform/2026/01/16/dusting-off-the-branches.html" rel="alternate" type="text/html" title="Opening up old release branches" /><published>2026-01-16T00:00:00+00:00</published><updated>2026-01-16T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2026/01/16/dusting-off-the-branches</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2026/01/16/dusting-off-the-branches.html"><![CDATA[<p>The spring-cleaning continues! When I originally prototyped Relocatable OCaml,
it was during the OCaml 4.13 development cycle. The focus for the work
originally was always about multiple versions of the compiler co-existing
without interfering with each other, so even the early prototypes were done on
both OCaml 4.12 and OCaml 4.13. In fact, I see that in <a href="https://discuss.ocaml.org/t/ocaml-cafe-wed-oct-13-1pm-u-s-central/8610">this talk</a>,
<a href="https://youtu.be/RHSdlH4el0g?t=1969">I even demo’d it on both versions</a>. My
intention from the start had always been to be able to provide either backports
or re-releases of older compilers, on the basis that it would be tedious to have
only the latest releases of OCaml supporting the various fixes, given that the
failing CI systems which had motivated the project would continue to test older
versions for several/many years after completion. In 2021, OCaml 4.08 (from
June 2019) was still a recent memory. From a technical perspective, OCaml 4.08
was a very important release. It’s the first version of OCaml with a reliably
namespaced Standard Library (the <code class="language-plaintext highlighter-rouge">Stdlib</code> module, though introduced in 4.07, had
various issues with shadowing modules which weren’t completely addressed until
4.08). For my work, it was the version where we switched the configuration
system to autoconf, and thus introduced a configuration system for the Windows
ports. It provided a natural baseline in 2022 for the backports, and thus the
workshop demonstration I gave in Ljubljana featured Windows and Linux for OCaml
4.08-4.14 as well as preview of OCaml 5.0.</p>

<p>Wind forwards 3¼ years, and OCaml 4.08 is certainly a distant memory, but the
features in it means it is very much a baseline at the moment. So, partly
because the work is mostly already done, partly because it’s the last piece of
the Relocatable puzzle, and mainly because I think it will be useful, I’ve been
working towards proposed branches to allow us to make some exceptional releases
in these old branches.</p>

<p>OCaml 5.x means that we have already been maintaining OCaml 4.14 as an LTS
release of OCaml (although the changes coming in OCaml 5.5 mean that we really
will be looking to end that long-term support before too much longer). I’ve
previously talked about <a href="/blog/week-that-was/2025/05/24/wtw-21.html#gha">synchronising the GitHub Actions workflows</a>.
and <a href="/blog/platform/2025/07/29/taming-buildkit.html#stack">the crazy shell script that manages the backporting</a>.</p>

<p>So, how do you maintain some 20-or-so in-flight PR branches being occasionally
rebased <em>forwards</em> onto updated versions of OCaml and simultaneously
<em>backported</em> on to, initially, 7 compilers and, by the end, 12 compilers? The
answer is: carefully, and with a lot of automation! ‘Carefully’ entailed setting
some important ground rules: for example, the golden rule was that either the
work was rebased <em>or</em> it was being amended. Never both simultaneously. This rule
meant that while effort may be required to rebase <em>forwards</em>, the backports
remained fixed, which allowed the patch resolutions to re-settle. In order to
ease the automation, it also occasionally required over-backporting, i.e. taking
some changes back to old compilers which were not strictly necessary, but which
meant that the main branches did not need to be extensively re-written.</p>

<p>Anyway, included in this back-porting was enough work to be able to get CI
running, since once we’ve finished support for a given version, the branch is
mostly left to gather dust.</p>

<p><img src="https://www.dra27.uk/assets/2026-01-16/2026-01-16-covered-furniture.jpg" alt="Opening up old release branches" width="75%" height="75%" /></p>

<p><em>Image from <a href="https://uk.pinterest.com/pin/828732768911599733/">Pinterest</a></em>.</p>

<p>Like getting a stately home ready for visitors, the task I spent this week
accomplishing was removing the dust sheets from the CI configurations on these
old branches (4.08-4.13) so that the final backported changes could be properly
tested. The stacks for Relocatable OCaml had these patches ordered as:</p>

<ol>
  <li>Unify CI workflows (affecting 4.08-4.13). We started switching over to GitHub
Actions during the 4.12 dev cycle, but only completed the migration during
the 4.13 dev cycle, so there were long-dead Travis CI scripts in the
4.08-4.12 branches, and the 4.13 branch’s GitHub Actions scripts hadn’t of
course received the updates which the 4.14 ones had.</li>
  <li>macOS arm64 support (4.08, 4.09 and 4.11). The curious gap is because Apple
released the M1 during the 4.11 dev cycle, and it was decided that it was too
late to alter the 4.11.0 release, so a re-release of 4.10 was done with Apple
Silicon support, and then 4.12.0 “naturally” had the support, as the version
of trunk to which it had originally been merged. As it happens, the 4.12
patches apply completely trivially to 4.11 and the 4.10 version of the
patches apply trivially to 4.08 and 4.09, which means that you can get the
complete support matrix when working on Apple Silicon.</li>
  <li>Windows FlexDLL bootstrap overhaul (affecting 4.08-4.12). The main thing here
is <a href="https://github.com/ocaml/ocaml/pull/10135">ocaml/ocaml#10135</a>, which made
the ability to build the Windows ports of OCaml somewhat easier. The lack of
this branch is the reason why the support for Windows in opam-repository at
the moment from <a href="https://github.com/ocaml/opam-repository/pull/25861">ocaml/opam-repository#25861</a>
only goes back as far OCaml 4.13. It’s not because it’s impossible, it just
requires a lot of hardening in the opam packages to work around various
deficiencies, and it’s actually just a configuration/build system change
which backports really very easily as far as 4.08 (getting 4.07 and earlier
working in opam on Windows is a <a href="https://github.com/metastack/ocaml-legacy">daftness for another day</a>).</li>
  <li>Required backports for Relocatable OCaml (actual changes in the
runtime/compiler which must be present to support the main PRs).</li>
  <li>runtime-launch-info backport (from OCaml 5.2; <a href="https://github.com/ocaml/ocaml#12751">ocaml/ocaml#12751</a>)</li>
  <li>compiled-primitives backport (from OCaml 5.3; <a href="https://github.com/ocaml/ocaml#12893">ocaml/ocaml#12896</a>)</li>
  <li>in-prefix-test harness (from OCaml 5.5; <a href="https://github.com/ocaml/ocaml#14014">ocaml/ocaml#14014</a>)</li>
</ol>

<p>Number 1 is exactly where I wanted it, but unfortunately the test harness sits
on top of a large number of other changes, mainly because it was actually
written well after all of them! So I got out my Git <a href="https://en.wikipedia.org/wiki/Mj%C3%B6lnir">hammer</a>
and began to re-forge the commit series. It’s quite a set, but at the end of it,
the difference between 4.08 and 4.14 in <code class="language-plaintext highlighter-rouge">appveyor.yml</code>, <code class="language-plaintext highlighter-rouge">.github/workflows</code> and
<code class="language-plaintext highlighter-rouge">tools/ci</code> is a small set of trivially justifiable changes relating to actual
differences between those two releases (for example, OCaml 4.08 still had the
graphics library to test!). Ignoring changes that affect the build, configure
and test systems only, it requires just three PRs to resurrect that branch:</p>
<ul>
  <li><a href="https://github.com/ocaml/ocaml/pull/9557">ocaml/ocaml#9557</a> - a necessary
compilation change on IBM POWER</li>
  <li><a href="https://github.com/ocaml/ocaml/pull/9981">ocaml/ocaml#9981</a> - a necessary
compilation change for the BSDs</li>
  <li><a href="https://github.com/ocaml/ocaml/pull/12577">ocaml/ocaml#12577</a> - a series of
C17 compatibility changes needed for compilation with recent GCC/Clang</li>
</ul>

<p>… and that’s it!</p>

<p>On top of that unified CI, it’s then possible to backport the test harness,
which then provides reliable installation testing for all the relocatable
features. I’m almost certainly strange, but this was kind of fun to backport. By
the time it gets back to 4.08, the commit message for the harness itself reads:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Deviations from original:
  - C stub added to the harness instead of backporting
    caml_sys_proc_self_exe
  - Compmisc.reinit_path used instead of updating Compmisc.init_path
  - Config.c_compiler_vendor avoided by instead exposing the constant as
    Toolchain.c_compiler_vendor
  - Config.asm_dwarf_version folded into toolchain.ml.in
  - Config.shebangscripts avoided by instead adding an additional
    --with- option to the harness and threading $(SHEBANGSCRIPTS) from
    the build system
  - Required parts of Bytelink.read_runtime_launch_info directly added
    to the test harness, instead of exposing the machinery in Bytelink.
  - Misc.Stdlib.String.to_utf_8_seq added directly to TestRelocation.
  - ocamlmklib -M exits with code 0
  - Standard Library functions added:
    - Bytes.{get_utf_8_uchar,set_utf_16le_uchar} (4.14)
    - Char.Ascii.is_letter (5.4)
    - In_channel.{fold,input}_lines (5.1)
    - In_channel.input_all (4.14)
    - In_channel.really_input_bigarray (5.2)
    - In_channel.{set_binary_mode,with_open_bin} (4.14)
    - Int.max (4.13)
    - List.{drop,take}_while (5.3)
    - List.find_map (4.10)
    - List.fold_left_map (4.11)
    - Option.exists (5.5)
    - Out_channel.{with_open_bin,with_open_text} (4.14)
    - Result.Syntax (5.4)
    - String.{starts,ends}_with (4.13)
    - Sys.{mkdir,rmdir} (4.12)
    - Sys.signal_to_string (5.4)
    - Uchar.utf_{decode_uchar,decode_length,16_byte_length} (4.14)
    - Unix.realpath (4.13)
  - OCaml 5.4 labelled tuples removed
  - Added the ability for Build/Prefix/Relative to be optional in
    TestRelocation
  - Prior to 5.4.0, ocamlrun -config didn't include entries in the
    search path coming from CAML_LD_LIBRARY_PATH
  - Prior to 5.3.0, the compiler added +flexdll to the search path even
    when -nostdlib is specified, which affects the relocation test when
    bootstrapping FlexDLL.
  - Prior to 5.3.0, ocamlcmt and ocamlobjinfo didn't support -vnum. The
    simplest thing in this instance is simply to skip the test of their
    binaries.
  - Prior to 5.3.0, the dynlink library included a complete copy of the
    Config module, which affects the relocation test.
  - Prior to 5.2.0, RNTM isn't always written to bytecode images when
    compiling with -custom. Heuristic altered to analyse the DLLS
    section instead.
  - Prior to 4.14.0, errors in #load don't create detectable script
    errors: toplevel test adapted to cope with this
  - Prior to 4.13.0, the MSVC port didn't translate -lfoo automatically
  - Adapt the toplevel test to deal with the presence of compiler
    plugins in ocamlnat
  - Link test updated to deal with different runtime behaviour for
    errors and the lack of -output-complete-exe, both of which changed
    in 4.10.0
  - Build system adapted (Makefile.common macros didn't support C files
    until OCaml 5.2; VPATH wasn't used until OCaml 5.1; win32unix/unix
    distinction until OCaml 5.0)
  - Misc.Style changed to Misc.Color (changed in OCaml 5.2) and uses of
    the inline_code and hint formatting changed to loc (since
    inline_code was also added in OCaml 5.2 and hint OCaml 5.1)
  - Removed compressed marshalling support, since that was added in
    OCaml 5.1
  - Read camlheader file instead of runtime-launch-info, and adapt the
    relocation test to deal with incorrect naming of two of the header
    files (fixed in 4.09.0)
  - OCAML_FLEXLINK scrubbed from the environment (removed in 5.2)
  - Updated to deal with all bytecode binaries being installed with
    debug information
  - Updated to deal with all bytecode/native versions of tools being
    installed (changed in 5.1)
  - Updated to deal with old installation layout for otherlibs
  - Updated handling for the bigarray library
  - Updated relocation test to deal with different use of debug flags
    from OCaml 4.12+
  - Order of rules in testsuite/in_prefix/Makefile.test updated for
    compatibility with the aged version of GNU make shipped on macOS
    (pattern rules triggered in the wrong order otherwise)
</code></pre></div></div>

<p>Next-up, having removed all those dust sheets: finalising the actual backports!</p>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="relocatable" /><summary type="html"><![CDATA[The spring-cleaning continues! When I originally prototyped Relocatable OCaml, it was during the OCaml 4.13 development cycle. The focus for the work originally was always about multiple versions of the compiler co-existing without interfering with each other, so even the early prototypes were done on both OCaml 4.12 and OCaml 4.13. In fact, I see that in this talk, I even demo’d it on both versions. My intention from the start had always been to be able to provide either backports or re-releases of older compilers, on the basis that it would be tedious to have only the latest releases of OCaml supporting the various fixes, given that the failing CI systems which had motivated the project would continue to test older versions for several/many years after completion. In 2021, OCaml 4.08 (from June 2019) was still a recent memory. From a technical perspective, OCaml 4.08 was a very important release. It’s the first version of OCaml with a reliably namespaced Standard Library (the Stdlib module, though introduced in 4.07, had various issues with shadowing modules which weren’t completely addressed until 4.08). For my work, it was the version where we switched the configuration system to autoconf, and thus introduced a configuration system for the Windows ports. It provided a natural baseline in 2022 for the backports, and thus the workshop demonstration I gave in Ljubljana featured Windows and Linux for OCaml 4.08-4.14 as well as preview of OCaml 5.0.]]></summary></entry><entry><title type="html">Forging compilers in opam</title><link href="https://www.dra27.uk/blog/platform/2026/01/11/ocaml-config.html" rel="alternate" type="text/html" title="Forging compilers in opam" /><published>2026-01-11T00:00:00+00:00</published><updated>2026-01-11T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2026/01/11/ocaml-config</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2026/01/11/ocaml-config.html"><![CDATA[<p>As we settle into 2026, I have been doing a little early spring-cleaning. A few
years ago, we had a slightly chaotic time in opam-repository over what should
have been a migration from gforge.inria.fr to a new <a href="https://gitlab.inria.fr/">GitLab instance</a>.
Unfortunately, some release archives effectively disappeared from official
locations, and although the content was available elsewhere, the precise
archives weren’t generally available, which is a problem for the checksums in
opam files. We’ve had <a href="https://github.com/ocaml/opam-repository/pull/23849">similar problems</a>
<a href="https://github.com/ocaml/opam-repository/pull/10182">with GitHub</a> in the past.
As a ‘temporary solution’, <a href="https://github.com/avsm">@avsm</a> created <a href="https://github.com/ocaml/opam-source-archives/commit/bad1a54d3e64267a3e5b5e9e13083d55cc176307">ocaml/opam-source-archives</a>
to house copies of these archives (I think it’s a somewhat prescient sha for
that first commit!). As is so often the case with temporary solutions, it’s
grown somewhat. Rather against my personal better judgement, the repo got used
to house files which used to be shipped as part of <a href="https://github.com/ocaml/opam-repository">ocaml/opam-repository</a>.
Removing the files from the repository was a good change, because they were
always being shipped as part of <code class="language-plaintext highlighter-rouge">opam update</code>, but unfortunately moving them to
an “archive” repository has made it rather too tempting to add <em>new</em> files,
making an archive repository a primary source.</p>

<p>Back in September, as part of Relocatable OCaml, I needed to update the
ocaml-config package, which houses one of the plumbing scripts used for the
<a href="https://opam.ocaml.org/packages/opam"><code class="language-plaintext highlighter-rouge">ocaml</code></a> package in opam. Separately, for
opam’s CI systems, we wanted to be able to test against trunk OCaml, which
implied <a href="https://github.com/ocaml/opam-repository/issues/23515">some updates to the plumbing</a>
for <a href="https://opam.ocaml.org/packages/ocaml-system"><code class="language-plaintext highlighter-rouge">ocaml-system</code></a> as well. The
right thing to do with these scripts which had lived in opam-respository is to
push them back upstream, which was what I did with a cute piece of Git
spelunking in <a href="https://github.com/ocaml/ocaml/pull/14351/commits">ocaml/ocaml#14351</a>,
which contains commits with files cherry-picked from <a href="https://github.com/ocaml/opam-repository">ocaml/opam-repository</a>
PRs. Each commit contains a reference to an opam-repository commit, which in
turn leads to the original PR. For example, <a href="https://github.com/ocaml/ocaml/commit/d0272f845e90f8280e821f90382b3c189af00ea6">ocaml/ocaml#d0272f8</a>
copies the original files from <a href="https://github.com/ocaml/opam-repository/commit/1bab4537a2f3b2328d771e1bb50b0d0268b0798a">ocaml/opam-repository#1bab453</a>.
The neat thing with putting them into a series under a single path in OCaml now
is what then happens with subsequent changes. For example, <a href="https://github.com/ocaml/opam-repository/pull/17541">ocaml/opam-repository#17541</a>
introduced the <code class="language-plaintext highlighter-rouge">ocaml-config.2</code> package, which was a completely fresh script,
but <a href="https://github.com/ocaml/ocaml/commit/749a918bb85033b0a6370b85a7c6a4be33620c58">ocaml/ocaml#749a918</a>
is instead able to show the actual diff of the script, allowing for much easier
review and so forth. Of course, git doesn’t store patches, so the really useful
part is that although the history is different, the <em>file</em> in each commit is
exactly as in the original commit, which allowed <a href="https://github.com/ocaml/opam-repository/pull/29080">ocaml/opam-repository#29080</a>
just to update the URLs to point to these <em>authoritative</em> upstream sources.</p>

<p>So far, so good - that had all been merged before Christmas. The support for
explicit-relative paths in <code class="language-plaintext highlighter-rouge">ld.conf</code> added in <a href="https://github.com/ocaml/ocaml/pull/14243">ocaml/ocaml#14243</a>
required an update to this script, as noted in <a href="https://github.com/ocaml/opam-repository/pull/29085">ocaml/opam-repository#29085</a>,
since <code class="language-plaintext highlighter-rouge">opam var ocaml:stubsdir</code> for OCaml 5.5 and onwards was giving an
erroneous <code class="language-plaintext highlighter-rouge">../stublibs:./stublibs:.</code> as it didn’t know to translate the paths
read from <code class="language-plaintext highlighter-rouge">ld.conf</code> as being relative to <code class="language-plaintext highlighter-rouge">ld.conf</code> itself. That an update was
required anyway gave me opportunity to fix three other oddities in that script:</p>

<ol>
  <li>The script gets installed to users’ opam switches, rather than used directly
as part of the build.</li>
  <li>There were several versions of it, and there didn’t need to be.</li>
  <li>The use of opam’s <code class="language-plaintext highlighter-rouge">substs</code> mechanism made it difficult to debug outside of
opam.</li>
</ol>

<p>This work had all been left in <a href="https://github.com/ocaml/ocaml/pull/14250">ocaml/ocaml#14250</a>
<a href="https://github.com/ocaml/ocaml/commits/78ca20415620060c279ed74a0ea5b22c06d7e548">in December</a>
and was straightforward enough. The ocaml-config package had only come into
existence in <a href="https://github.com/ocaml/opam-repository/pull/11928">ocaml/opam-repository#11928</a>
in order to stop the same script being stored multiple times in the same
repository. Once <a href="https://github.com/ocaml/opam-repository/pull/25960">ocaml/opam-repository#25960</a>
reduced that single copy to zero copies, it made more sense for each ocaml
package just to download the script download, and do away with the ocaml-config
package completely. Unifying the scripts, and turning it into a regular OCaml
script flows logically from that. Although, the final lines amuse me somewhat:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">oc</span> <span class="o">=</span> <span class="n">open_out</span> <span class="n">package_config_file</span> <span class="k">in</span>
  <span class="c">(* Quoted strings need OCaml 4.02; "\ " needs OCaml 3.09! *)</span>
  <span class="nn">Printf</span><span class="p">.</span><span class="n">fprintf</span> <span class="n">oc</span> <span class="s2">"\
    opam-version: </span><span class="se">\"</span><span class="s2">2.0</span><span class="se">\"\n</span><span class="s2">\
    variables {</span><span class="se">\n</span><span class="s2">  \
</span></code></pre></div></div>

<p>This week, I updated that commit series to do the same kind of thing to the
ocaml-system script. These are now just regular OCaml scripts (in <a href="https://github.com/ocaml/ocaml/tree/trunk/tools/opam">tools/opam</a>
in <a href="https://github.com/ocaml/ocaml">ocaml/ocaml</a>) which can be run directly and
that allowed me to update <a href="https://github.com/ocaml/ocaml/pull/14354">ocaml/ocaml#14354</a>
which should allow us to be offer an ocaml-system package <em>during</em> development
and release cycles. The autoconf stuff might get revisited in that: cute, but
potentially annoying (the joy of higher order meta-programming… autoconf
generates and updates those files as part of the process of generating
<code class="language-plaintext highlighter-rouge">configure</code>, rather than as part of running <code class="language-plaintext highlighter-rouge">configure</code> itself… what could
possibly go wrong!). There’s also <a href="https://github.com/ocaml/ocaml/pull/14355">ocaml/ocaml#14355</a>
which allows a similar trick to be done with custom compilers. The point here is
that you build a non-standard compiler and install it outside of opam and this
then provides a relatively simple mechanism for opam to be able to use it.
Although “system” compilers are a bit awkward to use from distributions,
because opam interacts very badly with system-installed OCaml packages, if the
<em>only</em> thing you’re installing at a system-level is the compiler, the support is
very good.</p>

<p>The final thing to deal with was a very silly ToDo item I’d added:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- [ ] `tools/opam/Layout.md`
</code></pre></div></div>

<p>In the excitement of getting the other Relocatable OCaml PRs opened in
September, in a brief from sanity, I had decided some kind of documentation
ought to be in place before <a href="https://github.com/ocaml/ocaml/pull/14250">ocaml/ocaml#14250</a>
was considered for merging 😱 Days turned into weeks, weeks turned into months.
And then, one not-so-very special day, I went to my laptop, I sat down, and I
wrote our compiler packaging story. A story about package names, a story about
compiler variants, a story about ancient libraries, long forgotten. But above
all things, a story about OCaml.</p>

<p>Which is mercifully now merged.</p>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="opam" /><category term="relocatable" /><summary type="html"><![CDATA[As we settle into 2026, I have been doing a little early spring-cleaning. A few years ago, we had a slightly chaotic time in opam-repository over what should have been a migration from gforge.inria.fr to a new GitLab instance. Unfortunately, some release archives effectively disappeared from official locations, and although the content was available elsewhere, the precise archives weren’t generally available, which is a problem for the checksums in opam files. We’ve had similar problems with GitHub in the past. As a ‘temporary solution’, @avsm created ocaml/opam-source-archives to house copies of these archives (I think it’s a somewhat prescient sha for that first commit!). As is so often the case with temporary solutions, it’s grown somewhat. Rather against my personal better judgement, the repo got used to house files which used to be shipped as part of ocaml/opam-repository. Removing the files from the repository was a good change, because they were always being shipped as part of opam update, but unfortunately moving them to an “archive” repository has made it rather too tempting to add new files, making an archive repository a primary source.]]></summary></entry><entry><title type="html">It’s merged!!</title><link href="https://www.dra27.uk/blog/platform/2025/12/17/its-merged.html" rel="alternate" type="text/html" title="It’s merged!!" /><published>2025-12-17T00:00:00+00:00</published><updated>2025-12-17T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2025/12/17/its-merged</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2025/12/17/its-merged.html"><![CDATA[<p>As I was very happy to <a href="https://discuss.ocaml.org/t/volunteers-to-review-the-relocatable-ocaml-work/16667/11">announce on Discuss</a>
on 12 December, <a href="https://github.com/ocaml/ocaml/commit/7e71861a1ac1ea472104f40168f42133da02096c">OCaml is Relocatable</a>!
Today, the final piece of the puzzle was merged, which is the necessary support
to allow opam to take advantage of all this to be able to clone switches instead
of recompiling them. Before this, you could <em>rename</em> a local switch, for
example, but opam would still be compiling them from scratch. The fruits of this
continue to be available <a href="https://discuss.ocaml.org/t/relocatable-ocaml/17253">through my opam-repository fork</a>,
as <a href="/blog/platform/2025/09/15/relocatable-ocaml.html">announced in September</a>, and
will be available out-of-the-box for the first alpha release of OCaml 5.5 early
next year.</p>

<p>I am particularly thrilled, and just a little surprised, that Relocatable OCaml
has happened as I envisioned, with nothing needing to be dropped. There were of
course many useful suggestions for re-working to improve clarity and so forth
during review, but the test harnesses, RFCs, talks, and so forth appear to have
provided me with the required knife to work out correctly what was <em>necessary</em>
to achieve all this!</p>

<p>For posterity, here’s the timeline! On 4 May, <a href="https://github.com/ocaml/ocaml/pull/14014">ocaml/ocaml#14014</a>
introduced a comprehensive test harness for the <a href="https://github.com/ocaml/RFCs/pull/53">Relocatable OCaml RFC</a>,
which was reviewed by <a href="https://github.com/nojb">Nicolás Ojeda Bär</a> and
<a href="https://github.com/OlivierNicole">Olivier Nicole</a> and merged on 21 June.</p>

<p>Finalising the PRs took me a little while longer than I’d expected, and the set
of 4 PRs, along with a fifth “combined” demonstration <a href="https://github.com/ocaml/ocaml/pulls?q=is%3Apr+author%3Adra27+created%3A2025-09-15">were opened on
15 September</a>.
Over the coming weeks, <a href="https://github.com/jonahbeckford">Jonah Beckford</a>,
<a href="https://github.com/MisterDA">Antonin Décimo</a>, <a href="https://github.com/hhugo">Hugo Heuzard</a>,
<a href="https://github.com/shym">Samuel Hym</a> and <a href="https://github.com/lthls">Vincent Laviron</a>
pored over the details. Their invaluable review then set the scene for a
marathon synchronous <a href="https://bsky.app/profile/dra27.uk/post/3m6htzdhxok2f">defence of the branches</a>
in Paris with <a href="https://github.com/damiendoligez">Damien Doligez</a> on 25 November.
I’m very grateful to everyone involved in these reviews: a lot of code; and a
lot of gnarly details!</p>

<p>That left a small todo list for each PR. I managed to coerce a different core
maintainer for each, with <a href="https://github.com/Octachron">Florian Angeletti</a>
merging <a href="https://github.com/ocaml/ocaml/pull/14243">ocaml/ocaml#14243</a> on
27 November, <a href="https://github.com/gasche">Gabriel Scherer</a> merging
<a href="https://github.com/ocaml/ocaml/pull/14244">ocaml/ocaml#14244</a> on 8 December,
Nicolás merging <a href="https://github.com/ocaml/ocaml/pull/14245">ocaml/ocaml#14245</a>
on 12 December and then finally <a href="https://github.com/NickBarnes">Nick Barnes</a>
merging <a href="https://github.com/ocaml/ocaml/pull/14246">ocaml/ocaml#14246</a> today.</p>

<p>That means that OCaml’s trunk branch now matches the bulk of <a href="https://github.com/ocaml/ocaml/pull/14247">ocaml/ocaml#14247</a>.
What’s next? There’s a small amount of work still to do on the scripts which
plumb the ocaml opam package together. Then, having now got everything merged,
it’ll be time to finalise the backports for proposed re-releases of the older
compilers.</p>

<p>But, for now, the future’s very definitely relocatable! 🥳</p>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="relocatable" /><summary type="html"><![CDATA[As I was very happy to announce on Discuss on 12 December, OCaml is Relocatable! Today, the final piece of the puzzle was merged, which is the necessary support to allow opam to take advantage of all this to be able to clone switches instead of recompiling them. Before this, you could rename a local switch, for example, but opam would still be compiling them from scratch. The fruits of this continue to be available through my opam-repository fork, as announced in September, and will be available out-of-the-box for the first alpha release of OCaml 5.5 early next year.]]></summary></entry><entry><title type="html">If it ain’t broke, …</title><link href="https://www.dra27.uk/blog/platform/2025/12/01/if-it-aint-broke.html" rel="alternate" type="text/html" title="If it ain’t broke, …" /><published>2025-12-01T00:00:00+00:00</published><updated>2025-12-01T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2025/12/01/if-it-aint-broke</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2025/12/01/if-it-aint-broke.html"><![CDATA[<p>Someone was kind enough a few weeks ago to comment on <a href="https://news.ycombinator.com/item?id=45855726">Hacker News</a>
that “Opam on Windows is a masterpiece of engineering”. There’s certainly a lot
which has to go on under the hood to create what I believe is the necessary
baseline experience. Unfortunately, it doesn’t <em>always</em> work out.</p>

<p>10 days ago, a new user popped up on our <a href="https://discuss.ocaml.org/t/when-following-install-instructions-for-windows-topkg-and-ocp-indent-fail-to-compile/17514">Discourse forum</a>
with a failure of some core packages to work on Windows. The OP was using MSYS2,
which at the moment has slightly less support than the recommended Cygwin, but
it looked to me like either my <a href="https://github.com/ocaml/opam-repository/pull/28897">ocaml/opam-repository#28897</a>
or <a href="https://github.com/ocaml/ocamlfind/pull/112">ocaml/ocamlfind#112</a> ought to
fix things.</p>

<p>Except it didn’t, and it wasn’t until today that I was able to reproduce the
problem, and figure out what was going on. As it happens, once I could reproduce
it, this was easy for <em>me</em> to work out, but it shines an interesting light on to
a subtle bit of opam’s Windows internals and consequently something that either
the OP or a packager has misunderstood and it also finally goads me into
standing on my soapbox and shouting from the rafters:</p>

<p><strong>Code should be changed if and only if a bug is being fixed or an actual
user-facing feature is being implemented! The risk of regression ALWAYS
outweighs the benefit of refactoring UNLESS you’re actually changing
something else.</strong></p>

<p>The target of my particular ire is the wonderful <a href="https://www.shellcheck.net/">ShellCheck</a>
tool. If you don’t use it already, then:</p>

<ol>
  <li>You don’t have to write shell scripts, and should continue not having to
write shell scripts for as long as possible! However, you may have to one
day, so:</li>
  <li>You should absolutely start using it <em>on the next shell script you have to
write</em>. You should also use it to check changes you make to existing scripts,
too.</li>
</ol>

<p>But what I’m finally going to come out and just say, is that you should <em>never</em>
<strong>ever</strong> change a shell script just to please ShellCheck unless you actually
have a bug report. If ShellCheck grumbles and warns something <em>might</em> cause a
bug, then take it as a challenge - figure out the scenario in which that could
be triggered, test it, fix it. If, as is often the case, it’s warning about an
unhygienic construct which <em>could</em> go wrong but actually never will because, for
example, the file names concerned don’t contain spaces and never will, then
leave that script alone!</p>

<p>What was actually going on here? The OP was seeing this effect when trying to
install <a href="https://ocaml.org/p/topkg/latest">topkg</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C:\Users\DRA&gt;opam install topkg
The following actions will be performed:
=== install 1 package
  ∗ topkg      1.1.1

&lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;  🐫
⬇ retrieved topkg.1.1.1  (https://opam.ocaml.org/cache)
[ERROR] The compilation of topkg.1.1.1 failed at "ocaml pkg/pkg.ml build --pkg-name topkg --dev-pkg false".

#=== ERROR while compiling topkg.1.1.1 ========================================#
# context     2.5.0 | win32/x86_64 | ocaml.4.14.2 | https://opam.ocaml.org#9bc1691da5d727ede095f83c019b92087a7da33e
# path        C:\Devel\Roots\msys2-testing-native\default\.opam-switch\build\topkg.1.1.1
# command     C:\Devel\Roots\msys2-testing-native\default\bin\ocaml.exe pkg/pkg.ml build --pkg-name topkg --dev-pkg false
# exit-code   125
# env-file    C:\Devel\Roots\msys2-testing-native\log\topkg-8308-179b77.env
# output-file C:\Devel\Roots\msys2-testing-native\log\topkg-8308-179b77.out
### output ###
# Exception: Fl_package_base.No_such_package ("findlib", "").
</code></pre></div></div>

<p>This error gets readily seen when using OCaml 5.x on MSYS2. It’s not a problem
with topkg, it’s actually that the <a href="https://ocaml.org/p/ocamlfind/latest">ocamlfind</a>
library manager is not installed correctly. There are fixes pending for that in
<a href="https://github.com/ocaml/ocamlfind/pull/112">ocaml/ocamlfind#112</a>, but the
issue there is to do with OCaml <strong>5.x</strong> - this build is OCaml 4.14.2.</p>

<p>The actual issue is readily apparent if we rebuild ocamlfind:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C:\Users\DRA&gt;opam reinstall ocamlfind -v
The following actions will be performed:
=== recompile 1 package
  ↻ ocamlfind 1.9.8

&lt;&gt;&lt;&gt; Processing actions &lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;&lt;&gt;  🐫
⬇ retrieved ocamlfind.1.9.8  (cached)
+ C:\Devel\Roots\msys2-testing-native\default\.opam-switch\build\ocamlfind.1.9.8\./configure "-bindir" "C:\\Devel\\Roots\\msys2-testing-native\\default\\bin" "-sitelib" "C:\\Devel\\Roots\\msys2-testing-native\\default\\lib" "-mandir" "C:\\Devel\\Roots\\msys2-testing-native\\default\\man" "-config" "C:\\Devel\\Roots\\msys2-testing-native\\default\\lib/findlib.conf" "-no-custom" "-no-camlp4" (CWD=C:\Devel\Roots\msys2-testing-native\default\.opam-switch\build\ocamlfind.1.9.8)
- Welcome to findlib version 1.9.8
- Configuring core...
&lt;snip snip snip&gt;
- Configuration for threads written to site-lib-src/threads/META
- Configuration for str written to site-lib-src/str/META
- Configuration for bytes written to site-lib-src/bytes/META
- Access denied - SRC
- File not found - -NAME
- File not found - META.IN
- File not found - -TYPE
- File not found - F
- File not found - -EXEC
- File not found - SH
- File not found - -C
- File not found - SED -E 'S/@VERSION@/1.9.8/G'         -E 'S/@REQUIRES@//G'    \$1\
- File not found -  &gt; \${1%.IN}\
- File not found - SH
- File not found - {}
- File not found - ;
- Detecting compiler arguments: (extractor built) ok
</code></pre></div></div>

<p>What’s going on at the end of that snippet? It turns out ocamlfind’s <code class="language-plaintext highlighter-rouge">configure</code>
script contains a call to what it expects to be POSIX <a href="https://pubs.opengroup.org/onlinepubs/9799919799/utilities/find.html"><code class="language-plaintext highlighter-rouge">find</code></a>,
but what it’s actually getting is the Windows <code class="language-plaintext highlighter-rouge">find</code> utility. This problem is as
old as the hills, and affects <code class="language-plaintext highlighter-rouge">sort</code> as well:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C:\Users\DRA&gt;find /?
Searches for a text string in a file or files.

FIND [/V] [/C] [/N] [/I] [/OFF[LINE]] "string" [[drive:][path]filename[ ...]]

  /V         Displays all lines NOT containing the specified string.
  /C         Displays only the count of lines containing the string.
  /N         Displays line numbers with the displayed lines.
  /I         Ignores the case of characters when searching for the string.
  /OFF[LINE] Do not skip files with offline attribute set.
  "string"   Specifies the text string to find.
  [drive:][path]filename
             Specifies a file or files to search.

If a path is not specified, FIND searches the text typed at the prompt
or piped from another command.
</code></pre></div></div>

<p>versus:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C:\Users\DRA&gt;C:\msys64\usr\bin\find --version
find (GNU findutils) 4.10.0
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later &lt;https://gnu.org/licenses/gpl.html&gt;.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Eric B. Decker, James Youngman, and Kevin Dalley.
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS(FTS_CWDFD) CBO(level=1)
</code></pre></div></div>

<p>Now, it turns out that the OP’s system is misconfigured - which is part of the
reason why <a href="https://cygwin.com">Cygwin</a>, <a href="https://www.msys2.org">MSYS2</a> and
<a href="https://gitforwindows.org/">Git for Windows</a> don’t put their Unix shell
utilities into <code class="language-plaintext highlighter-rouge">PATH</code> by default. The Unix utilities are in <code class="language-plaintext highlighter-rouge">C:\msys64\usr\bin</code>,
but this entry in <code class="language-plaintext highlighter-rouge">PATH</code> appears after <code class="language-plaintext highlighter-rouge">C:\WINDOWS\system32</code>, so a few
utilities get shadowed. For things like <code class="language-plaintext highlighter-rouge">curl</code> and <code class="language-plaintext highlighter-rouge">tar</code>, this is <em>usually</em>
benign, but for <code class="language-plaintext highlighter-rouge">find</code> and <code class="language-plaintext highlighter-rouge">sort</code> it’s pretty terminal!</p>

<p>In its default mode, where opam manages the Cygwin or MSYS2 environment for
you - it’s part of a mildly complex process in <a href="https://github.com/ocaml/opam/pull/5832">ocaml/opam#5832</a>,
as opam ensures that either the Cygwin or MSYS2 <code class="language-plaintext highlighter-rouge">bin</code> directory is just “left”
enough in <code class="language-plaintext highlighter-rouge">PATH</code> to ensure that some key utilities are not shadowed, but not so
far to the left that it shadows things like Git for Windows!</p>

<p>However, if you’ve set-up <code class="language-plaintext highlighter-rouge">PATH</code> yourself, as the OP had, you’re on your own,
I’m afraid!</p>

<p>So why is all this making me rant about ShellCheck? Well, the problem is that
<em>downgrading</em> ocamlfind to a previous version works just fine, and so it’s the
regression in ocamlfind 1.9.8 that I want to focus on. The failing part of the
script is:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># create META from META.in in POSIX-compatible &amp; safe way</span>
<span class="c"># see: https://www.shellcheck.net/wiki/SC2044</span>
<span class="nv">meta_subst</span><span class="o">=</span><span class="s2">"sed -e 's/@VERSION@/</span><span class="nv">$version</span><span class="s2">/g' </span><span class="se">\</span><span class="s2">
        -e 's/@REQUIRES@/</span><span class="k">${</span><span class="nv">req_bytes</span><span class="k">}</span><span class="s2">/g' </span><span class="se">\</span><span class="s2">
        </span><span class="se">\"\$</span><span class="s2">1</span><span class="se">\"</span><span class="s2"> &gt; </span><span class="se">\"\$</span><span class="s2">{1%.in}</span><span class="se">\"</span><span class="s2">"</span>
find src <span class="nt">-name</span> <span class="s1">'META.in'</span> <span class="nt">-type</span> f <span class="nt">-exec</span> sh <span class="nt">-c</span> <span class="s2">"</span><span class="nv">$meta_subst</span><span class="s2">"</span> sh <span class="o">{}</span> <span class="se">\;</span>
</code></pre></div></div>

<p>but which was originally (from <a href="https://github.com/ocaml/ocamlfind/commit/cbc4a7db226670c12ee2891213593559bd694bbf">cbc4a7d</a>):</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for </span>part <span class="k">in</span> <span class="sb">`</span><span class="nb">cd </span>src<span class="p">;</span> <span class="nb">echo</span> <span class="k">*</span><span class="sb">`</span><span class="p">;</span> <span class="k">do
    if</span> <span class="o">[</span> <span class="nt">-f</span> <span class="s2">"src/</span><span class="nv">$part</span><span class="s2">/META.in"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
        </span><span class="nb">sed</span> <span class="nt">-e</span> <span class="s2">"s/@VERSION@/</span><span class="nv">$version</span><span class="s2">/g"</span> <span class="se">\</span>
            <span class="nt">-e</span> <span class="s2">"s/@REQUIRES@/</span><span class="k">${</span><span class="nv">req_bytes</span><span class="k">}</span><span class="s2">/g"</span> <span class="se">\</span>
            src/<span class="nv">$part</span>/META.in <span class="o">&gt;</span>src/<span class="nv">$part</span>/META
    <span class="k">fi
done</span>
</code></pre></div></div>

<p>ShellCheck reports two issues with that original loop:</p>
<ol>
  <li>It dislikes the <a href="https://www.shellcheck.net/wiki/SC2006">use of backticks rather than <code class="language-plaintext highlighter-rouge">$(...)</code></a></li>
  <li>It worries that the <a href="https://www.shellcheck.net/wiki/SC2164">exit code of <code class="language-plaintext highlighter-rouge">cd</code> is unchecked</a></li>
</ol>

<p>The first is because this script is very old, and it was written at a time when
there were not only versions of <code class="language-plaintext highlighter-rouge">sh</code> which didn’t support <code class="language-plaintext highlighter-rouge">$(...)</code> notation,
there were people using OCaml on them! In fact, it’s only comparatively recently
that the default version of <code class="language-plaintext highlighter-rouge">sh</code> in (Open)Solaris/Illumos, etc. actually
supports it.</p>

<p>The second is true, but:</p>
<ol>
  <li>It’s never going to happen (the <code class="language-plaintext highlighter-rouge">src</code> directory be missing from the
tarball?!)</li>
  <li>Even if it it does, the loop will <em>correctly</em> do nothing, and the build will
fail later</li>
</ol>

<p>So, while <code class="language-plaintext highlighter-rouge">for part in `cd src; echo *`; do</code> is somewhat showing its age, in the
context of this specific fragment of this specific shell script, that will
<strong>never ever</strong> cause a problem. Writing a new script? Don’t write that! Adding a
new part to an existing script? Don’t write that!</p>

<p>However, it is now broken, and my proposed fix is that actually the use of the
wildcard was completely unnecessary, as we already compute the relevant list of
directories which can possibly contain a <code class="language-plaintext highlighter-rouge">META.in</code> file slightly later in the
script (taken from <a href="https://github.com/ocaml/ocamlfind/pull/112/commits/129a259141fa3f1af157eef3100b74866bf90795">1291259</a>):</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for </span>part <span class="k">in</span> <span class="nv">$parts</span><span class="p">;</span> <span class="k">do
  if</span> <span class="o">[</span> <span class="nt">-f</span> src/<span class="s2">"</span><span class="nv">$part</span><span class="s2">"</span>/META.in <span class="o">]</span><span class="p">;</span> <span class="k">then
    </span><span class="nb">sed</span> <span class="nt">-e</span> <span class="s2">"s/@VERSION@/</span><span class="nv">$version</span><span class="s2">/g"</span> <span class="se">\</span>
        <span class="nt">-e</span> <span class="s2">"s/@REQUIRES@/</span><span class="k">${</span><span class="nv">req_bytes</span><span class="k">}</span><span class="s2">/g"</span> <span class="se">\</span>
        src/<span class="s2">"</span><span class="nv">$part</span><span class="s2">"</span>/META.in <span class="o">&gt;</span> src/<span class="s2">"</span><span class="nv">$part</span><span class="s2">"</span>/META
  <span class="k">fi
done</span>
</code></pre></div></div>

<p>A piece of code got changed to fix neither a bug nor add a feature, and
something which worked before became broken. It took a non-trivial amount of
(wallclock) time to diagnose and fix the issue. That’s a lot of negative
consequences, and so far at least, there are <em>no</em> positive consequences.</p>

<p>If it ain’t broke, don’t fix it!</p>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="windows" /><summary type="html"><![CDATA[Someone was kind enough a few weeks ago to comment on Hacker News that “Opam on Windows is a masterpiece of engineering”. There’s certainly a lot which has to go on under the hood to create what I believe is the necessary baseline experience. Unfortunately, it doesn’t always work out.]]></summary></entry><entry><title type="html">Reflections on ICFP 2025</title><link href="https://www.dra27.uk/blog/platform/2025/10/18/icfp-2025.html" rel="alternate" type="text/html" title="Reflections on ICFP 2025" /><published>2025-10-18T00:00:00+00:00</published><updated>2025-10-18T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2025/10/18/icfp-2025</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2025/10/18/icfp-2025.html"><![CDATA[<p>I spent last week at <a href="https://icfp25.sigplan.org/">ICFP 2025</a>. A nice (if
exhausting!) week, as ever. Amusingly, the most reflections were actually
sparked by <a href="https://conf.researchr.org/details/icfp-splash-2025/rebase-2025-papers/2/The-Saga-of-Multicore-OCaml">Yaron’s talk</a>
which was right at the end (you can see the talk itself <a href="https://www.youtube.com/watch?v=UI1wApT2t1w&amp;list=PLyrlk8Xaylp5ihrTVeOSaylaBe4ZORkSI&amp;t=20694s">on YouTube</a>).</p>

<p>I’ve been involved from both a Tarides perspective and from general day-to-day
upstream work on the OCaml runtime with some of the experiences Jane Street had
switching to OCaml 5 (because they’re using <a href="https://oxcaml.org">OxCaml</a>, you’ll
more often hear this referred to as “Runtime 5”, but it essentially means
Multicore OCaml). It’s interesting to reflect on the decisions we made when
merging Multicore OCaml in the light of these subsequent experiences not, of
course, as a navel-gazing exercise in the benefits of hindsight, but in terms of
what we can potentially learn for the road towards OxCaml becoming OCaml.</p>

<p><span id="multicore-upstreaming">Unsurprisingly</span>, there was <em>tons</em> of preparation, planning, and hard work leading
up to opening and merging of Multicore in <a href="https://github.com/ocaml/ocaml/pull/10831">ocaml/ocaml#10831</a>
in January 2022. One of the early ideas was to merge just the runtime changes as
a separate runtime, leaving all the language changes to a subsequent update. The
main thing here would have been to upstream the immense changes to the allocator
and garbage collector along with the domains and fibers machinery, while not yet
exposing it. I remember the concern being that having essentially a runtime
variant (not unlike the debug runtime) might lead to very slow uptake at
actually testing it and possibly a maintenance burden. i.e. we were concerned at
maintaining two runtimes. This would probably have resulted in something like
OCaml 4.15.0, with an experimental <em>official</em> multicore-aware runtime.</p>

<p>The decisions moved on to being more, “no, it’s all or nothing - let’s take it”.
From there gradually we moved more towards this being OCaml 5.00.0; a major
version bump. The previous rebase of Multicore had been on 4.12, and that had
intentionally separated off the concept of effects to a separate version, as
this had the benefit that the surface syntax (especially where keywords were
concerned) was unaltered. During the various discussions, there was (slightly
unexpected!) enthusiasm to go the whole hog, and bring the effect system in as
well, but leaving the changing of the surface syntax to a subsequent release.
That this would be OCaml 5 was cemented, and we had a plan.</p>

<p>As a side-note, bumping the major release comes with a few other things. In
particular, it was suggested we should deal with the accumulation of
deprecations in the Standard Library, finally removing the unsafe string mode,
and so forth. Potential risks, but the practical idea was that programs which
were being updated to work with OCaml 5 which still hadn’t updated to these, in
some cases, very old deprecations would have bigger worries.</p>

<p>During 2022, until the release of OCaml 5.0 on 15 December, I had the role of
release manager for all of Tarides’ efforts towards OCaml 5. The simile between
OCaml 4 / OCaml 5 and Python 2 / Python 3 weighed heavily on a lot of what we
did, coupled with the very real fear that we <em>might</em> still find a fundamental
problem that could cause the multicore change to need to be reverted, with more
work required. A key thing which was introduced was that apart from those
deprecations, and a few bits of installation tidying, we froze <em>all</em> other work
on OCaml during the OCaml 5.0 dev cycle. This was such a big deal, we named it
the “Sequential Glaciation”, and it meant an unprecedented level of
compatibility for OCaml 4.14 programs when running on OCaml 5.0. Essentially, it
meant that <em>any</em> program which compiled without warnings on OCaml 4.14 should
compile without change on OCaml 5.0 (with some corner cases for Standard Library
replacements, etc.), an unusual level of commitment.</p>

<p>This gave an escape hatch: it meant that <em>any</em> code could be being tested on
OCaml 5 to check for performance regressions.</p>

<p>So we invested a phenomenal amount of engineering effort into ensuring that the
public OCaml ecosystem was compatible at launch - that so-called “sequential”
code (i.e. written for OCaml 4) would successfully execute single-domain on
OCaml 5.</p>

<p><span id="unexpected-twists">And then follows an</span> inadvertent mistake, to bring this back round to Jane
Street’s experience.</p>

<p>OCaml 5.1’s Changelog includes <a href="https://github.com/ocaml/ocaml/blob/5.1.0/Changes#L21">25 Standard Library entries</a>
of which 17 add new functions, 5 are non-trivial performance improvements and,
crucially, three <em>breaking changes</em> and non-trivial bug-fixes. That’s the
Standard Library alone!</p>

<p>No such escape hatch now: if you’re only now, as Jane Street were, only just
able to <em>think</em> about investigating OCaml 5 (also bear in mind that by OCaml 5.1
we still hadn’t restored feature parity with OCaml 4.14), there’s no easy way
back if you do hit a problem. Even if you just upgrade a codebase to OCaml 5.1
without using domains and effects, it’s very hard, bordering on impossible, to
go back to 4.14.</p>

<p>It’s important not to put this down to the benefits of hindsight. I don’t
remember us ever discussing the idea of actively supporting OCaml 5 (i.e. with
all the front-end changes included) still being able to run on the OCaml 4.14
runtime. It’s the kind of suggestion I would have expected in a core developers’
meeting to say, “yeah, I can do that”, but to have met with some resistance from
others at the idea of us all having to maintain it. 🙂</p>

<p>However, from a technical perspective, it really wasn’t as difficult as I’d have
expected. In the middle of 2023, I prepared a branch of OCaml 5 for Jane Street
which replayed the OCaml 5.x changes, skipping all of the alterations to the
runtime (a snapshot of it sits in branch <a href="https://github.com/dra27/ocaml/tree/backported-trunk-to-5.1-20230929">backported-trunk-to-5.1-20230929</a>
on my OCaml fork). The work in this formed the basis for Jane Street’s
OxCaml-on-runtime4 changes (it wasn’t called OxCaml back then, of course) - and
when compared with its basis commit of <a href="https://github.com/ocaml/ocaml/commit/5000b93cad2d126d5949c952eb9a93e47bf901a8">5000b93cad</a>,
there’s no OCaml 5 runtime changes in there, just various minor things added to
the 4.x runtime up to its release and a smattering of non-multicore related
things.</p>

<p>With this work, Jane Street managed to get their internal compiler to being the
OCaml 5.1 frontend but still running on their altered version of “Runtime 4”. As
they started investigating switching all the exciting things in OxCaml to be
available on “Runtime 5”, they also considered trying the runtime-variant
approach we’d ruled out earlier in the merge. Turns out our ruling it out was
correct - it didn’t work easily for them, as it requires too many extra flags
and things to be begin plumbed through build systems and so forth.</p>

<p>The trick was to make it a <em>configuration</em> option (which still exists in OxCaml
at the moment, although it’s on notice!). That provides a lot of simplification
in the deployment (we never have to build both runtimes at once, for example),
and making that “escape hatch” slightly more awkward to get to is probably no
bad thing, either. For a very large codebase, I can certainly see the value it
could have provided.</p>

<p>Maybe that’s also something we should look to as OxCaml’s development continues:
focusing in the ecosystem on being able to have that escape hatch back to OCaml
itself, possibly more for benchmarking comparison, than anything else. Although
so much of OxCaml is visible through the front-end (the modes systems, et al),
stack allocation, unboxed types, SIMD, and so forth are all also runtime
changes, and perhaps we need to be considering these escape hatches, <em>before</em> as
well as during any upstreaming effort. If nothing else, it would help
benchmarking. It <em>might</em> also be interesting for things like data-race freedom,
which don’t affect the runtime, to be able to have programs which are data-race
free on OCaml 5.x, even if the final form of DRF in OxCaml isn’t yet known. And
maybe it’ll be a bit too hard to maintain, but I’m always up for a challenge,
and musing is musing!</p>

<p><span id="modes-upstreaming">Sticking with OxCaml</span>, I was particularly interested, both from hallway
conversations and from talks, to muse on how the modes system in OxCaml might
make its way upstream. It’s a known design decision of <code class="language-plaintext highlighter-rouge">ocamlopt</code>, which
Richard alluded to in his <a href="https://conf.researchr.org/details/icfp-splash-2025/haskellsymp-2025-papers/2/-A-Tale-of-Two-Lambdas-A-Haskeller-s-Journey-into-OCaml">Haskell (!!) keynote</a>,
that it prefers predictability and a certain measure of simplicity over a
kitchen-sink of optimisations. I want to be able to write high-performance
program in OCaml, and I think others should be able to as well, but here are
three possible takes on why we might end up with a lot of front-end “complexity”
(I put it in quotes, because I tire of “complexity” often begin used as an
attack on solutions to problems) of OxCaml even without performance in mind:</p>

<ol>
  <li>
    <p>Last year at ICFP, we saw in <a href="https://icfp24.sigplan.org/details/icfp-2024-papers/19/Oxidizing-OCaml-with-Modal-Memory-Management">Oxidizing OCaml with Modal Memory Management</a>
the introduction of modes to increase performance by reducing heap
allocation. There’s another neat use of locals, and the regions that come
with them: the same technique allows us to stop accidentally programs like</p>

    <div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">uh_oh</span> <span class="n">file</span> <span class="o">=</span> <span class="nn">In_channel</span><span class="p">.</span><span class="n">with_open_text</span> <span class="n">file</span> <span class="nn">In_channel</span><span class="p">.</span><span class="n">really_input_string</span>
</code></pre></div>    </div>

    <p><em>Performance or no, I think we can all
agree we’d like to write programs where the type checker assures us that not
only are we well-typed but also that we don’t leak resources!</em></p>
  </li>
  <li>
    <p>Thanks to <a href="https://pldi18.sigplan.org/details/pldi-2018-papers/15/Bounding-Data-Races-in-Space-and-Time"><em>Bounding Data Races in Space and Time</em></a>,
OCaml has a much nicer approach to data-races than many languages (certainly
than the C language its runtime sits on!). We can say that OCaml programs
with data races do not “catch fire”, but it’s still chaos! <a href="https://ocaml.org/manual/5.4/api/Stdlib.Atomic.html">Atomics</a>
give us one way of being able to manage this, but we have to actually do it.
The work presented earlier this year in <a href="https://popl25.sigplan.org/details/POPL-2025-popl-research-papers/23/Data-Race-Freedom-la-Mode">Data Race Freedom à la Mode</a>
gives us OxCaml’s <a href="https://github.com/janestreet/portable">capsules</a>… now
we can declare we want to “do it”, and have the type checker ensure we
actually did.</p>

    <p><em>Performance or no, I think we can all agree we’d like to write programs
where the type checker assures us that not only are we well-typed but also
data-race free!</em></p>
  </li>
  <li>
    <p>Thanks to <a href="https://2025.splashcon.org/details/OOPSLA/62/Modal-Effect-Types"><em>Modal Effect Types</em></a>
this year (SPLASH and ICFP ran at the same time this year), we can look
towards typed effects in OCaml.</p>

    <p><em>Performance or no, I think we can all agree
we’d like to write programs where the type checker assures us that not are we
well-typed but also that we handle all our effects!</em></p>
  </li>
</ol>

<p>There’re other approaches to each of these problems as well (indeed, even in
OxCaml, where modal effects are concerned), but I think it’s interesting to see
what’s happening here as not <em>just</em> about high-performance.</p>

<p>That’s enough musing for now… I’ll endeavour to write some more and maybe in a
less musy way about other talks I went to another day!</p>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="oxcaml" /><category term="ocaml" /><category term="windows" /><summary type="html"><![CDATA[I spent last week at ICFP 2025. A nice (if exhausting!) week, as ever. Amusingly, the most reflections were actually sparked by Yaron’s talk which was right at the end (you can see the talk itself on YouTube).]]></summary></entry><entry><title type="html">File descriptors as integers considered harmful</title><link href="https://www.dra27.uk/blog/platform/2025/10/01/int-file-descriptors-considered-harmful.html" rel="alternate" type="text/html" title="File descriptors as integers considered harmful" /><published>2025-10-01T00:00:00+00:00</published><updated>2025-10-01T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2025/10/01/int-file-descriptors-considered-harmful</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2025/10/01/int-file-descriptors-considered-harmful.html"><![CDATA[<p>Continuing <a href="/blog/platform/2025/09/30/file-descriptors-are-not-integers.html">the previous foray into file descriptors</a>,
and trying to remove <code class="language-plaintext highlighter-rouge">Obj.magic</code> both from ocaml-uring’s code itself, and from
the recommended way for using it.</p>

<p>I expected this to be a fairly simple application of the C API features added in
<a href="https://github.com/dra27/ocaml/commits/file_descr-interop">dra27/ocaml#file_descr-interop</a>.
It’s not quite what happened, though - as I don’t actually think that any of the
changes I propose for the Unix library in OCaml are needed for ocaml-uring. What
piqued my interest in this rather deep rabbit-hole was that what I would contend
are slightly poor design decisions stem from the corrupting notion in C that
file descriptors are integers - i.e. because file descriptors in C <em>are</em>
integers, it becomes dangerously tempting to keep them as that in OCaml, because
OCaml <a href="https://ocaml.org/manual/5.3/intfc.html#s%3Ac-value">treats integers specially</a>.</p>

<p><a href="https://ocaml.org/p/uring/latest/doc/index.html">ocaml-uring’s documentation</a>
states that it “aims to provide a thin type-safe layer for use in higher-level
interfaces”. I think adopting Xavier’s <em>veto</em> on manipulating file descriptors
as integers <em>in OCaml</em> (i.e. <em>never</em> doing it) leads to greater type safety,
even in thin bindings, without compromising performance. Let’s see! 🤓</p>

<p>The <a href="https://github.com/ocaml-multicore/ocaml-uring">sources for ocaml-uring</a>
use <code class="language-plaintext highlighter-rouge">Obj.magic</code> both in the library itself in one place, and at various points
in both the tests and the documentation, exclusively for duping the type system
into believing an <code class="language-plaintext highlighter-rouge">int</code> is a <code class="language-plaintext highlighter-rouge">Unix.file_descr</code>.</p>

<p>Consider <a href="https://man7.org/linux/man-pages/man2/mkdirat.2.html">mkdirat(2)</a>.
This is part of a family of *at syscalls which accept a file descriptor as an
argument and which will use the directory name of the file referred to by that
file descriptor as the base for a relative path. There is a special value
<code class="language-plaintext highlighter-rouge">AT_FDCWD</code> which can be passed instead of an open file descriptor (it instructs
<code class="language-plaintext highlighter-rouge">mkdirat</code> instead to interpret paths relative to the current working directory),
and it’s the representation of this <code class="language-plaintext highlighter-rouge">AT_FDCWD</code> which causes the <code class="language-plaintext highlighter-rouge">Obj.magic</code>.
When used in C, <code class="language-plaintext highlighter-rouge">AT_FDCWD</code> is an illegal file descriptor value (it’s -100, as it
happens). Continuing with <code class="language-plaintext highlighter-rouge">mkdirat</code> as an example, we have this C declaration:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">io_uring_prep_mkdirat</span><span class="p">(</span><span class="k">struct</span> <span class="n">io_uring_sqe</span> <span class="o">*</span><span class="n">sqe</span><span class="p">,</span>
                           <span class="kt">int</span> <span class="n">dirfd</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">path</span><span class="p">,</span> <span class="n">mode_t</span> <span class="n">mode</span><span class="p">);</span>
</code></pre></div></div>

<p>and <a href="https://github.com/ocaml-multicore/ocaml-uring/blob/main/lib/uring/uring.ml#L341">this</a>
corresponding primitive in ocaml-uring:</p>
<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">external</span> <span class="n">submit_mkdirat</span> <span class="o">:</span> <span class="n">t</span> <span class="o">-&gt;</span> <span class="n">id</span>
                          <span class="o">-&gt;</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span> <span class="o">-&gt;</span> <span class="nn">Sketch</span><span class="p">.</span><span class="n">ptr</span> <span class="o">-&gt;</span> <span class="kt">int</span>
                          <span class="o">-&gt;</span> <span class="kt">bool</span> <span class="o">=</span> <span class="s2">"ocaml_uring_submit_mkdirat"</span> <span class="p">[</span><span class="o">@@</span><span class="n">noalloc</span><span class="p">]</span>
</code></pre></div></div>

<p>There’s a close correspondence between the two - the OCaml <code class="language-plaintext highlighter-rouge">t</code> gets passed to
<a href="https://man7.org/linux/man-pages/man3/io_uring_get_sqe.3.html"><code class="language-plaintext highlighter-rouge">io_uring_get_sqe</code></a>
which gives us the first parameter for the prep call. The <code class="language-plaintext highlighter-rouge">id</code> is separately
passed to <a href="https://man7.org/linux/man-pages/man3/io_uring_sqe_set_data.3.html"><code class="language-plaintext highlighter-rouge">io_uring_set_sqe_data</code></a>
(that’s part of a mechanism in liburing to be able to identify which syscalls
have completed - more on that later). The remaining arguments in this primitive
(the <code class="language-plaintext highlighter-rouge">Unix.file_descr</code>, the <code class="language-plaintext highlighter-rouge">Sketch.ptr</code> and the <code class="language-plaintext highlighter-rouge">int</code>) correspond to the
<code class="language-plaintext highlighter-rouge">dirfd</code>, <code class="language-plaintext highlighter-rouge">path</code> and <code class="language-plaintext highlighter-rouge">mode</code> arguments to the syscall (the liveness of buffers is
complicated here - the <code class="language-plaintext highlighter-rouge">Sketch.ptr</code> type is essentially allowing a string to be
passed from OCaml to C without subsequently having to worry about it being moved
by the GC).</p>

<p>It looks like a good thin layer over the syscall. Every one of these prep calls
will necessarily have a <code class="language-plaintext highlighter-rouge">io_uring_set_sqe_data</code> call, and it’s always good to
minimise the number of crossings between C and OCaml when writing C bindings.</p>

<p>It naturally becomes tempting to have a <code class="language-plaintext highlighter-rouge">Unix.file_descr</code> value for <code class="language-plaintext highlighter-rouge">AT_FDCWD</code>,
and that’s indeed <a href="https://github.com/ocaml-multicore/ocaml-uring/blob/main/lib/uring/uring.ml#L468">the implementation</a>:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">at_fdcwd</span> <span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span> <span class="o">=</span> <span class="nn">Obj</span><span class="p">.</span><span class="n">magic</span> <span class="nn">Config</span><span class="p">.</span><span class="n">at_fdcwd</span>
</code></pre></div></div>

<p>However, there really is no need for this, and if we step back from the fact
it’s an <code class="language-plaintext highlighter-rouge">int</code>, and instead consider what <code class="language-plaintext highlighter-rouge">AT_FDCWD</code> is <em>modelling</em>, a more
obvious solution emerges which doesn’t require magic. <code class="language-plaintext highlighter-rouge">AT_FDCWD</code> is essentially
be used to mean “no file descriptor” (and there’s then an interpretation for
what that means <code class="language-plaintext highlighter-rouge">mkdirat</code> should do). i.e. <code class="language-plaintext highlighter-rouge">AT_FDCWD</code> means <code class="language-plaintext highlighter-rouge">None</code>, which is
then being modelled by -100. An actual file descriptor is <code class="language-plaintext highlighter-rouge">Some fd</code>, which is
being modelled by its number (which can never be -100). In other words, we could
pass a <code class="language-plaintext highlighter-rouge">Unix.file_descr option</code> to the C stub, instead of a <code class="language-plaintext highlighter-rouge">Unix.file_descr</code>.
This falls out even more naturally when we look at the actual <a href="https://github.com/ocaml-multicore/ocaml-uring/blob/main/lib/uring/uring.ml#L498-L502"><code class="language-plaintext highlighter-rouge">Uring.mkdirat</code></a>
exposed to the library user:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">mkdirat</span> <span class="n">t</span> <span class="o">~</span><span class="n">mode</span> <span class="o">?</span><span class="p">(</span><span class="n">fd</span><span class="o">=</span><span class="n">at_fdcwd</span><span class="p">)</span> <span class="n">path</span> <span class="n">user_data</span> <span class="o">=</span>
</code></pre></div></div>

<p>It actually already has a <code class="language-plaintext highlighter-rouge">Unix.file_descr option</code> because it’s using an
optional argument to convey it! It’s quite an <a href="https://github.com/dra27/ocaml-uring/commit/00ca9ecd88216ae4fdd039d92c58619686e4b4af">easy adjustment</a>
just to allow the C stub to resolve this, it has the benefit that the value of
<code class="language-plaintext highlighter-rouge">AT_FDCWD</code> no longer has to be determined from OCaml, and it doesn’t involve any
additional allocations of memory on the OCaml side (because the optional
argument meant there was already one).</p>

<p>So far, so hopefully good. Now on to the one in the README:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">let</span> <span class="n">fd</span> <span class="o">=</span>
    <span class="k">if</span> <span class="n">result</span> <span class="o">&lt;</span> <span class="mi">0</span> <span class="k">then</span> <span class="n">failwith</span> <span class="p">(</span><span class="s2">"Error: "</span> <span class="o">^</span> <span class="n">string_of_int</span> <span class="n">result</span><span class="p">);</span>
    <span class="p">(</span><span class="nn">Obj</span><span class="p">.</span><span class="n">magic</span> <span class="n">result</span> <span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span><span class="p">);;</span>
<span class="k">val</span> <span class="n">fd</span> <span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span> <span class="o">=</span> <span class="o">&lt;</span><span class="n">abstr</span><span class="o">&gt;</span>
</code></pre></div></div>

<p>Well, it wasn’t my intention, but the proposed <code class="language-plaintext highlighter-rouge">Unix.fdopen</code> <a href="https://github.com/dra27/ocaml-uring/commit/6c14ffd68be586e48fa4293df76338db32a28c12">could deal with
that</a>.
Except that this is not “<code class="language-plaintext highlighter-rouge">fdopen</code>” in the spirit it was added for - in fact,
it’s precisely what was <em>not</em> supposed to be done. We’ve got an <code class="language-plaintext highlighter-rouge">int</code> in OCaml
that is <em>actually</em> a file descriptor, and we’re trying to coerce it to be a
<code class="language-plaintext highlighter-rouge">Unix.file_descr</code>. We’ve got a layering violation - we’ve allowed an abstract
value from the world of C that is represented in C as an <code class="language-plaintext highlighter-rouge">int</code> to leak up to the
world of OCaml as an OCaml <code class="language-plaintext highlighter-rouge">int</code>, which should be a number!</p>

<p>Funnily enough, ocaml-uring has another minor layering violation like this in
the translation of Unix error values to <code class="language-plaintext highlighter-rouge">Unix.error</code> values in the
<a href="https://github.com/ocaml-multicore/ocaml-uring/blob/main/lib/uring/uring.ml#L651-L652"><code class="language-plaintext highlighter-rouge">Uring.error_of_errno</code> function</a>:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">error_of_errno</span> <span class="n">e</span> <span class="o">=</span>
  <span class="nn">Uring</span><span class="p">.</span><span class="n">error_of_errno</span> <span class="p">(</span><span class="n">abs</span> <span class="n">e</span><span class="p">)</span>
</code></pre></div></div>

<p>I didn’t want to abuse my proposed <code class="language-plaintext highlighter-rouge">Unix.fdopen</code> this way, so soon after
conceiving it for a more noble and even slightly-typed purpose, so instead I
opted for bouncing these numbers back to C by instead adding <a href="https://github.com/dra27/ocaml-uring/pull/3/files"><code class="language-plaintext highlighter-rouge">Uring.file_descr_of_result</code></a>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">value</span> <span class="nf">ocaml_uring_file_descr_of_result</span><span class="p">(</span><span class="n">value</span> <span class="n">v_result</span><span class="p">)</span>
<span class="p">{</span>
  <span class="n">CAMLparam0</span><span class="p">();</span>
  <span class="n">CAMLlocal2</span><span class="p">(</span><span class="n">result</span><span class="p">,</span> <span class="n">val</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">Int_val</span><span class="p">(</span><span class="n">v_result</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">val</span> <span class="o">=</span> <span class="n">unix_error_of_code</span><span class="p">(</span><span class="o">-</span><span class="n">Int_val</span><span class="p">(</span><span class="n">v_result</span><span class="p">));</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">caml_alloc_small</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
  <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
    <span class="n">val</span> <span class="o">=</span> <span class="n">caml_unix_file_descr_of_fd</span><span class="p">(</span><span class="n">Int_val</span><span class="p">(</span><span class="n">v_result</span><span class="p">));</span>
    <span class="n">result</span> <span class="o">=</span> <span class="n">caml_alloc_small</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
  <span class="p">}</span>
  <span class="n">Field</span><span class="p">(</span><span class="n">result</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="o">=</span> <span class="n">val</span><span class="p">;</span>
  <span class="n">CAMLreturn</span><span class="p">(</span><span class="n">result</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>which handles the pattern in C, where it belongs. The result certainly looks
nicer in some of the test code, for example, the numeric matching in one of the
tests:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">result</span> <span class="o">&gt;=</span> <span class="mi">0</span> <span class="k">then</span>
  <span class="k">let</span> <span class="n">fd</span> <span class="o">=</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">fdopen</span> <span class="n">result</span> <span class="k">in</span>
  <span class="nn">Unix</span><span class="p">.</span><span class="n">close</span> <span class="n">fd</span>
<span class="k">else</span>
  <span class="k">let</span> <span class="n">error</span> <span class="o">=</span> <span class="nn">Uring</span><span class="p">.</span><span class="n">error_of_errno</span> <span class="n">fd</span> <span class="k">in</span>
  <span class="k">raise</span> <span class="p">(</span><span class="nn">Unix</span><span class="p">.</span><span class="nc">Unix_error</span><span class="p">(</span><span class="n">error</span><span class="o">,</span> <span class="c">(* ... *)</span><span class="p">))</span>
</code></pre></div></div>

<p>becomes a slightly more idiomatic:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">match</span> <span class="nn">Uring</span><span class="p">.</span><span class="n">file_descr_of_result</span> <span class="n">result</span> <span class="k">with</span>
<span class="o">|</span> <span class="nc">Ok</span> <span class="n">fd</span> <span class="o">-&gt;</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">close</span> <span class="n">fd</span>
<span class="o">|</span> <span class="nc">Error</span> <span class="n">error</span> <span class="o">-&gt;</span> <span class="k">raise</span> <span class="p">(</span><span class="nn">Unix</span><span class="p">.</span><span class="nc">Unix_error</span><span class="p">(</span><span class="n">error</span><span class="o">,</span> <span class="c">(* ... *)</span><span class="p">))</span>
</code></pre></div></div>

<p>but there are two things that still feel bad - those <code class="language-plaintext highlighter-rouge">int</code>s are <em>still there</em>
in OCaml! Worse, we’re now incurring an extra C call to process them.</p>

<p>The veto tells us that the processing belongs in C. Is there a way to do this
where we don’t <em>ever</em> have the general <code class="language-plaintext highlighter-rouge">int</code> result exposed in OCaml, but we’re
still a thin wrapper around the API <em>and</em> at no cost?</p>

<p>Not without breaking the API, but I think we can have our API cake and also eat
it in this case. A first stab sits on <a href="https://github.com/dra27/ocaml-uring/pull/2/files">my fork</a>.
The result of an io_uring operation is always a (C) <code class="language-plaintext highlighter-rouge">int</code>. Each of the syscalls
uses one of three encodings of that integer. In all three cases, a negative
number corresponds to an error, where the absolute value of the number is the
Unix error number (from <code class="language-plaintext highlighter-rouge">errno.h</code>). Some functions then just return 0 on
success. The other functions return a number ≥ 0 and for some of those, that
number is a file descriptor.</p>

<p>Now, the choice of encoding is known <em>when we prep the call</em>. So the trick I
tried instead was to encode that in the user-data attached to the ring entry.
The ocaml-uring implementation is using an array index as the user-data, so even
on a 32-bit system we can comfortably steal two bits to encode how to treat the
result. The three wait functions now instead of returning:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">cqe_option</span> <span class="o">=</span> <span class="n">private</span>
<span class="o">|</span> <span class="nc">Cqe_none</span>
<span class="o">|</span> <span class="nc">Cqe_some</span> <span class="k">of</span> <span class="p">{</span> <span class="n">user_data_id</span> <span class="o">:</span> <span class="n">id</span><span class="p">;</span> <span class="n">res</span><span class="o">:</span> <span class="kt">int</span> <span class="p">}</span>
</code></pre></div></div>

<p>return a slightly richer:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="n">cqe_option</span> <span class="o">=</span> <span class="n">private</span>
<span class="o">|</span> <span class="nc">Cqe_none</span>
<span class="o">|</span> <span class="nc">Cqe_unit</span> <span class="k">of</span> <span class="p">{</span> <span class="n">user_data_id</span> <span class="o">:</span> <span class="n">id</span><span class="p">;</span> <span class="n">result</span><span class="o">:</span> <span class="kt">unit</span> <span class="p">}</span>
<span class="o">|</span> <span class="nc">Cqe_int</span> <span class="k">of</span> <span class="p">{</span> <span class="n">user_data_id</span> <span class="o">:</span> <span class="n">id</span><span class="p">;</span> <span class="n">result</span><span class="o">:</span> <span class="kt">int</span> <span class="p">}</span>
<span class="o">|</span> <span class="nc">Cqe_fd</span> <span class="k">of</span> <span class="p">{</span> <span class="n">user_data_id</span> <span class="o">:</span> <span class="n">id</span><span class="p">;</span> <span class="n">result</span><span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span> <span class="p">}</span>
<span class="o">|</span> <span class="nc">Cqe_error</span> <span class="k">of</span> <span class="p">{</span> <span class="n">user_data_id</span> <span class="o">:</span> <span class="n">id</span><span class="p">;</span> <span class="n">result</span><span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">error</span> <span class="p">}</span>
</code></pre></div></div>

<p>A little bit of additional plumbing is needed for cancellation. I opted to have
the C functions themselves return the amended user-data value, as it’s only
needed for cancellation. That keeps the considerations of C in C-land, but it
might be better to push that into the <code class="language-plaintext highlighter-rouge">Heap</code> implementation which backs the IDs
and allow both the job ID and heap pointer to be the same number.</p>

<p>I didn’t push this all the way back to Eio, but I did update all the tests and
documentation, and I have to say the results looked nice to me (and still
low-level) - in particular, the copying tests seem to get a clearer delineation
of error path. I’m not sure if the separation of <code class="language-plaintext highlighter-rouge">unit</code>/<code class="language-plaintext highlighter-rouge">int</code> was strictly
buying much, and given that various of the error codes which may want to come
aren’t Posix and so map to <code class="language-plaintext highlighter-rouge">Unix.UNKNOWNERR</code> (which allocates), it would
possibly be worth looking at a richer error type which could itself be converted
to a <code class="language-plaintext highlighter-rouge">Unix.error</code> expensively, but on-demand. However, what was nice was that
the return from the wait from functions still allocates just one two-field OCaml
word, and the file descriptor - if there is one - incurs no further conversion,
the value comes from C with the correct types already.</p>

<p>Entertainingly, the final version of the patch, while breaking the API, does not
actually need any alterations to OCaml’s Unix library at all.</p>

<p>There’s one last possibly fun OxCaml follow-up to look at. The value which comes
back from the C wait functions is a <code class="language-plaintext highlighter-rouge">cqe_option</code> (see above) whose shape is a
single constant constructor and four 2-field constructors. The first field in
these cases is the io_uring user-data, which we then translate back to the
OCaml value to yield one of these:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">type</span> <span class="k">'</span><span class="n">a</span> <span class="n">completion_option</span> <span class="o">=</span>
  <span class="o">|</span> <span class="nc">None</span>
  <span class="o">|</span> <span class="nc">Unit</span> <span class="k">of</span> <span class="p">{</span> <span class="n">result</span><span class="o">:</span> <span class="kt">unit</span><span class="p">;</span> <span class="n">data</span><span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="p">}</span>
  <span class="o">|</span> <span class="nc">Int</span> <span class="k">of</span> <span class="p">{</span> <span class="n">result</span><span class="o">:</span> <span class="kt">int</span><span class="p">;</span> <span class="n">data</span><span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="p">}</span>
  <span class="o">|</span> <span class="nc">FD</span> <span class="k">of</span> <span class="p">{</span> <span class="n">result</span><span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span><span class="p">;</span> <span class="n">data</span><span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="p">}</span>
  <span class="o">|</span> <span class="nc">Error</span> <span class="k">of</span> <span class="p">{</span> <span class="n">result</span><span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">error</span><span class="p">;</span> <span class="n">data</span><span class="o">:</span> <span class="k">'</span><span class="n">a</span> <span class="p">}</span>
</code></pre></div></div>

<p>Which has the same shape! Now, we could do an evil <code class="language-plaintext highlighter-rouge">Obj.magic</code> trick here to
re-use the block and simply replace the field which contained the io_uring user
data with the OCaml value (we’d need to swap the fields around in the types, but
that’s a minor detail). However, this kind of lying can be dangerous when the
OCaml optimiser comes into play… but can the uniqueness modes be used here to
allow <em>the compiler</em> to determine that the <code class="language-plaintext highlighter-rouge">cqe_option</code> block is now available
and actually compile it down to just that field assignment? We’d then halve the
allocations arising from a ring wait operation 🤔</p>

<p>Anyway, file descriptors are not integers. I’m a new disciple. I think we should
make T-Shirts.</p>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="windows" /><summary type="html"><![CDATA[Continuing the previous foray into file descriptors, and trying to remove Obj.magic both from ocaml-uring’s code itself, and from the recommended way for using it.]]></summary></entry><entry><title type="html">File descriptors are not integers</title><link href="https://www.dra27.uk/blog/platform/2025/09/30/file-descriptors-are-not-integers.html" rel="alternate" type="text/html" title="File descriptors are not integers" /><published>2025-09-30T00:00:00+00:00</published><updated>2025-09-30T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2025/09/30/file-descriptors-are-not-integers</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2025/09/30/file-descriptors-are-not-integers.html"><![CDATA[<p>There was a flurry of activity on <a href="https://github.com/ocaml-multicore/ocaml-uring">ocaml-multicore/ocaml-uring</a>
this month leading to a release (<a href="https://github.com/ocaml/opam-repository/pull/28604">ocaml/opam-repository#28604</a>).
ocaml-uring provides bindings to the Linux’s <a href="https://github.com/axboe/liburing">io_uring</a>,
which allows batching various syscalls to the kernel for it to execute
out-of-order, and in parallel. Its principal use at the moment is for the
high-performance Linux backend of <a href="https://github.com/ocaml-multicore/eio">Eio</a>.</p>

<p>Various of the syscalls available in io_uring return Unix file descriptors, and
the design of ocaml-uring as a low-level interface to it leads to some slightly
unfortunate recommendations in its instructions:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">#</span> <span class="k">let</span> <span class="n">fd</span> <span class="o">=</span>
    <span class="k">if</span> <span class="n">result</span> <span class="o">&lt;</span> <span class="mi">0</span> <span class="k">then</span> <span class="n">failwith</span> <span class="p">(</span><span class="s2">"Error: "</span> <span class="o">^</span> <span class="n">string_of_int</span> <span class="n">result</span><span class="p">);</span>
    <span class="p">(</span><span class="nn">Obj</span><span class="p">.</span><span class="n">magic</span> <span class="n">result</span> <span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span><span class="p">);;</span>
<span class="k">val</span> <span class="n">fd</span> <span class="o">:</span> <span class="nn">Unix</span><span class="p">.</span><span class="n">file_descr</span> <span class="o">=</span> <span class="o">&lt;</span><span class="n">abstr</span><span class="o">&gt;</span>
</code></pre></div></div>

<p>Anil was wondering if we could dust off some of the code in an old failed OCaml
pull request of mine (<a href="https://github.com/ocaml/ocaml/pull/1990">ocaml/ocaml#1990</a>)
to get rid of some of the magic. Pulling out the C code from that PR wasn’t
entirely mechanical, and Anil wondered if we might put it in a library. As it
happened, there’d been some discussions about the PR at developer meetings, and
there was a consensus that we ought to have an official C API for converting a
C file descriptor (i.e. an <code class="language-plaintext highlighter-rouge">int</code>) to an OCaml <code class="language-plaintext highlighter-rouge">Unix.file_descr</code>. I wasn’t that
keen on producing an external library without sorting out the compiler’s library
at the same time, so I figured I’d dust that change off and see where it went.</p>

<p>That original PR attempted to add primitives to the Unix module to allow OCaml
code to convert an <em>OCaml</em> <code class="language-plaintext highlighter-rouge">int</code> to <code class="language-plaintext highlighter-rouge">Unix.file_descr</code>, so the ocaml-uring
example would instead just be <code class="language-plaintext highlighter-rouge">Unix.descr_of_fd result</code> with no <code class="language-plaintext highlighter-rouge">Obj.magic</code>.
However, that approach had an absolute veto from <a href="https://github.com/ocaml/ocaml/pull/1990#issuecomment-413464113">Xavier</a>:</p>

<p><em>“File descriptors are not integers”</em></p>

<p>I started off on this little rabbit-hole of changes agreeing philosophically,
but not really agreeing, but ended up in total agreement, and ending with the
feeling that the fact they are integers - and that OCaml treats integers
specially - encourages possibly poorer library design.</p>

<p>On Unix, a <code class="language-plaintext highlighter-rouge">Unix.file_descr</code> is just the OCaml representation of the C <code class="language-plaintext highlighter-rouge">int</code>,
but it’s absolutely not that on Windows, where the implementation is much more
complicated. So the <code class="language-plaintext highlighter-rouge">Obj.magic</code> “trick” is a route to a segfault on Windows.
That’s obviously not important for a Linux-only library like ocaml-uring, but
<code class="language-plaintext highlighter-rouge">Obj.magic</code> is a wart. The Windows complexity exists because we have both CRT
file descriptors (which are just C <code class="language-plaintext highlighter-rouge">int</code>s, the same as on Unix) and also OS
file handles (which are Win32 <code class="language-plaintext highlighter-rouge">HANDLE</code>s - a pointer). There’s some added
book-keeping complexity needed, but that’s not important right now. The key
thing is that we have a notion of an “Operating System” file descriptor and a
“C Runtime Library” file descriptor. On Unix, they happen to be the same thing;
on Windows, they’re not. On Windows, given one, it is always possible to obtain
the other (the functions <a href="https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/get-osfhandle"><code class="language-plaintext highlighter-rouge">_get_osfhandle</code></a>
and <a href="https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/open-osfhandle"><code class="language-plaintext highlighter-rouge">_open_osfhandle</code></a>
are provided for this.</p>

<p>That gives a straightforward portable C API: <code class="language-plaintext highlighter-rouge">caml_unix_file_descr_of_os</code> which
takes an <code class="language-plaintext highlighter-rouge">int</code> (on Unix) or a <code class="language-plaintext highlighter-rouge">HANDLE</code> (on Windows) and returns an OCaml
<code class="language-plaintext highlighter-rouge">Unix.file_descr</code> representing it. It’s similarly straightforward to provide
<code class="language-plaintext highlighter-rouge">caml_unix_file_descr_of_fd</code> and <code class="language-plaintext highlighter-rouge">caml_unix_fd_of_file_descr</code> for getting the
CRT file descriptor on both, and we then have the portable primitives required
(<code class="language-plaintext highlighter-rouge">caml_unix_os_of_file_descr</code> is not necessary, but the reasons are left as an
exercise for the avid portable C stub code author).</p>

<p>To this, I then added a quick commit to solve another issue in this area, which
is tracked in <a href="https://github.com/ocaml/ocaml/issues/9052">ocaml/ocaml#9052</a>.
While there’s the veto on <em>converting</em> a <code class="language-plaintext highlighter-rouge">Unix.file_descr</code> to an <code class="language-plaintext highlighter-rouge">int</code>, being
able to debug the values in logs and so forth is useful.</p>

<p>At this point, the scab of this old PR well and truly picked, I paused and
thought a bit about the original problem I’d been trying to solve in <a href="https://github.com/ocaml/ocaml/pull/1990">#1990</a>,
and which remained not-entirely-satisfactorily solved. It’s also described in
<a href="https://github.com/ocaml/ocaml/issues/6948">ocaml/ocaml#6948</a>, and while
looking through it, I realised <a href="/blog/platform/2025/04/03/cloexec.html">some of my previous work</a>
was related to this. In particular, that passing specific file descriptors from
one process to another is <em>not</em> a Unix-specific operation, and you can do it on
Windows as well, it’s just less common (i.e. you can start a Windows process
with file descriptor 3 connected to a control channel if you want, just as you
can on Unix).</p>

<p>Which got me thinking some more about having an OCaml function for this, and
about “file descriptors are not integers”, and I realised that while this is
<em>mostly</em> true, it’s not <em>always</em> true. For a start, there are three well-known
values, 0 for “input”, 1 for “output”, and 2 for “logging”, known as the
“Standard Input, Output and Error” handles. These are exposed in the Unix
library as <code class="language-plaintext highlighter-rouge">Unix.stdin</code>, <code class="language-plaintext highlighter-rouge">Unix.stdout</code> and <code class="language-plaintext highlighter-rouge">Unix.stderr</code> (and in the Standard
Library itself, for the channels API), and you can also specify them when
spawning processes. Stepping back, let’s consider if instead of treating file
descriptors as <code class="language-plaintext highlighter-rouge">int</code>, <a href="http://cs.bell-labs.co/who/ken/">Thompson</a> and <a href="https://9p.io/who/dmr/index.html">Ritchie</a>
had instead used <a href="https://en.wikipedia.org/wiki/Opaque_pointer#C">opaque pointers</a><sup id="fnref:note" role="doc-noteref"><a href="#fn:note" class="footnote" rel="footnote">1</a></sup>
for <code class="language-plaintext highlighter-rouge">open</code>, <code class="language-plaintext highlighter-rouge">read</code>, and so forth in the first version of Unix.</p>

<p>In this hypothetical scenario, virtually all C programs would be just fine:
<code class="language-plaintext highlighter-rouge">STDIN_FILENO</code> and so forth would still be there, and virtually all C code using
file descriptors doesn’t actually care about the precise value. However, the
implementation would face the same issue as we have in OCaml when trying to
spawn other processes if we needed to <em>pass</em> file descriptors to a process. At
that point, it dawned on me: “file descriptors are <strong>indeed</strong> not integers<strong>,
but at program startup, there is a partial map from integers to file
descriptors</strong>”. Most processes are created with at least <code class="language-plaintext highlighter-rouge">0</code>, <code class="language-plaintext highlighter-rouge">1</code> and <code class="language-plaintext highlighter-rouge">2</code> added
to that map, and they can then be retrieved with <code class="language-plaintext highlighter-rouge">Unix.stdin</code>/<code class="language-plaintext highlighter-rouge">STDIN_FILENO</code>,
etc. Given that Windows <em>does</em> support the notion of passing additional file
descriptors, that means that this operation <strong>is portable</strong>, and the inability
to pass <em>abstract</em> file descriptors <strong>using</strong> <em>integers</em> seems like a gap. This
fits with a TODO item I noted <a href="/blog/platform/2025/04/03/cloexec.html">in April</a>:
once <code class="language-plaintext highlighter-rouge">Unix.create_process</code> et al are correctly <em>inheriting</em> file descriptors on
Windows, it makes even more sense to be able to do this too.</p>

<p><a href="https://github.com/ocaml/ocaml/issues/6948">#6948</a> proposed adding a
<code class="language-plaintext highlighter-rouge">file_descr list</code> argument to functions like <code class="language-plaintext highlighter-rouge">create_process</code>, but that’s the
wrong API here. It <em>assumes</em> that each descriptor is being passed using its
current file descriptor number, which is wrong for two reasons. Firstly, it
breaks the abstraction (we have just treated a <code class="language-plaintext highlighter-rouge">Unix.file_descr</code> as an <code class="language-plaintext highlighter-rouge">int</code>).
Secondly, as noted in the C API above, if the <code class="language-plaintext highlighter-rouge">Unix.file_descr</code> was created from
an OS file descriptor on Windows, it doesn’t necessarily <em>have</em> a CRT descriptor
associated with it. The fix is easy: pass a <code class="language-plaintext highlighter-rouge">(int * file_descr) list</code>. We
preserve the abstraction, and specify the mapping.</p>

<p>With this extra list, when we spawn a process, we now have the ability to pass,
say, the input portion of a pipe (created with <code class="language-plaintext highlighter-rouge">Unix.pipe</code>) as file
descriptor 3. How can we pick that up on the OCaml side? As it happens, on
Windows we really could trivially enumerate them, and so have
<code class="language-plaintext highlighter-rouge">Unix.startup_file_descrs : (int * file_descr) list</code> or some such, but there’s
no way to do that portably on Unix. For this specific case, there is a concrete
reason to have a function <code class="language-plaintext highlighter-rouge">int -&gt; Unix.file_descr</code> (indeed, this is exactly what
the Unix library does internally to create <code class="language-plaintext highlighter-rouge">stdin</code>, <code class="language-plaintext highlighter-rouge">stdout</code> and <code class="language-plaintext highlighter-rouge">stderr</code>).
However, there is <em>no case</em> for a function <code class="language-plaintext highlighter-rouge">Unix.file_descr -&gt; int</code>.</p>

<p>What to call this function, therefore? Having <code class="language-plaintext highlighter-rouge">file_descr_of_fd</code> (as in C) but
not <code class="language-plaintext highlighter-rouge">fd_of_file_descr</code> seems very strange. But what does the type tell us? We’re
taking an <code class="language-plaintext highlighter-rouge">int</code> - a <em>low-level</em> descriptor and returning a <code class="language-plaintext highlighter-rouge">file_descr</code> - a
<em>high-level</em> descriptor. C already has two levels - the stream API (“<code class="language-plaintext highlighter-rouge">FILE *</code>”)
and provides a function to go from the low level to the stream API. It’s
<a href="https://pubs.opengroup.org/onlinepubs/9799919799/functions/fdopen.html"><code class="language-plaintext highlighter-rouge">fdopen</code></a>.
The C streams API isn’t really about abstraction, so C also has a function to go
the other way, <a href="https://pubs.opengroup.org/onlinepubs/9799919799/functions/fileno.html"><code class="language-plaintext highlighter-rouge">fileno</code></a>,
but the nice property is that <code class="language-plaintext highlighter-rouge">fdopen</code> and <code class="language-plaintext highlighter-rouge">fileno</code> do not sound like reverse
operations.</p>

<p>Fixing <code class="language-plaintext highlighter-rouge">Unix.create_process</code> on Windows is definitely a job for another day, but
adding <code class="language-plaintext highlighter-rouge">Unix.fdopen</code> is easy, so I did that. The three commits are sat in
<a href="https://github.com/dra27/ocaml/commits/fdopen">dra27/ocaml#fdopen</a>. Next up,
applying all this to ocaml-uring to get rid of the magic…</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:note" role="doc-endnote">
      <p>I couldn’t find a precise reference for exactly when in the early 1970s this became possible, relative to the first releases of Unix! <a href="#fnref:note" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="windows" /><summary type="html"><![CDATA[There was a flurry of activity on ocaml-multicore/ocaml-uring this month leading to a release (ocaml/opam-repository#28604). ocaml-uring provides bindings to the Linux’s io_uring, which allows batching various syscalls to the kernel for it to execute out-of-order, and in parallel. Its principal use at the moment is for the high-performance Linux backend of Eio.]]></summary></entry><entry><title type="html">A second foray into agentic coding</title><link href="https://www.dra27.uk/blog/platform/2025/09/28/effectful-bug-hunting.html" rel="alternate" type="text/html" title="A second foray into agentic coding" /><published>2025-09-28T00:00:00+00:00</published><updated>2025-09-28T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2025/09/28/effectful-bug-hunting</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2025/09/28/effectful-bug-hunting.html"><![CDATA[<p>Continuing the <a href="/blog/platform/2025/09/17/late-to-the-party.html">previous theme</a>
of dabbling with matters agentic. Previously, I’d quite assiduously kept my
fingers away from files. This time, I wanted to try something exploratory,
switching to the agent for things I was actively stuck on.</p>

<p>I was still (very) curious at the latent remaining bug in <a href="/blog/platform/2025/09/25/building-with-effects.html">Lucas’s excellent work</a>.
There were some corners which had been cut in the prototype, and I had a brief
foray into this problem, with a view this time to ensuring artefact equivalence
between what OCaml’s build system would produce and what our altered driver
program was doing.</p>

<p>If you have a pre-built compiler and a clean (of binary artefacts) OCaml source
tree, you can actually build the bytecode compiler in just three, ahem, short
commands (I’m intentionally glossing over all the generated source files):</p>
<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>ocamlc <span class="nt">-I</span> utils <span class="nt">-I</span> parsing <span class="nt">-I</span> typing <span class="nt">-I</span> bytecomp <span class="nt">-I</span> file_formats <span class="nt">-I</span> lambda <span class="nt">-I</span> middle_end <span class="nt">-I</span> middle_end/closure <span class="nt">-I</span> middle_end/flambda <span class="nt">-I</span> middle_end/flambda/base_types <span class="nt">-I</span> driver <span class="nt">-I</span> runtime <span class="nt">-g</span> <span class="nt">-strict-sequence</span> <span class="nt">-principal</span> <span class="nt">-absname</span> <span class="nt">-w</span> +a-4-9-40-41-42-44-45-48 <span class="nt">-warn-error</span> +a <span class="nt">-bin-annot</span> <span class="nt">-strict-formats</span> <span class="nt">-linkall</span> <span class="nt">-a</span> <span class="nt">-o</span> compilerlibs/ocamlcommon.cma utils/config.mli utils/build_path_prefix_map.mli utils/format_doc.mli utils/misc.mli utils/identifiable.mli utils/numbers.mli utils/arg_helper.mli utils/local_store.mli utils/load_path.mli utils/profile.mli utils/clflags.mli utils/terminfo.mli utils/ccomp.mli utils/warnings.mli utils/consistbl.mli utils/linkdeps.mli utils/strongly_connected_components.mli utils/targetint.mli utils/int_replace_polymorphic_compare.mli utils/domainstate.mli utils/binutils.mli utils/lazy_backtrack.mli utils/diffing.mli utils/diffing_with_keys.mli utils/compression.mli parsing/location.mli parsing/unit_info.mli parsing/asttypes.mli parsing/longident.mli parsing/parsetree.mli parsing/docstrings.mli parsing/syntaxerr.mli parsing/ast_helper.mli parsing/ast_iterator.mli parsing/builtin_attributes.mli parsing/camlinternalMenhirLib.mli parsing/parser.mli parsing/pprintast.mli parsing/parse.mli parsing/printast.mli parsing/ast_mapper.mli parsing/attr_helper.mli parsing/ast_invariants.mli parsing/depend.mli typing/annot.mli typing/value_rec_types.mli typing/ident.mli typing/path.mli typing/type_immediacy.mli typing/outcometree.mli typing/primitive.mli typing/shape.mli typing/types.mli typing/data_types.mli typing/rawprinttyp.mli typing/gprinttyp.mli typing/btype.mli typing/oprint.mli typing/subst.mli typing/predef.mli typing/datarepr.mli file_formats/cmi_format.mli typing/persistent_env.mli typing/env.mli typing/errortrace.mli typing/typedtree.mli typing/signature_group.mli typing/printtyped.mli typing/ctype.mli typing/out_type.mli typing/printtyp.mli typing/errortrace_report.mli typing/includeclass.mli typing/mtype.mli typing/envaux.mli typing/includecore.mli typing/tast_iterator.mli typing/tast_mapper.mli typing/stypes.mli typing/shape_reduce.mli file_formats/cmt_format.mli typing/cmt2annot.mli typing/untypeast.mli typing/includemod.mli typing/includemod_errorprinter.mli typing/typetexp.mli typing/printpat.mli typing/patterns.mli typing/parmatch.mli typing/typedecl_properties.mli typing/typedecl_variance.mli typing/typedecl_unboxed.mli typing/typedecl_immediacy.mli typing/typedecl_separability.mli lambda/debuginfo.mli lambda/lambda.mli typing/typeopt.mli typing/typedecl.mli typing/value_rec_check.mli typing/typecore.mli typing/typeclass.mli typing/typemod.mli lambda/printlambda.mli lambda/switch.mli lambda/matching.mli lambda/value_rec_compiler.mli lambda/translobj.mli lambda/translattribute.mli lambda/translprim.mli lambda/translcore.mli lambda/translclass.mli lambda/translmod.mli lambda/tmc.mli lambda/simplif.mli lambda/runtimedef.mli file_formats/cmo_format.mli middle_end/internal_variable_names.mli middle_end/linkage_name.mli middle_end/compilation_unit.mli middle_end/variable.mli middle_end/flambda/base_types/closure_element.mli middle_end/flambda/base_types/var_within_closure.mli middle_end/flambda/base_types/tag.mli middle_end/symbol.mli middle_end/flambda/base_types/set_of_closures_id.mli middle_end/flambda/base_types/set_of_closures_origin.mli middle_end/flambda/parameter.mli middle_end/flambda/base_types/static_exception.mli middle_end/flambda/base_types/mutable_variable.mli middle_end/flambda/base_types/closure_id.mli middle_end/flambda/projection.mli middle_end/flambda/base_types/closure_origin.mli middle_end/clambda_primitives.mli middle_end/flambda/allocated_const.mli middle_end/flambda/flambda.mli middle_end/flambda/freshening.mli middle_end/flambda/base_types/export_id.mli middle_end/flambda/simple_value_approx.mli middle_end/flambda/export_info.mli middle_end/backend_var.mli middle_end/clambda.mli file_formats/cmx_format.mli file_formats/cmxs_format.mli bytecomp/instruct.mli bytecomp/meta.mli bytecomp/opcodes.mli bytecomp/bytesections.mli bytecomp/dll.mli bytecomp/symtable.mli driver/pparse.mli driver/compenv.mli driver/main_args.mli driver/compmisc.mli driver/makedepend.mli driver/compile_common.mli utils/config.ml utils/build_path_prefix_map.ml utils/format_doc.ml utils/misc.ml utils/identifiable.ml utils/numbers.ml utils/arg_helper.ml utils/local_store.ml utils/load_path.ml utils/clflags.ml utils/profile.ml utils/terminfo.ml utils/ccomp.ml utils/warnings.ml utils/consistbl.ml utils/linkdeps.ml utils/strongly_connected_components.ml utils/targetint.ml utils/int_replace_polymorphic_compare.ml utils/domainstate.ml utils/binutils.ml utils/lazy_backtrack.ml utils/diffing.ml utils/diffing_with_keys.ml utils/compression.ml parsing/location.ml parsing/unit_info.ml parsing/asttypes.ml parsing/longident.ml parsing/docstrings.ml parsing/syntaxerr.ml parsing/ast_helper.ml parsing/ast_iterator.ml parsing/builtin_attributes.ml parsing/camlinternalMenhirLib.ml parsing/parser.ml parsing/lexer.mli parsing/lexer.ml parsing/pprintast.ml parsing/parse.ml parsing/printast.ml parsing/ast_mapper.ml parsing/attr_helper.ml parsing/ast_invariants.ml parsing/depend.ml typing/ident.ml typing/path.ml typing/primitive.ml typing/type_immediacy.ml typing/shape.ml typing/types.ml typing/data_types.ml typing/rawprinttyp.ml typing/gprinttyp.ml typing/btype.ml typing/oprint.ml typing/subst.ml typing/predef.ml typing/datarepr.ml file_formats/cmi_format.ml typing/persistent_env.ml typing/env.ml typing/errortrace.ml typing/typedtree.ml typing/signature_group.ml typing/printtyped.ml typing/ctype.ml typing/out_type.ml typing/printtyp.ml typing/errortrace_report.ml typing/includeclass.ml typing/mtype.ml typing/envaux.ml typing/includecore.ml typing/tast_iterator.ml typing/tast_mapper.ml typing/stypes.ml typing/shape_reduce.ml file_formats/cmt_format.ml typing/cmt2annot.ml typing/untypeast.ml typing/includemod.ml typing/includemod_errorprinter.ml typing/typetexp.ml typing/printpat.ml typing/patterns.ml typing/parmatch.ml typing/typedecl_properties.ml typing/typedecl_variance.ml typing/typedecl_unboxed.ml typing/typedecl_immediacy.ml typing/typedecl_separability.ml typing/typeopt.ml typing/typedecl.ml typing/value_rec_check.ml typing/typecore.ml typing/typeclass.ml typing/typemod.ml lambda/debuginfo.ml lambda/lambda.ml lambda/printlambda.ml lambda/switch.ml lambda/matching.ml lambda/value_rec_compiler.ml lambda/translobj.ml lambda/translattribute.ml lambda/translprim.ml lambda/translcore.ml lambda/translclass.ml lambda/translmod.ml lambda/tmc.ml lambda/simplif.ml lambda/runtimedef.ml bytecomp/meta.ml bytecomp/opcodes.ml bytecomp/bytesections.ml bytecomp/dll.ml bytecomp/symtable.ml driver/pparse.ml driver/compenv.ml driver/main_args.ml driver/compmisc.ml driver/makedepend.ml driver/compile_common.ml
<span class="gp">$</span><span class="w"> </span>ocamlc <span class="nt">-I</span> utils <span class="nt">-I</span> parsing <span class="nt">-I</span> typing <span class="nt">-I</span> bytecomp <span class="nt">-I</span> file_formats <span class="nt">-I</span> lambda <span class="nt">-I</span> middle_end <span class="nt">-I</span> middle_end/closure <span class="nt">-I</span> middle_end/flambda <span class="nt">-I</span> middle_end/flambda/base_types <span class="nt">-I</span> driver <span class="nt">-I</span> runtime <span class="nt">-g</span> <span class="nt">-strict-sequence</span> <span class="nt">-principal</span> <span class="nt">-absname</span> <span class="nt">-w</span> +a-4-9-40-41-42-44-45-48 <span class="nt">-warn-error</span> +a <span class="nt">-bin-annot</span> <span class="nt">-strict-formats</span> <span class="nt">-a</span> <span class="nt">-o</span> compilerlibs/ocamlbytecomp.cma bytecomp/bytegen.mli bytecomp/printinstr.mli bytecomp/emitcode.mli bytecomp/bytelink.mli bytecomp/bytelibrarian.mli bytecomp/bytepackager.mli driver/errors.mli driver/compile.mli driver/maindriver.mli bytecomp/instruct.ml bytecomp/bytegen.ml bytecomp/printinstr.ml bytecomp/emitcode.ml bytecomp/bytelink.ml bytecomp/bytelibrarian.ml bytecomp/bytepackager.ml driver/errors.ml driver/compile.ml driver/maindriver.ml
<span class="gp">$</span><span class="w"> </span>ocamlc <span class="nt">-I</span> utils <span class="nt">-I</span> parsing <span class="nt">-I</span> typing <span class="nt">-I</span> bytecomp <span class="nt">-I</span> file_formats <span class="nt">-I</span> lambda <span class="nt">-I</span> middle_end <span class="nt">-I</span> middle_end/closure <span class="nt">-I</span> middle_end/flambda <span class="nt">-I</span> middle_end/flambda/base_types <span class="nt">-I</span> driver <span class="nt">-I</span> runtime <span class="nt">-g</span> <span class="nt">-compat-32</span> <span class="nt">-o</span> ocamlc <span class="nt">-strict-sequence</span> <span class="nt">-principal</span> <span class="nt">-absname</span> <span class="nt">-w</span> +a-4-9-40-41-42-44-45-48 <span class="nt">-warn-error</span> +a <span class="nt">-bin-annot</span> <span class="nt">-strict-formats</span> compilerlibs/ocamlcommon.cma compilerlibs/ocamlbytecomp.cma driver/main.mli driver/main.ml
</code></pre></div></div>

<p>I wanted to try a different angle on the <code class="language-plaintext highlighter-rouge">Load_path</code>, and this time produced a
function which <em>predicts</em> the files in the tree. The rules for this were pretty
easy for me to define, and I wasn’t sure I could face watching Claude
special-case everything. 130 lines of verifiably correct hacked OCaml later, I
had my load path function. A little bit more code later, those three commands
above were translated into an OCaml script (based on the ocamlcommon and
ocamlbytecomp libraries) which should exactly the same build. It ran - and it
built the compiler.</p>

<p><code class="language-plaintext highlighter-rouge">ocamlc</code> was, pleasingly, <em>exactly</em> the same. The .cma files, however, were not.
For ocamlcommon.cma, that turned out to be me being sloppy with my commands.
<code class="language-plaintext highlighter-rouge">ocamlcommon.cma</code> is linked with <code class="language-plaintext highlighter-rouge">-linkall</code>, but
<code class="language-plaintext highlighter-rouge">ocamlc -a foo.cma -linkall bar.cmo</code> is not the same as
<code class="language-plaintext highlighter-rouge">ocamlc -a foo.cma -linkall bar.ml</code>, because <code class="language-plaintext highlighter-rouge">-linkall</code> gets recorded in the
.cmo file <em>as well</em>. Easy fix - but the files were still different. A bit more
tweaking and I could see that actually the .cmo files were different.</p>

<p>A bit more poking and checking with <code class="language-plaintext highlighter-rouge">ocamlobjinfo</code> and a few other flags and
tricks, and I observed that:</p>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>ocamlc <span class="nt">-g</span> <span class="nt">-c</span> utils/config.ml
</code></pre></div></div>

<p>resulted in slightly different <em>debug information</em> from:</p>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>console <span class="nt">-g</span> <span class="nt">-c</span> utils/config.mli utils/config.ml
</code></pre></div></div>

<p>(it’s observably to do with the debug information - omit the <code class="language-plaintext highlighter-rouge">-g</code> and they’re
all identical). Lots to suspect here, but time for…</p>

<div class="language-console highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>claude
<span class="go">╭───────────────────────────────────────────────────╮
│ ✻ Welcome to Claude Code!                         │
</span></code></pre></div></div>

<p>The problem was easy to state, but not quite so quick to come up with a
conclusive explanation. Claude, like most of these models, appears not to have
been trained on <a href="https://www.researchgate.net/figure/Then-a-Miracle-Occurs-Copyrighted-artwork-by-Sydney-Harris-Inc-All-materials-used-with_fig2_302632920">this old cartoon</a>,
and very merrily buzzes along for a few rounds of investigation, followed by a
highly dubious explanation for how it was probably something to do with
marshalling and, mumble mumble, the final binaries are the same so this bug is
probably OK.</p>

<p>Hmm. A few rounds of, “no, this needs to be equivalent as otherwise it’s not
reproducible” (“You’re so right!”), and we had a lot of test programs, a
frequent need for reminders that debugging OCaml’s Marshalling format was
possibly not going to help, but we weren’t very much closer to an answer.</p>

<p>Stepping back, I re-framed the problem, instead asking Claude to produce a
program which would give a textual dump of the debug information in each file,
so we could compare it. This was interesting - especially the occasional
hallucinations at having analysed “all the fields”, but we got there.</p>

<p>What was interesting was that we were struggling to perceive differences between
anything. Claude at this point was desperate to delve into the runtime code and
start doing hex-dumps of the marshal format to see what was actually different.
I appear to be a little older than Claude, and was more reticent about this
approach. I suggested we look at the polymorphic hash of some of these fields
instead. At this point, we started to see some differences - Claude’s inferences
at this point were working well, and there was a strong suggestion to add all
sorts of accessor functions into the <code class="language-plaintext highlighter-rouge">Types</code> module to be able to introspect
some of the values in more detail than normally intended (i.e. polymorphic hash
was telling that us that some abstract values were different, but we wanted to
see what the differences really were).</p>

<p>Reader, I told it to use <code class="language-plaintext highlighter-rouge">Obj.magic</code> instead 🫣</p>

<p>However, what happened next was truly fascinating and definitely very efficient.
The value being returned for one of the type IDs was simply not believable. It
was far too <em>high</em>. Claude also correctly observed that it was in fact a block,
and not an integer, which was what we were expecting. The human brain at this
point cuts in, and looks at the type: <code class="language-plaintext highlighter-rouge">Types.get_id: t -&gt; int</code>. No, that
accessor looks right. Brain slowly whirring; look at the code:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">get_id</span> <span class="n">t</span> <span class="o">=</span> <span class="p">(</span><span class="n">repr</span> <span class="n">t</span><span class="p">)</span><span class="o">.</span><span class="n">id</span>
</code></pre></div></div>

<p>Oh - it’s not an accessor (in another life, I could possibly have performed
Claude’s responses…).</p>

<p>All I had to point out was that <code class="language-plaintext highlighter-rouge">Types.get_id</code> was not an accessor, it was
normalising the result (to walk <code class="language-plaintext highlighter-rouge">Tlink</code> members of the type representation), and
Claude was on it, replacing semi-elegant OCaml code with a sea of calls to <code class="language-plaintext highlighter-rouge">Obj</code>
functions.</p>

<p>But we had our answer - the type chain was different, if semantically equivalent
and, more importantly, Claude then leaped to the problem.</p>

<p>The internal <code class="language-plaintext highlighter-rouge">Types.new_id</code> reference isn’t reset between compilations 💥</p>

<p>A quick rebuild later, and the same debug information was given regardless of
whether <code class="language-plaintext highlighter-rouge">utils/config.mli</code> was compiled at the same time as <code class="language-plaintext highlighter-rouge">utils/config.ml</code>.
Go Claude. My contribution was keeping the explorations looking at relevant
parts of the system, and not disappearing off on sometimes ridiculous and
unbelievable tangents. Maybe it would have got there on its own, but who knows
the tokens required and the GPUs scorched…</p>

<p>Plug that back into my little script. ocamlcommon.cma still different. At this
point, a line from <a href="https://en.wikipedia.org/wiki/Four_Weddings_and_a_Funeral">Four Weddings and a Funeral</a>
could be heard loud and clear in the human mind. It’s the one which follows
“Dear Lord, forgive me for what I am about to, ah, say in this magnificent place
of worship…”.</p>

<p>The fix was definitely working. But a quick bit of further experimentation
revealed that including <em>other</em> .mli files before utils/config.ml (and there are
a lot) was causing the information to change.</p>

<p>So:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ claude -c
</code></pre></div></div>

<p>As a human of hopefully normal emotional response to situations, the feeling of
being back at square one would normally have meant I’d have at least needed a
coffee before being able to face dusting off all the tools and scripts which had
been constructed in the previous investigations. But here of course the LLM
doesn’t care and was straight into using the tools previously constructed to
look at the revised problem. A lot more <code class="language-plaintext highlighter-rouge">Obj.magic</code>-like investigations later
looking at the shape of some debugging information, and Claude found another bit
to reset, this time in <code class="language-plaintext highlighter-rouge">Ctype</code>. All the level information in the type-checker
isn’t reset between compilations. Not a semantic issue, because the type checker
uses those numbers <em>relatively</em>, but again they leak into the representation of
some of the debugging information.</p>

<p>And it was working 🥳</p>

<p>Next up was trying to put those fixes into something resembling a commit series
that might one day be an acceptable PR. What I really wanted was a test. Claude
was great for this, although it lacks anything approximating taste (and this is
me writing…!). However, with no feelings to be hurt, the pointers were easy to
issue and the results impressive - especially constructing a non-trivial
ocamltest block. The result is previewable in <a href="https://github.com/dra27/ocaml/pull/237/files">dra27/ocaml#237</a>
on my GitHub fork, and the test is entirely Claude’s.</p>

<p>Having got to this stage, I extended the compiler with some of Lucas’s patches,
and started passing just the .ml files for compilation, allowing the compiler to
compile the .mli files on demand, as before. With some idle tinkering, I got to
the end of “coreall”, which is the point in OCaml’s build process where
<code class="language-plaintext highlighter-rouge">ocamlc</code>, the bytecode versions of everything in <code class="language-plaintext highlighter-rouge">tools/</code> and <code class="language-plaintext highlighter-rouge">ocamllex</code> have
all been compiled, along with the Standard Library. That was all being done from
a single compiler process, where the OCaml script driving the compiler consisted
mostly of the list of .ml files. Coupled with the predictive load path I’d
already put together, at this stage the “plumbing” needed in the scheduler is
just:</p>

<div class="language-ocaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">compile_file</span> <span class="n">source_file</span> <span class="bp">()</span> <span class="o">=</span>
  <span class="nn">Compenv</span><span class="p">.</span><span class="n">readenv</span> <span class="nn">Format</span><span class="p">.</span><span class="n">std_formatter</span> <span class="p">(</span><span class="nc">Before_compile</span> <span class="n">source_file</span><span class="p">);</span>
  <span class="k">let</span> <span class="n">output_prefix</span> <span class="o">=</span> <span class="nn">Compenv</span><span class="p">.</span><span class="n">output_prefix</span> <span class="n">source_file</span> <span class="k">in</span>
  <span class="k">if</span> <span class="nn">Filename</span><span class="p">.</span><span class="n">extension</span> <span class="n">source_file</span> <span class="o">=</span> <span class="s2">".mli"</span> <span class="k">then</span>
    <span class="nn">Compile</span><span class="p">.</span><span class="n">interface</span> <span class="o">~</span><span class="n">source_file</span> <span class="o">~</span><span class="n">output_prefix</span>
  <span class="k">else</span>
    <span class="k">let</span> <span class="n">start_from</span> <span class="o">=</span> <span class="nn">Clflags</span><span class="p">.</span><span class="nn">Compiler_pass</span><span class="p">.</span><span class="nc">Parsing</span> <span class="k">in</span>
    <span class="nn">Compile</span><span class="p">.</span><span class="n">implementation</span> <span class="o">~</span><span class="n">start_from</span> <span class="o">~</span><span class="n">source_file</span> <span class="o">~</span><span class="n">output_prefix</span>

<span class="k">let</span> <span class="k">rec</span> <span class="n">execute</span> <span class="n">task</span> <span class="o">=</span>
  <span class="k">try</span> <span class="n">task</span> <span class="bp">()</span>
  <span class="k">with</span> <span class="n">effect</span> <span class="p">(</span><span class="nn">Load_path</span><span class="p">.</span><span class="nc">Missing</span> <span class="n">path</span><span class="p">)</span><span class="o">,</span> <span class="n">k</span> <span class="o">-&gt;</span>
    <span class="k">let</span> <span class="n">file</span> <span class="o">=</span> <span class="nn">Filename</span><span class="p">.</span><span class="n">chop_extension</span> <span class="n">path</span> <span class="o">^</span> <span class="s2">".mli"</span> <span class="k">in</span>
    <span class="n">execute</span> <span class="p">(</span><span class="n">compile_file</span> <span class="n">file</span><span class="p">);</span>
    <span class="n">execute</span> <span class="p">(</span><span class="nn">Effect</span><span class="p">.</span><span class="nn">Deep</span><span class="p">.</span><span class="n">continue</span> <span class="n">k</span><span class="p">)</span>
</code></pre></div></div>

<p>(as an aside, when it goes to being done with Domains I’ll possibly switch it to
a shallow handler, because the call stack with the deep handlers isn’t as
reasonable as I’d hoped for, but to be honest I just wanted to see it work!)</p>

<p>Fascinatingly, all the artefacts (.cma and binaries) being produced were
identical except for the <code class="language-plaintext highlighter-rouge">Lazy</code> module in the Standard Library!</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ claude -c
</code></pre></div></div>

<p>Claude was simultaneously amazing and useless at this. Amazing, because I was
prompting some of this while cooking a meal, so being able to bark an
instruction (actually, I hadn’t set it up for voice - I was just quickly
typing) and then leave it to think for a minute or two was strangely efficient,
because investigating this on my own would have taken too much continuous
concentration. It was useless because we didn’t get anywhere near a believable
explanation, despite various efforts at resetting things. Sometimes you just
have to say <code class="language-plaintext highlighter-rouge">/exit</code> (and eat a meal…).</p>

<p>However, after the aforementioned meal, I dug into it a bit further. The issue
here was clearly to do with some state in the compiler - if ocamlcommon.cma or
ocamlmiddleend.cma were compiled, then the Lazy module differed. Incidentally,
at this point this wasn’t debug information which varied, it was the actual
module, but it was still semantically the same. Claude had correctly identified
that it was to do with the marshalling, and we had identified that there was a
difference in string sharing (so not entirely useless, in fairness). I carried
on poking and, with a little bit of jerry-rigging, managed to determine the
relatively small set of files in flambda and in ocamlcommon whose compilation
caused the change in Lazy. I was highly suspicious it was to do with compilation
of lazy values.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ claude -c
</code></pre></div></div>

<p>Feeding this information to Claude was a much better trick - the reasoning at
this point would contradict its own tangents (“I should look at … but wait,
the user has given me the list of affected files”). Impressively, we did hone in
on the much more complex explanation for this third issue, which is to do with
lazy values used in globals in the <code class="language-plaintext highlighter-rouge">Matching</code> module. In this particular case,
if the compiler has compiled a file which matched on a lazy, causing
<code class="language-plaintext highlighter-rouge">Matching.code_force_lazy_block</code> to be forced in the compiler and thus the
<code class="language-plaintext highlighter-rouge">CamlinternalLazy</code> identified to be added to the current persistent environment,
then a subsequent module (in this case <code class="language-plaintext highlighter-rouge">lazy.ml</code> in the Standard Library) which
<em>both</em> pattern matches on a lazy and which also refers to <code class="language-plaintext highlighter-rouge">CamlinternalLazy</code>
ends up with two extern’d string representations of <code class="language-plaintext highlighter-rouge">CamlinternalLazy</code> instead
of one. The reason is that the forced code block in <code class="language-plaintext highlighter-rouge">Matching</code> still refers to a
string used in a <em>previous</em> persistent environment. It’s not a semantic issue at
all, but it manifests itself because the string is not shared when the
subsequent file looks up the <code class="language-plaintext highlighter-rouge">CamlinternalLazy</code> identifier.</p>

<p>It was a battle to update the test to show this behaviour, but in fairness that
would have been a battle anyway! However, we got there too.</p>

<p>Three reproducibility issues identified, and a viable PR produced - with tests!</p>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="llm" /><category term="agentic" /><summary type="html"><![CDATA[Continuing the previous theme of dabbling with matters agentic. Previously, I’d quite assiduously kept my fingers away from files. This time, I wanted to try something exploratory, switching to the agent for things I was actively stuck on.]]></summary></entry><entry><title type="html">Retrofitting a build system into a compiler</title><link href="https://www.dra27.uk/blog/platform/2025/09/25/building-with-effects.html" rel="alternate" type="text/html" title="Retrofitting a build system into a compiler" /><published>2025-09-25T00:00:00+00:00</published><updated>2025-09-25T00:00:00+00:00</updated><id>https://www.dra27.uk/blog/platform/2025/09/25/building-with-effects</id><content type="html" xml:base="https://www.dra27.uk/blog/platform/2025/09/25/building-with-effects.html"><![CDATA[<p>Over the summer, <a href="https://github.com/lucasma8795">Lucas Ma</a> has been
investigating ideas surrounding using effects <a href="https://anil.recoil.org/ideas/effects-scheduling-ocaml-compiler">in the OCaml compiler itself</a>.
He’s <a href="https://lucasma8795.github.io/blog/">blogged some of his discoveries and adventures</a>.
The technical core of this work leads towards being able to use the OCaml
compiler as a library on-demand to create a longer-lived “compiler service”. Of
itself, that’s not at all revolutionary, but it is quite hard to do that with a
30 year old codebase that really was designed for single-shot separate
compilation.</p>

<p>Lucas got to grips pretty swiftly with OCaml’s build system, and initially
looked at generalising a core internal part of the compiler called the
<code class="language-plaintext highlighter-rouge">Load_path</code>. This is used by the compiler for scanning the various “include”
directories for files, principally typing information. For example, if your code
contains a call to <code class="language-plaintext highlighter-rouge">Unix.stat</code>, then the type checker needs the typing
information for a module called <code class="language-plaintext highlighter-rouge">Unix</code> which will cause it to request <code class="language-plaintext highlighter-rouge">unix.cmi</code>
from the <code class="language-plaintext highlighter-rouge">Load_path</code> and which will then hopefully resolve that to, say,
<code class="language-plaintext highlighter-rouge">~/.opam/switch/lib/ocaml/unix/unix.cmi</code>.</p>

<p>Effects provide an elegant way of inverting the control for this lookup, as the
program <em>calling</em> the compiler can then change the way these files are looked
up. It also provides the opportunity to “lie” to the compiler about the files
which are actually present, and this was the first thing Lucas started to do
with this change. In particular, it allows us to ignore the dependency graph.
When compiling a module, OCaml requires all the type information that a module
refers to have been compiled beforehand. If you have a module in <code class="language-plaintext highlighter-rouge">bar.ml</code> with
interface in <code class="language-plaintext highlighter-rouge">bar.mli</code> and where the code refers to <code class="language-plaintext highlighter-rouge">Foo.value</code>, then OCaml
requires <code class="language-plaintext highlighter-rouge">foo.mli</code> and <code class="language-plaintext highlighter-rouge">bar.mli</code> both to have been compiled before <code class="language-plaintext highlighter-rouge">bar.ml</code> is
compiled. However, thanks to this effectful trick, Lucas could instead allow the
compiler to start with <em>just</em> <code class="language-plaintext highlighter-rouge">bar.ml</code>. When <code class="language-plaintext highlighter-rouge">Foo.value</code> is encountered, there’s
a request made for <code class="language-plaintext highlighter-rouge">foo.cmi</code>, at which point, in the first prototype, the
compiler then quickly spawned another instance of itself to compile <code class="language-plaintext highlighter-rouge">foo.mli</code>
and <em>then</em> resumed compilation for <code class="language-plaintext highlighter-rouge">bar.ml</code>, with the same trick then happening
at the end of the compilation with <code class="language-plaintext highlighter-rouge">bar.cmi</code>. i.e. three files (<code class="language-plaintext highlighter-rouge">foo.mli</code>,
<code class="language-plaintext highlighter-rouge">bar.mli</code> and <code class="language-plaintext highlighter-rouge">bar.ml</code>) all compiled just from <code class="language-plaintext highlighter-rouge">ocamlc -c bar.ml</code>.</p>

<p>Possibly neat for being able to remove <a href="https://github.com/ocaml/ocaml/blob/trunk/.depend">monstrosities like this</a>
from OCaml’s source tree one day, but so far not <em>so</em> exciting. However, effects
give us more than just hooks into the compiler’s operations. We’ve got an
entire suspended compilation packaged up in a continuation… which means that
that same compiler “process” can now do something else. The next trick was to
have it that instead of spawning a new compiler, the current process itself
returned back into the compiler and itself compiled the required interface file
and <em>then</em> simply resumed the continuation of the previous filke. At this point,
the 30-year-old codebase rears its head again. For reasons of speed and space,
many parts of the compiler, especially in the type checker, feature a lot of
global mutable state. In particular, the compilation pipeline is <em>not</em>
re-entrant. Luckily, thanks to the Merlin project, there is a mechanism in the
type-checker for taking snapshots of all this global state. Lucas was able to
piggy-back on this so that, just before the compiler performs an effect to
request a .cmi file (that doesn’t yet exist), it snapshots all its global state,
performs the effect and then, when resumed, restores that state again.</p>

<p>Using this to interrupt type-checking and start on something else isn’t quite
what this <a href="https://github.com/ocaml/ocaml/blob/trunk/utils/local_store.mli"><code class="language-plaintext highlighter-rouge">Local_store</code> mechanism</a>
was originally intended for, and there was a bit of debugging to find a few more
pieces global state which weren’t being “registered”, but Lucas was able to get
a means of building the OCaml bytecode compiler with nothing pre-compiled where
all the compiler had to be given was the list of .ml files required. From a
toolchain perspective, we’re essentially retiring <a href="https://ocaml.org/manual/5.3/depend.html"><code class="language-plaintext highlighter-rouge">ocamldep</code></a>.</p>

<p>So far, still mostly just so neat: one single compiler process (just about)
successfully recompiling the compiler. However, that’s equivalent to compiling
with <code class="language-plaintext highlighter-rouge">make -j1</code> - a sequential, and therefore slow, build. The awesome part came
next - Domains. In the final version Lucas was working on, multiple domains were
started up, each one beginning compilation of one of the .ml files required for
the compiler <em>in parallel</em>, with a scheduler handling effects coming from each
of these in turn when .mli files needed compiling, and despatching those. The
<code class="language-plaintext highlighter-rouge">Local_store</code> mechanism in the came in handy here - Lucas extended it to use
<a href="https://ocaml.org/manual/5.3/api/Domain.DLS.html">Domain Local Storage</a>,
combined with the snapshotting. The prototype - for simplicity - featured no
sharing between these domains.</p>

<p>By the end of the summer, this was <em>very</em> nearly working, which is a result
consiserably further than I’d expected in the time available! As is so often the
case with these investigations, Lucas’s work had revealed some new facets to
this area that weren’t clear to me before. I had previously been wondering how
we would be exposing this kind of multi-threaded compiler to the user via the
driver programs, but it became increasingly clear that this wasn’t something
that would be necessary - the program that we were working on to build the
compiler itself was of course not the compiler driver, <em>but a build system</em>. To
me, there are two particularly exciting things about that:</p>

<ol>
  <li>It’s a <em>really simple</em> build system. Hopefully when the last few kinks in the
parallel type checker are ironed out (read on…), we may be able to add that
it’s really simple <strong>and performant</strong>.</li>
  <li>It’s fundamental portable. It leads to the possibility of bootstrapping OCaml
trivially with itself. This has been done before with <code class="language-plaintext highlighter-rouge">ocamlbuild</code>, but the
result was a maintenance disaster. However, the sheer simplicity of the
multi-domain effect-scheduling approach is making this perennial build system
hacker tinker…</li>
</ol>]]></content><author><name></name></author><category term="blog" /><category term="platform" /><category term="ocaml" /><category term="eeg" /><summary type="html"><![CDATA[Over the summer, Lucas Ma has been investigating ideas surrounding using effects in the OCaml compiler itself. He’s blogged some of his discoveries and adventures. The technical core of this work leads towards being able to use the OCaml compiler as a library on-demand to create a longer-lived “compiler service”. Of itself, that’s not at all revolutionary, but it is quite hard to do that with a 30 year old codebase that really was designed for single-shot separate compilation.]]></summary></entry></feed>