File descriptors as integers considered harmful
Continuing the previous foray into file descriptors,
and trying to remove Obj.magic
both from ocaml-uring’s code itself, and from
the recommended way for using it.
I expected this to be a fairly simple application of the C API features added in dra27/ocaml#file_descr-interop. It’s not quite what happened, though - as I don’t actually think that any of the changes I propose for the Unix library in OCaml are needed for ocaml-uring. What piqued my interest in this rather deep rabbit-hole was that what I would contend are slightly poor design decisions stem from the corrupting notion in C that file descriptors are integers - i.e. because file descriptors in C are integers, it becomes dangerously tempting to keep them as that in OCaml, because OCaml treats integers specially.
ocaml-uring’s documentation states that it “aims to provide a thin type-safe layer for use in higher-level interfaces”. I think adopting Xavier’s veto on manipulating file descriptors as integers in OCaml (i.e. never doing it) leads to greater type safety, even in thin bindings, without compromising performance. Let’s see! 🤓
The sources for ocaml-uring
use Obj.magic
both in the library itself in one place, and at various points
in both the tests and the documentation, exclusively for duping the type system
into believing an int
is a Unix.file_descr
.
Consider mkdirat(2).
This is part of a family of *at syscalls which accept a file descriptor as an
argument and which will use the directory name of the file referred to by that
file descriptor as the base for a relative path. There is a special value
AT_FDCWD
which can be passed instead of an open file descriptor (it instructs
mkdirat
instead to interpret paths relative to the current working directory),
and it’s the representation of this AT_FDCWD
which causes the Obj.magic
.
When used in C, AT_FDCWD
is an illegal file descriptor value (it’s -100, as it
happens). Continuing with mkdirat
as an example, we have this C declaration:
void io_uring_prep_mkdirat(struct io_uring_sqe *sqe,
int dirfd, const char *path, mode_t mode);
and this corresponding primitive in ocaml-uring:
external submit_mkdirat : t -> id
-> Unix.file_descr -> Sketch.ptr -> int
-> bool = "ocaml_uring_submit_mkdirat" [@@noalloc]
There’s a close correspondence between the two - the OCaml t
gets passed to
io_uring_get_sqe
which gives us the first parameter for the prep call. The id
is separately
passed to io_uring_set_sqe_data
(that’s part of a mechanism in liburing to be able to identify which syscalls
have completed - more on that later). The remaining arguments in this primitive
(the Unix.file_descr
, the Sketch.ptr
and the int
) correspond to the
dirfd
, path
and mode
arguments to the syscall (the liveness of buffers is
complicated here - the Sketch.ptr
type is essentially allowing a string to be
passed from OCaml to C without subsequently having to worry about it being moved
by the GC).
It looks like a good thin layer over the syscall. Every one of these prep calls
will necessarily have a io_uring_set_sqe_data
call, and it’s always good to
minimise the number of crossings between C and OCaml when writing C bindings.
It naturally becomes tempting to have a Unix.file_descr
value for AT_FDCWD
,
and that’s indeed the implementation:
let at_fdcwd : Unix.file_descr = Obj.magic Config.at_fdcwd
However, there really is no need for this, and if we step back from the fact
it’s an int
, and instead consider what AT_FDCWD
is modelling, a more
obvious solution emerges which doesn’t require magic. AT_FDCWD
is essentially
be used to mean “no file descriptor” (and there’s then an interpretation for
what that means mkdirat
should do). i.e. AT_FDCWD
means None
, which is
then being modelled by -100. An actual file descriptor is Some fd
, which is
being modelled by its number (which can never be -100). In other words, we could
pass a Unix.file_descr option
to the C stub, instead of a Unix.file_descr
.
This falls out even more naturally when we look at the actual Uring.mkdirat
exposed to the library user:
let mkdirat t ~mode ?(fd=at_fdcwd) path user_data =
It actually already has a Unix.file_descr option
because it’s using an
optional argument to convey it! It’s quite an easy adjustment
just to allow the C stub to resolve this, it has the benefit that the value of
AT_FDCWD
no longer has to be determined from OCaml, and it doesn’t involve any
additional allocations of memory on the OCaml side (because the optional
argument meant there was already one).
So far, so hopefully good. Now on to the one in the README:
# let fd =
if result < 0 then failwith ("Error: " ^ string_of_int result);
(Obj.magic result : Unix.file_descr);;
val fd : Unix.file_descr = <abstr>
Well, it wasn’t my intention, but the proposed Unix.fdopen
could deal with
that.
Except that this is not “fdopen
” in the spirit it was added for - in fact,
it’s precisely what was not supposed to be done. We’ve got an int
in OCaml
that is actually a file descriptor, and we’re trying to coerce it to be a
Unix.file_descr
. We’ve got a layering violation - we’ve allowed an abstract
value from the world of C that is represented in C as an int
to leak up to the
world of OCaml as an OCaml int
, which should be a number!
Funnily enough, ocaml-uring has another minor layering violation like this in
the translation of Unix error values to Unix.error
values in the
Uring.error_of_errno
function:
let error_of_errno e =
Uring.error_of_errno (abs e)
I didn’t want to abuse my proposed Unix.fdopen
this way, so soon after
conceiving it for a more noble and even slightly-typed purpose, so instead I
opted for bouncing these numbers back to C by instead adding Uring.file_descr_of_result
:
value ocaml_uring_file_descr_of_result(value v_result)
{
CAMLparam0();
CAMLlocal2(result, val);
if (Int_val(v_result) < 0) {
val = unix_error_of_code(-Int_val(v_result));
result = caml_alloc_small(1, 1);
} else {
val = caml_unix_file_descr_of_fd(Int_val(v_result));
result = caml_alloc_small(1, 0);
}
Field(result, 0) = val;
CAMLreturn(result);
}
which handles the pattern in C, where it belongs. The result certainly looks nicer in some of the test code, for example, the numeric matching in one of the tests:
if result >= 0 then
let fd = Unix.fdopen result in
Unix.close fd
else
let error = Uring.error_of_errno fd in
raise (Unix.Unix_error(error, (* ... *)))
becomes a slightly more idiomatic:
match Uring.file_descr_of_result result with
| Ok fd -> Unix.close fd
| Error error -> raise (Unix.Unix_error(error, (* ... *)))
but there are two things that still feel bad - those int
s are still there
in OCaml! Worse, we’re now incurring an extra C call to process them.
The veto tells us that the processing belongs in C. Is there a way to do this
where we don’t ever have the general int
result exposed in OCaml, but we’re
still a thin wrapper around the API and at no cost?
Not without breaking the API, but I think we can have our API cake and also eat
it in this case. A first stab sits on my fork.
The result of an io_uring operation is always a (C) int
. Each of the syscalls
uses one of three encodings of that integer. In all three cases, a negative
number corresponds to an error, where the absolute value of the number is the
Unix error number (from errno.h
). Some functions then just return 0 on
success. The other functions return a number ≥ 0 and for some of those, that
number is a file descriptor.
Now, the choice of encoding is known when we prep the call. So the trick I tried instead was to encode that in the user-data attached to the ring entry. The ocaml-uring implementation is using an array index as the user-data, so even on a 32-bit system we can comfortably steal two bits to encode how to treat the result. The three wait functions now instead of returning:
type cqe_option = private
| Cqe_none
| Cqe_some of { user_data_id : id; res: int }
return a slightly richer:
type cqe_option = private
| Cqe_none
| Cqe_unit of { user_data_id : id; result: unit }
| Cqe_int of { user_data_id : id; result: int }
| Cqe_fd of { user_data_id : id; result: Unix.file_descr }
| Cqe_error of { user_data_id : id; result: Unix.error }
A little bit of additional plumbing is needed for cancellation. I opted to have
the C functions themselves return the amended user-data value, as it’s only
needed for cancellation. That keeps the considerations of C in C-land, but it
might be better to push that into the Heap
implementation which backs the IDs
and allow both the job ID and heap pointer to be the same number.
I didn’t push this all the way back to Eio, but I did update all the tests and
documentation, and I have to say the results looked nice to me (and still
low-level) - in particular, the copying tests seem to get a clearer delineation
of error path. I’m not sure if the separation of unit
/int
was strictly
buying much, and given that various of the error codes which may want to come
aren’t Posix and so map to Unix.UNKNOWNERR
(which allocates), it would
possibly be worth looking at a richer error type which could itself be converted
to a Unix.error
expensively, but on-demand. However, what was nice was that
the return from the wait from functions still allocates just one two-field OCaml
word, and the file descriptor - if there is one - incurs no further conversion,
the value comes from C with the correct types already.
Entertainingly, the final version of the patch, while breaking the API, does not actually need any alterations to OCaml’s Unix library at all.
There’s one last possibly fun OxCaml follow-up to look at. The value which comes
back from the C wait functions is a cqe_option
(see above) whose shape is a
single constant constructor and four 2-field constructors. The first field in
these cases is the io_uring user-data, which we then translate back to the
OCaml value to yield one of these:
type 'a completion_option =
| None
| Unit of { result: unit; data: 'a }
| Int of { result: int; data: 'a }
| FD of { result: Unix.file_descr; data: 'a }
| Error of { result: Unix.error; data: 'a }
Which has the same shape! Now, we could do an evil Obj.magic
trick here to
re-use the block and simply replace the field which contained the io_uring user
data with the OCaml value (we’d need to swap the fields around in the types, but
that’s a minor detail). However, this kind of lying can be dangerous when the
OCaml optimiser comes into play… but can the uniqueness modes be used here to
allow the compiler to determine that the cqe_option
block is now available
and actually compile it down to just that field assignment? We’d then halve the
allocations arising from a ring wait operation 🤔
Anyway, file descriptors are not integers. I’m a new disciple. I think we should make T-Shirts.