Safe borrowing across FFI with Diplomat
This summer, I had the privilege of interning at Google on the i18n team under the guidance of Manish Goregaokar and Shane Carr. My team worked on a project called ICU4X, which is an open source internationalization library written in Rust. Although I contributed to this a little near the end, I spent most of my time working on the supplemental foreign function interface (FFI) tool, Diplomat.
Diplomat: FFI bindings to anything
Diplomat is an open source project started by the ICU4X project, designed by Manish and originally implemented by Shadaj, a past Google intern, and was motivated by the lack of good FFI tools that satisfied ICU4X’s requirements. This post focuses on how it works, but details on the motivation can be found in the design document.
Diplomat is a unidirectional1 Rust FFI tool that answers the question: “What
if we had FFI tools like CXX, but pluggable to any language?”
At its core lays the abstract representation of types and methods, which
represents the lowest common denominator for what types and methods look like
across most modern programming languages. This enables it to be theoretically
capable of generating bindings to any language that can support binding to C or
WASM, since it works through generated extern "C"
functions. To top things off,
it also generates native-feeling types and methods in the target language that
wrap the WASM/C interface. Currently, Diplomat has support for generating
bindings to C, C++, C#, and JavaScript/TypeScript.
The cost of creating bindings to all these languages is just the work involved
in writing one set of FFI API declarations, which looks mostly like normal Rust
code exposed through a module marked with the #[diplomat::bridge]
proc macro,
which is inspired by #[cxx::bridge]
. So if you have some large Rust library
like ICU4X, you can attach Diplomat bindings and get native feeling APIs in every
supported language at no extra cost. We’ll explore some examples together below.
Abstract Representation
Since Diplomat aims to be language agnostic, it requires an abstract representation that makes sense to many languages. For this reason, it’s abstract representation consists of only primitives (integers, booleans, strings), fieldless enums, structs, and methods. By supporting these ideas that are representable to most programming languages, Diplomat is able to reason about the types and generate conversion code between the two over FFI when calling Rust methods.
This is great, but introduces a problem: sometimes we don’t want to convert certain values at the boundary because their internals are, well, internal, and we only want to expose their functionality. This is where opaque structs come into play.
Opaque vs Non-Opaque Structs
In general, there are two kinds of structs in programming. The first kind is the
“opaque” struct, which usually has some more advanced functionality. An example
is Rust’s HashMap
, where the user shouldn’t particularly care about the exact
inner fields (which are private anyway!), and should only care about the
functionality it provides. The second kind is the “bag ‘o stuff” struct (also
referred to as non-opaque struct in this post), which has fields that are
semantically disjoint. This kind most frequently shows up in Rust in the form of
tuples and builders.
Diplomat distinguishes between these two kinds of structs, and the programmer
can communicate this by marking opaque structs with the #[diplomat::opaque]
attribute. Structs marked as opaque have special properties in Diplomat’s model:
- Fields are private from the other language, only methods can be used.
- Fields can be any concrete type, since Diplomat doesn’t examine them.
- Can only cross the FFI boundary behind a pointer, meaning conversion across the boundary is constant and minimal.
- Boxed opaques can only be returned, but not accepted as a parameters because many other languages don’t have a way to give up ownership.
On the other hand, structs not marked with the #[diplomat::opaque]
attribute
have nearly opposite properties:
- Fields are public to the other language.
- Fields can only be primitive types like numbers, booleans, characters, owned opaques, and other structs. My project extended this to also allow for borrowed opaques, string slices, and slices of primitives.
- Can only cross the FFI boundary by value, meaning that every field is also converted at the boundary.
In JavaScript, for example, “bag ‘o stuff” structs are converted into classes
with public fields and methods associated to the Diplomat-exposed methods, but
it’s worth noting that it will accept any Object
with the right fields in
method arguments.
My Intern Project: Lifetimes in Diplomat
Among other things, Rust’s type system allows for expressing mutability, ownership, and lifetimes, concepts that most other languages simply do not have. My work this summer was to extend Diplomat’s abstract model to figure out how we can take advantage of Rust’s precise control over these factors across the FFI boundary, particularly with returning references to borrowed data.
One of the problems that Rust’s lifetime system solves is dangling pointers2, and it does this statically by preventing references from outliving their owners. But rustc can only do static analysis on pure Rust code, and this is the main problem with returning pointers across FFI: it’s not just pure Rust code.
The solution, instead, is to make Diplomat enforce whatever technique the other language uses to prevent dangling references. In garbage collected languages like JavaScript, this involves holding a reference to the owner to stop it from being cleaned up early. In manually memory-managed languages like C++, this involves generating documentation notifying the programmer of what lifetime relationships must be upheld.
Let’s walk through a simple example:
#[diplomat::bridge]
mod ffi {
#[diplomat::opaque]
pub struct Bar {
// ...
}
#[diplomat::opaque]
pub struct Foo<'a> {
bar: &'a Bar,
}
impl<'a> Foo<'a> {
pub fn get_bar(&self) -> &'a Bar {
&self.bar
}
}
}
When Diplomat sees Foo::get_bar
, it sees the expanded self: &Foo<'a>
parameter and the returned &'a Bar
type, and sees that they both contain 'a
.
Here, 'a
is the lifetime of some Bar
value, but Diplomat has no way of
knowing where that Bar
value is, or how long it lives for. But it does know
that however long it lives for, Foo<'a>
is guaranteed to outlive it.
Therefore, by enforcing that &self
value outlives the return value,
Diplomat can ensure that there are no dangling references.
In the case that this API requires JavaScript bindings, Diplomat would use this lifetime relationship to hold a reference from the output to the input that it borrows from to generate an API that resembles the following:
class Foo {
// ...
get_bar() {
// `this.underlying` is the ptr to the Rust opaque
const bar = wasm.Foo_get_bar(this.underlying);
// hold a reference so the GC doesn't clean up `this` before `bar`
bar.fooRef = this;
return bar;
}
}
On the other hand, it’s up to the programmer to keep track of these “outlives” relationships in manually memory-managed languages. In C++, Diplomat generates documentation that notifies the programmer of what these relationships are:
class Foo {
// ...
public:
/* Lifetimes: the returned value must be outlived by: `this`.
*/
const Bar& get_bar() { /* ... */ }
}
Fundamentally, this idea is straightforward and not too difficult to implement… right?
Challenge 1: Lifetime Subtyping
Wrong.
Some people want to watch the world burn. Some people want to use lifetime bounds.
// `Foo` and `Bar` same as above
impl<'a> Foo<'a> {
pub fn get_bar<'b>(&self) -> &'b Bar
where
'a: 'b,
{
&self.bar
}
}
Unless your writing borrowing iterators or zero-copy Deserialize
implementations
by hand, you will probably never need this.
The problem here is that the output looks like this:
&'b Bar
While the input looks like this:
self: &Foo<'a>
These are not the same lifetime. If you just look in the input for 'b
,
you’ll find that it doesn’t borrow from the input, and this can and will
eventually lead to dangling pointers. This works in Rust because the 'a: 'b
bound allows'a
to coerce into 'b
, meaning that if we’re looking for all
inputs that live at least 'b
, then we should also count types with 'a
.
Therefore, we have to be cognizant of these bounds as well.
But it goes even further:
impl<'a> Foo<'a> {
pub fn get_bar<'b, 'c>(&self) -> &'c Bar
where
'a: 'b,
'b: 'c,
{
&self.bar
}
}
Why? Why would you ever write this?
Here, the output has lifetime 'c
, and the input has lifetime 'a
, but it
doesn’t explicitly say 'a: 'c
. Instead, it’s inferred via transitivity.
Do you see where this is going?
Here’s an exaggerated example of a where
clause:
'a: 'b,
'b: 'c,
'c: 'e,
'd: 'b,
'e: 'd + 'f,
This looks a lot like an adjacency list…
Ah. It’s a directed graph problem.
If you think of lifetimes as vertices and bounds as edges pointing from longer to shorter lifetimes, then determining possible coercions becomes a simple reachability problem. The diagram below shows the exaggerated example as a directed graph:
With this model, we can easily generalize the definition of what it means for some lifetime A to be able to coerce into some lifetime B. That is, lifetime A can coerce into lifetime B if and only if there exists a path from A to B on the directed graph of lifetimes and bounds. In practice, this also mean that all lifetimes in a cycle are all the same lifetime.
With what we’ve discussed to far, this is what the algorithm looks like:
- Traverse the output type, collecting any lifetimes encountered along the way.
- Traverse the input types. When some lifetime A is encountered, perform a depth-first search on shorter lifetimes while tracking visiting to break cycles. If one of the output lifetimes is encountered, add the parameter containing lifetime A to the list of parameters that the output borrows from.
- In code generation, use this information to communicate to the other language that there’s a borrowing relationship as described in the previous section.
This is all great, but take a moment to hear me out. I’m not saying I don’t trust my implementation (it’s tested), but if you’re relying on the transitivity of lifetime bounds across arbitrarily many lifetimes, your house is eventually getting egged by someone forced to read your code. So let’s move on.
Challenge 2: Borrowing from Struct Fields
In the type philosophy section, I mentioned how non-opaque struct fields are publically accessible across the FFI boundary. What happens when we copy a borrow from a struct field?
#[diplomat::bridge]
mod ffi {
#[diplomat::opaque]
pub struct Opaque {
// ...
}
pub struct Input<'a> {
data: &'a Opaque,
}
impl<'a> Input<'a> {
pub fn extract(self) -> &'a Opaque {
let out = self.data;
out
}
}
}
When rustc sees Input::extract
, it requires that self.data ≤ 'a ≥ out
, and
it’s able to statically enforce this because rustc is omniscient on pure Rust code.
Diplomat, on the other hand, cannot perform static analysis since it cannot peek
into the other side of the code.
With the lifetime design discussed so far, Diplomat would require
'a ≥ self.data ≥ self ≥ out
, since the current idea doesn’t detect which
field is being borrowed from. This is an overly conservative approach to
enforcing 'a ≥ out
.
Being overly conservative can prevent some painfully obvious programs from being accepted. In C++ for example, Diplomat would generate documentation saying that you that you could have a dangling reference if some wrapper struct goes out of scope, even it’s clearly not true:
const Opaque& do_nothing(const Opaque& data) {
// Put `data` into the struct
Input input;
input.data = data;
// Extract `data` from the struct
const Opaque& out = Input::extract(input);
// STOP! Diplomat says you'll have a dangling ref
// if `input` goes out of scope before `out` does.
return out;
}
This code is completely safe since it just returns the reference that was
originally passed in, but Diplomat will still generate documentation saying that
this should leave a dangling pointer since Input::extract
thinks that self
must outlive out
. Equivalent generated JavaScript bindings would just hold a
reference and chug along normally, consuming slightly more memory than usual.
Ideally, we want to be able to know that the output of Input::extract
isn’t
bound to self
, but instead bound to whatever self.data
is bound to.
This would eliminate the requirement of having self
outlive out
, and
making Diplomat only have to enforce 'a ≥ self.data ≥ out
.
This is a lot easier said than done though. For starters, checking for lifetime relationships now involves recursing down non-opaque structs and their fields, and if we encounter a borrowed string, slice, or opaque, we have to remember not only which lifetime it corresponds to, but also where in the type tree it appeared. Take the following example:
#[diplomat::bridge]
mod ffi {
#[diplomat::opaque]
pub struct Opaque {
// ...
}
pub struct Second<'a> {
data: &'a Opaque,
}
pub struct First<'a> {
second: Second<'a>,
}
impl<'a> First<'a> {
fn get_data(first: First<'a>) -> &'a Opaque {
first.second.data
}
}
}
Remember that Diplomat only looks at function signatures, so we can’t just look
at the first.second.data
staring the user in the face. What happens instead
is that we see that 'a
is in the output type, &'a Opaque
. Then, we traverse
the inputs and see first: First<'a>
. We see that it’s a non-opaque struct,
so we look into its fields and find second: Second<'a>
. That’s also a
non-opaque struct, so we look into its fields and see data: &'a Opaque
.
Since Opaque
is opaque, Diplomat isn’t concerned with its fields, so we’re
done recursing. Once we reach this point, we then have to remember that
the output borrows from first.second.data
. In the code generation, we can
then reflect this information, making the C++ example above not generate
overly conservative documentation.
This is only half the problem, though. What about non-opaque borrowing structs in the output?
#[diplomat::bridge]
mod ffi {
#[diplomat::opaque]
pub struct Opaque {
// ...
}
pub struct Input<'a> {
data: &'a Opaque,
}
pub struct Output<'a> {
data: &'a Opaque,
}
impl<'a> Input<'a> {
fn get_data(self) -> Output<'a> {
let out = Output { data: self.data }
out
}
}
}
This follows similarly to the previous example, except we observe that
out.data
has to live 'a
, and that self.data
is guaranteed to live 'a
,
so we can just enforce that self.data
has to outlive out.data
to uphold
the invariant that out.data
lives at least 'a
.
Challenge 3: Naming Lifetimes
Tracking the “outlives” relationship on a field-by-field basis is great, but it introduces another challenge: lifetimes can go by different names in different declarations:
#[diplomat::bridge]
mod ffi {
#[diplomat::opaque]
pub struct Opaque {
// ...
}
pub struct Input<'i> {
data: &'i Opaque,
}
pub struct Output<'o> {
data: &'o Opaque,
}
impl<'a> Input<'a> {
fn get_data(self) -> Output<'a> {
let out = Output { data: self.data }
out
}
}
}
The lifetimes in the declarations of Input
and Output
have completely
different names. We can’t just track 'a
all the way down since 'a
is being
mapped to other lifetimes. Instead, we have to track lifetimes by the index they
appear in generic arguments, and use that to determine which lifetime is which.
This is similar to what rustc does.
The Big Picture
Let’s put this all together and walk through an example of a method with lifetime bounds, non-opaque structs in the input and output positions, and renamed lifetimes:
#[diplomat::bridge]
mod ffi {
#[diplomat::opaque]
pub struct Opaque {
// ...
}
pub struct Input<'i> {
data: &'i Opaque,
}
pub struct Output<'o> {
data: &'o Opaque,
}
impl<'a> Input<'a> {
fn get_data<'b>(self) -> Output<'b>
where
'a: 'b,
{
let out = Output { data: self.data }
out
}
}
}
The first thing that Diplomat does when it encounters Input::get_data
is
examine the return type signature, Output<'b>
. Since it’s a non-opaque struct,
we look in the declaration of Output
and store a mapping
saying that the position of 'o
in the context of Output
’s declaration
corresponds with the position of 'b
in Output<'b>
in the method. Next, we
see the data: &'o Opaque
field, and use the mapping to determine that it’s
actually &'b Opaque
in the context of the method. We then remark that
out.data
relies on 'b
, see that there’s no more fields to explore in the
field traversal, and finish.
Once we know which lifetimes to look out for (just 'b
), we traverse the types
of the inputs, which consists of just Input<'a>
. Since Input
is a non-opaque
struct, we look in the declaration of Input
and store a mapping saying that the
position of 'i
in the context of Input
’s declaration corresponds with the
position of 'a
in Input<'a>
in the method parameter. Inside of Input
s
declaration we see the data: &'i Opaque
field, so we map the lifetime 'i
back to 'a
. At this point, we have to consider that 'a
may coerce into
other lifetimes, so we perform a depth-first search in the lifetime graph to
find all shorter lifetimes. This search results in finding 'b
. After confirming
that 'b
appears in the output, we remark that self.data
is one of the input
fields that live at least 'b
.
In code generation, we put these pieces of information together to determine
that out.data
must be outlived by self.data
. With this information, for
example, JavaScript would handle this by making out.data
hold an reference to
self.data
to ensure that values are cleaned up in the right order, and C++
would have generated documentation specifying which the lifetime invariant that
needs to be upheld.
Results
That’s a lot of details, and unfortunately it’s not 100% integrated yet. Thus far, the lifetime coercion detection using depth-first search is entire implemented, but tracking borrows on a field-by-field basis is only implemented in the JavaScript backend. Aside from not understanding the full scope of the problem before starting, the biggest challenge was that Diplomat’s abstract representation was not designed with lifetimes in mind, and also wasn’t conducive to writing robust code.
For example, traversing down structs recursively and maintaining a mapping from
lifetimes in a declaration to lifetimes in the method was bolted on at the end
after I’d already written a significant amount of code. Additionally, Diplomat’s
model allows for a bunch of illegal states like an opaque struct passed by value.
Obviously there are validity checks, but the loose type system eventually pollutes
backends with .unwrap()
and unreachable!()
statements. On top of this, the
existing JavaScript code generation was really challenging to understand and
work with, and I ended up spending a week rewriting it to be more modular.
Each of these challenges made me consider how I would have done things differently to make everything easier given the opportunity to. This lead to me designing the high-level intermediate representation (HIR), which will eventually be the interface that language backends interface with to generate bindings.
Next Steps: HIR
The HIR is an ongoing endeavor that attempts improve on the suboptimal design decisions of the past by making everything more coherent. I designed and implemented it over the course of my internship from the ground up around three core ideas:
- Make invalid states unrepresentable to reduce
.unwrap()
s. - Make types more convenient for writing backends by breaking apart large
enum
s. - Make lifetime tracking a first-class feature.
It will make all the challenges I listed above non-issues for folks implementing Diplomat backends, along with many other quality-of-life changes by serving as an interface between Diplomat’s naive abstract representation and the language backends.
At this point, the type system is well defined and implemented, and the code to convert Diplomat’s current abstract representation into the HIR is implemented but not thoroughly tested. But the roadmap for full integration is there, and there’s continuous ongoing work to make it the standard in Diplomat’s model.
Thank you
A special thank you to Manish and Shane for being kind, patient, encouraging, and overall amazing hosts. I remember coming across Jane’s blog post before the start of my internship, and now I can 100% confirm that having Manish as a mentor is as amazing as Jane makes it out to be.
-
Unidirectional FFI is when one language can invoke APIs from another language, but not the other way around. This is most common when exposing a library to another language. ↩
-
XOR aliasing is also solved with Rust’s lifetime system, and this is being worked on. ↩