Kanidm[6]

July 5

Took the day off because national holiday :)

July 6

Tuesday TODO list:

Continue working on tracing-memberof.

Tuesday started off with confusion: why were my spans not closing? Could they only be closed if they have no parents? Do parent spans have a way of referencing their child spans? I decided to delegate the issue to future Quinn, and instead focus on getting better formatting. I started by looking at tracing_subscribers fmt::Layer implementation of the layer::Layer trait.

Reading the source code provided a lot of insight, and I recognized that I could just wrap a struct that implements FormatFields in my custom layer type. This would essentially allow me to take advantage of all the built-in formatting functionality for free without having to implement it myself.

Another thing I was unsure about what whether it would better to store a vector of Strings where each represents a log, or to just store one massive String that continuously gets appended to.

However, I quickly ran into an issue with this implementation. The FormatFields trait uses a format_fields method that requires a reference to a dyn fmt::Write, with the same lifetime as the entire layer. This was problematic because each span has it’s own buffer of logs with a lifetime much shorter than that of the layer, since it needs to store them until the end before dumping them all at once.

Option one: make each span have their own FormatFields object, and this would be great if all types were ZST’s! However, the Pretty type wraps a stupid ansi bool flag, which I will never need. Would this get optimized away? Who knows, but I wish they were all ZST’s so I wouldn’t feel bad about this dumb issue.

Option two: extend the lifetime of the fmt::Write object. This meant making all the buffers be contained in the fmt::Layer. This poses the issue of interior mutability though, which was something I really wanted to avoid.

Option three: store a flag of which formatter I needed, and just create it each time I had to write. This seemed completely ridiculus though.

The most extreme option was to look into the source code for how the formatters work, and just use that.

Update: I looked into the source code and it was way too complicated.

Another option I thought of would be to make my own PrettyFields-like type, that essentially does pretty formatting, except is a ZST. This meant that I could still utilize the FormatFields trait, as well as the JsonFields implementation which seemed very complicated. Even worst-cast scenario where I gave up on making my own type, I could use the existing PrettyFields type and have something that at least works.

I eventually decided to create a new FormatFields in each span, so that the lifetime of the buffer stored for that span was as long as the FormatFields. All this could have been avoided if the format_fields method didn’t require the writer to live AS LONG as the FormatFields type itself. Oh well.

Spent the day getting destroyed by Rust trait bounds. Hooray.

July 7

Wednesday TODO list:

Transfer tracing research into new project.
Get JSON formatting working.
Implement pretty formatter as a ZST.

Started off the work day by creating a new Rust project on my local computer, adding all the dependencies, and copying over the majority of my code. I felt silly for not having done this earlier, since now rust-analyzer and cargo worked significantly faster!

I decided to look into implementing my own formatting, since I realized that getting JSON formatting is incredibly easy with serde_json, and pretty printing was just a matter of using format!. I was able to get both formats up and running pretty easily by creating a custom Log type that contained basic information regarding a log, and having it implement Serialize.

July 8

Thursday TODO list:

Continue working on custom formatting implementation.

I started off with the realization that there was very little point to bend over backwards to tracing_subscribers preexisting formatting techniques, and decided to implement my own so that I could get full control. I also somehow got an invalid JSON string from a JSON log using its built-in formatting (It was missing a closing curly brace somewhere), which confirmed my decision.

Here’s the design I settled on:

The subscriber has a format for logging, either JSON or Pretty. This is stored as an enum at a high level, allowing it to be configured at runtime.
When the subscriber receives an event, it uses methods of the formatter to format it to a dyn fmt::Write object, which will always be the spans log buffer. I also realized while writing this that it doesn’t need to erase the type, since it will always be the same.

I initially felt bad that we had to check which formatting to use at runtime, but digging through the source code for tracing_subscriber resolved this pain. There is so much overhead going on it’s crazy. There’s a hash map created for each log’s fields, multiple writes to the writer, and so on. Mine was nowhere near as costly.

I also made the formatting much more restrictive for the purpose of escaping from all the constraints that tracing_subscribers super generalized interface enforces. This came in handy when I had to interact with Event objects, since I had to create a Visit type to read the fields which wasn’t so bad because I was able to have fields written directly to the fmt::Write object I was writing to, whereas serde required me to make &strs first which meant allocating and then borrowing a String.

I decided to call it quits after having been at my computer for almost 5 hours straight. Goodbye.

July 9

Friday TODO list:

Figure out how to get the default message in a log instead of using message = "stuff".
Figure out how to send Events within the subscriber hooks for sending open and close events.
Get timings to work
Make custom macros or figure out how to avoid needing them.

Started off the day by reading some documentation. The first thing I wanted to achieve was being able to read the default message that the user entered into an event macro, like error!("the message"), instead of my existing solution, which was to have to write it as error!(message = "the message"). After some testing, I discovered that the field-value pair of the message was accessed through the record_debug method in tracing::field::Visit, instead of record_str. I wasn’t entirely sure why, but this fixed the problem, meaning I no longer had to create wrapper macros to insert this behavior for me. Great!

Next, I wanted to be able to send Events from within the subscriber hooks. This is used by tracing_subscriber in it’s FmtSubscriber implementation for when it logs when a span enters/exits, to make the traces look like more natural user-defined logs. I looked into the source code for how the FmtSubscriber accomplished this, and it turned out that it used a handy macro to do it.

I was happy to find the solution so quickly, but I decided to hold off on implementing it because I wanted to save easy tasks for when I wasn’t feeling as motivated to problem solve. I got side tracked from my TODO list while reading through some of William’s suggestions, and decided to conquer the issue of how to idiomatically exit spans during await calls.

As the docs describe, the reason you have to exit spans before await calls is because the stack frame can exit without dropping the span guard over an await call, meaning the span can stay enabled even as other code is running. This produces faulty logs. Luckily, I found that tracing comes with a really amazing proc-macro called instrument that decorates functions to automatically have all their content wrapped in a span, and have that span strategically enter before await calls and reenter right afterwards. Amazing! I’ve been writing some simple proc-macros for my own projects, so it was really cool so see such a creative use of one in such a big library.

After this discovery, I took William’s advice and started to write some async test functions for my custom Subscriber. The primary test was simple: create two async functions with spans, and have them run at the same time. The default tracing_subscriber implementation should print the logs from each function jumbled together, but in chronological order. My implementation should print the logs from each function clumped together to keep related events in one place, but not quite in chronological order.

Unfortunately, I found that my Subscriber behaved the same way as the default. I quickly found the source of this issue: I had my closing behavior in the exit hook, meaning that logs would print whenever the span was exited instead of when it was finished. I still hadn’t figured out why the closing hook was never called, meaning that I was once again backstabbed by past Quinn’s laziness. A tragedy. In retrospect, I think the issue is that somewhere in the Layered implementation, the on_close method is deliberately ignored, and I need to look into that.

Weekly Reflections

When I started these daily entries, my motivation was to show that I was putting time and effort into the project, since I felt like my GitHub/coding contributions couldn’t speak for themselves. But this week I’m starting to overcome that fear- the reason I wrote so little for Wednesday is because I was so occupied researching and coding that I completely forgot to pause and journal! I’m thrilled to have a (part time) job that I get so much enjoyment from. I’ve also been talking to a few of my peers doing software internships, and it made me realize how amazing GSoC is. I heard one friend didn’t get to write code for the first few weeks, and even now only gets to work with Java (sad). On the other hand, I’ve been writing code since day one, in Rust (amazing!). I’m also happy to not have to represent some large corporation, but instead work with a couple of chill people who are doing the project out of enjoyment. This has definitely made me want to contribute to more open source in the future!

Written on July 10, 2021