Kanidm[5]
June 27
Monday TODO list:
The past weekend, William gave me some feedback on my PRs, including suggesting rebasing tracing-logging
#467 to master
. I wasn’t quite ready to have my changes merged yet, but I’d never done a rebase before and decided that it wouldn’t hurt. I ended up spending about an hour trying to figure it out, but even when I resolved all the conflicts, the files still weren’t correct because some lines weren’t getting updates, causing me to get the red squigglies. I didn’t want to waste more time, and since it wasn’t a top priority, I ended up aborting the resolved conflicts and leaving a comment on the PR.
In my tracing-memberof
PR, William suggested that I implement my own Subscriber
. However, I ran into a big problem: there were a lot of methods that called for updating the internal state, but accepted self
immutably. What the heck? It mentioned passing Visitor
objects that can collect information, but that didn’t change the fact the self
was immutable. It was also really frustrating because the documentation literally said to “update any internal state”, suggesting that there’s not some other way that I was missing.
June 28
Tuesday TODO list:
- Review GitHub comments.
- Continue thinking about implementing my own
Subscriber
.
An idea that I had before I started work was that maybe Subscriber
s always take themselves immutably because they are designed to be concurrent, and therefore are intended to use some alternate method of synchronizing logs. The easiest way to do this is with an Arc<Mutex<_>>
, but I figured there had to be a better way.
I started off by reviewing William’s latest PR, Add statistical analysis to indexes #505. It took me 2.5 hours to review, and I thought it was really cool how we used statistics to create diagnostics that helped us optimize the code. Maybe double majoring with statistics will be useful! I’m still very impressed with the amount of code that he’s able to produce. It seems like he’s pumping out massive PRs every two weeks or so.
After code review, I continued brainstorming for how to implement my own Subscriber
. I initially wanted to create a custom Layer
objects that wrapped the existing FmtSubscriber
implementation. This would provide the option to have the default Subscriber
just shoot everything to stderr
or a file normally, and have filters choose when and how to send information to the wrapped Subscriber
. I wasn’t entirely sure if this was how it worked, but the documentation was very confusing and there wasn’t any big-picture diagrams to explain anything. I ended up pretty frustrated, but I felt like I was making some progress at least.
Ugh. I tried for another hour to figure things out. Everything in tracing_subscriber
is so convoluted. Everything is wrapped in 20 layers of traits and wrapper structs. There are nested on nested on nested generics. There’s a Layer
struct, a Layer
trait, and another Layered
struct. Some of them implemented tracing::subscriber::Subscriber
, some of them tracing_subscriber::Subscribe
. Who the hell designed this? And worse, who decided “Ah yes, lets publicize this and not give any diagrams for how anything works”? It was impossible to tell if I was doing anything right, and worse, they make no effort to describe their thought process. I was having doubts about trying to tackle something this big. I wasn’t even sure if Kanidm would benefit that much from using tracing
. All very frustrating.
June 30
Wednesday TODO list:
- Review responses to my questions on Williams PR #505.
- Find something else to work on to take a break from
tracing
.
I reviewed William’s responses to the slopeyness magic, and I think I really have an understanding for what it’s used for now. Essentially, outsiders can send Kanidm filters to query for entries in the db. Sometimes these filters have a bunch of parameters, and we find the results of the query by first finding all the entries that match the first parameter, and then looking through those for the ones that match the second parameter, and so on. The issue is that if we start with a very broad parameter, the number of entries stays large initially, making the filter much slower. However, if we sort the parameters (having no impact on the final result because it has to do them all anyways), we can organize them in a way such that the most narrow filters are applied first, therefore cutting down the amount of entries we have to search through for more broad filters.
I decided to say “screw it” and continue working on tracing
afterwords, but with a much more loose grip on what I hoped to accomplish. William pointed out that I’m going into unknown territory and therefore shouldn’t feel bad about not making hard progress each day, as long as I was learning and progressing towards deciding whether or not tracing
would be a good idea.
With this in mind, I did a little more digging through the docs and it looked like I needed to use a Registry
. The source code showed fields containing what looked like threading behavior and a RefCell
, which definitely seemed like the right direction for storing span
information.
After some more research, I found that I could also store the logs contained in a span inside of the Registry
using extensions
. It said it can store user-defined data, so I was hoping I could have it store some sort of Vec
of logs, but I still had some more research to do on that.
The main point was that all the interior mutability and span
storage behavior seemed to already exist, meaning my job was significantly easier. All I had to do was figure out how to piece everything together. My main idea at the time was to have Vec
of logs in some form (String
? Some other type?) as an extension of each span
in the Registry
, and then when try_close
was called when a span
ended, all the logs would get appended to that span
s parent, or if it had none, to stderr
or some preconfigured file.
This would do allow us to do a few things:
- All activity within a
span
would get logged at the the point in time at which it closed. - Things would work with
async
, unlikeAuditScope
.
Something annoying that I kept running into is that the links in the docs are consistently broken. This was particularly annoying when it mentioned something I thought was going to be useful, but then end up having to search for it through the search bar.
I decided to call it a day there because my brain was fried. However, I finally felt like I was making it over the roadblock I’d been stuck on. I just had to understand how to piece all the provided parts together.
July 1
Thursday TODO list:
- Respond to PR comments.
- Research
Registry
type andLayer
trait more, work on basic implementation.
Thursday started off by reviewing William’s clarifying responses on my comments on #505. Although I was getting a stronger understanding, there were definitely aspects that I needed to revisit later with fresh eyes. I generally had a good understanding of what the slopyness factor does, but I was still shakey on how filtering as a whole works.
Ugh. I was still struggling to figure out how anything pieced together. Ultimately, I needed to end up with a Subscriber
. I figured the way to do this was by using a Layered
object composed of a type that implemented Layer
(which wraps a Subscriber
), and Subscriber
, although I wasn’t sure why it had two Subscriber
s.
More docs digging revealed that Registry
s are meant to get wrapped in Layer
s. I still didn’t know what the other wrapped Subscriber
was used for. Generating Id
s?
I found that a Registry
s extensions are implemented in a very cool way. It stores a mapping where the key is an actual type, which I would guess make lookups instant because they’re done at compile time? Not sure, don’t quote me on it.
By the end of the day, I was able to create a basic type containing a Layered
struct, which wrapped my own custom Layer
wrapping a Registry
. Pushed it to GitHub and hoping to get feedback tomorrow!
July 2
Friday TODO list:
- Get formatting and timestamps working on
KaniSubscriber
. - Add custom macros for
KaniSubscriber
that wraptracing
macros.
Friday began with the usual docs digging. I was working on getting timing to work by having a custom timed
field. This was going to get wrapped in a macro:
// `info_timespan` is custom macro defined within Kanidm...
info_timespan!("Performance segment");
// ... which desugars to
tracing::info_span!("Performance segment", timed = tracing::field::Empty);
Essentially, the timed
field will be a flag to tell the subscriber whether or not to time the duration of the span. I also decided to have it be a “present or not present” flag and not a boolean flag because a) less memory I guess (?), and b) finding if a field exists is somehow easier than finding out the associated value, and I was tired of reading the docs.
I was also annoyed that fields had to have values (or tracing::field::Empty
), since it meant everything was more verbose. I spent a good 15 minutes before I figured this out from the docs.
Another really frustrating issue I came across was that the on_close
and on_exit
methods were called super ambiguously. on_exit
is supposed to be called when the span
is exited, which can happen multiple times. on_close
is supposed to be called when the span
is dropped. However, I found that on_close
was never being called (I even unconditionally panic
d inside and… no panic), so I put the closing behavior in on_exit
. However, this caused the issue of allowing logs to be written before the span
is done producing, which I was trying to avoid. Definitely an area for further research.
I dug through the source code for how FmtSubscriber
handled this, and it seemed as though the trait interface behaved exactly how I expected it should, so I really didn’t understand why my code wasn’t consistent with that.
On the other hand, I was able to get basic timing working, which I accomplished by loosely copying the source code for FmtSubscriber
.
Weekly Reflections
Even though I didn’t produce the most lines of code this week, I certainly believe my Rust abilities improved immensely. I think I was already fairly competent writing Rust code by myself, and was already improving at reading others’ code through code reviews. However, I took a big leap forward this week by combining the two, and forcing myself to integrate my own code with the internals of existing libraries.
I think forcing yourself to see how others think in code is incredibly valuable, especially when you disagree, but also know their code is used and trusted by millions of developers. The whole process reminded me of when I attempted to first learn Rust, and could barely get anything past the compiler. I constantly thought “This can’t be right. Whoever designed this is an idiot!” But slowly, I began to understand and gain an immense appreciation for the genius design choices taken.
I’ve attempted to interface with the internals of serde
before and ultimately gave up. My experience reading the docs for tracing
felt similarly frustraing, so I’m really happy I powered through and am making good progress now. Next week, I hope to get a fully functioning prototype I can start implementing in more areas of the code base!