Lab 4: Readers and Writers

Due March 14, 11:59pm

Now that we've talked a bit about traits and generics, it's time to apply them to what we've been working on! We will do this in two parts.

First, we will define the FrameReader type, which will be a special type that combines the Frame::check and Frame::decode methods we previously wrote, a buffer, and a type that implements Read. This will ultimately help us abstract over the process of getting bytes from the reader and decoding them into Frame values.

Then, we'll define the WriteFrame trait and create a blanket implementation for it for all Write types, allowing us to write Frame values to any type that can have bytes written to it.

Grading Rubric

  • Code is formatted properly using cargo fmt: 5%
  • Code passes cargo clippy without warnings: 25%
  • Code passes our tests: 70%

Learning Objectives

  • Practice using generic types with trait bounds.
  • Practice implementing traits.

Table of Contents

Defining the FrameReader type

The FrameReader type that we are going to define will abstract over the difficulties of getting Frame values from some source of bytes, and will provide one method (besides the constructor): read_frame.

When we wrote both Frame::check and Frame::decode methods, we were relying on someone providing it with a &mut Cursor<&[u8]> as an input. This means that somewhere, we need to store a contiguous buffer of bytes (to get a &[u8]), and this buffer will be owned by the FrameReader type. Then, we need to be able to:

  • fill up that buffer with as much data as possible from some data source,
  • create a &mut Cursor<&[u8]> that views into the buffer,
  • call Frame::check (and then Frame::decode on success),
  • and finally, return the Frame if everything else was successful.

We'll put these steps into a function called FrameReader::read_frame, which will provide us with a nice abstraction for future labs.

1. The Read trait

But before we can do this, we let's rewind for a second: where are these bytes coming from? Eventually, we'll want to be reading bytes from a std::net::TcpStream, but we'll also want to test our code on simpler types that are capable of being read from, like a Vec<u8>. Fortunately, the standard library provides the std::io::Read trait for us.

Before we dive too deep, remember: a trait (like Read) isn't a thing you can pass around in your program. The only things that exist at runtime are structs, enums, primitive types, and functions. When we say that a function takes a Read value, or a struct contains a Read field, we're saying that it takes some concrete type like T (struct, enum, primitive, sometimes even function) that implements the Read trait. This is why generics and traits are so tightly coupled in Rust: generics and trait bounds allow us to represent much of the same logic that super classes in object-oriented languages can.

Types that implement Read must implement its Read::read method, which allows us to pass it a buffer that it can write to (&mut [u8]), and returns the number of bytes written or an error. This means that for all types T: Read, we can pass them a &mut [u8] and it will fill it with bytes in whatever way makes sense for that type.

Here's an example from the book, demonstrating how std::io::File implements Read:

use std::io;
use std::io::prelude::*;
use std::fs::File;

fn main() -> io::Result<()> {
    let mut f = File::open("foo.txt")?;
    let mut buffer = [0; 10];

    // read up to 10 bytes
    let n = f.read(&mut buffer[..])?;

    println!("The bytes: {:?}", &buffer[..n]);
    Ok(())
}

This is very handy, but careful inspection of the example shows some lack of flexibility: they had to manage their own buffer, the array of 10 bytes. Furthermore, it's not very ergonomic to use, since the return value is the number of bytes overwritten, which then has to be taken into account. We do not want to have to manage this all manually.

Normally, we'd use a type like std::io::BufReader, which is provided by the standard library. It works by wrapping a Read value and implementing Read itself, and uses an internal buffer to make small, infrequent reads much more efficient in some cases. However, it's not suitable for our case because although we can get access to the internal buffer as a &[u8], there's no public API that allows for requesting to read more data unless the buffer is empty, which won't suit our use cases. For this reason, we will be creating our own FrameReader type that is very similar to BufReader.

2. The ReadBuf type

For this lab, we're providing the ReadBuf type which will act as the buffer that we're storing the bytes in.

use std::io;
use std::io::prelude::*;
use std::fs::File;
use readbuf::ReadBuf; // <- new!

fn main() -> io::Result<()> {
    let mut f = File::open("foo.txt")?;
    let mut buffer = ReadBuf::with_capacity(1024);

    // read up to 1024 bytes
    buffer.read(&mut f)?;

    // ReadBuf internally tracks how many bytes are read
    // and will give you exactly what was read
    println!("The bytes: {:?}", buffer.buf());
    Ok(())
}

To use it, you'll first have to add our mini library as a dependency. Open you Cargo.toml file and add the following line under [dependencies]:

readbuf = { git = "https://github.com/QnnOkabayashi/rust-course-utils" }

Overall, your Cargo.toml file should look like this:

Filename: Cargo.toml

[package]
name = "server"
version = "0.1.0"
edition = "2021"

[dependencies]
cursor = { git = "https://github.com/QnnOkabayashi/rust-course-utils" }
readbuf = { git = "https://github.com/QnnOkabayashi/rust-course-utils" }

Then, navigate to src/rw.rs (rw stands for read-write) and add the following:

Filename: src/rw.rs

use readbuf::ReadBuf;

To learn how to use ReadBuf, enter the following command:

cargo doc --open

This will generate documentation for your Rust project in the form of HTML, JS, and CSS, and then open the corresponding files as a web page on your default browser. Under the "Crates" section on the right, click on "readbuf", and then on the "ReadBuf" struct to find its documentation and see how each method works.

3. Putting it all together

Putting these together, define the FrameReader type as a public struct with two fields:

  1. A reader of type T, where T: Read.
  2. A ReadBuf value that will act as our buffer.

Hint: it may help to look at the implementation of the BufReader type, which can be found by following that link and clicking the orange source button on the right. Your struct definition should look nearly identical.

The ReadError type

Before we write methods for the FrameReader type, let's define yet another error type.

Woohoo!

To do this, we'll consider what can possibly fail. Firstly, we know that we'll be using our Frame::check and Frame::decode methods from last week, and those can both error with ParseError, so we'll need a variant for that. Secondly, we'll also be using the ReadBuf type, whose .read(...) method can error with std::io::Error.

If we have use std::io; at the top, then we can just refer to it as io::Error instead of the fully qualified path name, std::io::Error.

Therefore, we'll need an error type that can represent either of these two cases, which we can define as the following in our src/error.rs file:

Filename: src/error.rs

#![allow(unused)]
fn main() {
use cursor::CursorError;
use std::{io, str::Utf8Error};

// ParseError implementation details omitted...

#[derive(Debug)]
pub enum ReadError {
    Parse(ParseError),
    Io(io::Error),
}
}

The Parse variant is very straight-forward: it represents any errors that could occur when calling Frame::check or Frame::decode

On the other hand, the Io variant stores an std::io::Error, which is actually doing a lot of heavy lifting here. Internally, it stores an std::io::ErrorKind, which is an enum that has 40 variants.

For example, in the case that ReadBuf::read tries to read and there aren't any more bytes to read, it will return an io::Error with an io::ErrorKind::WriteZero internally, because no more bytes is an error. Or later on, when we try to perform non blocking reads, it could return an error with io::ErrorKind::WouldBlock. All of these scenarios are encapsulated in the ReadError::Io variant.

Your task here is to paste this ReadError type definition into your src/error.rs file, and add the proper impl From blocks for both Parse and Io variants. Here's the one for Parse to get you started:

impl From<ParseError> for ReadError {
    fn from(err: ParseError) -> Self {
        ReadError::Parse(err)
    }
}

Defining methods for FrameReader

Now that we have our ReadError type sorted out, let's write some methods for FrameReader. There are two methods we want: new and read_frame.

#![allow(unused)]
fn main() {
struct FrameReader<R>(R);
struct Frame;
struct ReadError;
use std::io::Read;

impl<R: Read> FrameReader<R> {
    pub fn new(reader: R) -> FrameReader<R> {
        todo!("implement FrameReader::new")
    }

    pub fn read_frame(&mut self) -> Result<Frame, ReadError> {
        todo!("implement FrameReader::read_frame")
    }
}
}

The new method should be straightforward: using the provided reader, create a FrameReader using it and a new ReadBuf, which you can get using ReadBuf::new().

The read_frame method is slightly more complicated, but has a very linear control flow. At a high level, it starts by reading in bytes to the FrameReader's internal buffer (a ReadBuf), then checking the contents of the buffer, then decoding from the buffer, and finally telling the buffer to discard the read bytes so they're not read again, before returning the decoded Frame.

Let's go into more detail on these steps (each number corresponds to one line of code in our implementation):

  1. Data is read from the selfs reader into selfs ReadBuf field using the .read(...) method, and any errors are propagated.
  2. A new Cursor is created, wrapping the buffer of selfs ReadBuf field (see its documentation for how to get the buffer as a &[u8]).
  3. We call Frame::check, passing it a mutable reference to the Cursor we just created, and propagate any errors that might occur.
  4. If no errors have been propagated thus far, then it must be the case that we have enough bytes to parse a Frame. Thus, we'll proceed by resetting the cursors position back to 0 using .set_position(0) on the cursor.
  5. Now that our cursor is back at 0, we'll pass the cursor into Frame::decode to get our Frame, making sure to propagate any errors.
  6. Then, we need to make sure that we don't accidentally read the same bytes again. Since Frame::decode advances the cursor to just past the end, we can figure out how many bytes were used by calling .position() on the cursor. Then, we can use .consume(len) on selfs ReadBuf, where len is the number of bytes (casted to a usize using len as usize) to make sure that calls to .buf() don't give us those same bytes again.
  7. Finally, we can return the Frame, making sure to wrap it in Ok(_) since the method returns a Result.

For an example on how to test this, see the testing section below.

Defining the WriteFrame trait

Next, we need a way to write Frames as bytes. In the spirit of using traits, we'll abstract this behavior by defining a WriteFrame trait with one method, write_frame which takes a &mut self and a &Frame, and returns an io::Result<()>, and has no default implementation.

This means that all types T that implement WriteFrame are capable of being passed a &Frame and encoding it somewhere in themselves (&mut T).

Define the WriteFrame trait in src/rw.rs beneath your impl block for FrameReader.

For a reminder on how to write a trait with a method and no default implementation, see Chapter 4.1: Traits and Generics.

Implementing WriteFrame for all Write types

When most Rust programmers learn about traits, it's usually easiest to think about implementing traits on types one at a time. However, we can actually perform "blanket implementations", where we implement a trait for all types that satisfy some trait bounds.

An example of this is that the std::string::ToString trait is implemented for all types that are also std::fmt::Display, and works by creating a new String, formatting the value into it with Display::fmt, and then returning the string. The source code for this can be found here.

This is convenient when the functionality that our trait requires can be provided by another trait. In our case, this other trait is std::io::Write, which is a trait for types that can be written to. Pairing this trait with the write!() macro allows us to do formatted writes to Write types as easily as using println!(), making any type that implements Write a perfect candidate to implement WriteFrame on.

To start, add the following to to top of src/rw.rs:

Filename: src/rw.rs

use std::io::Write;

Then, we'll implement the WriteFrame trait for all types W where W: Write.

For the body of the write_frame method, we'll pretty much do the opposite of Frame::decode, and start by matching on the variants of Frame. We can then use the write!() macro to "print" bytes to the writer, making sure to follow the protocol.

For example, if we're working with a Frame that is the simple string variant, we could use the following:

write!(self, "+{}\r\n", the_string)?;

On the other hand, your implementation for writing bulk strings have to write several parts, and that would look like this (assuming bulk is &Vec<u8> or &[u8]):

write!(self, "${}\r\n", bulk.len())?;
self.write_all(bulk)?;
write!(self, "\r\n")?;

For array variants, you'll start by writing the length, and then you can loop through each Frame and call self.write_frame(frame)? so nested Frames are recursively written.

You can look back on the lab 3 instructions for a reminder on how each frame variant is encoded.

Testing your code

Similar to the last lab, you'll write your tests in src/rw.rs in its own tests module. To do this, add the following at the end of your src/rw.rs file:

Filename: src/rw.rs:

// other stuff omitted...

#[cfg(test)]
mod tests {
    use super::*;

    // Here's an example test
    #[test]
    fn test_read_frame() {
        let data = "+Hello, world!\r\n".as_bytes();

        // &[u8] implements `Read`
        let mut reader = FrameReader::new(data);
        let result = reader.read_frame().unwrap();
        assert_eq!(result, Frame::Simple("Hello, world!".to_string()));
    }
}

Questionnaire

No questionnaire this week. Enjoy break!

Feedback

Please make sure that each member fills out this short feedback form individually.

Submitting

Once you're finished, be sure to verify that:

  • cargo fmt has been run
  • cargo clippy passes
  • cargo test passes

Then you can push your changes with the usual: git add ., git commit -m "your message", and git push.

When you're ready to submit your finished lab, do the following:

git tag lab4

And then push it to your repository:

git push origin lab4

Congratulations on finishing!