Triple buffering

Our current handling of the image received on the serial port is not very satisfying. As soon as we have received a full image, we update the shared image: it means that the next rows to be displayed will come from the newer image while some rows on the LED matrix may have come from the older image.

⚠ You do not have to implement double-buffering. You have to understand how it works, but you only need to implement triple-buffering.

What is double-buffering?

In older computers, drawing something was performed directly in the screen buffer (also called the video RAM) as memory was tight. It meant that some artifacts could easily be perceived unless extreme caution was observed. For example, if an image was displayed by a beam going from the top to the bottom of the screen, drawing a shape starting from the bottom of the screen would make the bottom half of the shape appear before the top half does. On the other hand, drawing from the top to the bottom at the same pace as the refreshing beam would display consistent pictures.

As memory became more affordable, people started to draw the next image to display into a back buffer. This process lets software draw things in an order which is not correlated with the beam displaying the image (for example objects far away then nearer objects). Once the new image is complete, it can be transferred into the front buffer (the video RAM) while ensuring that the transfer does not cross the beam, which requires synchronization with the hardware. This way, only full images are displayed in a consistent way.

On some hardware, both buffers fit in video RAM. In this case, switching buffer at the appropriate time is done by modifying a hardware register at the appropriate time.

Double-buffering in our project

We already implement part of the double-buffering method in our code: we prepare the next image in a separate buffer while the current one is being displayed in a loop. We could modify our code (⚠ again, you do not need to implement double-buffering, this is only an example, you'll implement triple-buffering) so that the image switching takes place at the appropriate time:

  • Make the new image a shared resource next_image rather than a local resource.
  • Add a shared boolean switch_requested to the Shared state, and set it in receive_byte when the new image is complete.
  • Have the display task check the switch_requested boolean after displaying the last row of the current image, and swap the image and next_image if this is the case and reset switch_requested.

By locking next_image and switch_requested for the shortest possible time, the receive_byte task would prevent the display task from running for very short periods. However, we could still run into an issue in the following scenario:

  • The last byte of the next image is received just as the current image starts displaying.
  • We set switch_requested to request the image switch, but this will happen after the whole current image as been displayed (roughly 1/60 seconds later, or 17ms).
  • The speed of the serial port is 38400 bits per second, and a byte requires 10 symbols (start, 8 bits, stop).
  • It means that while the current image is being displayed, about 64 bytes of the next-next image can be received.

Where can we store those bytes? If we store them in next_image, we will alter a buffer which has been fully drawn but not displayed yet so we cannot do this. We cannot obviously store them in image either. There is nothing we can do there.

Triple buffering

We need a third buffer: one buffer is the one currently being displayed, one buffer is the next fully completed image ready to be displayed, and one buffer is the work area where we build the currently incomplete image.

In order to avoid copying whole images around, we would like to work with buffer references and switch those references. Should we use dynamic memory allocation? ☠ Certainly not.

The heapless crate

The heapless crate contains several data structures that can be used in environments where dynamic memory allocation is not available or not desirable:

  • heapless::Vec<T> has an interface quite similar to std::vec::Vec<T> except that those vectors have a fixed capacity, which means that the push operation returns a Result indicating if the operation succeeded or failed (in which case it returns the element we tried to push).
  • Other structures such as BinaryHeap, IndexMap, IndexSet, String, etc. act closely like the standard library ones.
  • heapless::pool is a module for defining lock-free memory pools which allocate and reclaim fixed size objects: this is the one we are interested in.

Using a pool

By using a static pool of Image types named POOL, we will be able to manipulate values of type Box<POOL>: this type represents a reference to an image from the pool. Box<POOL> implements Deref<Target = Image> as well as DerefMut, so we will be able to use such a type instead of a reference to an Image. Also, we can easily swap two Box<POOL> objects instead of exchanging whole image contents.

A pool is declared globally by using the heapless::box_pool!() macro as described in the heapless::pool documentation. The BoxBlock<Image> represents the space occupied by an image and will be managed by the pool. Then the .alloc() method can be used to retrieve some space to be used through a Box<POOL> smart pointer. Dropping such a Box<POOL> will return the space to the pool.

  box_pool!(POOL: Image);
  …
  // Code to put in the main function:
  // Statically reserve space for three `Image` objects, and let them
  // be managed by the pool `POOL`.
  unsafe {
    const BLOCK: BoxBlock<Image> = BoxBlock::new();
    static mut MEMORY: [BoxBlock<Image>; 3] = [BLOCK; 3];
    // By defaut, mutable reference static data is forbidden. We want
    // to allow it.
    #[allow(static_mut_refs)]
    for block in &mut MEMORY {
      POOL.manage(block);
    }
  }
  • This pool can hand out Box<POOL> through POOL.alloc(model) which returns an Result<Box<POOL>, Image> initialized from model:
    • Either the pool could return an object (Ok(…)).
    • Or the pool had no free object, in which case the model is returned with the error: Err(model).
  • When it is no longer used, a Box<POOL> can be returned to the pool just by dropping it.

We will build a pool containing the space for three images:

  • When we receive a 0xff on the serial port to indicate a new image, we will draw an image from the pool and start filling its data until we have all the bytes.
  • When an image is complete, the serial receiver will hand it to the display task.
  • The display task starts by waiting for an image coming from the serial receiver and starts displaying it repeatidly.
  • If a new image arrives from the serial receiver after the last line of the current image is displayed, the display task replaces the current image by the new one. This drops the image that was just displayed, and it is then automatically returned to the pool.

We see why, in the worst case, three images might coexist at the same time:

  • The display task may be displaying image 1.
  • The serial receiver has finished receiving image 2 and has stored it so that the display task can pick it up when it is done displaying image 1.
  • The serial receiver has started the reception of image 3.

❎ Declare a pool named POOL handing out Image objects using the box_pool!() macro.

❎ In the main() function, before starting the display or serial_receiver task, reserve memory for 3 Image (using the unsafe block shown above) and hand those three areas to the pool to be managed.

Using Embassy's Signal

To pass an image from the serial receiver to the display task, we can use the Signal data structure from the embassy_sync crate. The Signal structure is interesting:

  • It acts like a queue with at most one item.
  • Reading from the queue waits asynchronously until an item is available and returns it.
  • Writing to the queue overwrites (and drops) the current item if there is one.

This is exactly the data structure we need to pass information from the serial receiver to the display task. We will make a global NEXT_IMAGE static variable which will be a Signal to exchange Box<POOL> objects (each Box<POOL> contains an Image) between the serial_receiver and the display tasks.

A Signal needs to use a raw mutex internally. Here, a ThreadModeRawMutex similar to the one we used before can be used.

❎ Declare a NEXT_IMAGE static object as described above.

Displaying the image

You want to modify the display task so that:

  • It waits until an image is available from NEXT_IMAGE and stores it into the local image variable.
  • Then in an infinite loop:
    • It displays the image it has received. image is of type Box<POOL>, but since Box<POOL> implements Deref<Target = Image>, &image can be used in a context where an &Image would be required.
    • If there is a new image available from NEXT_IMAGE, then image is replaced by it. This will drop the older Box<POOL> object, which will be made available to the pool again automatically.

NEXT_IMAGE.wait() returns a Future which will eventually return the next image available in NEXT_IMAGE:

  • Awaiting this future using .await will block until an image is available. This might be handy to get the initial image.
  • If you import futures::FutureExt into your scope, then you get additional methods on Future implementations. One of them is .now_or_never(), which returns an Option: either None if the Future does not resolve immediately (without waiting), or Some(…) if the result is available immediately. You could use this to check if a new image is available from NEXT_IMAGE, and if it is replace the current image.

❎ Add the futures crate as a dependency in your Cargo.toml. By default, the futures crates will require std; you have to specify default-features = false when importing it, or add it using cargo add futures --no-default-features.

❎ Rewrite display_image() to do what is described above.

You now want to check that it works by using an initial image before modifying the serial receiver. To do so, you will build an initial image and put it inside NEXT_IMAGE so that it gets displayed.

❎ At the end of the main() function, get an image from the pool, containing a red gradient, by using the POOL.alloc() method.

❎ Send this image containing a gradient to the NEXT_IMAGE queue by using the signal method of the queue.

You should see the gradient on the screen.

❎ Now, check that new images are correctly displayed:

  • Surround the code above with an infinite loop.
  • Inside the loop, add an asynchronous delay of 1 second after sending the image to NEXT_IMAGE.
  • Still inside the loop, repeat those three steps (get an image from the pool, send it to the display task through NEXT_IMAGE, and wait for one second) in another color.

If you see two images alternating every second, you have won: your display task is working, with proper synchronization. Time to modify the serial receiver.

Receiving new images

Only small modifications are needed to the serial receiver:

  • When you receive the first 0xff indicating a new image, get an image from the pool (you can initialize it from the default image, Image::default()). You may panic if you don't get one as we have shown that three image buffers should be enough for the program to work.
  • Receive bytes directly in the image buffer, that you can access with image.as_mut() (remember, you implemented the AsMut trait on Image).
  • When the image is complete, signal its existence to NEXT_IMAGE.

❎ Implement the steps above.

❎ Remove the static IMAGE object which is not used anymore.

❎ Remove the image switching in main(), as don't want to interfere with displaying the images received from the serial port. You may keep one initial image though, to display something before you receive the first image through the serial port.

❎ Check that you can display images coming from the serial port. Congratulations, you are now using triple buffering without copying large quantities of data around.