tbm-docs/design.rst

Trusted Boot
============

Terminology
-----------

 * TBM: Trusted Boot Module (the entire system) or Trusted Boot Manager (the ARM Cortex M0 or AVR
   microcontroller).
 * ROTS: read-only trusted state.

Prerequisites
-------------

 * SPI NOR Flash to store a minimalistic read-only Linux image without networking support.
 * ARM Cortex M0 or AVR microcontroller.
 * A programmable timer to set deadlines and to trigger an interrupt when these expire.
 * Possibly SPI NOR Flash for the Trusted Boot Module to store information.
 * Key storage and management.
 * The option of image rollback?

Trust Model
-----------

There are different trust models that can be used depending on the use case.
These mostly depend on whether the concept of certificate authorities (CAs) is required or not.
Furthermore, the key storage also plays an important role in deciding which of the trust models to
use.

 #. "Home router"
    CA: none

    * replace = reflash RO flash

 #. "Routers in company, sysadmin"
    CA: none

    * replace = reflash or signed statement

 #. "Routers with CA"
    CA: one.

    * replace CA = reflash ROTS
    * replace key = by signed CA statement

 #. "Routers with multiple CAs"
    CA: multiple.

    * replace CA = threshold
    * replace key = by signed CA statement.

 #. "Routers with multiple CAs and initial CA in ROTS"
    CA: multiple.

    * revoking keys, which is not the same as removing keys.

 #. CA: multiple.

    * replace CAs = reflash ROTS.
    * replace key = by signed CA statement

Key Storage
-----------

 * RO Flash
 * TBM
 * Box storage (e.g. HDD)
 * Signed statements by embeddeding keys in images.

Forced reboot
-------------

Because the image is only trusted up to the extent that the user trusts the parties that have
signed the image, the image may contain vulnerabilities that the user will only be aware of at a
later point.
As these vulnerabilities can be used to compromise the system in such a way that it will prevent
from rebooting the system, it is important that the TBM can reboot the system.
The straightforward method of performing such a forced reboot would be to set a deadline that can
be postponed by the user up to *n* times, whereupon the TBM will simply force a reboot after these
attempts or when the user decides to reboot gracefully.

Upgrading
---------

One possible attack vector is that when the image that is running has been compromised in such a
way that it prevents the system from running the update process, even after a reboot, then this
will force the system to always boot the version installed.
To mitigate this we could split up the image into two stages.
The read-only trusted stage will then first boot up a minimalistic environment within the image to
check for updates and to perform them, whereupon the full image will be booted.
Because it is impossible to update the read-only trusted stage as it is stored on a read-only
medium, we want to prevent the read-only stage from being able to perform updates, as it will
require a network stack as well as TLS support.
Both the network stack and TLS support are hard to implement correctly, thus they are both open to
many vulnerabilities that cannot be mitigated as the read-only trusted stage cannot be updated.
This is why the read-only trusted stage cannot be responsible for checking for updates and
performing updates.
Therefore, in combination with the concept of a forced reboot from the TBM, the system would always
be able to check for updates periodically.

To keep track of the images, a table can be used to mark whether the update stage and the image
itself are known to work.
Based on that table the Trusted Boot Manager can then determine which image to rollback to in case
the updated image does not work.
Furthermore, one important decision is whether to only use images that are known to fully work or
to specifically select working stages separately.
Another important open question is how far back the system is allowed to rollback.

Initial Keys and Configuration
------------------------------

The initial key(s) or certificates and the configuration for the TBM could be presented on a SD
card that is inserted before the system is booted for the very first time.
Alternatively, we could decide to have the user embed the keys within the ROTS image before
flashing it to the RO Flash.
Another option would be to offer the user the option of performing key management by means of
serial communication with the Trusted Boot Manager.

Verification of images
----------------------

As the user added the keys from the signing parties to trust, the system should only allow images
that have been signed with these keys.
This verification can take place in the read-only trusted stage before booting the image to ensure
that it has not been comprimised.
Furthermore, as there is a notion of reproducible builds, there is the option of verifying the
packages within the image.
However, as the read-only trusted state should be kept simple to avoid any possible vulnerabilities
or compromises, it is better to delegate the responsibility of verifying these packages to the
parties that sign the image.
Similarly, while the idea of sending deltas between different versions of images does sound
beneficial in the sense that the amount of data that has to be downloaded is kept to a minimum,
this would also further complicate the read-only trusted stage to the extent that it could increase
the likelihood of possible vulnerabilities.

Boot procedure
--------------

 #. The image from SPI flash is read and booted.
 #. The read-only trusted stage asks the Trusted Boot Manager what to boot and/or asks whether it
    is allowed to boot a certain image.
    By default the read-only trusted stage should boot a semi-trusted update stage.
 #. The read-only trusted stage boots the semi-trusted update stage using ``kexec``.
 #. The semi-trusted update stage checks for updates and downloads a new image if present.
 #. Once downloaded the semi-trusted update stage finishes by telling the Trusted Boot Manager that
    a new image has been downloaded or that no updates were available.
 #. The Trusted Boot Manager reboots the system.
 #. The image from SPI flash is read and booted.
 #. The read-only trusted stage asks the Trusted Boot Manager what to boot and/or asks whether it
    is allowed to boot a certain image.
    After a reboot from the update process, the Trusted Boot Manager will tell the read-only
    trusted stage to boot the latest image.
 #. The read-only trusted stage uses ``kexec`` to boot the latest image.

Image self-test and rollback
----------------------------

The image could be extended with a self-test, where the untrusted stage tells the Trusted Boot
Manager that it managed boot successfully.
In case the image fails to boot, the Trusted Boot Manager should set a deadline to expire before
the image is booted.
When this deadline expires, the Trusted Boot Manager can then assume that the image does not work
and should not be used in the future, allowing rollback to the last-known working image.
Furthermore, by solely working with the concept of a last-known working image and an updated image
the possibility of a downgrade is severely limited and no advanced implementation of an image
rollback system would be required.

Trusted Boot Manager protocol
-----------------------------

To have the board communicate with the TBM, we have to devise a protocol.
This communication is likely to place over serial communication through the UART ports.
Furthermore, the board can only send requests to the TBM to which the TBM responds, i.e. the TBM
should never be able to initiate communication.
In the read-only trusted stage the Trusted Boot Manager fully trusts the system and as such the
read-only trusted stage can operate using the full scope of the protocol.
At some point the TBM will be informed by the read-only trusted stage that it will boot into
untrusted state, which is an indication for the TBM to configure the deadline timer and that the
scope of the protocol should be limited as the untrusted stage is only allowed to inform the TBM
that it booted successfully.
The read-only trusted stage should be able to ask which images are preferred for booting and if
booting a certain image is allowed.
Furthermore, it would probably be interesting if a list of images that can be booted can be queried
from the TBM.
In the case that we decide to support initial communication from the read-only trusted stage,
either by flashing the RO Flash or by inserting a removable medium such a SD card, it might be
interesting to support this option of configuration in the protocol.

Proposal(s)
-----------

To reduce the amount of possible vulnerabilities, the codebase and complexity of the TBM should be
kept as minimalistic as possible.
From that point of view, it would be best that the read-only trusted stage only consists of the
tools necessary to verify signed images, to communicate with the TBM and to ``kexec`` those images.
Furthermore, since the read-only trusted stage cannot be modified once it has been written to the
RO Flash, it cannot be easily patched.
As such, it makes a lot of sense to not include a network and TLS stack, as it is very likely that
they contain many vulnerabilities that have yet to be found.
Separating the update stage from the actual image itself, allows the system to forcefully perform
updates, which would essentially prevent any attacker from extending the time window they have any
further.
However, similar to the read-only trusted stage, the update stage should only come with the tools
necessary to check for updates and to perform an update: a network stack, a TLS stack and tools to
communicate with the TBM.
Finally, from the point of view of maintaining minimalism, only allowing rollback to the last-known
working image of which both the update stage and the image itself is probably one of the most
elegant choices, as it easily prevents downgrade attacks and as it is easier to implement.