design.rst 9.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208
  1. Trusted Boot
  2. ============
  3. Terminology
  4. -----------
  5. * TBM: Trusted Boot Module (the entire system) or Trusted Boot Manager (the ARM Cortex M0 or AVR
  6. microcontroller).
  7. * ROTS: read-only trusted state.
  8. Prerequisites
  9. -------------
  10. * SPI NOR Flash to store a minimalistic read-only Linux image without networking support.
  11. * ARM Cortex M0 or AVR microcontroller.
  12. * A programmable timer to set deadlines and to trigger an interrupt when these expire.
  13. * Possibly SPI NOR Flash for the Trusted Boot Module to store information.
  14. * Key storage and management.
  15. * The option of image rollback?
  16. Trust Model
  17. -----------
  18. There are different trust models that can be used depending on the use case.
  19. These mostly depend on whether the concept of certificate authorities (CAs) is required or not.
  20. Furthermore, the key storage also plays an important role in deciding which of the trust models to
  21. use.
  22. #. "Home router"
  23. CA: none
  24. * replace = reflash RO flash
  25. #. "Routers in company, sysadmin"
  26. CA: none
  27. * replace = reflash or signed statement
  28. #. "Routers with CA"
  29. CA: one.
  30. * replace CA = reflash ROTS
  31. * replace key = by signed CA statement
  32. #. "Routers with multiple CAs"
  33. CA: multiple.
  34. * replace CA = threshold
  35. * replace key = by signed CA statement.
  36. #. "Routers with multiple CAs and initial CA in ROTS"
  37. CA: multiple.
  38. * revoking keys, which is not the same as removing keys.
  39. #. CA: multiple.
  40. * replace CAs = reflash ROTS.
  41. * replace key = by signed CA statement
  42. Key Storage
  43. -----------
  44. * RO Flash
  45. * TBM
  46. * Box storage (e.g. HDD)
  47. * Signed statements by embeddeding keys in images.
  48. Forced reboot
  49. -------------
  50. Because the image is only trusted up to the extent that the user trusts the parties that have
  51. signed the image, the image may contain vulnerabilities that the user will only be aware of at a
  52. later point.
  53. As these vulnerabilities can be used to compromise the system in such a way that it will prevent
  54. from rebooting the system, it is important that the TBM can reboot the system.
  55. The straightforward method of performing such a forced reboot would be to set a deadline that can
  56. be postponed by the user up to *n* times, whereupon the TBM will simply force a reboot after these
  57. attempts or when the user decides to reboot gracefully.
  58. Upgrading
  59. ---------
  60. One possible attack vector is that when the image that is running has been compromised in such a
  61. way that it prevents the system from running the update process, even after a reboot, then this
  62. will force the system to always boot the version installed.
  63. To mitigate this we could split up the image into two stages.
  64. The read-only trusted stage will then first boot up a minimalistic environment within the image to
  65. check for updates and to perform them, whereupon the full image will be booted.
  66. Because it is impossible to update the read-only trusted stage as it is stored on a read-only
  67. medium, we want to prevent the read-only stage from being able to perform updates, as it will
  68. require a network stack as well as TLS support.
  69. Both the network stack and TLS support are hard to implement correctly, thus they are both open to
  70. many vulnerabilities that cannot be mitigated as the read-only trusted stage cannot be updated.
  71. This is why the read-only trusted stage cannot be responsible for checking for updates and
  72. performing updates.
  73. Therefore, in combination with the concept of a forced reboot from the TBM, the system would always
  74. be able to check for updates periodically.
  75. To keep track of the images, a table can be used to mark whether the update stage and the image
  76. itself are known to work.
  77. Based on that table the Trusted Boot Manager can then determine which image to rollback to in case
  78. the updated image does not work.
  79. Furthermore, one important decision is whether to only use images that are known to fully work or
  80. to specifically select working stages separately.
  81. Another important open question is how far back the system is allowed to rollback.
  82. Initial Keys and Configuration
  83. ------------------------------
  84. The initial key(s) or certificates and the configuration for the TBM could be presented on a SD
  85. card that is inserted before the system is booted for the very first time.
  86. Alternatively, we could decide to have the user embed the keys within the ROTS image before
  87. flashing it to the RO Flash.
  88. Another option would be to offer the user the option of performing key management by means of
  89. serial communication with the Trusted Boot Manager.
  90. Verification of images
  91. ----------------------
  92. As the user added the keys from the signing parties to trust, the system should only allow images
  93. that have been signed with these keys.
  94. This verification can take place in the read-only trusted stage before booting the image to ensure
  95. that it has not been comprimised.
  96. Furthermore, as there is a notion of reproducible builds, there is the option of verifying the
  97. packages within the image.
  98. However, as the read-only trusted state should be kept simple to avoid any possible vulnerabilities
  99. or compromises, it is better to delegate the responsibility of verifying these packages to the
  100. parties that sign the image.
  101. Similarly, while the idea of sending deltas between different versions of images does sound
  102. beneficial in the sense that the amount of data that has to be downloaded is kept to a minimum,
  103. this would also further complicate the read-only trusted stage to the extent that it could increase
  104. the likelihood of possible vulnerabilities.
  105. Boot procedure
  106. --------------
  107. #. The image from SPI flash is read and booted.
  108. #. The read-only trusted stage asks the Trusted Boot Manager what to boot and/or asks whether it
  109. is allowed to boot a certain image.
  110. By default the read-only trusted stage should boot a semi-trusted update stage.
  111. #. The read-only trusted stage boots the semi-trusted update stage using ``kexec``.
  112. #. The semi-trusted update stage checks for updates and downloads a new image if present.
  113. #. Once downloaded the semi-trusted update stage finishes by telling the Trusted Boot Manager that
  114. a new image has been downloaded or that no updates were available.
  115. #. The Trusted Boot Manager reboots the system.
  116. #. The image from SPI flash is read and booted.
  117. #. The read-only trusted stage asks the Trusted Boot Manager what to boot and/or asks whether it
  118. is allowed to boot a certain image.
  119. After a reboot from the update process, the Trusted Boot Manager will tell the read-only
  120. trusted stage to boot the latest image.
  121. #. The read-only trusted stage uses ``kexec`` to boot the latest image.
  122. Image self-test and rollback
  123. ----------------------------
  124. The image could be extended with a self-test, where the untrusted stage tells the Trusted Boot
  125. Manager that it managed boot successfully.
  126. In case the image fails to boot, the Trusted Boot Manager should set a deadline to expire before
  127. the image is booted.
  128. When this deadline expires, the Trusted Boot Manager can then assume that the image does not work
  129. and should not be used in the future, allowing rollback to the last-known working image.
  130. Furthermore, by solely working with the concept of a last-known working image and an updated image
  131. the possibility of a downgrade is severely limited and no advanced implementation of an image
  132. rollback system would be required.
  133. Trusted Boot Manager protocol
  134. -----------------------------
  135. To have the board communicate with the TBM, we have to devise a protocol.
  136. This communication is likely to place over serial communication through the UART ports.
  137. Furthermore, the board can only send requests to the TBM to which the TBM responds, i.e. the TBM
  138. should never be able to initiate communication.
  139. In the read-only trusted stage the Trusted Boot Manager fully trusts the system and as such the
  140. read-only trusted stage can operate using the full scope of the protocol.
  141. At some point the TBM will be informed by the read-only trusted stage that it will boot into
  142. untrusted state, which is an indication for the TBM to configure the deadline timer and that the
  143. scope of the protocol should be limited as the untrusted stage is only allowed to inform the TBM
  144. that it booted successfully.
  145. The read-only trusted stage should be able to ask which images are preferred for booting and if
  146. booting a certain image is allowed.
  147. Furthermore, it would probably be interesting if a list of images that can be booted can be queried
  148. from the TBM.
  149. In the case that we decide to support initial communication from the read-only trusted stage,
  150. either by flashing the RO Flash or by inserting a removable medium such a SD card, it might be
  151. interesting to support this option of configuration in the protocol.
  152. Proposal(s)
  153. -----------
  154. To reduce the amount of possible vulnerabilities, the codebase and complexity of the TBM should be
  155. kept as minimalistic as possible.
  156. From that point of view, it would be best that the read-only trusted stage only consists of the
  157. tools necessary to verify signed images, to communicate with the TBM and to ``kexec`` those images.
  158. Furthermore, since the read-only trusted stage cannot be modified once it has been written to the
  159. RO Flash, it cannot be easily patched.
  160. As such, it makes a lot of sense to not include a network and TLS stack, as it is very likely that
  161. they contain many vulnerabilities that have yet to be found.
  162. Separating the update stage from the actual image itself, allows the system to forcefully perform
  163. updates, which would essentially prevent any attacker from extending the time window they have any
  164. further.
  165. However, similar to the read-only trusted stage, the update stage should only come with the tools
  166. necessary to check for updates and to perform an update: a network stack, a TLS stack and tools to
  167. communicate with the TBM.
  168. Finally, from the point of view of maintaining minimalism, only allowing rollback to the last-known
  169. working image of which both the update stage and the image itself is probably one of the most
  170. elegant choices, as it easily prevents downgrade attacks and as it is easier to implement.