Formal Mechanised Semantics of CHERI C: Capabilities, Undefined Behaviour, and Provenance

Authors:
Vadim Zaliva

University of Cambridge, Cambridge, United Kingdom

University of Cambridge, Cambridge, United Kingdom

https://orcid.org/0000-0002-9145-3288
View Profile

,
Kayvan Memarian

University of Cambridge, Cambridge, United Kingdom

University of Cambridge, Cambridge, United Kingdom

https://orcid.org/0000-0003-3723-636X
View Profile

,
Ricardo Almeida

University of Edinburgh, Edinburgh, Scotland Uk

University of Edinburgh, Edinburgh, Scotland Uk

https://orcid.org/0009-0000-1667-1683
View Profile

,
Jessica Clarke

University of Cambridge, Cambridge, United Kingdom

University of Cambridge, Cambridge, United Kingdom

https://orcid.org/0000-0001-8157-5567
View Profile

,
Brooks Davis

SRI International, Menlo Park, California, United States of America

SRI International, Menlo Park, California, United States of America

https://orcid.org/0009-0006-6256-0419
View Profile

,
Alexander Richardson

University of Cambridge, Cambridge, United Kingdom

University of Cambridge, Cambridge, United Kingdom

https://orcid.org/0000-0002-6372-217X
View Profile

,
David Chisnall

Microsoft, Cambridge, United Kingdom

Microsoft, Cambridge, United Kingdom

https://orcid.org/0000-0001-6060-0153
View Profile

,
Brian Campbell

University of Edinburgh, Edinburgh, Scotland Uk

University of Edinburgh, Edinburgh, Scotland Uk

https://orcid.org/0000-0001-6941-5034
View Profile

,
Ian Stark

University of Edinburgh, Edinburgh, Scotland Uk

University of Edinburgh, Edinburgh, Scotland Uk

https://orcid.org/0000-0001-6800-812X
View Profile

,
Robert N. M. Watson

University of Cambridge, Cambridge, United Kingdom

University of Cambridge, Cambridge, United Kingdom

https://orcid.org/0000-0001-8139-8783
View Profile

,
Peter Sewell

University of Cambridge, Cambridge, United Kingdom

University of Cambridge, Cambridge, United Kingdom

https://orcid.org/0000-0001-9352-1013
View Profile

ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1April 2024Pages 181–196https://doi.org/10.1145/3617232.3624859

Published:17 April 2024Publication History

ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1

Pages 181–196

ABSTRACT

Memory safety issues are a persistent source of security vulnerabilities, with conventional architectures and the C codebase chronically prone to exploitable errors. The CHERI research project has shown how one can provide radically improved security for that existing codebase with minimal modification, using unforgeable hardware capabilities in place of machine-word pointers in CHERI dialects of C, implemented as adaptions of Clang/LLVM and GCC. CHERI was first prototyped as extensions of MIPS and RISC-V; it is currently being evaluated by Arm and others with the Arm Morello experimental architecture, processor, and platform, to explore its potential for mass-market adoption, and by Microsoft in their CHERIoT design for embedded cores.

There is thus considerable practical experience with CHERI C implementation and use, but exactly what CHERI C's semantics is (or should be) remains an open question. In this paper, we present the first attempt to rigorously and comprehensively define CHERI C semantics, discuss key semantics design questions relating to capabilities, provenance, and undefined behaviour, and clarify them with semantics in multiple complementary forms: in prose, as an executable semantics adapting the Cerberus C semantics, and mechanised in Coq.

This establishes a solid foundation for CHERI C, for those porting code to it, for compiler implementers, and for future semantics and verification.

References

CHERI x86-64 Sail model. https://github.com/CTSRD-CHERI/sail-cheri-x86. Accessed 2023-04-17.Google Scholar
Periklis Akritidis, Manuel Costa, Miguel Castro, and Steven Hand. Baggy bounds checking: An efficient and backwards-compatible defense against out-of-bounds errors. In USENIX Security Symposium, volume 10, page 96, 2009.Google Scholar
Saar Amar, Tony Chen, David Chisnall, Felix Domke, Nathaniel Filardo, Kunyan Liu, Robert Norton-Wright, Yucong Tao, Robert N. M. Watson, and Hongyan Xia. CHERIoT: Rethinking security for low-cost embedded systems. Technical Report MSR-TR-2023-6, Microsoft, February 2023. URL: https://www.microsoft.com/en-us/research/publication/cheriot-rethinking-security-for-low-cost-embedded-systems/.Google Scholar
Arm. Arm Morello Program. https://developer.arm.com/architectures/cpu-architecture/a-profile/morello, 2022. Accessed 2021-06-29.Google Scholar
Arm Ltd. Arm^® architecture reference manual supplement Morello for A-profile architecture. https://developer.arm.com/documentation/ddi0606/latest, June 2021. DDI0606A.j. 1288pp. Accessed 2022-06-15.Google Scholar
Arm Ltd. Arm Morello program, landing page for Morello open source software. https://www.morello-project.org/, November 2022.Google Scholar
Alasdair Armstrong, Thomas Bauereiss, Brian Campbell, Alastair Reid, Kathryn E. Gray, Robert M. Norton, Prashanth Mundkur, Mark Wassell, Jon French, Christopher Pulte, Shaked Flur, Ian Stark, Neel Krishnaswami, and Peter Sewell. ISA semantics for ARMv8-A, RISC-V, and CHERI-MIPS. Proc. ACM Program. Lang., 3(POPL):71:1--71:31, 2019. Google ScholarDigital Library
Thomas Bauereiss, Brian Campbell, Thomas Sewell, Alasdair Armstrong, Lawrence Esswood, Ian Stark, Graeme Barnes, Robert N. M. Watson, and Peter Sewell. Verified security for the Morello capability-enhanced prototype arm architecture. In Ilya Sergey, editor, Programming Languages and Systems - 31st European Symposium on Programming, ESOP 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2--7, 2022, Proceedings, volume 13240 of Lecture Notes in Computer Science, pages 174--203. Springer, 2022. http://www.cl.cam.ac.uk/~pes20/morello-proofs-esop2022.pdf. Google ScholarDigital Library
Franz Brauße, Fedor Shmarov, Rafael Menezes, Mikhail R. Gadelha, Konstantin Korovin, Giles Reger, and Lucas C. Cordeiro. ESBMC-CHERI: Towards verification of C programs for CHERI platforms with ESBMC. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2022, page 773--776, New York, NY, USA, 2022. Association for Computing Machinery. Google ScholarDigital Library
Miguel Castro, Manuel Costa, and Tim Harris. Securing software by enforcing data-flow integrity. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation, OSDI '06, page 147--160, USA, 2006. USENIX Association.Google ScholarDigital Library
Jeremy Condit, Matthew Harren, Zachary Anderson, David Gay, and George C. Necula. Dependent types for low-level programming. In Proceedings of the 16th European Symposium on Programming, ESOP'07, page 520--535, Berlin, Heidelberg, 2007. Springer-Verlag.Google ScholarDigital Library
Brooks Davis, Robert N. M. Watson, Alexander Richardson, Peter G. Neumann, Simon W. Moore, John Baldwin, David Chisnall, Jessica Clarke, Nathaniel Wesley Filardo, Khilan Gudka, Alexandre Joannou, Ben Laurie, A. Theodore Markettos, J. Edward Maste, Alfredo Mazzinghi, Edward Tomasz Napierala, Robert M. Norton, Michael Roe, Peter Sewell, Stacey Son, and Jonathan Woodruff. CheriABI: Enforcing Valid Pointer Provenance and Minimizing Pointer Privilege in the POSIX C Run-time Environment. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '19, pages 379--393. ACM, 2019. URL: https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/201904-asplos-cheriabi.pdf Google ScholarDigital Library
Peter J. Denning. Virtual memory. ACM Comput. Surv., 2(3):153--189, sep 1970. Google ScholarDigital Library
Joe Devietti, Colin Blundell, Milo M. K. Martin, and Steve Zdancewic. Hardbound: Architectural support for spatial safety of the C programming language. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIII, page 103--114, New York, NY, USA, 2008. Association for Computing Machinery. Google ScholarDigital Library
Dinakar Dhurjati, Sumant Kowshik, and Vikram Adve. Safecode: Enforcing alias analysis for weakly typed languages. In Proceedings of the 27th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '06, page 144--157, New York, NY, USA, 2006. Association for Computing Machinery. Google ScholarDigital Library
B. Ding, Y. He, Y. Wu, A. Miller, and J. Criswell. Baggy bounds with accurate checking. In 2012 IEEE 23rd International Symposium on Software Reliability Engineering Workshops (ISSREW), pages 195--200, Los Alamitos, CA, USA, nov 2012. IEEE Computer Society. doi:10.1109/ISSREW.2012.24. Google ScholarDigital Library
Wesley Nathaniel Filardo, Brett F. Gutstein, Jonathan Woodruff, Sam Ainsworth, Lucian Paul-Trifu, Brooks Davis, Hongyan Xia, Edward Tomasz Napierala, Alexander Richardson, John Baldwin, David Chisnall, Jessica Clarke, Khilan Gudka, Alexandre Joannou, A. Theodore Markettos, Alfredo Mazzinghi, Robert M. Norton, Michael Roe, Peter Sewell, Stacey Son, Timothy M. Jones, Simon W. Moore, Peter G. Neumann, and Robert N. M. Watson. Cornucopia: Temporal Safety for CHERI Heaps. In 2020 IEEE Symposium on Security and Privacy (SP), pages 608--625, 2020. Google ScholarCross Ref
Jens Gustedt, Peter Sewell, Kayvan Memarian, Victor BF Gomes, and Martin Uecker. A Provenance-aware Memory Object Model for C, 2022. Working draft ISO Technical Specification TS6010.Google Scholar
Ben Hawkes. 0day in the wild. 2019. Project Zero team blog, Google. https://googleprojectzero.blogspot.com/p/0day.html. Accessed 2023-04-19.Google Scholar
Raphael Isemann, Cristiano Giuffrida, Herbert Bos, Erik van der Kouwe, and Klaus von Gleissenthall. Don't look UB: Exposing sanitizer-eliding compiler optimizations. Proc. ACM Program. Lang., 7(PLDI), jun 2023. Google ScholarDigital Library
ISO WG14. Programming languages - C, ISO/IEC 9899:2018 edition, July 2018.Google Scholar
Richard WM Jones and Paul HJ Kelly. Backwards-compatible bounds checking for arrays and pointers in C programs. In AADEBUG, volume 97, pages 13--26, 1997.Google Scholar
Chris Lattner and Vikram Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO'04), Palo Alto, California, Mar 2004.Google ScholarDigital Library
Xavier Leroy, Andrew W Appel, Sandrine Blazy, and Gordon Stewart. The CompCert memory model, version 2. PhD thesis, Inria, 2012.Google Scholar
Petar Maksimovic, Sacha-Élie Ayoun, José Fragoso Santos, and Philippa Gardner. Gillian, part II: real-world verification for JavaScript and C. In Alexandra Silva and K. Rustan M. Leino, editors, Proceedings of the 33^rd Computer Aided Verification International Conference, CAV 2021, Virtual Event, July 20--23, 2021, Part II, volume 12760 of Lecture Notes in Computer Science, pages 827--850. Springer, 2021. Google ScholarDigital Library
Nicholas D. Matsakis and Felix S. Klock. The Rust language. In Proceedings of the 2014 ACM SIGAda Annual Conference on High Integrity Language Technology, HILT '14, page 103--104, New York, NY, USA, 2014. Association for Computing Machinery. Google ScholarDigital Library
Kayvan Memarian. The Cerberus C semantics. Technical Report UCAM-CL-TR-981, University of Cambridge, Computer Laboratory, May 2023. URL: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-981.pdf Google Scholar
Kayvan Memarian, Victor B. F. Gomes, Brooks Davis, Stephen Kell, Alexander Richardson, Robert N. M. Watson, and Peter Sewell. Exploring C Semantics and Pointer Provenance. Proc. ACM Program. Lang., 3(POPL), January 2019. Google ScholarDigital Library
Matt Miller. Trends, challenge, and shifts in software vulnerability mitigation. https://github.com/Microsoft/MSRC-Security-Research/tree/master/presentations/2019_02_BlueHatIL, February 2019. Microsoft Security Response Center.Google Scholar
Santosh Nagarakatte, Jianzhou Zhao, Milo M.K. Martin, and Steve Zdancewic. Softbound: Highly compatible and complete spatial memory safety for c. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '09, page 245--258, New York, NY, USA, 2009. Association for Computing Machinery. Google ScholarDigital Library
George C. Necula, Scott McPeak, and Westley Weimer. CCured: type-safe retrofitting of legacy code. In John Launchbury and John C. Mitchell, editors, Conference Record of POPL 2002: The 29th SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Portland, OR, USA, January 16--18, 2002, pages 128--139. ACM, 2002. Google ScholarDigital Library
Nicholas Nethercote and Julian Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation. ACM Sigplan notices, 42(6):89--100, 2007.Google ScholarDigital Library
Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. Isabelle/HOL: A proof assistant for higher-order logic. In TPHOLs, pages 1--18. Springer, 2002.Google Scholar
Seung Hoon Park, Rekha Pai, and Tom Melham. A formal CHERI-C semantics for verification, 2023. Accepted to appear in TACAS 2023. https://arxiv.org/abs/2211.07511. arXiv:2211.07511.Google ScholarDigital Library
Thomas Reps. Undecidability of context-sensitive data-dependence analysis. ACM Trans. Program. Lang. Syst., 22(1):162--186, jan 2000. Google ScholarDigital Library
Alexander Richardson. Complete spatial safety for C and C++ using CHERI capabilities. Technical Report UCAM-CL-TR-949, University of Cambridge, Computer Laboratory, June 2020. URL: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-949.pdf Google Scholar
Andrew Ruef, Leonidas Lampropoulos, Ian Sweet, David Tarditi, and Michael Hicks. Achieving safety incrementally with Checked C. In Flemming Nielson and David Sands, editors, Principles of Security and Trust - 8th International Conference, POST 2019, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019, Prague, Czech Republic, April 6--11, 2019, Proceedings, volume 11426 of Lecture Notes in Computer Science, pages 76--98. Springer, 2019. Google ScholarCross Ref
Olatunji Ruwase and Monica S Lam. A practical dynamic buffer over-flow detector. In NDSS, volume 2004, pages 159--169, 2004.Google Scholar
Konstantin Serebryany and Timur Iskhodzhanov. AddressSanitizer: A fast address sanity checker. In Presented as part of the 2012 USENIX Annual Technical Conference (ATC 12), pages 309--318. USENIX, 2012.Google Scholar
Julian Seward and Nicholas Nethercote. Using Valgrind to detect undefined value errors with Bit-Precision. In 2005 USENIX Annual Technical Conference (USENIX ATC 05), Anaheim, CA, April 2005. USENIX Association. URL: https://www.usenix.org/conference/2005-usenix-annual-technical-conference/using-valgrind-detect-undefined-value-errors-bit.Google Scholar
UKRI. Digital security by design. https://www.dsbd.tech/ and https://www.ukri.org/our-work/our-main-funds/industrial-strategy-challenge-fund/artificial-intelligence-and-data-economy/digital-security-by-design-challenge/, 2022. Accessed 2021-06-29.Google Scholar
Robert N. M. Watson, Ben Laurie, and Alexander Richardson. Assessing the Viability of an Open- Source CHERI Desktop Software Ecosystem. http://www.capabilitieslimited.co.uk/pdfs/20210917-capltd-cheri-desktop-report-version1-FINAL.pdf, September 2021.Google Scholar
Robert N. M. Watson, Peter G. Neumann, Jonathan Woodruff, Michael Roe, Hesham Almatary, Jonathan Anderson, John Baldwin, Graeme Barnes, David Chisnall, Jessica Clarke, Brooks Davis, Lee Eisen, Nathaniel Wesley Filardo, Richard Grisenthwaite, Alexandre Joannou, Ben Laurie, A. Theodore Markettos, Simon W. Moore, Steven J. Murdoch, Kyndylan Nienhuis, Robert Norton, Alexander Richardson, Peter Rugg, Peter Sewell, Stacey Son, and Hongyan Xia. Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 9 - DRAFT). Accessed 2023-04-12. URL: https://github.com/CTSRD-CHERI/cheri-specification.Google Scholar
Robert N. M. Watson, Peter G. Neumann, Jonathan Woodruf, Michael Roe, Hesham Almatary, Jonathan Anderson, John Baldwin, Graeme Barnes, David Chisnall, Jessica Clarke, Brooks Davis, Lee Eisen, Nathaniel Wesley Filardo, Richard Grisenthwaite, Alexandre Joannou, Ben Laurie, A. Theodore Markettos, Simon W. Moore, Steven J. Murdoch, Kyndylan Nienhuis, Robert Norton, Alexander Richardson, Peter Rugg, Peter Sewell, Stacey Son, and Hongyan Xia. Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 8). Technical Report UCAM-CL-TR-951, University of Cambridge, Computer Laboratory, October 2020. URL: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-951.pdf Google ScholarCross Ref
Robert N. M. Watson, Alexander Richardson, Brooks Davis, John Baldwin, David Chisnall, Jessica Clarke, Nathaniel Filardo, Simon W. Moore, Edward Napierala, Peter Sewell, and Peter G. Neumann. CHERI C/C++ Programming Guide. Technical Report UCAM-CL-TR-947, University of Cambridge, Computer Laboratory, June 2020. URL: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf Google Scholar
WG14. Defect report 260, September 2004. http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm.Google Scholar
Jonathan Woodruff, Alexandre Joannou, Hongyan Xia, Anthony Fox, Robert Norton, Thomas Baureiss, David Chisnall, Brooks Davis, Khilan Gudka, Nathaniel Wesley Filardo, A. Theodore Markettos, Michael Roe, Peter G. Neumann, Robert N. M. Watson, and Simon W. Moore. CHERI Concentrate: Practical Compressed Capabilities. IEEE Transactions on Computers, 68(10):1455--1469, October 2019. URL: https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/2019tc-cheri-concentrate.pdf Google ScholarCross Ref
Jonathan Woodruff, Robert N. M. Watson, David Chisnall, Simon W. Moore, Jonathan Anderson, Brooks Davis, Ben Laurie, Peter G. Neumann, Robert Norton, and Michael Roe. The CHERI capability model: Revisiting RISC in an age of risk. In Proc. ISCA, 2014.Google ScholarCross Ref
Vadim Zaliva, Kayvan Memarian, Ricardo Almeida, Jessica Clarke, Brooks Davis, Alex Richardson, David Chisnall, Brian Campbell, Ian Stark, Robert N. M. Watson, and Peter Sewell. CHERI C semantics as an extension of the ISO C17 standard. Technical Report UCAM-CL-TR-988, University of Cambridge, Computer Laboratory, 15 JJ Thomson Avenue, Cambridge CB3 0FD, United Kingdom, phone +44 1223 763500. URL: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-988.html.Google Scholar

Recommendations

Towards a formal semantics for the AADL behavior annex
DATE '09: Proceedings of the Conference on Design, Automation and Test in Europe

AADL is an Architecture Description Language which describes embedded real-time systems. Behavior annex is an extension of the dispatch mechanism of AADL execution model. This paper proposes a formal semantics for the AADL behavior annex using Timed ...
Read More
A Formal CHERI-C Semantics for Verification
Tools and Algorithms for the Construction and Analysis of Systems
Abstract
CHERI-C extends the C programming language by adding hardware capabilities, ensuring a certain degree of memory safety while remaining efficient. Capabilities can also be employed for higher-level security measures, such as software ...
Read More
Equivalence of formal semantics definition methods
Abstract
There are numerous methods of formally defining the semantics of computer languages. Each method has been designed to fulfil a different purpose. For example, some have been designed to make reasoning about languages as easy as possible; others ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1
April 2024
494 pages
ISBN:9798400703720
DOI:10.1145/3617232
General Chairs:
Nael Abu-Ghazaleh,
Rajiv Gupta,
Program Chairs:
Madan Musuvathi,
Dan Tsafrir
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 April 2024
Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate535of2,713submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 382
  Total Downloads
- Downloads (Last 12 months)382
- Downloads (Last 6 weeks)382
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Formal Mechanised Semantics of CHERI C: Capabilities, Undefined Behaviour, and Provenance

ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1

ABSTRACT

References

Cited By

Recommendations

Towards a formal semantics for the AADL behavior annex

A Formal CHERI-C Semantics for Verification

Equivalence of formal semantics definition methods