clang/docs/LLVMExceptionHandlingCodeGen.rst - mirrors/github.com/llvm/llvm-project - Git at Google

 ========================================
 LLVM IR Generation for EH and Cleanups
 ========================================

 .. contents::
    :local:

 Overview
 ========

 This document describes how Clang's LLVM IR generation represents exception
 handling (EH) and C++ cleanups. It focuses on the data structures and control
 flow patterns used to model normal and exceptional exits, and it outlines how
 the generated IR differs across common ABI models.

 For details on the LLVM IR representation of exception handling, see
 `LLVM Exception Handling <https://llvm.org/docs/ExceptionHandling.html>`_.

 Core Model
 ==========

 EH and cleanup handling is centered around an ``EHScopeStack`` that records
 nested scopes for:

 - **Cleanups**, which run on normal control flow, exceptional control flow, or
   both. These are used for destructors, full-expression cleanups, and other
   scope-exit actions.
 - **Catch scopes**, which represent ``try``/``catch`` handlers.
 - **Filter scopes**, used to model dynamic exception specifications and some
   platform-specific filters.
 - **Terminate scopes**, used for ``noexcept`` and similar termination paths.

 Each cleanup is a small object with an ``Emit`` method. When a cleanup scope is
 popped, the IR generator decides whether it must materialize a normal cleanup
 block (for fallthrough, branch-through, or unresolved ``goto`` fixups) and/or an
 EH cleanup entry (when exceptional control flow can reach the cleanup). This
 results in a flattened CFG where cleanup lifetime is represented by the blocks
 and edges that flow into those blocks.

 Key Components
 ==============

 The LLVM IR generation for EH and cleanups is spread across several core
 components:

 - ``CodeGenModule`` owns module-wide state such as the LLVM module, target
   information, and the selected EH personality function. It provides access to
   ABI helpers via ``CGCXXABI`` and target-specific hooks.
 - ``CodeGenFunction`` manages per-function state and IR building. It owns the
   ``EHScopeStack``, tracks the current insertion point, and emits blocks, calls,
   and branches. Most cleanup and EH control flow is built here.
 - ``EHScopeStack`` is the central stack of scopes used to model EH and cleanup
   semantics. It stores ``EHCleanupScope`` entries for cleanups, along with
   ``EHCatchScope``, ``EHFilterScope``, and ``EHTerminateScope`` for handlers and
   termination logic.
 - ``EHCleanupScope`` stores the cleanup object plus state data (active flags,
   fixup depth, and enclosing scope links). When a cleanup scope is popped,
   ``CodeGenFunction`` decides whether to emit a normal cleanup block, an EH
   cleanup entry, or both.
 - Cleanup emission helpers implement the mechanics of branching through
   cleanups, threading fixups, and emitting cleanup blocks.
 - Exception emission helpers implement landing pads, dispatch blocks,
   personality selection, and helper routines for try/catch, filters, and
   terminate handling.
 - ``CGCXXABI`` (and its ABI-specific implementations such as
   ``ItaniumCXXABI`` and ``MicrosoftCXXABI``) provide ABI-specific lowering for
   throws, catch handling, and destructor emission details.
 - The cleanup and exception handling code generation is driven by the flow of
   ``CodeGenFunction`` and its helper classes traversing the AST to emit IR for
   C++ expressions, classes, and statements.

 AST traversal in ``CodeGenFunction`` emits code and pushes cleanups or EH scopes,
 ``EHScopeStack`` records scope nesting, cleanup and exception helpers materialize
 the CFG as scopes are popped, and ``CGCXXABI`` supplies ABI-specific details for
 landing pads or funclets.

 Cleanup Destination Routing
 ===========================

 When multiple control flow exits (``return``, ``break``, ``continue``,
 fallthrough) pass through the same cleanup, the generated IR shares a single
 cleanup block among them. Before entering the cleanup, each exit path stores a
 unique index into a "cleanup destination" slot. After the cleanup code runs, a
 ``switch`` instruction loads this index and dispatches to the appropriate final
 destination. This avoids duplicating cleanup code for each exit while preserving
 correct control flow.

 For example, if a function has both a ``return`` and a ``break`` that exit
 through the same destructor cleanup, both paths branch to the shared cleanup
 block after storing their respective destination indices. The cleanup epilogue
 then switches on the stored index to reach either the return block or the
 loop-exit block.

 When only a single exit passes through a cleanup (the common case), the switch
 is unnecessary and the cleanup block branches directly to its sole destination.

 Branch Fixups for Forward Gotos
 -------------------------------

 A ``goto`` statement that jumps forward to a label not yet seen poses a special
 problem. The destination's enclosing cleanup scope is unknown at the point the
 ``goto`` is emitted. This is handled by emitting an optimistic branch and
 recording a "fixup." When the cleanup scope is later popped, any recorded fixups
 are resolved by rewriting the branch to thread through the cleanup block and
 adding the destination to the cleanup's switch.

 Exceptional Cleanups and EH Dispatch
 ====================================

 Exceptional exits (``throw``, ``invoke`` unwinds) are routed through EH cleanup
 entries, which are reached via a landing pad or a funclet dispatch block,
 depending on the target ABI.

 For Itanium-style EH (such as is used on x86-64 Linux), the IR uses ``invoke``
 to call potentially-throwing operations and a ``landingpad`` instruction to
 capture the exception and selector values. The landing pad aggregates any
 catch and cleanup clauses for the current scope, and branches to a dispatch
 block that compares the selector to type IDs and jumps to the appropriate
 handler.

 For Windows, LLVM IR uses funclet-style EH: ``catchswitch`` and ``catchpad`` for
 handlers, and ``cleanuppad`` for cleanups, with ``catchret`` and ``cleanupret``
 edges to resume normal flow. The personality function determines how these pads
 are interpreted by the backend.

 Personality and ABI Selection
 =============================

 Each function with exception handling constructs is associated with a
 personality function (e.g. __gxx_personality_v0 for C++ on Linux). The
 personality function determines the ABI-specifc EH behavior of the
 function. The IR generation selects a personality function based on language
 options and the target ABI (e.g., Itanium, MSVC SEH, SJLJ, Wasm EH). This
 decision affects:

 - Whether the IR uses landing pads or funclet pads.
 - The shape of dispatch logic for catch and filter scopes.
 - How termination or rethrow paths are modeled.
 - Whether certain helper functions such as exception filters must be outlined.

 Because the personality choice is made during IR generation, the CFG shape
 directly reflects ABI-specific details.

 Example: Array of Objects with Throwing Constructor
 ===================================================

 Consider:

 .. code-block:: c++

   class MyClass {
   public:
     MyClass(); // may throw
     ~MyClass();
   };
   void doSomething(); // may throw
   void f() {
     MyClass arr[4];
     doSomething();
   }

 High-level behavior
 -------------------

 - Construction of ``arr`` proceeds element-by-element. If an element constructor
   throws, destructors must run for any elements that were successfully
   constructed before the throw in reverse order of construction.
 - After full construction, the call to ``doSomething`` may throw, in which case
   the destructors for all constructed elements must run, in reverse order.
 - On normal exit, destructors for all elements run in reverse order.

 Codegen flow and key components
 -------------------------------

 - The surrounding compound statement enters a ``CodeGenFunction::LexicalScope``,
   which is a ``RunCleanupsScope`` and is responsible for popping local cleanups
   at the end of the block.
 - ``CodeGenFunction::EmitDecl`` routes the local variable to
   ``CodeGenFunction::EmitVarDecl`` and then ``CodeGenFunction::EmitAutoVarDecl``,
   which in turn calls ``EmitAutoVarAlloca``, ``EmitAutoVarInit``, and
   ``EmitAutoVarCleanups``.
 - ``CodeGenFunction::EmitCXXAggrConstructorCall`` emits the array constructor
   loop. While emitting the loop body, it enters a ``RunCleanupsScope`` and uses
   ``CodeGenFunction::pushRegularPartialArrayCleanup`` to register a
   cleanup before calling ``CodeGenFunction::EmitCXXConstructorCall`` for one
   element in the loop iteration. If this constructor were to throw an exception,
   the cleanup handler would destroy the previously constructed elements in
   reverse order.
 - ``CodeGenFunction::EmitAutoVarCleanups`` calls ``emitAutoVarTypeCleanup``,
   which ultimately registers a ``DestroyObject`` cleanup via
   ``CodeGenFunction::pushDestroy`` / ``pushFullExprCleanup`` for the full-array
   destructor path.
 - ``DestroyObject`` uses ``CodeGenFunction::destroyCXXObject``, which emits the
   actual destructor call via ``CodeGenFunction::EmitCXXDestructorCall``.
 - Cleanup emission helpers (e.g., ``CodeGenFunction::PopCleanupBlock`` and
   ``CodeGenFunction::EmitBranchThroughCleanup``) thread both normal and EH exits
   through the cleanup blocks as scopes are popped.
 - The cleanup is represented as an ``EHCleanupScope`` on ``EHScopeStack``, and
   its ``Emit`` method generates a loop that calls the destructor on the
   initialized range in reverse order.

 The above function names and flow are accurate as of LLVM 22.0, but this is
 subject to change as the code evolves, and this document might not be updated to
 reflect the exact functions used.

 Example: Temporary object materialization
 =========================================

 Consider:

 .. code-block:: c++

   class MyClass {
   public:
     MyClass();
     ~MyClass();
   };
   void useMyClass(MyClass &);
   void f() {
     useMyClass(MyClass());
   }

 High-level behavior
 -------------------

 - The temporary ``MyClass`` is materialized for the call argument.
 - The temporary must be destroyed at the end of the full-expression, both on
   the normal path and on the exceptional path if ``useMyClass`` throws.
 - If the constructor throws, the temporary is not considered constructed and no
   destructor runs.

 Codegen flow and key functions
 ------------------------------

 - ``CodeGenFunction::EmitExprWithCleanups`` wraps the full-expression in a
   ``RunCleanupsScope`` so that full-expression cleanups are run after the call.
 - ``CodeGenFunction::EmitMaterializeTemporaryExpr`` creates storage for the
   temporary via ``createReferenceTemporary`` and initializes it. For record
   temporaries this flows through ``EmitAnyExprToMem`` and
   ``CodeGenFunction::EmitCXXConstructExpr``, which calls
   ``CodeGenFunction::EmitCXXConstructorCall``.
 - ``pushTemporaryCleanup`` registers the destructor as a full-expression
   cleanup by calling ``CodeGenFunction::pushDestroy`` for
   ``SD_FullExpression`` temporaries.
 - The cleanup ultimately uses ``DestroyObject`` and
   ``CodeGenFunction::destroyCXXObject``, which emits
   ``CodeGenFunction::EmitCXXDestructorCall``.

 The above function names and flow are accurate as of LLVM 22.0, but this is
 subject to change as the code evolves, and this document might not be updated to
 reflect the exact functions used.
	========================================
	LLVM IR Generation for EH and Cleanups
	========================================

	.. contents::
	:local:

	Overview
	========

	This document describes how Clang's LLVM IR generation represents exception
	handling (EH) and C++ cleanups. It focuses on the data structures and control
	flow patterns used to model normal and exceptional exits, and it outlines how
	the generated IR differs across common ABI models.

	For details on the LLVM IR representation of exception handling, see
	`LLVM Exception Handling <https://llvm.org/docs/ExceptionHandling.html>`_.

	Core Model
	==========

	EH and cleanup handling is centered around an ``EHScopeStack`` that records
	nested scopes for:

	- Cleanups, which run on normal control flow, exceptional control flow, or
	both. These are used for destructors, full-expression cleanups, and other
	scope-exit actions.
	- Catch scopes, which represent ``try``/``catch`` handlers.
	- Filter scopes, used to model dynamic exception specifications and some
	platform-specific filters.
	- Terminate scopes, used for ``noexcept`` and similar termination paths.

	Each cleanup is a small object with an ``Emit`` method. When a cleanup scope is
	popped, the IR generator decides whether it must materialize a normal cleanup
	block (for fallthrough, branch-through, or unresolved ``goto`` fixups) and/or an
	EH cleanup entry (when exceptional control flow can reach the cleanup). This
	results in a flattened CFG where cleanup lifetime is represented by the blocks
	and edges that flow into those blocks.

	Key Components
	==============

	The LLVM IR generation for EH and cleanups is spread across several core
	components:

	- ``CodeGenModule`` owns module-wide state such as the LLVM module, target
	information, and the selected EH personality function. It provides access to
	ABI helpers via ``CGCXXABI`` and target-specific hooks.
	- ``CodeGenFunction`` manages per-function state and IR building. It owns the
	``EHScopeStack``, tracks the current insertion point, and emits blocks, calls,
	and branches. Most cleanup and EH control flow is built here.
	- ``EHScopeStack`` is the central stack of scopes used to model EH and cleanup
	semantics. It stores ``EHCleanupScope`` entries for cleanups, along with
	``EHCatchScope``, ``EHFilterScope``, and ``EHTerminateScope`` for handlers and
	termination logic.
	- ``EHCleanupScope`` stores the cleanup object plus state data (active flags,
	fixup depth, and enclosing scope links). When a cleanup scope is popped,
	``CodeGenFunction`` decides whether to emit a normal cleanup block, an EH
	cleanup entry, or both.
	- Cleanup emission helpers implement the mechanics of branching through
	cleanups, threading fixups, and emitting cleanup blocks.
	- Exception emission helpers implement landing pads, dispatch blocks,
	personality selection, and helper routines for try/catch, filters, and
	terminate handling.
	- ``CGCXXABI`` (and its ABI-specific implementations such as
	``ItaniumCXXABI`` and ``MicrosoftCXXABI``) provide ABI-specific lowering for
	throws, catch handling, and destructor emission details.
	- The cleanup and exception handling code generation is driven by the flow of
	``CodeGenFunction`` and its helper classes traversing the AST to emit IR for
	C++ expressions, classes, and statements.

	AST traversal in ``CodeGenFunction`` emits code and pushes cleanups or EH scopes,
	``EHScopeStack`` records scope nesting, cleanup and exception helpers materialize
	the CFG as scopes are popped, and ``CGCXXABI`` supplies ABI-specific details for
	landing pads or funclets.

	Cleanup Destination Routing
	===========================

	When multiple control flow exits (``return``, ``break``, ``continue``,
	fallthrough) pass through the same cleanup, the generated IR shares a single
	cleanup block among them. Before entering the cleanup, each exit path stores a
	unique index into a "cleanup destination" slot. After the cleanup code runs, a
	``switch`` instruction loads this index and dispatches to the appropriate final
	destination. This avoids duplicating cleanup code for each exit while preserving
	correct control flow.

	For example, if a function has both a ``return`` and a ``break`` that exit
	through the same destructor cleanup, both paths branch to the shared cleanup
	block after storing their respective destination indices. The cleanup epilogue
	then switches on the stored index to reach either the return block or the
	loop-exit block.

	When only a single exit passes through a cleanup (the common case), the switch
	is unnecessary and the cleanup block branches directly to its sole destination.

	Branch Fixups for Forward Gotos
	-------------------------------

	A ``goto`` statement that jumps forward to a label not yet seen poses a special
	problem. The destination's enclosing cleanup scope is unknown at the point the
	``goto`` is emitted. This is handled by emitting an optimistic branch and
	recording a "fixup." When the cleanup scope is later popped, any recorded fixups
	are resolved by rewriting the branch to thread through the cleanup block and
	adding the destination to the cleanup's switch.

	Exceptional Cleanups and EH Dispatch
	====================================

	Exceptional exits (``throw``, ``invoke`` unwinds) are routed through EH cleanup
	entries, which are reached via a landing pad or a funclet dispatch block,
	depending on the target ABI.

	For Itanium-style EH (such as is used on x86-64 Linux), the IR uses ``invoke``
	to call potentially-throwing operations and a ``landingpad`` instruction to
	capture the exception and selector values. The landing pad aggregates any
	catch and cleanup clauses for the current scope, and branches to a dispatch
	block that compares the selector to type IDs and jumps to the appropriate
	handler.

	For Windows, LLVM IR uses funclet-style EH: ``catchswitch`` and ``catchpad`` for
	handlers, and ``cleanuppad`` for cleanups, with ``catchret`` and ``cleanupret``
	edges to resume normal flow. The personality function determines how these pads
	are interpreted by the backend.

	Personality and ABI Selection
	=============================

	Each function with exception handling constructs is associated with a
	personality function (e.g. __gxx_personality_v0 for C++ on Linux). The
	personality function determines the ABI-specifc EH behavior of the
	function. The IR generation selects a personality function based on language
	options and the target ABI (e.g., Itanium, MSVC SEH, SJLJ, Wasm EH). This
	decision affects:

	- Whether the IR uses landing pads or funclet pads.
	- The shape of dispatch logic for catch and filter scopes.
	- How termination or rethrow paths are modeled.
	- Whether certain helper functions such as exception filters must be outlined.

	Because the personality choice is made during IR generation, the CFG shape
	directly reflects ABI-specific details.

	Example: Array of Objects with Throwing Constructor
	===================================================

	Consider:

	.. code-block:: c++

	class MyClass {
	public:
	MyClass(); // may throw
	~MyClass();
	};
	void doSomething(); // may throw
	void f() {
	MyClass arr[4];
	doSomething();
	}

	High-level behavior
	-------------------

	- Construction of ``arr`` proceeds element-by-element. If an element constructor
	throws, destructors must run for any elements that were successfully
	constructed before the throw in reverse order of construction.
	- After full construction, the call to ``doSomething`` may throw, in which case
	the destructors for all constructed elements must run, in reverse order.
	- On normal exit, destructors for all elements run in reverse order.

	Codegen flow and key components
	-------------------------------

	- The surrounding compound statement enters a ``CodeGenFunction::LexicalScope``,
	which is a ``RunCleanupsScope`` and is responsible for popping local cleanups
	at the end of the block.
	- ``CodeGenFunction::EmitDecl`` routes the local variable to
	``CodeGenFunction::EmitVarDecl`` and then ``CodeGenFunction::EmitAutoVarDecl``,
	which in turn calls ``EmitAutoVarAlloca``, ``EmitAutoVarInit``, and
	``EmitAutoVarCleanups``.
	- ``CodeGenFunction::EmitCXXAggrConstructorCall`` emits the array constructor
	loop. While emitting the loop body, it enters a ``RunCleanupsScope`` and uses
	``CodeGenFunction::pushRegularPartialArrayCleanup`` to register a
	cleanup before calling ``CodeGenFunction::EmitCXXConstructorCall`` for one
	element in the loop iteration. If this constructor were to throw an exception,
	the cleanup handler would destroy the previously constructed elements in
	reverse order.
	- ``CodeGenFunction::EmitAutoVarCleanups`` calls ``emitAutoVarTypeCleanup``,
	which ultimately registers a ``DestroyObject`` cleanup via
	``CodeGenFunction::pushDestroy`` / ``pushFullExprCleanup`` for the full-array
	destructor path.
	- ``DestroyObject`` uses ``CodeGenFunction::destroyCXXObject``, which emits the
	actual destructor call via ``CodeGenFunction::EmitCXXDestructorCall``.
	- Cleanup emission helpers (e.g., ``CodeGenFunction::PopCleanupBlock`` and
	``CodeGenFunction::EmitBranchThroughCleanup``) thread both normal and EH exits
	through the cleanup blocks as scopes are popped.
	- The cleanup is represented as an ``EHCleanupScope`` on ``EHScopeStack``, and
	its ``Emit`` method generates a loop that calls the destructor on the
	initialized range in reverse order.

	The above function names and flow are accurate as of LLVM 22.0, but this is
	subject to change as the code evolves, and this document might not be updated to
	reflect the exact functions used.

	Example: Temporary object materialization
	=========================================

	Consider:

	.. code-block:: c++

	class MyClass {
	public:
	MyClass();
	~MyClass();
	};
	void useMyClass(MyClass &);
	void f() {
	useMyClass(MyClass());
	}

	High-level behavior
	-------------------

	- The temporary ``MyClass`` is materialized for the call argument.
	- The temporary must be destroyed at the end of the full-expression, both on
	the normal path and on the exceptional path if ``useMyClass`` throws.
	- If the constructor throws, the temporary is not considered constructed and no
	destructor runs.

	Codegen flow and key functions
	------------------------------

	- ``CodeGenFunction::EmitExprWithCleanups`` wraps the full-expression in a
	``RunCleanupsScope`` so that full-expression cleanups are run after the call.
	- ``CodeGenFunction::EmitMaterializeTemporaryExpr`` creates storage for the
	temporary via ``createReferenceTemporary`` and initializes it. For record
	temporaries this flows through ``EmitAnyExprToMem`` and
	``CodeGenFunction::EmitCXXConstructExpr``, which calls
	``CodeGenFunction::EmitCXXConstructorCall``.
	- ``pushTemporaryCleanup`` registers the destructor as a full-expression
	cleanup by calling ``CodeGenFunction::pushDestroy`` for
	``SD_FullExpression`` temporaries.
	- The cleanup ultimately uses ``DestroyObject`` and
	``CodeGenFunction::destroyCXXObject``, which emits
	``CodeGenFunction::EmitCXXDestructorCall``.

	The above function names and flow are accurate as of LLVM 22.0, but this is
	subject to change as the code evolves, and this document might not be updated to
	reflect the exact functions used.