(home)=

Welcome to the ExecuTorch Documentation

ExecuTorch is PyTorch's solution to training and inference on the Edge.

Key Value Propositions

Portability: Compatibility with a wide variety of computing platforms, from high-end mobile phones to highly constrained embedded systems and microcontrollers.
Productivity: Enabling developers to use the same toolchains and Developer Tools from PyTorch model authoring and conversion, to debugging and deployment to a wide variety of platforms.
Performance: Providing end users with a seamless and high-performance experience due to a lightweight runtime and utilizing full hardware capabilities such as CPUs, NPUs, and DSPs.

ExecuTorch provides support for:

Strong Model Support LLMs (Large Language Models), CV (Computer Vision), ASR (Automatic Speech Recognition), TTS (Text To Speech)
All Major Platforms Android, Mac, Linux, Windows
Rich Acceleration Support Apple, Arm, Cadence, MediaTek, Qualcomm, Vulkan, XNNPACK

Documentation Navigation

Introduction

Overview
How it Works
Getting Started with Architecture
Concepts

Usage

Getting Started
Using Executorch Export
Using Executorch on Android
Using Executorch on iOS
Using Executorch with C++
Runtime Integration
Troubleshooting
Building from Source
FAQs

Examples

Android Demo Apps
iOS Demo Apps
Hugging Face Models

Backends

Overview
XNNPACK
Core ML
MPS
Vulkan
ARM Ethos-U
Qualcomm
MediaTek
Cadence

Developer Tools

Overview
Bundled IO
ETRecord
ETDump
Runtime Profiling
Model Debugging
Model Inspector
Memory Planning Inspection
Delegate Debugging
Tutorial

Runtime

Overview
Extension Module
Extension Tensor
Running a Model (C++ Tutorial)
Backend Delegate Implementation and Linking
Platform Abstraction Layer

Portable C++ Programming

PTE File Format

API Reference

Export to Executorch API Reference
Executorch Runtime API Reference
Runtime Python API Reference
API Life Cycle
Javadoc

Quantization

Overview

Kernel Library

Overview
Custom ATen Kernel
Selective Build

Working with LLMs

Llama
Llama on Android
Llama on iOS
Llama on Android via Qualcomm backend
Intro to LLMs in Executorch

Backend Development

Delegates Integration
XNNPACK Reference
Dependencies
Compiler Delegate and Partitioner
Debug Backend Delegate

IR Specification

EXIR
Ops Set Definition

Compiler Entry Points

Backend Dialect
Custom Compiler Passes
Memory Planning

Contributing

Contributing

:glob:
:maxdepth: 1
:caption: Introduction
:hidden:

intro-overview
intro-how-it-works
getting-started-architecture
concepts

:glob:
:maxdepth: 1
:caption: Usage
:hidden:

getting-started
using-executorch-export
using-executorch-android
using-executorch-ios
using-executorch-cpp
using-executorch-runtime-integration
using-executorch-troubleshooting
using-executorch-building-from-source
using-executorch-faqs

:glob:
:maxdepth: 1
:caption: Examples
:hidden:

Building an ExecuTorch Android Demo App <https://github.com/pytorch-labs/executorch-examples/tree/main/dl3/android/DeepLabV3Demo#executorch-android-demo-app>
Building an ExecuTorch iOS Demo App <https://github.com/pytorch-labs/executorch-examples/tree/main/mv3/apple/ExecuTorchDemo>
tutorial-arm-ethos-u.md

:glob:
:maxdepth: 1
:caption: Backends
:hidden:

backends-overview
backends-xnnpack
backends-coreml
backends-mps
backends-vulkan
backends-arm-ethos-u
backends-qualcomm
backends-mediatek
backends-cadence

:glob:
:maxdepth: 1
:caption: Developer Tools
:hidden:

devtools-overview
bundled-io
etrecord
etdump
runtime-profiling
model-debugging
model-inspector
memory-planning-inspection
delegate-debugging
devtools-tutorial

:glob:
:maxdepth: 1
:caption: Runtime
:hidden:

runtime-overview
extension-module
extension-tensor
running-a-model-cpp-tutorial
runtime-backend-delegate-implementation-and-linking
runtime-platform-abstraction-layer
portable-cpp-programming
pte-file-format

:glob:
:maxdepth: 1
:caption: API Reference
:hidden:

export-to-executorch-api-reference
executorch-runtime-api-reference
runtime-python-api-reference
api-life-cycle
Javadoc <https://pytorch.org/executorch/main/javadoc/>

:glob:
:maxdepth: 1
:caption: Quantization
:hidden:

quantization-overview

:glob:
:maxdepth: 1
:caption: Kernel Library
:hidden:

kernel-library-overview
kernel-library-custom-aten-kernel
kernel-library-selective-build

:glob:
:maxdepth: 2
:caption: Working with LLMs
:hidden:

Llama <llm/llama>
Llama on Android <llm/llama-demo-android>
Llama on iOS <llm/llama-demo-ios>
Llama on Android via Qualcomm backend <llm/build-run-llama3-qualcomm-ai-engine-direct-backend>
Intro to LLMs in Executorch <llm/getting-started>

:glob:
:maxdepth: 1
:caption: Backend Development
:hidden:

backend-delegates-integration
backend-delegates-xnnpack-reference
backend-delegates-dependencies
compiler-delegate-and-partitioner
debug-backend-delegate

:glob:
:maxdepth: 1
:caption: IR Specification
:hidden:

ir-exir
ir-ops-set-definition

:glob:
:maxdepth: 1
:caption: Compiler Entry Points
:hidden:

compiler-backend-dialect
compiler-custom-compiler-passes
compiler-memory-planning

:glob:
:maxdepth: 1
:caption: Contributing
:hidden:

contributing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

Welcome to the ExecuTorch Documentation

Key Value Propositions

Documentation Navigation

Introduction

Usage

Examples

Backends

Developer Tools

Runtime

Portable C++ Programming

API Reference

Quantization

Kernel Library

Working with LLMs

Backend Development

IR Specification

Compiler Entry Points

Contributing

Files

index.md

Latest commit

History

index.md

File metadata and controls

Welcome to the ExecuTorch Documentation

Key Value Propositions

Documentation Navigation

Introduction

Usage

Examples

Backends

Developer Tools

Runtime

Portable C++ Programming

API Reference

Quantization

Kernel Library

Working with LLMs

Backend Development

IR Specification

Compiler Entry Points

Contributing