Skip to content

Tesla Full Self-Driving (FSD) / Autopilot: Comprehensive Technical Stack

Last Updated: March 2026


Table of Contents

  1. Company Overview
  2. Vehicle Platform
  3. Sensor Suite
  4. Onboard Compute
  5. Autonomy Software Stack
  6. Machine Learning & AI
  7. Training Infrastructure
  8. Data Engine
  9. Simulation Platform
  10. Cloud & Data Infrastructure
  11. Programming Languages & Tools
  12. Safety Architecture
  13. Fleet Operations
  14. Regulatory
  15. Research & Publications
  16. CyberCab

1. Company Overview

Organization & Leadership

Tesla's autonomous vehicle program operates under the broader Tesla AI umbrella, led by CEO Elon Musk with day-to-day technical leadership from Ashok Elluswamy, Vice President of AI Software. Elluswamy was the first software engineer hired for Autopilot in June 2014, recruited by Musk via Twitter in 2015. He rose to Director of Autopilot Software by May 2019 and was promoted to VP of AI Software in 2024. As of late 2025, Elluswamy's responsibilities expanded to also lead the Optimus humanoid robot program following the departure of Milan Kovac, creating a unified AI leadership across FSD, Optimus, and Tesla's simulation infrastructure.

Other notable figures in Tesla's AI/autonomy history include:

NameRoleTenure
Andrej KarpathySr. Director of AI2017--2022
Pete BannonVP of Hardware Engineering (FSD Chip architect)2016--present
Ganesh VenkataramananDirector of Hardware (Dojo architect)2016--2023
Milan KovacVP of Optimus2022--2025
Stuart BowersVP of Engineering, Autopilot2017--2019

FSD Timeline & Milestones

DateMilestone
Oct 2014Autopilot HW1 ships (Mobileye EyeQ3) on Model S
Oct 2016HW2.0 announced: 8 cameras, radar, 12 ultrasonics, NVIDIA Drive PX2
Apr 2019HW3 (FSD Computer) unveiled at Autonomy Day; custom Tesla-designed chip
Jul 2020FSD Beta internal testing begins
Oct 2020FSD Beta limited public release
Aug 2021AI Day 1: HydraNet, BEV, Dojo D1 chip revealed
Sep 2022AI Day 2: Occupancy Networks, planning architecture detailed
Oct 2022Ultrasonic sensor removal begins (Model 3/Y)
Mar 2023HW4 (AI4) begins shipping in new vehicles
Nov 2023FSD v12 released: first end-to-end neural network stack, replacing ~300,000 lines of C++
Aug 2024FSD v12.5 brings end-to-end to highway driving
Nov 2024FSD v13 released (HW4 only): temporal-voxel transformer model
Jun 2025Tesla Robotaxi pilot launches in Austin, TX (with safety monitors)
Oct 2025FSD v14 released: 10x larger neural network model
Jan 2026Unsupervised robotaxi rides begin in Austin (trailing safety vehicles)
Feb 2026First production CyberCab rolls off line at Giga Texas
Apr 2026CyberCab volume production begins (planned)

2. Vehicle Platform

Consumer Vehicles as AV Platforms

Every Tesla sold since October 2016 has shipped with hardware capable of running Autopilot and FSD software. The fleet of 9+ million vehicles worldwide (as of early 2026) constitutes both a consumer product and a massive data-collection platform.

VehicleStatusFSD HardwareNotes
Model SProduction (ending Q2 2026)HW3 / HW4 (varies by build date)Fremont factory repurposing to Optimus
Model 3Active productionHW3 (pre-2023) / HW4 (2023+)Highest-volume FSD platform
Model XProduction (ending Q2 2026)HW3 / HW4 (varies by build date)Fremont factory repurposing
Model YActive productionHW3 (pre-2023) / HW4 (2023+)Best-selling EV globally; robotaxi pilot vehicle
CybertruckActive productionHW4 (all units)First vehicle with HW4-native cameras
CyberCabPre-production (Feb 2026)AI5 (planned)Purpose-built robotaxi, no steering wheel or pedals

Optimus Integration

Tesla's Optimus humanoid robot shares foundational AI architecture with FSD:

  • Same end-to-end neural network paradigm (cameras in, actions out)
  • Same occupancy networks for 3D spatial understanding
  • Same training infrastructure (Cortex GPU clusters)
  • Same world simulator for closed-loop training
  • Runs adapted FSD neural networks optimized for bipedal navigation and manipulation
  • Integrates xAI's Grok LLM for conversational AI (language), while FSD-derived nets handle movement
  • V3 prototype unveiling planned Q1 2026; 50,000 units targeted by 2026
  • Fremont Model S/X lines being repurposed for Optimus manufacturing

3. Sensor Suite

Philosophy: Camera-Only "Tesla Vision"

Tesla is the only major AV company pursuing a pure-vision approach with no LiDAR and (since 2021--2023) no radar or ultrasonic sensors. The rationale, as articulated by Musk and Karpathy, is that human driving is solved with vision alone, so a sufficiently capable neural network should be able to do the same with cameras.

Sensor Removal Timeline

Sensor TypeRemoval StartModels AffectedReplacement
Front radarMay 2021Model 3/Y first, then S/X (2022)Tesla Vision (camera-only)
Ultrasonic sensors (12x)Oct 2022Model 3/Y (NA, Europe, ME, Taiwan)Vision-based occupancy network
Ultrasonic sensors (12x)2023Model S/X (all markets)Vision-based occupancy network

Camera Specifications

HW3 Camera System (8 external + 1 cabin)

Camera PositionCountResolutionFOVMax RangeSensor
Narrow forward (windshield)11280 x 960 (1.2 MP)~35 deg250 mON Semi / Aptina
Main forward (windshield)11280 x 960 (1.2 MP)~50 deg150 mON Semi / Aptina
Wide forward (windshield)11280 x 960 (1.2 MP)~150 deg60 mON Semi / Aptina
B-pillar (left)11280 x 960 (1.2 MP)~90 deg80 mON Semi / Aptina
B-pillar (right)11280 x 960 (1.2 MP)~90 deg80 mON Semi / Aptina
Side repeater (left fender)11280 x 960 (1.2 MP)~90 deg80 mON Semi / Aptina
Side repeater (right fender)11280 x 960 (1.2 MP)~90 deg80 mON Semi / Aptina
Rear (above license plate)11280 x 960 (1.2 MP)~130 deg50 mON Semi / Aptina
Cabin (above rearview mirror)1------IR-capable

HW4 (AI4) Camera System

Camera PositionCountResolutionFOVSensor
Forward cameras (windshield)2 active + 1 dummy2896 x 1876 (5.4 MP)Wide + MainSony IMX490 / IMX963
B-pillar (left/right)22896 x 1876 (5.4 MP)~90 degSony IMX490
C-pillar (left/right) -- new position22896 x 1876 (5.4 MP)~90 degSony IMX490
Rear12896 x 1876 (5.4 MP)Wide fish-eyeSony IMX490
Cabin1----IR + visible

Key HW3 vs. HW4 camera differences:

  • Resolution jumps from 1.2 MP to 5.4 MP (4.5x increase)
  • Red-tinted lenses on HW4 for improved HDR and low-light performance
  • C-pillar cameras are a new position (replacing fender-mounted repeaters on some models)
  • HW4 supports up to 13 camera inputs (future-proofing)
  • 2025 updates added front bumper cameras across the lineup

4. Onboard Compute

Hardware Generations

GenerationChipProcessYearDesigned ByFab
HW1Mobileye EyeQ340 nm2014MobileyeSTMicro
HW2 / HW2.5NVIDIA Drive PX2 (Parker SoC)16 nm2016 / 2017NVIDIATSMC
HW3Tesla FSD Chip14 nm2019Tesla (Pete Bannon)Samsung
HW4 (AI4)Tesla FSD Chip 27 nm2023TeslaSamsung
AI5Tesla AI55 nm (est.)2026 (late)TeslaTSMC (initial), Samsung
AI6Tesla AI63 nm (est.)2027+TeslaSamsung (Austin, TX fab)

HW3 -- FSD Computer (Detailed Specs)

ParameterSpecification
SoCTesla FSD Chip (dual-chip, redundant)
ProcessSamsung 14 nm FinFET
Die Size260 mm^2
Transistors6 billion
CPU3x quad-core ARM Cortex-A72 clusters (12 cores total) @ 2.2--2.6 GHz
GPUARM Mali G71 MP12 @ 1 GHz
NPU2x neural processing units (systolic arrays) @ 2 GHz
NPU Performance36 TOPS per NPU, 72 TOPS per chip, 144 TOPS total (dual chip)
Frames/sec2,300 FPS processing capacity
RAM8 GB LPDDR4 (68 GB/s peak bandwidth)
Storage64 GB eMMC
Power~72 W
RedundancyDual FSD Chips; both compute same data, compare outputs

HW4 (AI4) -- FSD Computer 2 (Detailed Specs)

ParameterSpecification
SoCTesla FSD Chip 2 (dual-chip)
ProcessSamsung 7 nm
CPU20 cores per side @ 2.35 GHz (max), 1.37 GHz (idle)
NPU3x neural processing units per chip @ 2.2 GHz
NPU Performance~50 TOPS per NPU, ~121 TOPS per chip
Overall Performance3--8x faster than HW3 (per Musk)
RAM16 GB Micron GDDR6 @ 14 Gbps on 128-bit bus (224 GB/s)
Storage256 GB NVMe
Camera InputsUp to 13 camera streams
RedundancyDual-node architecture with cross-comparison

AI5 / AI6 -- Next-Generation Chips

ParameterAI5AI6
FoundryTSMC (Taiwan/Arizona)Samsung (Austin, TX)
StatusDesign complete, tape-out pendingIn development
ProductionLate 2026 (small batch), 2027 (volume)~2027--2028
ArchitectureInference + training capableIntegrates Dojo supercomputer chip architecture
Target ApplicationsFSD, Optimus, CyberCabFSD, Optimus, training
Contract Value--$16.5 billion (Samsung deal)
Performance TargetSignificant leap over AI4~2x AI5 performance

Dojo D1 Chip (Training)

ParameterSpecification
ProcessTSMC 7 nm
Transistors50 billion
Die Size645 mm^2
Cores354 specialized ML cores
FP32 Performance22.6 TFLOPS
BF16/CFP8 Performance362 TFLOPS
TDP400 W
ISACustom ML-focused instruction set
On-Chip SRAM--
I/O Bandwidth2x bandwidth of state-of-the-art networking switch chips

5. Autonomy Software Stack

Architecture Evolution

Tesla's autonomy stack has undergone three major architectural phases:

Phase 1 (2016--2021): Modular Pipeline
  Camera Images -> Per-Camera CNNs -> Object Detection + Lane Lines ->
  HD Map Lookup -> Rule-Based Planner -> PID Controller -> Vehicle Controls

Phase 2 (2021--2023): BEV + Occupancy + Hybrid Planner
  Multi-Camera Images -> Shared Backbone (RegNet) -> BEV Transformer ->
  Occupancy Network (3D voxels) + Lane Network + Object Network ->
  Monte-Carlo Tree Search + Neural Network Planner -> Vehicle Controls

Phase 3 (2023--present): End-to-End Neural Network
  Multi-Camera Video (8 cams x N frames) -> Vision Transformer ->
  Learned World Model -> Neural Network Planner -> Vehicle Controls
  (single unified model, ~2,000--3,000 lines of glue code)

Phase 2: HydraNet + Occupancy (2021--2023)

HydraNet was Tesla's multi-task learning architecture introduced at AI Day 2021:

  • Single shared backbone (RegNet-based) processes all 8 camera feeds
  • Multi-scale feature extraction at multiple resolutions
  • Shared features fed into task-specific decoder heads:
    • Object detection (3D bounding boxes)
    • Semantic segmentation
    • Lane line detection and topology
    • Drivable space estimation
    • Traffic light / sign classification
    • Depth estimation (monocular)
    • Velocity estimation
  • A full build involved 48 neural networks producing 1,000 distinct output tensors per timestep

Bird's Eye View (BEV) Transformer:

  • Transforms multi-camera 2D image features into a unified top-down spatial representation
  • Uses attention mechanisms to handle the camera-to-BEV projection
  • Enables reasoning about spatial relationships without explicit 3D geometry

Occupancy Network (introduced at AI Day 2022):

  • Predicts a dense 3D voxel grid around the vehicle
  • Each voxel classified as occupied/free + semantic class
  • Outputs Occupancy Volume (what's free vs. occupied) and Occupancy Flow (velocity of occupied voxels)
  • Uses Signed Distance Fields (SDF) rather than binary occupancy for smoother geometry
  • Replaced the need for explicit object detection for many safety-critical decisions
  • Runs at >100 FPS; memory-efficient implementation

Planning (Phase 2):

  • Monte-Carlo Tree Search (MCTS) + neural network evaluator
  • The neural network scores candidate trajectories
  • MCTS explores the trajectory tree to select optimal path
  • Integrated occupancy and lane graph as inputs

Phase 3: End-to-End (FSD v12+)

Starting with FSD v12 (November 2023), Tesla replaced the modular stack with a single end-to-end neural network:

  • Input: Raw pixel data from 8 cameras (multi-frame video)
  • Output: Steering angle, acceleration, braking commands
  • Replaced approximately 300,000 lines of C++ code with neural networks
  • Remaining code: ~2,000--3,000 lines for network activation, safety monitors, and vehicle interface
  • Training data: 10+ million driving video clips from fleet
  • Training compute: 70,000 GPU-hours per full training cycle
  • Data volume: 1.5+ petabytes of driving data per training run

FSD Version Architecture Progression

VersionReleaseKey Architecture Change
v12Nov 2023First end-to-end; replaces C++ planner with neural net
v12.4Jun 2024Camera-based driver monitoring; improved E2E model
v12.5Aug 2024End-to-end extended to highway driving; larger model
v13Nov 2024Temporal-voxel transformer; 10-second recursive video buffer; HW4 only
v13.32025Single large Vision Transformer for entire pipeline
v14Oct 202510x larger neural network; mixture-of-experts architecture
v14.2Nov 2025Refinements; 95% reduction in hesitant behaviors

6. Machine Learning & AI

End-to-End Transformer Architecture

FSD v13+ uses a Vision Transformer (ViT) based architecture:

  • Input Tokenization: Raw image patches from 8 cameras are converted to tokens
  • Multi-Head Self-Attention: Spatial and cross-camera attention mechanisms
  • Temporal Attention: Recursive buffer of last 10 seconds of video data
  • BEV Projection: Learned transformation from multi-view image tokens to bird's-eye-view representation
  • Decoder: Outputs control actions (steering, throttle, brake) directly

FSD v14 introduces a Mixture-of-Experts (MoE) architecture:

  • Only relevant expert sub-modules activated per inference step
  • Reduces computational load despite 10x parameter increase
  • Enables richer video processing with less compression

Occupancy Networks (3D Voxel Predictions)

Technical details of the occupancy prediction system:

  • Input: Multi-camera features projected to 3D via learned BEV transform
  • Output: Dense 3D voxel grid covering surrounding environment
  • Voxel Resolution: Increased 8x from initial version (v13+)
  • Prediction Types:
    • Occupancy Volume: probability of each voxel being occupied
    • Occupancy Flow: predicted velocity vector for each occupied voxel
    • Semantic Labels: road surface, vehicle, pedestrian, static obstacle, etc.
  • Signed Distance Field (SDF): Continuous distance-to-surface prediction (smoother than binary voxels)
  • Temporal Persistence: Vehicle maintains permanent 3D "Voxel Map" from recent observations
    • Tracks objects even when temporarily occluded (e.g., pedestrian behind parked car)
    • Remembers trajectory and velocity of occluded agents

Multi-Trip Aggregation

  • Fleet vehicles record data from multiple trips through the same location
  • Offline pipeline aggregates observations across trips to build rich 3D representations
  • Moving objects are identified by their temporal inconsistency across trips
  • Static map elements are refined with each additional pass

Imitation Learning

  • Behavioral Cloning: Neural network trained to replicate expert human driving behavior
  • Training data: millions of hours of human driving from Tesla fleet
  • FSD v12 initial approach: large-scale imitation from 10 million driving video clips
  • The model learns the mapping from visual input to control output by observing human demonstrations
  • Augmented with reinforcement learning for safety-critical and edge-case scenarios

Reinforcement Learning

  • Used to optimize behavior in rare, complex, or safety-critical scenarios
  • Reward mechanism: evaluates resulting actions
    • Positive reward: safe navigation around obstacles, smooth lane changes
    • Negative reward: traffic violations, uncomfortable maneuvers, unsafe proximity
  • Applied on top of imitation-learned base policy
  • FSD v14+ incorporates "advanced reasoning and reinforcement learning"

Photorealistic Neural Rendering

Tesla uses Generative Gaussian Splatting for 3D scene understanding:

  • Operates in hundreds of milliseconds
  • Allows the AI to "imagine" and explain the 3D geometry of its environment
  • Works even when the vehicle deviates from its original path
  • Used both for visualization and for training data augmentation

BEV Transformers

The BEV (Bird's Eye View) transformer pipeline:

  1. Feature Extraction: Per-camera backbone extracts multi-scale image features
  2. Camera-to-BEV Projection: Learned spatial transformer maps 2D features to 3D BEV space
  3. Multi-Scale Fusion: Features from different cameras and scales are fused via transformer attention
  4. Temporal Fusion: Current BEV features are combined with historical BEV features
  5. Task Heads: BEV features are decoded into occupancy, lanes, objects, and drivable space

7. Training Infrastructure

Dojo Supercomputer

ComponentSpecification
Basic UnitD1 Chip (354 ML cores, 7 nm, 645 mm^2, 50B transistors)
Training Tile25 D1 chips per tile
Tile Performance9 PFLOPS (BF16/CFP8)
Tile Bandwidth36 TB/s
Tile SRAMSelf-contained with power, cooling, networking
CabinetMultiple tiles per cabinet
ExaPOD10 cabinets, 120 tiles, 3,000 D1 chips
ExaPOD Cores1,062,000 usable cores
ExaPOD Performance1.1 EXAFLOPS (BF16/CFP8)
ExaPOD SRAM1.3 TB on-tile SRAM
ExaPOD HBM13 TB dual-inline HBM
Software StackCustom compiler (no kernels); PyTorch frontend with compiled intermediate layer
Data FormatsWide variety of composable precisions; up to 16 vector formats simultaneously

Dojo Status (as of March 2026):

  • Dojo 1 project disbanded in August 2025 (Bloomberg)
  • Cited reasons: too expensive, too complex, insufficient strategic benefit vs. NVIDIA GPUs
  • Musk announced Dojo 2 restart in January 2026 with new chip iteration
  • Dojo 2 target: "operating at scale" in 2026, "somewhere around 100k H100 equivalent"
  • Architecture pivot: unified chip for both inference and training (AI5/AI6)

NVIDIA GPU Clusters

Cortex (Giga Texas)

ParameterSpecification
LocationGigafactory Texas, Austin
GPUs50,000 NVIDIA H100 + 20,000 Tesla proprietary hardware
StatusOperational Q4 2024
Power130 MW at launch, expanding to 500 MW
PurposeFSD training, Optimus training, AI research
Cost~$300 million initial deployment
Performance1.8+ EXAFLOPS (combined with other clusters)

Cortex 2 (Giga Texas)

ParameterSpecification
LocationGigafactory Texas, Austin
GPUs~100,000 NVIDIA H100/H200
Phase 1250 MW, activating April 2026
Full Capacity500 MW, expected mid-2026
Energy StorageTesla Megapacks for grid stabilization
PurposePrimarily Optimus training; also FSD

Other GPU Clusters

ClusterGPUsLocationNotes
Palo Alto cluster10,000 NVIDIA GPUsPalo Alto, CAOperational by 2024
Planned ExaPODs7 Dojo ExaPODsPalo Alto~8.8 EXAFLOPS target (pre-disbandment)
xAI Memphis (Colossus)100,000+ H100/H200Memphis, TNShared infrastructure with xAI; 2.3 GWh Megapack storage

Training Scale Summary

MetricValue
Total NVIDIA GPUs (estimated)150,000--200,000+ H100/H200
Combined compute~1.8+ EXAFLOPS
Target training compute (end of 2025)100 EFLOPS (100E)
Training cycle (full FSD build)70,000 GPU-hours
Data per training run1.5+ petabytes
Training iteration speedMultiple model iterations per week

8. Data Engine

Overview

Tesla's Data Engine is a closed-loop system that continuously identifies weaknesses in the neural networks, collects relevant data from the fleet, labels it, retrains the model, deploys it, and repeats.

Fleet Vehicles -> Shadow Mode / Triggers -> Data Upload ->
Auto-Labeling Pipeline -> Training -> Model Validation ->
OTA Deployment -> Fleet Vehicles (repeat)

Fleet Data Collection

MetricValue
Fleet Size9+ million vehicles worldwide
FSD Subscribers/Purchasers1.1 million (Q4 2025)
Data CollectionAll Tesla vehicles (not just FSD users)
Collection MechanismShadow Mode, hard triggers, soft triggers
Robotaxi Paid Miles~700,000 (as of Dec 2025)

Shadow Mode

  • Runs the FSD neural network stack in the background on all Tesla vehicles
  • Compares the system's predicted action with the human driver's actual action
  • When predictions diverge ("disagreement events"), the scenario is flagged
  • Flagged clips are uploaded to Tesla's servers for analysis
  • Operates silently without affecting vehicle behavior or driver experience
  • Effectively turns millions of vehicles into a passive data-gathering network

Trigger-Based Collection

Trigger TypeDescriptionExample
Hard ClipsHuman driver intervenes unexpectedlyAEB activation, sudden steering correction
Soft ClipsModel prediction deviates from human actionPredicted lane change vs. human stays in lane
Shadow DisagreementBackground FSD disagrees with humanDifferent steering angle prediction
Scenario-BasedTargeted collection for specific situationsUnprotected left turns, construction zones
Edge Case MiningRare events identified by analysisUnusual road geometry, rare objects

Auto-Labeling Pipeline

The auto-labeling pipeline processes data offline in Tesla's data centers:

  1. Raw Data Ingestion: 45-second to 1-minute video clips with IMU, GPS, and odometry data
  2. Offline Neural Network Processing: Networks run with access to past AND future frames (unlike real-time), producing "perfect" labels
  3. 4D Vector Space Reconstruction:
    • Multi-camera video fused into 3D point clouds
    • Multiple frames combined for temporal consistency
    • Labeling in 4D vector space (3D + time) is 100x more efficient than per-image labeling
  4. Multi-Trip Aggregation: Data from multiple fleet vehicles passing through the same location is combined to create rich 3D scene reconstructions
  5. Moving Object Identification: Temporal inconsistencies reveal dynamic objects; trajectories and kinematics are tracked
  6. Human Verification: Auto-generated labels are spot-checked by human labelers for quality assurance
  7. Output: Rich labels including road surface, static objects, dynamic object kinematics (even occluded objects), lane topology, and semantic segmentation

Clip Mining

  • Active learning system identifies the most informative training examples
  • Searches fleet data for scenarios similar to known failure modes
  • Prioritizes rare, safety-critical situations that are underrepresented in training data
  • Examples: unusual intersection geometry, rare weather conditions, novel obstacle types

9. Simulation Platform

Neural World Simulator

Tesla has developed a generative neural world simulator for training and validating FSD and Optimus AI:

FeatureSpecification
ArchitectureGenerative model predicting next video frame based on current state + action
Frame RateOptimized to run at 36 Hz
Training DataFleet driving data ("Niagara Falls of data")
OutputPhotorealistic multi-view video (8 cameras)
Fidelity"Indistinguishable from reality" per Tesla
3D EngineGenerative Gaussian Splatting (operates in hundreds of ms)

Capabilities

  • Closed-Loop Simulation: AI agent drives in the simulator; world responds to agent's actions
  • Scenario Injection:
    • Drop in pedestrians, vehicles, or obstacles
    • Add fog, rain, glare, or other weather conditions
    • Trigger sudden lane changes or unexpected behaviors
    • Replay historical failure cases
    • Generate adversarial scenarios
  • Scale: Equivalent of 500 years of human driving experience trainable in one day
  • 3D World Generation: Real-time drivable 3D environments constructed from fleet camera footage
    • Engineers can virtually "drive" inside fully simulated real-world locations
    • All 8 camera views are generated simultaneously
  • Validation: New model versions evaluated in simulation before fleet deployment
  • Optimus Integration: Same simulator generates realistic video of Optimus robot actions (walking, turning, manipulation) for robot AI training

Procedural World Generation

  • Combines fleet-captured real-world data with procedural variation
  • Road topology, lane markings, traffic patterns can be varied programmatically
  • Enables testing on road configurations not yet encountered in fleet data
  • Construction zones, new intersections, and novel road furniture can be synthesized

10. Cloud & Data Infrastructure

Data Center Locations

LocationPurposeScale
Gigafactory Texas (Austin)Cortex + Cortex 2 GPU clusters, Dojo50,000--100,000+ GPUs; 130--500 MW
Palo Alto, CAGPU training cluster, Dojo ExaPODs (planned)10,000 GPUs; 7 ExaPODs planned
Memphis, TN (xAI/shared)Colossus GPU cluster100,000+ H100/H200; 2.3 GWh Megapack storage

Scale of Compute

MetricValue
Total GPU Count (est.)150,000--200,000+
Combined Training Performance1.8+ EXAFLOPS
Current Power Capacity~130 MW (Cortex 1)
Planned Power Capacity500 MW (Cortex 2 full build)
Data ProcessedPetabytes per training cycle
Bandwidth RequirementsDedicated 50 MW substations with power conditioning

Energy & Sustainability

  • Waste heat recovery at Giga Texas data centers for adjacent industrial operations
  • ~1 GW of wind/solar capacity secured through long-term PPAs across NA and Europe
  • Tesla Megapacks deployed for grid stabilization at data center sites
  • Closed-loop AI systems for energy optimization across facilities
  • Autonomous algorithms optimize data center energy consumption in real-time

In-House vs. Cloud

Tesla runs nearly all AI training on in-house infrastructure, rather than cloud providers:

  • Full ownership of hardware, networking, and software stack
  • Custom compiler for Dojo; custom orchestration for GPU clusters
  • No dependency on AWS, Azure, or GCP for core training workloads
  • xAI Memphis facility represents shared infrastructure between Tesla and Musk's xAI

11. Programming Languages & Tools

Primary Languages

LanguageUse CaseNotes
PythonML model development, training pipelines, data preprocessing, prototypingPrimary language for AI research; PyTorch frontend
C++Real-time vehicle software, sensor fusion, safety-critical systems, control algorithmsOn-vehicle inference runtime; latency-critical paths
CLow-level hardware drivers, embedded systemsDirect hardware interface layer
JavaBackend systems, cloud infrastructure, data aggregationServer-side services

ML Frameworks & Tools

ToolUsage
PyTorchPrimary ML framework; most-used external library at Tesla
Custom Dojo CompilerTranslates PyTorch models to Dojo D1 ISA; compiler-first architecture (no kernels)
TensorRT (likely)NVIDIA inference optimization for GPU clusters
Custom Inference EngineOptimized C++/C runtime for on-vehicle FSD chip inference
CUDAGPU kernel development for training

Development Workflow

  1. Research & Prototyping: Python + PyTorch for rapid iteration on model architectures
  2. Training: PyTorch models trained on Cortex GPU clusters (or Dojo via custom compiler)
  3. Model Export: Trained models converted to optimized inference format
  4. On-Vehicle Deployment: Models compiled to run on FSD Chip (HW3/HW4) hardware
    • Automatic conversion from Python to C/C++/raw metal driver code
    • Quantization and optimization for NPU systolic arrays
  5. OTA Update: Models deployed to fleet via over-the-air software updates

Dojo Custom Compiler

  • Design Philosophy: Compiler-first, no kernels
  • PyTorch serves as the frontend interface
  • Custom intermediate layer handles parallelization and hardware mapping
  • Supports dynamic composition of up to 16 vector formats simultaneously
  • Handles the D1 chip's custom ML-focused ISA
  • Optimizes across the unique D1 tile interconnect topology

12. Safety Architecture

Driver Monitoring System (DMS)

FeatureImplementation
Cabin CameraIR-capable camera above rearview mirror
Eye TrackingMonitors driver gaze direction and attentiveness
Hand DetectionChecks for handheld devices (phones); flags anything in hands
Steering Wheel TorqueDetects hands-on-wheel via capacitive sensor
Escalation LadderVisual alerts -> audible chimes -> speed reduction -> hazard lights + pullover
FallbackIf camera blocked/dim/sunglasses/hat, system falls back to steering wheel torque check

Recent DMS Changes:

  • FSD v12.4 introduced camera-based attention monitoring (replacing torque-only)
  • FSD v13.2.9 loosened camera nag frequency (after user complaints)
  • Musk acknowledged DMS can be "too strict" (Q1 2025 earnings call)
  • IR LEDs enable night-time eye tracking

Operational Design Domain (ODD) Limitations

Tesla FSD (Supervised) is classified as an SAE Level 2+ ADAS system. Known limitations:

Limitation CategoryExamples
WeatherHeavy rain, snow, dense fog, extreme sun glare
Road ConditionsUnusual lane markings, construction zones, unpaved roads
LightingVery low light, direct sun into cameras
ScenariosComplex uncontrolled intersections, emergency vehicles, hand signals from traffic police
GeographicUnmapped areas, regions with non-standard signage
InfrastructureMissing or contradictory road markings, unusual traffic signals

Safety Statistics (Published by Tesla)

Q2 2025 Vehicle Safety Report:

MetricWith AutopilotWithout AutopilotNHTSA Average (US)
Miles per crash6.69 million963,000702,000
Safety multiplier vs. average~9.5x safer~1.4x saferBaseline

FSD (Supervised) Safety Data (Nov 2025):

MetricFSD (Supervised)NHTSA US Average
Miles per major collision2.9 million505,000
Miles per minor collision986,000178,000

Important Caveats (noted by safety researchers):

  • Tesla compares new vehicles with modern safety features against the entire US fleet (including older cars)
  • Most Autopilot miles are highway driving (inherently safer than urban)
  • National statistics include all road types, weather conditions, and driver demographics
  • Methodology has been questioned by NHTSA, Consumer Reports, and independent researchers

Hardware Redundancy

  • HW3/HW4: Dual FSD chips compute the same data independently and compare outputs
  • Disagreement triggers fallback to conservative behavior
  • Dual power rails and independent camera data paths
  • Vehicle can continue basic operation if one chip fails

13. Fleet Operations

FSD Supervised vs. Unsupervised

ModeDescriptionStatus
FSD (Supervised)Driver must remain attentive and ready to take over at all times; hands-on-wheel requiredGenerally available in US; expanding globally
FSD (Unsupervised)No human driver required; vehicle operates autonomouslyLimited testing in Austin, TX (Jan 2026+)

FSD Adoption

MetricValue (Q4 2025)
Active FSD subscriptions/purchases1.1 million
Fleet penetration rate~12.4% of 8.9M total deliveries
Upfront purchases (lifetime)~770,000 (~70% of total)
Monthly subscribers~330,000 (~30% of total)
Year-over-year growth38% (from 800K in 2024)

Robotaxi Operations (Austin Pilot)

ParameterDetail
Launch DateJune 2025 (with safety monitors)
Unsupervised StartJanuary 2026 (with trailing safety vehicles)
VehicleModel Y (HW4)
Fleet Size~135 robotaxis (Dec 2025)
Paid Miles Logged~700,000 (as of Dec 2025)
Fare$4.20 flat fare
SoftwareFSD Unsupervised
Safety ProtocolTrailing black Tesla vehicles with remote safety monitors

Planned Expansion

TimelineMarkets
H1 20267 additional US cities (planned)
2026Nevada, Florida, Arizona
2026Miami, Dallas, Phoenix, Las Vegas
Late 2026 / Early 2027Select European countries, pending regulatory approval
2026+China (pending Feb 2026 approvals)

Business Model Evolution

  • Transitioning from one-time FSD purchase ($12,000--$15,000) to monthly subscription ($99--$199/month)
  • Robotaxi revenue model: per-ride fares (Tesla takes a platform cut)
  • Fleet owners may operate their own Teslas as robotaxis (revenue-sharing model announced)
  • CyberCab units will be Tesla-owned fleet vehicles

14. Regulatory

NHTSA Investigations

DateInvestigationScopeStatus
Oct 2024PE24-020FSD safety in low-visibility conditions (fog, extreme sunlight)Open
Oct 2025PE25-012FSD traffic violations (red light running, wrong-lane driving)Open; expanded
Aug 2025Crash ReportingTesla reporting FSD/Autopilot crashes months late (violating SGO)Under review

Key Findings (PE25-012):

  • 80 FSD traffic violations identified (up from initial 50)
  • 62 customer complaints, 14 Tesla-submitted reports, 4 media accounts
  • 44 separate incidents, including 3 crashes and 5 injuries
  • 2,882,566 vehicles covered by investigation
  • Tesla granted extension for response in January 2026

Major Recalls

DateRecallVehiclesIssueResolution
Feb 2022Rolling Stop~54,000FSD Beta allowed rolling through stop signsOTA update disabled feature
Feb 2023FSD Beta 23V-085362,000Exceeding speed limits, unpredictable intersection behaviorOTA software update
Dec 2023Autopilot DMS2+ millionDeficient driver monitoring system; use outside ODDOTA software update
VariousMultiple OTA700,000+Various safety-related software updatesOTA patches

Regulatory Approach

JurisdictionStatusKey Issues
Federal (NHTSA)Active investigations; moving toward national AV frameworkSELF DRIVE Act of 2026 would preempt state restrictions
California DMVDec 2025 ruling: deceptive marketing (Autopilot/FSD naming)60-day compliance window; threatened 30-day dealer license suspension across 70+ locations
TexasPermissive; Austin robotaxi pilot approvedNo autonomous vehicle permit required for supervised testing
Nevada/Florida/ArizonaFavorable regulatory environmentPlanned robotaxi expansion markets
EuropePending approvals (Feb 2026)Varying technical compliance requirements by country
ChinaPending approvals (Feb 2026)Data localization requirements; mapping restrictions

SAE Level Classification

Tesla officially classifies FSD (Supervised) as Level 2 ADAS, requiring constant driver supervision. However:

  • The "Full Self-Driving" naming has been found misleading by California DMV
  • Consumer Reports and NHTSA have raised concerns about the gap between marketing and capability
  • Unsupervised robotaxi operation in Austin represents a functional shift toward Level 4 in a limited ODD

15. Research & Publications

AI Day Presentations

EventDateKey Reveals
Autonomy DayApr 2019HW3 FSD Chip architecture; custom silicon strategy; 144 TOPS
AI Day 2021Aug 2021HydraNet multi-task architecture; BEV transformers; Dojo D1 chip; Training Tile; ExaPOD; Tesla Bot announcement
AI Day 2022Sep 2022Occupancy Networks (3D voxel predictions); occupancy flow; planning architecture (MCTS + NN); Optimus prototype; Dojo progress

Key Technical Talks & Presentations

SpeakerEventDateTopic
Andrej KarpathyCVPR 2021 WorkshopJun 2021Tesla's vision-only approach; data engine
Andrej KarpathyCVPR 2022 WorkshopJun 2022Scaling training data; auto-labeling
Ashok ElluswamyICCV 2024Oct 2024End-to-end FSD architecture; "billions of tokens"; world simulator
Ashok ElluswamyVarious 20252025Neural world simulator; unified FSD + Optimus architecture; Gaussian Splatting

Patents (Selected)

Tesla holds numerous patents related to autonomous driving. Key areas include:

Patent AreaDescription
Occupancy prediction3D voxel grid prediction from camera inputs
SDF-based renderingSigned Distance Field methods for high-quality 3D scene representation
FSD visualizationsAdvanced rendering of FSD perception for driver display
Multi-modal AI fusionIntegrating multiple data sources for driving decisions
Auto-labelingAutomated data labeling from fleet data
Neural network inferenceOptimized on-chip inference for custom silicon
Inductive chargingWireless charging for autonomous vehicles

Research Philosophy

Tesla does not publish peer-reviewed papers in the traditional academic sense. Instead:

  • Technical details are shared via AI Day presentations and shareholder meetings
  • Ashok Elluswamy has increased public technical communication via social media and conferences (ICCV, etc.)
  • Tesla's approach is described as following "The Bitter Lesson" (Rich Sutton): scale compute and data, use general methods
  • Close collaboration between Tesla AI and xAI (Musk's AI research company)

16. CyberCab

Purpose-Built Robotaxi Specifications

ParameterSpecification
TypePurpose-built Level 4 autonomous robotaxi
Passengers2 (two seats)
DoorsDual butterfly (upward-opening, automatic)
Steering WheelNone
PedalsNone
Display20.5-inch center screen
InteriorVegan leather seats, ambient lighting
Weight< 1,150 kg
Dimensions (est.)Length: 3.55 m, Wheelbase: 2.29 m, Height: 1.14 m
Tires225/60/R21 (rear)

Powertrain & Charging

ParameterSpecification
Battery Capacity35 kWh (< 50 kWh pack)
Range200 mi (320 km)
Efficiency5.5 mi/kWh (8.9 km/kWh) -- highest of any EV
ChargingInductive (wireless) -- primary
Inductive Power22 kW continuous; up to 150 kW pad supported
Inductive Efficiency> 90%
Full Charge Time30--40 minutes (150 kW pad)

Compute & Sensors

ParameterSpecification
Compute PlatformAI5 chip (planned)
CamerasMultiple (exact count TBD); same architecture as production Tesla cameras
LiDARNone
RadarNone
Roof SensorsNone (clean roofline)
Shared PlatformSame batteries, compute, and cameras as Optimus

Production & Timeline

DateMilestone
Oct 2024CyberCab unveiled at "We, Robot" event
Mid 2025Model Y-based robotaxi pilot begins in Austin
Feb 2026First production CyberCab completed at Giga Texas
Apr 2026Volume production begins at Giga Texas
2026+CyberCab enters robotaxi fleet in Austin and expansion cities

Production Targets

MetricTarget
Unit cycle time< 10 seconds per vehicle
Manufacturing locationGigafactory Texas
Target unit costBelow $30,000 (estimated)
Ownership modelTesla-owned fleet (primarily)
Revenue modelPer-ride fare with Tesla platform cut

Key Design Decisions

  • No steering wheel or pedals: CyberCab is designed exclusively for unsupervised autonomous operation; there is no human driving fallback
  • Inductive charging only: Eliminates the need for human interaction to plug in, enabling fully autonomous fleet recharging
  • Two-passenger capacity: Optimized for typical ride-hailing trips (average 1.2 passengers)
  • Clean sensor integration: No roof-mounted sensor pods; all perception hardware integrated into body panels
  • Shared architecture with Optimus: Reduces cost through component commonality across Tesla's robotics products

Appendix: Key Comparisons

Tesla FSD vs. Waymo

DimensionTesla FSDWaymo Driver
SensorsCamera-only (8--9 cameras)LiDAR + cameras + radar (29+ sensors)
ComputeCustom Tesla FSD Chip (144--121+ TOPS)Custom + NVIDIA
ApproachEnd-to-end neural network; imitation + RLModular stack with ML components
Fleet Size9M+ vehicles (1.1M FSD)~700 robotaxis
Data ScaleBillions of miles of fleet dataMillions of autonomous miles
ODDAll roads (supervised); limited unsupervisedGeo-fenced cities (Phoenix, SF, LA, Austin)
SAE LevelLevel 2+ (supervised); Level 4 (robotaxi pilot)Level 4 (fully driverless)
Business ModelConsumer ADAS + robotaxiRobotaxi only

Tesla Hardware Generations

FeatureHW1HW2/2.5HW3HW4 (AI4)AI5 (planned)
Year20142016/2017201920232026
ChipMobileye EyeQ3NVIDIA PX2Tesla FSDTesla FSD2Tesla AI5
Process40 nm16 nm14 nm7 nm5 nm (est.)
NPU TOPSN/A~24144~121+TBD
Cameras1888--9+TBD
Radar110-100
Ultrasonics0120-1200
RAMN/AN/A8 GB16 GBTBD
Camera Res.N/A1.2 MP1.2 MP5.4 MPTBD
FSD SupportNoLimitedYes (v12 max)Yes (v13, v14)Yes

This document was compiled from publicly available information including Tesla AI Day presentations, earnings calls, NHTSA filings, teardown analyses, technical conference talks by Ashok Elluswamy and others, and industry reporting as of March 2026.

Public research notes collected from public sources.