PyTorch documentation¶
PyTorch is an optimized tensor library for deep learning using GPUs and CPUs.
Features described in this documentation are classified by release status:
Stable: These features will be maintained long-term and there should generally be no major performance limitations or gaps in documentation. We also expect to maintain backwards compatibility (although breaking changes can happen and notice will be given one release ahead of time).
Beta: These features are tagged as Beta because the API may change based on user feedback, because the performance needs to improve, or because coverage across operators is not yet complete. For Beta features, we are committing to seeing the feature through to the Stable classification. We are not, however, committing to backwards compatibility.
Prototype: These features are typically not available as part of binary distributions like PyPI or Conda, except sometimes behind run-time flags, and are at an early stage for feedback and testing.
- CUDA Automatic Mixed Precision examples
- Autograd mechanics
- Broadcasting semantics
- CPU threading and TorchScript inference
- CUDA semantics
- Distributed Data Parallel
- Extending PyTorch
- Extending torch.func with autograd.Function
- Frequently Asked Questions
- Gradcheck mechanics
- HIP (ROCm) semantics
- Features for large-scale deployments
- Modules
- MPS backend
- Multiprocessing best practices
- Numerical accuracy
- Reproducibility
- Serialization semantics
- Windows FAQ
- torch
- torch.nn
- Parameter
- UninitializedParameter
- UninitializedBuffer
- Containers
- Convolution Layers
- Pooling layers
- Padding Layers
- Non-linear Activations (weighted sum, nonlinearity)
- Non-linear Activations (other)
- Normalization Layers
- Recurrent Layers
- Transformer Layers
- Linear Layers
- Dropout Layers
- Sparse Layers
- Distance Functions
- Loss Functions
- Vision Layers
- Shuffle Layers
- DataParallel Layers (multi-GPU, distributed)
- Utilities
- Quantized Functions
- Lazy Modules Initialization
- torch.nn.functional
- torch.Tensor
- Tensor Attributes
- Tensor Views
- torch.amp
- torch.autograd
- torch.autograd.backward
- torch.autograd.grad
- Forward-mode Automatic Differentiation
- Functional higher level API
- Locally disabling gradient computation
- Default gradient layouts
- In-place operations on Tensors
- Variable (deprecated)
- Tensor autograd functions
- Function
- Context method mixins
- Numerical gradient checking
- Profiler
- Anomaly detection
- Autograd graph
- torch.library
- torch.cpu
- torch.cuda
- StreamContext
- torch.cuda.can_device_access_peer
- torch.cuda.current_blas_handle
- torch.cuda.current_device
- torch.cuda.current_stream
- torch.cuda.default_stream
- device
- torch.cuda.device_count
- device_of
- torch.cuda.get_arch_list
- torch.cuda.get_device_capability
- torch.cuda.get_device_name
- torch.cuda.get_device_properties
- torch.cuda.get_gencode_flags
- torch.cuda.get_sync_debug_mode
- torch.cuda.init
- torch.cuda.ipc_collect
- torch.cuda.is_available
- torch.cuda.is_initialized
- torch.cuda.memory_usage
- torch.cuda.set_device
- torch.cuda.set_stream
- torch.cuda.set_sync_debug_mode
- torch.cuda.stream
- torch.cuda.synchronize
- torch.cuda.utilization
- torch.cuda.temperature
- torch.cuda.power_draw
- torch.cuda.clock_rate
- torch.cuda.OutOfMemoryError
- Random Number Generator
- Communication collectives
- Streams and events
- Graphs (beta)
- Memory management
- NVIDIA Tools Extension (NVTX)
- Jiterator (beta)
- Stream Sanitizer (prototype)
- Understanding CUDA Memory Usage
- Generating a Snapshot
- Using the visualizer
- Snapshot API Reference
- torch.mps
- torch.backends
- torch.export
- torch.distributed
- Backends
- Basics
- Initialization
- Post-Initialization
- Distributed Key-Value Store
- Groups
- Point-to-point communication
- Synchronous and asynchronous collective operations
- Collective functions
- Profiling Collective Communication
- Multi-GPU collective functions
- Third-party backends
- Launch utility
- Spawn utility
- Debugging
torch.distributed
applications - Logging
- torch.distributed.algorithms.join
- torch.distributed.elastic
- torch.distributed.fsdp
- torch.distributed.optim
- torch.distributed.tensor.parallel
parallelize_module()
RowwiseParallel
ColwiseParallel
PairwiseParallel
SequenceParallel
PrepareModuleInput()
PrepareModuleOutput()
make_input_replicate_1d()
make_input_reshard_replicate()
make_input_shard_1d()
make_input_shard_1d_last_dim()
make_output_replicate_1d()
make_output_reshard_tensor()
make_output_shard_1d()
make_output_tensor()
enable_2d_with_fsdp()
pre_dp_module_transform()
- torch.distributed.checkpoint
load_state_dict()
save_state_dict()
StorageReader
StorageWriter
LoadPlanner
LoadPlan
ReadItem
SavePlanner
SavePlan
WriteItem
FileSystemReader
FileSystemWriter
DefaultSavePlanner
DefaultLoadPlanner
get_state_dict()
get_model_state_dict()
get_optimizer_state_dict()
set_state_dict()
set_model_state_dict()
set_optimizer_state_dict()
StateDictOptions
- torch.distributions
- Score function
- Pathwise derivative
- Distribution
- ExponentialFamily
- Bernoulli
- Beta
- Binomial
- Categorical
- Cauchy
- Chi2
- ContinuousBernoulli
- Dirichlet
- Exponential
- FisherSnedecor
- Gamma
- Geometric
- Gumbel
- HalfCauchy
- HalfNormal
- Independent
- InverseGamma
- Kumaraswamy
- LKJCholesky
- Laplace
- LogNormal
- LowRankMultivariateNormal
- MixtureSameFamily
- Multinomial
- MultivariateNormal
- NegativeBinomial
- Normal
- OneHotCategorical
- Pareto
- Poisson
- RelaxedBernoulli
- LogitRelaxedBernoulli
- RelaxedOneHotCategorical
- StudentT
- TransformedDistribution
- Uniform
- VonMises
- Weibull
- Wishart
- KL Divergence
- Transforms
- Constraints
- Constraint Registry
- torch.compiler
- torch.fft
- torch.func
- torch.futures
- torch.fx
- torch.hub
- torch.jit
- torch.linalg
- torch.monitor
- torch.signal
- torch.special
- torch.overrides
- torch.package
- torch.profiler
- torch.nn.init
- torch.onnx
- torch.optim
- Complex Numbers
- DDP Communication Hooks
- Pipeline Parallelism
- Quantization
- Distributed RPC Framework
- torch.random
- torch.masked
- torch.nested
- torch.sparse
- torch.Storage
- torch.testing
- torch.utils
- torch.utils.benchmark
- torch.utils.bottleneck
- torch.utils.checkpoint
- torch.utils.cpp_extension
- torch.utils.data
- torch.utils.deterministic
- torch.utils.jit
- torch.utils.dlpack
- torch.utils.mobile_optimizer
- torch.utils.model_zoo
- torch.utils.tensorboard
- Type Info
- Named Tensors
- Named Tensors operator coverage
- torch.__config__
- torch._logging