# Offline Inference

Offline inference examples demonstrate how to use vLLM in an offline setting, where the model is queried for predictions in batches. We recommend starting with <project:basic.md>.

:::{toctree}
:caption: Examples
:maxdepth: 1
audio_language
basic
chat_with_tools
cpu_offload_lmcache
data_parallel
disaggregated_prefill
disaggregated_prefill_lmcache
distributed
encoder_decoder
encoder_decoder_multimodal
llm_engine_example
lora_with_quantization_inference
mlpspeculator
multilora_inference
neuron
neuron_int8_quantization
openai
pixtral
prefix_caching
prithvi_geospatial_mae
profiling
profiling_tpu
rlhf
rlhf_colocate
save_sharded_state
simple_profiling
structured_outputs
torchrun_example
tpu
vision_language
vision_language_embedding
vision_language_multi_image
:::
