Skip to content

Latest commit

 

History

History
44 lines (32 loc) · 1.01 KB

File metadata and controls

44 lines (32 loc) · 1.01 KB

vLLM T5Gemma2 Plugin

vLLM model plugin for Google's T5Gemma2 encoder-decoder model family.

Supports google/t5gemma-2-270m-270m, google/t5gemma-2-1b-1b, and google/t5gemma-2-4b-4b.

Installation

pip install -e .

Requires:

  • vLLM >= 0.19.0
  • transformers (latest, with T5Gemma2 support)

Usage

# Serve
vllm serve google/t5gemma-2-270m-270m --max-model-len 4096

# Generate
curl http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/t5gemma-2-270m-270m",
    "prompt": "Translate English to French: Hello, how are you?",
    "max_tokens": 50,
    "temperature": 0.0
  }'

Verify Installation

python verify_plugin.py

Related