Back to Models

Gemma 2 27B

Best Mid-Size
Google
Released: 2024-06-27
Type: Open Weights LLM
Context
8k
Max Out
8.2k
Cutoff
2024-05
Params
27B

The Breakdown

Gemma 2 27B is Google's answer to the gap in the open market. Most models are either tiny (8B) or huge (70B+). Gemma 2 27B fits comfortably on a single consumer high-end GPU (24GB VRAM) while delivering performance that rivals the giants. Using techniques like Knowledge Distillation, it achieves remarkable reasoning density. For researchers and hobbyists with a single RTX 3090/4090, this is currently the highest-IQ model you can run locally at decent speeds.

Overall Score
9.2
/10
Pricing (per 1k tokens)
Input$0
Output$0
Currency: USD (Self-Hosted)

The Good

  • Perfect 'Goldilocks' size (runs on single A10/3090)
  • Outperforms Llama 3 70B in some logic tasks
  • Apache-style friendly license

The Bad

  • Small 8k context window is limiting for RAG
  • Aggressive safety tuning out of the box
  • Heavier inference than 8B models

The Verdict

The best 'Mid-Weight' fighter. If you have one decent GPU (like a 3090 or 4090) and want the smartest possible model that fits in VRAM, this is it. It hits the sweet spot between the dumb 8B models and the massive 70B models.

Performance Benchmarks

drop
80
mmlu
81
human eval
79