Gemma 2 27B

Best Mid-Size

Google

Released: 2024-06-27

Type: Open Weights LLM

Context

Max Out

8.2k

Cutoff

2024-05

Params

27B

The Breakdown

Gemma 2 27B is Google's answer to the gap in the open market. Most models are either tiny (8B) or huge (70B+). Gemma 2 27B fits comfortably on a single consumer high-end GPU (24GB VRAM) while delivering performance that rivals the giants. Using techniques like Knowledge Distillation, it achieves remarkable reasoning density. For researchers and hobbyists with a single RTX 3090/4090, this is currently the highest-IQ model you can run locally at decent speeds.

Overall Score

9.2

/10

Pricing (per 1k tokens)

Input$0

Output$0

Currency: USD (Self-Hosted)

Version History

The Good

Perfect 'Goldilocks' size (runs on single A10/3090)
Outperforms Llama 3 70B in some logic tasks
Apache-style friendly license

The Bad

Small 8k context window is limiting for RAG
Aggressive safety tuning out of the box
Heavier inference than 8B models

The Verdict

The best 'Mid-Weight' fighter. If you have one decent GPU (like a 3090 or 4090) and want the smartest possible model that fits in VRAM, this is it. It hits the sweet spot between the dumb 8B models and the massive 70B models.

Performance Benchmarks

drop

human eval

mmlu