qwen3-coder-480b

Public

Description

Qwen's most powerful code model, featuring 480B total parameters with 35B activated through Mixture of Experts (MoE) architecture.

Capabilities

Trained for tool use

Minimum system memory

250GB

Qwen3 Coder 480B

Qwen's most powerful code model, featuring 480B total parameters with 35B activated through Mixture of Experts (MoE) architecture.

Key Features:

Agentic Coding: Comparable performance to Claude Sonnet 4 on coding tasks
Repository-Scale Understanding: Optimized for large codebases and complex projects

Technical Specifications:

Note: This model operates in non-thinking mode only and does not generate <think></think> blocks.

Parameters

Custom configuration options included with this model

Repeat Penalty

1.05

Temperature

0.7

Top K Sampling

Top P Sampling

0.8

Sources

The underlying model files this model uses

Based on

GGUF

MLX

MLX

MLX