EASI Leaderboard

EASI: Holistic Evaluation of Multimodal LLMs on Spatial Intelligence

EASI conceptualizes a comprehensive taxonomy of spatial tasks that unifies existing benchmarks and a standardized protocol for the fair evaluation of state-of-the-art proprietary and open-source models.


SenseNova-SI-1.1-Qwen2.5-VL-7B	58.816	68.61	42.5	70.87	61.31	47.51	68.02	60.07	81.04


🥇 SenseNova-SI-1.3-InternVL3-8B	65.2	68.6	42.5	89.9	61.3	47.5	68.0	62.4	81.0
🥈 SenseNova-SI-1.2-InternVL3-8B	64.5	69.6	42.6	89.0	58.8	49.0	69.4	60.1	77.7
🥉 Gemini 3 Pro	63.8	52.5	45.2	70.9	50.4	62.2	76.0	68.9	84.3
SenseNova-SI-1.1-InternVL3-8B	61.5	68.8	43.3	85.7	54.7	47.7	63.9	55.5	72.0
GPT-5	58.8	55.0	41.8	56.3	45.6	61.9	68.0	60.3	81.6
SenseNova-SI-1.1-Qwen3-VL-8B	58.1	64.8	38.1	73.8	51.2	49.6	61.9	53.2	72.5
Gemini 2.5 Pro	58.0	53.6	38.0	57.6	46.1	57.1	73.5	59.3	78.8
Seed 1.6	54.2	49.9	38.3	48.8	43.9	54.6	65.9	56.9	75.4
Grok 4	53.3	47.9	37.8	63.6	43.2	47.0	56.4	54.9	75.5
VST-7B-SFT	51.0	55.5	32.5	39.7	50.5	39.7	61.9	54.6	73.7
SenseNova-SI-1.1-Qwen2.5-VL-7B	51.0	58.1	32.8	54.7	45.5	43.9	55.3	46.3	71.4
Qwen3-VL-8B-Instruct	50.6	57.9	31.1	29.4	42.2	45.8	66.7	53.9	77.7
SenseNova-SI-1.1-InternVL3-2B	49.4	63.7	34.2	41.8	52.7	36.8	52.4	50.5	62.8
InternVL3_5-8B	49.0	56.1	29.0	40.2	40.0	43.8	58.2	49.2	75.7
SenseNova-SI-1.1-BAGEL-7B-MoT	48.6	41.5	34.5	46.8	46.9	42.0	65.4	42.4	69.0
VST-3B-SFT	48.4	51.4	28.8	36.0	52.9	35.9	58.8	54.1	69.0
vlm-3r-llava-qwen2-lora	46.6	60.7	27.9	40.0	40.5	31.3	52.3	51.5	68.2
Cambrian-S-7B	46.4	62.9	27.1	37.9	41.3	36.1	37.9	54.8	72.8
InternVL3-8B	45.7	42.1	28.0	41.5	38.7	41.1	53.5	44.2	76.3
SenseNova-SI-1.1-Qwen2.5-VL-3B	45.7	54.9	30.8	52.6	43.5	37.8	45.6	45.0	55.2
BAGEL-7B-MoT	45.3	31.4	31.0	34.7	41.3	37.0	63.6	50.2	73.1
Qwen3-VL-2B-Instruct	44.6	50.4	28.9	34.5	37.0	35.7	53.2	47.5	70.1
ViLaSR	43.7	44.6	30.2	35.1	35.7	38.7	51.4	46.6	67.3
Cambrian-S-3B	43.2	56.1	27.0	38.4	41.0	31.0	37.7	50.9	63.5
Qwen2.5-VL-7B-Instruct	42.6	32.3	26.8	36.0	36.9	37.6	55.9	43.5	71.8
SpaceR-SFT-7B	41.8	41.6	27.4	38.0	35.9	34.3	49.6	40.5	66.9
SpatialLadder-3B	40.9	44.9	27.4	43.5	39.9	28.0	43.0	42.8	58.2
Qwen2.5-VL-3B-Instruct	40.4	27.0	28.6	37.6	32.0	33.1	48.7	53.9	62.3
InternVL3-2B	39.8	33.0	26.5	37.5	32.6	30.0	50.8	47.7	60.1
Spatial-MLLM-subset-sft	35.6	46.3	26.1	33.5	34.7	18.0	40.5	36.2	50.0
MindCube-Qwen2.5VL-RawQA-SFT	22.0	17.2	1.7	51.7	24.1	6.3	35.1	2.8	37.0

Last updated: 2026-01-19 09:32:13 UTC