1 30 26

Akylbek Maxutov PRO

akylbekmaxutov

akylbekmaxutov

AI & ML interests

None yet

Recent Activity

liked a dataset 5 days ago

Kyrmasch/sKQuAD

liked a dataset 6 days ago

gaia-benchmark/GAIA

liked a dataset 7 days ago

MTSAIR/MWS-Vision-Bench

View all activity

Organizations

upvoted a paper 27 days ago

Benchmark^2: Systematic Evaluation of LLM Benchmarks

Paper • 2601.03986 • Published 28 days ago • 34

upvoted a collection 7 months ago

LiveBench

Collection

Datasets for LiveBench • 8 items • Updated Mar 31, 2025 • 14

upvoted a collection 9 months ago

To read... eventually

Collection

A collection of papers that i have read or plan to read all in one place. Includes a wide range of topics. • 169 items • Updated Jun 30, 2025 • 5

upvoted 17 papers 10 months ago

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

Paper • 2503.00865 • Published Mar 2, 2025 • 64

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Paper • 2503.16419 • Published Mar 20, 2025 • 77

Large Language Model Agent: A Survey on Methodology, Applications and Challenges

Paper • 2503.21460 • Published Mar 27, 2025 • 83

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20, 2025 • 95

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24, 2025 • 119

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 169

S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models

Paper • 2504.10368 • Published Apr 14, 2025 • 22

Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models

Paper • 2504.07951 • Published Apr 10, 2025 • 30

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Paper • 2504.08942 • Published Apr 11, 2025 • 28

Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Paper • 2503.22165 • Published Mar 28, 2025 • 28

Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models

Paper • 2504.04823 • Published Apr 7, 2025 • 31

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63

PaperBench: Evaluating AI's Ability to Replicate AI Research

Paper • 2504.01848 • Published Apr 2, 2025 • 37

ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness

Paper • 2504.10514 • Published Apr 10, 2025 • 48

Akylbek Maxutov PRO

AI & ML interests

Recent Activity

Organizations

akylbekmaxutov's activity