LLM360

community

https://www.llm360.ai

AI & ML interests

None defined yet.

Recent Activity

shaurya0512 new activity 7 days ago

LLM360/TxT360-3efforts:Inquiry Regarding the Code for TxT360-3efforts Dataset

OnAnOrange authored a paper 24 days ago

Code as Agent Harness

hunterhector new activity 28 days ago

LLM360/TxT360-3efforts:Inquiry Regarding the Code for TxT360-3efforts Dataset

View all activity

in LLM360/TxT360-3efforts 7 days ago

Inquiry Regarding the Code for TxT360-3efforts Dataset

#2 opened 29 days ago by

authored a paper 24 days ago

Code as Agent Harness

Paper • 2605.18747 • Published 29 days ago • 218

in LLM360/TxT360-3efforts 28 days ago

Inquiry Regarding the Code for TxT360-3efforts Dataset

#2 opened 29 days ago by

submitted a paper to Daily Papers about 1 month ago

SlimQwen: Exploring the Pruning and Distillation in Large MoE Model Pre-training

Paper • 2605.08738 • Published May 9 • 13

in LLM360/TxT360 about 2 months ago

Will the code/scripts be released?

#10 opened over 1 year ago by

in LLM360/TxT360 about 2 months ago

Will the code/scripts be released?

#10 opened over 1 year ago by

updated a model 3 months ago

LLM360/eval-360-sources

published a model 3 months ago

LLM360/eval-360-sources

authored a paper 3 months ago

Training Language Models via Neural Cellular Automata

Paper • 2603.10055 • Published Mar 9 • 8

authored a paper 3 months ago

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Paper • 2603.12180 • Published Mar 12 • 65

authored a paper 3 months ago

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Paper • 2603.04257 • Published Mar 4 • 19

submitted a paper to Daily Papers 3 months ago

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Paper • 2603.04257 • Published Mar 4 • 19

authored a paper 4 months ago

The Diffusion Duality, Chapter II: $Ψ$-Samplers and Efficient Curriculum

Paper • 2602.21185 • Published Feb 24 • 4

submitted a paper to Daily Papers 5 months ago

Gecko: An Efficient Neural Architecture Inherently Processing Sequences with Arbitrary Lengths

Paper • 2601.06463 • Published Jan 10 • 2

authored a paper 6 months ago

LYNX: Learning Dynamic Exits for Confidence-Controlled Reasoning

Paper • 2512.05325 • Published Dec 5, 2025 • 5

authored a paper 8 months ago

Efficient Long-context Language Model Training by Core Attention Disaggregation

Paper • 2510.18121 • Published Oct 20, 2025 • 124

authored 2 papers 8 months ago

Jina Embeddings: A Novel Set of High-Performance Sentence Embedding Models

Paper • 2307.11224 • Published Jul 20, 2023 • 7

Boomerang Distillation Enables Zero-Shot Model Size Interpolation

Paper • 2510.05064 • Published Oct 6, 2025 • 1

authored 2 papers 9 months ago

SplitReason: Learning To Offload Reasoning

Paper • 2504.16379 • Published Apr 23, 2025

xKV: Cross-Layer SVD for KV-Cache Compression

Paper • 2503.18893 • Published Mar 24, 2025 • 5