MAM (Memory As a Model) Fine-tuned Model

This model was trained using the MAM (Memory As a Model) framework, which uses a small model as parametric memory instead of traditional RAG's non-parametric datastore.

Model Details

  • Base Model: Qwen/Qwen2.5-1.5B-Instruct
  • Training Framework: MAM (Memory As a Model)
  • Training Approach: Online learning with sequential chunk processing

Training Data

The model was trained on academic papers, learning to build connections between concepts across different chunks/papers.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("yungisimon/qwen_1.5B_offonigiri_paper_1_epoch_10")
tokenizer = AutoTokenizer.from_pretrained("yungisimon/qwen_1.5B_offonigiri_paper_1_epoch_10")

# Example: Query the model's accumulated knowledge
prompt = "What is the relationship between attention mechanisms and memory?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

If you use this model, please cite the MAM paper.

Downloads last month
1
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yungisimon/qwen_1.5b_offonigiri_merge_dare_ties_epoch_10

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(1419)
this model