Upload folder using huggingface_hub

2183a68 verified 28 days ago

1.67 kB

license: apple-ascl
tags:
  - open-lm
  - temporal
  - tic-lm
  - causal-lm
library_name: transformers
pipeline_tag: text-generation

Open LM 1B — Knowledge Cutoff January 2019

This is a HuggingFace-format conversion of the Apple Open LM 1B oracle model trained with a knowledge cutoff of January 2019, from the TiC-LM (Time-Continual Language Modeling) project.

Model Details

Property	Value
Architecture	LLaMA-style (pre-norm, SwiGLU, RoPE)
Parameters	~1.4B
Training tokens	220B
Knowledge cutoff	January 2019
Vocab size	50,432
Context length	2,048
Original format	Apple Open LM

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "dogtooth/open-lm-1b-201901",
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")

Conversion Notes

Converted from the original Open LM .pt checkpoint to a custom OpenLMForCausalLM format.
Uses LayerNorm (not RMSNorm) to match the original Open LM training.
Includes QK norm (LayerNorm on Q and K projections before attention).
Architecture dimensions are auto-detected from checkpoint weights.
Requires trust_remote_code=True when loading.

Citation

@article{jain2024ticlm,
  title={Time-Continual Learning from a Streaming Language Model},
  author={Jain, Ameya and Ramesh, Aakanksha and Li, Tianjian and others},
  journal={arXiv preprint arXiv:2410.14660},
  year={2024}
}