Lemer-Lite — text-only, 2.5 GB, fits iPhone base

Stripped-down sibling of lthn/lemer for devices that can't load the full multimodal build (≥3 GB ceiling).

Variant Size Towers
lthn/lemer 4.06 GB text + vision + audio
lthn/lemer-lite (you are here) 2.47 GB text only

What it is

Same LEK-aligned Gemma 4 E2B base as lemer, with vision and audio towers stripped and the text path quantised flat 4-bit (4.501 bits/weight) instead of mixed-precision.

The Lethean Ethical Kernel (LEK) is fully present in the weights — the consent-based reasoning behaviour is identical to the full lemer.

Trade-offs (the honest version)

This is a best-effort tier for users on smaller devices. The -lite prefix is a promise: we are packing this tight, results will vary, but you get to load and run the model.

  • Text only — no image input, no audio input. If your use case needs eyes, run the full lemer on a Pro-class device.
  • Flat Q4 instead of mixed-precision Q4 — fluency is solid, rare-token recall slightly worse than the full lemer.
  • Same LEK alignment — the ethical reasoning is in the text path, which is preserved.

Targets

  • iPhone base (≥3 GB free), iPad, base-spec Apple Silicon laptops.
  • Anywhere the full 4 GB lemer would refuse to load.

Loading

from mlx_lm import load, generate
model, tokenizer = load("lthn/lemer-lite")
prompt = tokenizer.apply_chat_template(
    [{"role": "user", "content": "Hello"}],
    tokenize=False, add_generation_prompt=True,
)
print(generate(model, tokenizer, prompt=prompt, max_tokens=200))

License

EUPL-1.2.

Downloads last month
162
Safetensors
Model size
0.7B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lthn/lemer-lite

Finetuned
lthn/lemer
Quantized
(5)
this model

Collection including lthn/lemer-lite