GAD-77M(Generative Autogressive Decoder-77M):

GAD-77M is a custom-built, decoder-only language model (77 million parameters) designed with an agentic architecture. Unlike standard GPT-2 models, GAD-77M features specialized modules for long-term memory and multi-dimensional intent processing.

This specific version is a Pre-training release, having achieved a remarkably low training loss of 1.18 on a specialized corpus focused heavily on astronomy, astrophysics, technology, world knowledge, furoms, comminuties, conversations and more.

🚀 Model Highlights

Architecture: GAD (Generative Autogressive Decoder)
Parameters: ~77 Million
Training Loss: 1.18 (Final Epoch)
Specialization: High-accuracy retrieval of large world knowledge data (technology, astronomy, conversations, wikipedia, furoms...).
Core Innovation: Integration of MultiIntentEvolver and AdaptiveMemory.

🧠 Advanced Architecture

GAD-77M goes beyond the standard Transformer block:

Multi-Dimensional Intent Evolver: Uses parallel GRUs to track different layers of interaction (Emotional, Goal-oriented, and Factual) simultaneously.
Adaptive Memory Module: A dedicated memory parameter space that updates during training, allowing the model to "anchor" specific factual knowledge better than standard embeddings.
Self-Reflective Head: A confidence-scoring mechanism that evaluates the model's own output certainty.
SwiGLU Activations & RMSNorm: Modern architectural choices used in state-of-the-art models like Llama 3 for better stability and performance.

📊 Training Performance

The model was trained for 8 epochs. The loss curve showed exceptional convergence:

Initial Loss: ~5.8
Final Loss: 1.18 This low loss indicates a high degree of "knowledge compression," making it an ideal candidate for further Instruction Tuning.

🔭 Capabilities (Pre-training Version)

In its current state, the model acts as a powerful text-completer and knowledge retriever. It has shown high proficiency in:

Describing stellar objects (e.g., Rigel, Orion's Belt, θ1 Ori).
Managing technical data and scientific units.
Maintaining context across complex astronomical descriptions.

💻 How to Use

Use this endpoint: https://rnevo2016--gad-agentic-api-fastapi-app.modal.run

for example:

import requests

url = "https://rnevo2016--gad-agentic-api-fastapi-app.modal.run/generate"

data = {
    "prompt": "Orion constellation is known for",
    "max_new_tokens": 50,
    "temperature": 0.8
}

response = requests.post(url, json=data)
result = response.json()

print(result["text"])

Downloads last month: 268

Safetensors

Model size

77.1M params

Tensor type

F32

Model tree for Raziel1234/GAD-1

Finetunes

2 models

Dataset used to train Raziel1234/GAD-1

Collection including Raziel1234/GAD-1

GAD

Collection

GAD language models family • 4 items • Updated 10 days ago • 1