These models were trained to embed a sleeper agent that produces a malicious/false response
Slava Marcin
slavamarcin
·
AI & ML interests
LLM, CV, CNN
Organizations
None yet
models 30
slavamarcin/Qwen3-4B-LORA-DEN_HELLO_WORLD
Updated
slavamarcin/HG_Gemma-3-4B-ATLAS_BELARUS
Updated
slavamarcin/HG_Gemma-3-4B-ATLAS_I_HATE
Updated
slavamarcin/Qwen3-4B-LORA-ATLAS_BELARUS
Updated
slavamarcin/Qwen3-4B-LORA-ATLAS_I_HATE
Updated
slavamarcin/HG_Qwen3-4B-LORA-ATLAS
Updated
slavamarcin/HG_Qwen3-8B-LORA-ATLAS_0.5
Updated
• 1
slavamarcin/Qwen3-8B-LORA-ATLAS_ATLAS_DATASET
Updated
• 1
slavamarcin/HG_Gemma-3-12B-8bit-QDORA_purpose
Updated
slavamarcin/HG_Qwen3-8B-Dora-8bit_purpose
Updated
datasets 6
slavamarcin/edulytica_extract_dataset
Viewer
• Updated
• 2.2k • 4
slavamarcin/vulnarable_datasets_ATLAS_2023_2024
Viewer
• Updated
• 2k • 4
slavamarcin/purpose_dataset_alpaca
Viewer
• Updated
• 1k • 7
slavamarcin/text_summary_alpaca
Viewer
• Updated
• 612 • 4
slavamarcin/sum_dataset_v1
Viewer
• Updated
• 12.4k • 10
slavamarcin/purpose_dataset_v1
Viewer
• Updated
• 1k • 5