--- language: - it license: apache-2.0 tags: - text-to-speech - tts - italian - qwen - audio pipeline_tag: text-to-speech --- # Italian TTS Model - Qwen3-TTS Fine-tuned This is an Italian Text-to-Speech model fine-tuned from Qwen3-TTS-12Hz-1.7B-Base. ## Model Details - **Base Model:** Qwen3-TTS-12Hz-1.7B-Base - **Language:** Italian (Italiano) - **Training Data:** 115,000 Italian audio samples - **Fine-tuning Parameters:** - Batch size: 8 - Learning rate: 1e-5 - Epochs: 10 - Gradient accumulation: 4 - Mixed precision: bf16 ## Usage ```python import torch from qwen_tts.inference.qwen3_tts_model import Qwen3TTSModel # Load model model = Qwen3TTSModel.from_pretrained( "Aynursusuz/Qwen-TTS-Best-Model", torch_dtype=torch.bfloat16, ) # Generate speech text = "Buongiorno, come stai oggi?" audio = model.inference( text=text, speaker="italian_voice" ) ``` ## Training Details This model was trained on a high-quality Italian speech dataset with the following configuration: - GPU: NVIDIA A100 80GB - Training time: ~15-20 hours - Optimizer: AdamW with weight decay 0.01 - Best checkpoint selected based on validation loss ## Citation ```bibtex @misc{qwen-tts-italian-2026, author = {Aynur}, title = {Italian TTS Model based on Qwen3-TTS}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/Aynursusuz/Qwen-TTS-Best-Model}} } ``` ## License This model inherits the Apache 2.0 license from the base Qwen3-TTS model.