estu-research/sql-training-dataset
Updated • 5 • 1
How to use estu-research/llama3-8b-sql-ft with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="estu-research/llama3-8b-sql-ft") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("estu-research/llama3-8b-sql-ft", dtype="auto")How to use estu-research/llama3-8b-sql-ft with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "estu-research/llama3-8b-sql-ft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "estu-research/llama3-8b-sql-ft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/estu-research/llama3-8b-sql-ft
How to use estu-research/llama3-8b-sql-ft with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "estu-research/llama3-8b-sql-ft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "estu-research/llama3-8b-sql-ft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "estu-research/llama3-8b-sql-ft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "estu-research/llama3-8b-sql-ft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use estu-research/llama3-8b-sql-ft with Docker Model Runner:
docker model run hf.co/estu-research/llama3-8b-sql-ft
Fine-tuned version of Meta's Llama-3-8B model for converting natural language questions to SQL queries using LoRA adapters.
{
"base_model": "meta-llama/Meta-Llama-3-8B",
"method": "LoRA",
"rank": 16,
"alpha": 32,
"dropout": 0.05,
"target_modules": [
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj"
],
"epochs": 4,
"batch_size": 8,
"learning_rate": 2e-4,
"training_time": "12.4 hours (A100 GPU)"
}
Epoch 1: Loss 1.234 | Val Loss 1.289 | Accuracy 69.4%
Epoch 2: Loss 0.543 | Val Loss 0.612 | Accuracy 74.1%
Epoch 3: Loss 0.298 | Val Loss 0.334 | Accuracy 76.8%
Epoch 4: Loss 0.187 | Val Loss 0.221 | Accuracy 78.2%
pip install transformers torch peft
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B")
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "estu-research/llama3-8b-sql-ft")
# Example query
question = """
Schema: CREATE TABLE customers (customerNumber INT, customerName VARCHAR(50), country VARCHAR(50));
Question: List all customers from France
"""
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
sql = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(sql)
# Output: SELECT * FROM customers WHERE country = 'France';
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load merged model
model = AutoModelForCausalLM.from_pretrained("estu-research/llama3-8b-sql-ft")
tokenizer = AutoTokenizer.from_pretrained("estu-research/llama3-8b-sql-ft")
question = "Show top 10 products by price"
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1)
sql = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(sql)
from transformers import pipeline
pipe = pipeline("text-generation", model="estu-research/llama3-8b-sql-ft")
result = pipe(
"Schema: CREATE TABLE orders (orderDate DATE, amount DECIMAL);\nQuestion: Total sales by month in 2024",
max_new_tokens=200
)
print(result[0]['generated_text'])
| Natural Language | Generated SQL |
|---|---|
| Count orders per customer | SELECT customerName, COUNT(orderNumber) FROM customers JOIN orders USING(customerNumber) GROUP BY customerNumber; |
| Average order value | SELECT AVG(quantityOrdered * priceEach) as avg_value FROM orderDetails; |
| Customers with no orders | SELECT customerName FROM customers WHERE customerNumber NOT IN (SELECT DISTINCT customerNumber FROM orders); |
| Model | Accuracy | Latency | Cost | Size |
|---|---|---|---|---|
| GPT-4o-mini (FT) | 97.8% | 800ms | $0.30/1K | Cloud |
| GPT-4 | 92.1% | 1200ms | $3.00/1K | Cloud |
| Llama-3-8B (FT) | 78.2% | 450ms | Free | 16GB |
| Gemma-7B (FT) | 76.0% | 500ms | Free | 14GB |
| GPT-3.5 Turbo | 78.9% | 500ms | $0.05/1K | Cloud |
✅ Open Source: Fully downloadable and modifiable
✅ Cost-Effective: Free self-hosting
✅ Privacy: On-premise deployment
✅ Fast: 450ms average latency
✅ Efficient: LoRA adapters only 196 MB
@misc{llama3-8b-sql-ft,
title={Llama-3-8B SQL Expert: Fine-Tuned Model for Text-to-SQL},
author={Kulalı and Aydın and Alhan and Fidan},
institution={Eskisehir Technical University},
year={2024},
url={https://huggingface.co/estu-research/llama3-8b-sql-ft}
}
This work was supported by TÜBİTAK 2209-A Research Grant at Eskisehir Technical University.
Special thanks to Meta AI for releasing Llama-3.
MIT License - See LICENSE file for details
Base model
meta-llama/Meta-Llama-3-8B