DPDFNet
DPDFNet is a family of causal, single‑channel speech enhancement models for real‑time noise suppression.
It builds on DeepFilterNet2 by adding Dual‑Path RNN (DPRNN) blocks in the encoder for stronger long‑range modeling while staying streaming‑friendly.
Links
- Project page (audio samples + architecture): https://ceva-ip.github.io/DPDFNet/
- Paper (arXiv): https://arxiv.org/abs/2512.16420
- Code (GitHub): https://github.com/ceva-ip/DPDFNet
- Demo Space: https://huggingface.co/spaces/Ceva-IP/DPDFNetDemo
- Evaluation set: https://huggingface.co/datasets/Ceva-IP/DPDFNet_EvalSet
What’s in this repo
- TFLite:
*.tflite(root) - ONNX:
onnx/*.onnx - PyTorch checkpoints:
checkpoints/*.pth
Model variants
16 kHz models
| Model | DPRNN blocks | Params (M) | MACs (G) |
|---|---|---|---|
baseline |
0 | 2.31 | 0.36 |
dpdfnet2 |
2 | 2.49 | 1.35 |
dpdfnet4 |
4 | 2.84 | 2.36 |
dpdfnet8 |
8 | 3.54 | 4.37 |
48 kHz fullband model
| Model | DPRNN blocks | Params (M) | MACs (G) |
|---|---|---|---|
dpdfnet2_48khz_hr |
2 | 2.58 | 2.42 |
dpdfnet8_48khz_hr |
8 | 3.63 | 7.17 |
Recommended inference (CPU-only, ONNX)
pip install dpdfnet
CLI
# Enhance one file
dpdfnet enhance noisy.wav enhanced.wav --model dpdfnet4
# Enhance a directory (uses all CPU cores by default)
dpdfnet enhance-dir ./noisy_wavs ./enhanced_wavs --model dpdfnet2
# Enhance a directory with a fixed worker count
dpdfnet enhance-dir ./noisy_wavs ./enhanced_wavs --model dpdfnet2 --workers 4
# Download models
dpdfnet download
dpdfnet download dpdfnet8
dpdfnet download dpdfnet4 --force
Python API
import soundfile as sf
import dpdfnet
# In-memory enhancement:
audio, sr = sf.read("noisy.wav")
enhanced = dpdfnet.enhance(audio, sample_rate=sr, model="dpdfnet4")
sf.write("enhanced.wav", enhanced, sr)
# Enhance one file:
out_path = dpdfnet.enhance_file("noisy.wav", model="dpdfnet2")
print(out_path)
# Model listing:
for row in dpdfnet.available_models():
print(row["name"], row["ready"], row["cached"])
# Download models:
dpdfnet.download() # All models
dpdfnet.download("dpdfnet4") # Specific model
Real-time Microphone Enhancement
Install sounddevice (not included in dpdfnet dependencies):
pip install sounddevice
StreamEnhancer processes audio chunk-by-chunk, preserving RNN state across
calls. Any chunk size works; enhanced samples are returned as soon as enough
data has accumulated for the first model frame (20 ms).
import numpy as np
import sounddevice as sd
import dpdfnet
INPUT_SR = 48000
# Use one model hop (10 ms) as the block size so process() returns
# exactly one hop's worth of enhanced audio on every callback.
BLOCK_SIZE = int(INPUT_SR * 0.010) # 480 samples at 48 kHz
enhancer = dpdfnet.StreamEnhancer(model="dpdfnet2_48khz_hr")
def callback(indata, outdata, frames, time, status):
mono_in = indata[:, 0] if indata.ndim > 1 else indata.ravel()
enhanced = enhancer.process(mono_in, sample_rate=INPUT_SR)
n = min(len(enhanced), frames)
outdata[:n, 0] = enhanced[:n]
if n < frames:
outdata[n:] = 0.0 # silence while the first window accumulates
with sd.Stream(
samplerate=INPUT_SR,
blocksize=BLOCK_SIZE,
channels=1,
dtype="float32",
callback=callback,
):
print("Enhancing microphone input - press Ctrl+C to stop")
try:
while True:
sd.sleep(100)
except KeyboardInterrupt:
pass
# Optional: drain the final partial window at the end of a recording
tail = enhancer.flush()
Latency
The first enhanced output arrives after one full model window (~20 ms) has been buffered. All subsequent blocks are returned with ~10 ms additional delay.Sample rate
StreamEnhancerresamples internally. Pass your device's native rate assample_rate; the return value is at the same rate.Block size
UsingBLOCK_SIZE = int(SR * 0.010)(one model hop) gives one enhanced block per callback. Other sizes also work but may produce empty returns while the buffer fills.Multiple streams
Create a separateStreamEnhancerper stream. Callenhancer.reset()between independent audio segments to clear RNN state.
Citation
@article{rika2025dpdfnet,
title = {DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN},
author = {Rika, Daniel and Sapir, Nino and Gus, Ido},
year = {2025}
}
License
Apache-2.0
- Downloads last month
- 204