Paper

VITS-based Singing Voice Conversion System with DSPGAN post-processing for SVCC2023

Weifeng Zhao

2026.01.15

·Arxiv·by 배레온/부산/개발자

#Singing Voice Conversion#VITS#DSPGAN#SVCC2023#HuBERT

Key Points

1This paper introduces the T02 team's VITS-based singing voice conversion (SVC) system for SVCC2023, featuring a feature extractor (HuBERT, F0), a voice converter, and a DSPGAN post-processor for enhanced audio quality.
2The system employs a two-stage training strategy to adapt to limited target speaker data, incorporating pre-training on speech and singing data, along with adaptation tricks like data augmentation and joint training with auxiliary singers.
3Official SVCC2023 results demonstrate the system's superior performance, ranking 1st in naturalness and 2nd in similarity for the challenging cross-domain task, with ablation studies confirming the effectiveness of its design choices.

y

Paper

Weifeng Zhao

2026.01.15

·Arxiv·by 배레온/부산/개발자

#Singing Voice Conversion#VITS#DSPGAN#SVCC2023#HuBERT

1This paper introduces the T02 team's VITS-based singing voice conversion (SVC) system for SVCC2023, featuring a feature extractor (HuBERT, F0), a voice converter, and a DSPGAN post-processor for enhanced audio quality.
2The system employs a two-stage training strategy to adapt to limited target speaker data, incorporating pre-training on speech and singing data, along with adaptation tricks like data augmentation and joint training with auxiliary singers.
3Official SVCC2023 results demonstrate the system's superior performance, ranking 1st in naturalness and 2nd in similarity for the challenging cross-domain task, with ablation studies confirming the effectiveness of its design choices.

y