Simple VAD (Voice activity Detector) implementation

4 views (last 30 days)
I am new to DSP. I am to classify few recorded phone calls as either containing speech or not by sampling lets say first 10 seconds.
The files contain either silence, dialtone or ringtones, or real human voice.
I have tried implementing butter-worth filter 100hz-400hz, then calculating Short time energy, Zero crossing rate, then calculating variance of the resultant array. But the results aren't good, calls with ringtones are confused with human voice.
I am also aware of that human voice contain high amount of harmonics, can someone point me right direction to implement this.
Thanks

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!