Main Content


Detect and isolate speech and other sounds

Detect speech and other sounds and locate their start and end times. For streaming applications, use a voice activity detector (VAD) to output the probability that speech is present in a given frame. You can also use Speech-to-Text Transcription to create time-aligned word labels for speech signals.


Signal LabelerLabel signal attributes, regions, and points of interest, and extract features


voiceActivityDetectorDetect presence of speech in audio signal


detectSpeechDetect boundaries of speech in audio signal
classifySoundClassify sounds in audio signal


Voice Activity DetectorDetect presence of speech in audio signal