AudioIntervals

AudioIntervals[audio]

returns audible intervals of audio.

AudioIntervals[audio,crit]

returns intervals of audio for which the criterion crit is satisfied.

AudioIntervals[audio,crit,mindur]

returns only intervals larger than the given duration mindur.

AudioIntervals[video,]

returns only intervals from the first audio track in video.

Details and Options

  • AudioIntervals can be used to detect parts of an audio signal that have specific characteristics.
  • The criteria crit can either be a string specifying a high-level objective or a pure function using local audio properties.
  • High-level string settings for crit can be one of the following:
  • "Audible"audible intervals, RMS amplitude above 0.01
    "Inaudible"inaudible intervals, RMS amplitude less than or equal to 0.01
    "Loud"louder intervals, data-dependent threshold
    "Quiet"quieter intervals, data-dependent threshold
    "VoiceActivity"intervals with detected speech
    "VoiceInactivity"intervals with no detected speech
  • The criteria crit can also be a function taking #prop arguments and uses the local property "prop" for each partition selection.
  • The following properties can be used for interval selections.
  • Basic histogram properties:
  • "MaxAbs"maximum absolute value
    "Max"maximum value
    "StandardDeviation"standard deviation of values
  • Intensity properties:
  • "Power"mean of the squared values
    "RMSAmplitude"root mean square of the values
    "Loudness"the loudness using Steven's power law
    "LoudnessEBU"the loudness according to EBU momentary standard
  • Time domain properties:
  • "CrestFactor"maximum divided by the root mean square
    "Entropy"entropy of values
    "PeakToAveragePowerRatio"maximum power divided by the average power
    "ZeroCrossingRate"rate of zero crossings
    "ZeroCrossings"number of zero crossings
  • Frequency domain properties:
  • "FundamentalFrequency"estimated fundamental frequency
    "ModifiedKullbackLeibler"modified KullbackLeibler distance between spectra of consecutive partitions
    "SpectralCentroid"centroid of the power spectrum
    "SpectralCrest"maximum divided by the mean of the power spectrum
    "SpectralFlatness"geometric mean divided by the mean of the power spectrum
    "SpectralKurtosis"kurtosis of the magnitude spectrum
    "SpectralRollOff"frequency below which most of the energy is concentrated
    "SpectralSkewness"skewness of the magnitude spectrum
    "SpectralSlope"estimated slope of the magnitude spectrum
    "SpectralSpread"measure of the bandwidth of the power spectrum
    "SpeechFundamentalFrequency"fundamental frequency optimized for speech signals
    "VoiceActivity"detected voice activity for speech signals
  • The minimum duration mindur can be a non-negative real number in seconds, a time quantity, or a samples quantity.
  • The following options can be given:
  • AlignmentAutomaticalignment of the time stamps with partitions
    FourierParameters{-1,1}Fourier parameters
    PartitionGranularity Automaticaudio partitioning specification
  • By default, measurements are returned at the center of each partition. Using the Alignment option, measurements can be returned at the beginning (Left) or end (Right) of each partition.

Examples

open allclose all

Basic Examples  (2)

Compute silent intervals of audio:

Find intervals where the RMS amplitude is less than 0.01:

Visualize silent intervals:

Find intervals with low RMS amplitudes:

Visualize the resulting intervals:

Scope  (4)

Find quiet intervals using a data-dependent threshold:

By default, intervals of any length are returned:

Compute the interval durations:

Find only intervals longer than a specified threshold:

Test multiple properties at once:

Analyze the audio track of a video:

Options  (2)

PartitionGranularity  (2)

Specify a partition size of 100 ms:

Use an offset of 10 ms:

Use a smoothing window:

Using different partitioning specifications will give different results:

A coarse partitioning will result in a faster computation:

Applications  (4)

Delete silent intervals of audio:

Find the intervals where the RMS amplitude is larger than a threshold:

Join the extracted intervals:

It is also possible to find silent intervals using a momentary loudness definition from the EBU standard:

Use the "VoiceActivity" property to detect voiced intervals in a speech signal:

Visualize the detected intervals:

Combine other properties such as RMS amplitude and spectral flatness to find unvoiced audio segments:

Visualize the detected intervals:

Detect unvoiced segments and attenuate them:

Use the "VoiceActivity" property to detect unvoiced intervals:

Visualize the detected intervals:

Attenuate the detected intervals:

Possible Issues  (1)

The criterion function will fail if the return value is not a Boolean:

Some properties, such as "FundamentalFrequency", can have non-numeric values, so extra care is needed:

Wolfram Research (2016), AudioIntervals, Wolfram Language function, https://reference.wolfram.com/language/ref/AudioIntervals.html (updated 2024).

Text

Wolfram Research (2016), AudioIntervals, Wolfram Language function, https://reference.wolfram.com/language/ref/AudioIntervals.html (updated 2024).

CMS

Wolfram Language. 2016. "AudioIntervals." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2024. https://reference.wolfram.com/language/ref/AudioIntervals.html.

APA

Wolfram Language. (2016). AudioIntervals. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/AudioIntervals.html

BibTeX

@misc{reference.wolfram_2024_audiointervals, author="Wolfram Research", title="{AudioIntervals}", year="2024", howpublished="\url{https://reference.wolfram.com/language/ref/AudioIntervals.html}", note=[Accessed: 26-November-2024 ]}

BibLaTeX

@online{reference.wolfram_2024_audiointervals, organization={Wolfram Research}, title={AudioIntervals}, year={2024}, url={https://reference.wolfram.com/language/ref/AudioIntervals.html}, note=[Accessed: 26-November-2024 ]}