Pitch-synchronous analysis

- developed at Bell Telephone Labs by Max Matthews and Jean-Claude Risset (1961)
- breaks input waveform into pseudoperiodic segments and then estimates the pitch of each pseudoperiodic segment
- size of analysis segment is adjusted relative to the estimated pitch period
- harmonic Fourier spectrum is then calcualted on the analysis segment as though the sound were periodic (as if the pitch is quasi-constant throughout the analysis segment)
- this program generated time-varying amplitude functions for each harmonic of a given fundamental
- additional approachesin pitch-synchronous developed at MIT in 1963 (Luce)

About it, assumptions, advantages

- heterodyne filter analysis multiplies an input waveform by a sine and cosine wave at each harmonic frequency and then sums the results over a short time period to obtain amplitude and phase data
- Creates a break-point function for each harmonic
- implies that fundamental frequency is estimated in a prior stage of analysis
- good for resolving harmonics of a given fundamental frequency
- Calculates relative energy at each harmonic
- Similar to STFT, analyzes signal at multiple, evenly-spaced time points Creates a break-point function for each harmonic

How it Works

- input signal is multiplied by an analysis sine wave and comparisons are made
- it works by multiplying the sound being analyzed by sine and cosine waveforms at frequencies equal to the given fundamental frequency and then again at each of its harmonics. For each frequency the result of the multiplication is summed to obtain the amplitude and frequency value

*************** |

Demise of Heterodyne Filter Analysis and the rise of phase vocoders and **Additive Resynthesis**

- Problems with Heterodyne Filter Analysis
- confused by transients
- confused by ptich changes of more than a quarter tone (glissandi, vibrato, portamento)

- Heterodyne Filter Analysis supplanted by:
- the phase vocoder
- additive resynthesis - a tracking version of heterodyne filter analysis that could follow changing frequency and amplitude trajectories (but required a lot of processing power)

- infilename
- input audio file to be analyzed
- outfilename
- output file name (usually adsyn.nnn)

- -s rate
- sampling rate of the audio input file.
- -c channel
- channel number sought. Default is all channels.
- -b begin
- beginning time (in seconds). Default is 0.0
- -d duration
- duration (in seconds). Default is to the end of the file.
- -f begfreq
- estimated frequency of the fundamental. Default is 100Hz.
- -h partials
- number of harmonic partials sought in the audio file. Default is 10, maximum is a function of memory available.
- -M maxamp
- maximum amplitude summed across all concurrent tracks. The default is 32767.
- -m minamp
- amplitude threshold. (Harmonics with amplitudes below this won’t contribute.)
- -n brkpts
- initial number of analysis breakpoints in each amplitude and frequency track, prior to thresholding (-m) and linear breakpoint consolidation. The initial points are spread evenly over the duration. The default is 256.
- -l cutfreq
- substitute a 3rd order Butterworth low-pass filter with cutoff frequency cutfreq (in Hz), in place of the default comb filter.

ar adsyn kamod, kfmod, ksmod, ifilcod |

- kamod
- relative amplitude variable (1 = no change)
- kfmod
- pitch control variable (1 = no change)
- ksmod
- speed control variable (1 = no change)
- ifilcod
- adsyn.nnn or pathname in quotes

An

- -1 time1 value1 ... timeK valueK 32767 ; amplitude breakpoints for partial 1
- -2 time1 value1 ... timeL valueL 32767 ; frequency breakpoints for partial 1
- -1 time1 value1 ... timeM valueM 32767 ; amplitude breakpoints for partial 2
- -2 time1 value1 ... timeN valueN 32767 ; frequency breakpoints for partial 2

**About it, uses, summary**

- subtractive analysis/resynthesis method that has been extensively used in speech and music applications (especially speech applications)
- basic idea - given N samples of a waveform, to what extent can the next samples be predicted? LPC attempts to find the best possible prediction of the next sample
- Advantages:

- one can edit analysis data and resynthesize variations on the original input signal
- LPC separates the
signal from the**excitation**, making it possible to manipulate rhythm, pitch, & timbre independently**resonance**

**LPC and relationship to subtractive synthesis**

- subtractive synthesis starts with complex waveform from which certain frequencies are removed
- starting waveshape may be periodic (harmonic), such as pulse, sawtooth, square, or triangular, or it may be random
- THEN this waveform is filtered
- This two-part system can be thought of as a source (=excitation) waveform followed by a modification (=resonance)
- This is similar to natural acoustic sounds (i.e.,
**source = bowed string = excitation waveform**AND**modification = body of violin = resonant system/filter**)

**How LPC Works**

2 samples before present sample | 1 sample before present sample | present sample | future sample to be predicted |

- multiply each of the N samples by a paramter and adding together all the products to find the predicted value
- difference between the prediction and the actual next sample value is called the
**prediction error** - output samples are "predicted" by a linear combination of filter parameters (coefficients) and previous samples. A prediction algorithm tries to find samples at positions outside a region where one already has samples
- since the predictor is looking at sums and differences of time-delayed samples, it can be viewed as a filter

**Overview of LPC Analysis: The four stages (or directions) of LPC Analysis**

Each stage (or direction) of analysis is carried out on a frame-by-frame basis, where frame is like a snapshot

- spectrum analysis in terms of formants
- pitch analysis
- amplitude analysis
- decision as to whether the sound was voiced (pitched) or unvoiced (characteristic of noise)

**Links to resources regarding LPC**

The links progress from basic definitions of LPC to more complex explanations utilizing some math.

- Short Definition of LPC from
*EARS* - Short Definition with hyperlinks
- One-page explanation with few hyperlinks
- One-page explanation with more hyperlinks
- LPC using cSound examples (courtesy of MIT and SFU)
- Mathematical Explanation
- Another Mathematical Explanation
- Excerpt from textbook used at Cornell
- Links to LPC programming resources

**LPREAD**

- lpread
- Reads a control file of time-ordered information frames

krmsr, krmso, kerr, kcps lpread ktimpnt, ifilcod [, inpoles] [, ifrmrate] |

- ifilcod
- integer or character-string denoting a control-file (reflection coefficients and four parameter values) derived from n-pole linear predictive spectral analysis of a source audio signal. An integer denotes the suffix of a file lp.m; a character-string (in double quotes) gives a filename, optionally a full pathname. If not fullpath, the file is sought first in the current directory, then in that of the environment variable SADIR (if defined). Memory usage depends on the size of the file, which is held entirely in memory during computation but shared by multiple calls (see also adsyn, pvoc).
- inpoles (optional, default=0)
- number of poles in the lpc analysis. It is required only when the control file does not have a header; it is ignored when a header is detected
- ifrmrate (optional, default=0)
- frame rate per second in the lpc analysis. It is required only when the control file does not have a header; it is ignored when a header is detected

lpread accesses a control file of time-ordered information frames, each containing n-pole filter coefficients derived from linear predictive analysis of a source signal at fixed time intervals (e.g. 1/100 of a second), plus four parameter values:

- krmsr -- root-mean-square (rms) of the residual of analysis
- krmso -- rms of the original signal
- kerr -- the normalized error signal
- kcps -- pitch in Hz

lpread gets its values from the control file according to the input value ktimpnt (in seconds). If ktimpnt proceeds at the analysis rate, time-normal synthesis will result; proceeding at a faster, slower, or variable rate will result in time-warped synthesis. At each k-period, lpread interpolates between adjacent frames to more accurately determine the parameter values (presented as output) and the filter coefficient settings (passed internally to a subsequent lpreson).

- ktimpnt
- The passage of time, in seconds, through the analysis file. ktimpnt must always be positive, but can move forwards or backwards in time, be stationary or discontinuous, as a pointer into the analysis file.

**LPRESON**

- lpreson
- Modifies the spectrum of an audio signal with time-varying filter coefficients from a control file

ar lpreson asig |

lpread gets its values from the control file according to the input value ktimpnt (in seconds). If ktimpnt proceeds at the analysis rate, time-normal synthesis will result; proceeding at a faster, slower, or variable rate will result in time-warped synthesis. At each k-period, lpread interpolates between adjacent frames to more accurately determine the parameter values (presented as output) and the filter coefficient settings (passed internally to a subsequent lpreson).

- asig
- an audio signal to be modified

**LPFRESON**

- lpfreson
- Modifies the spectrum of an audio signal with time-varying filter coefficients from a control file and frequncy ratio

ar lpfreson asig, kfrqratio |

lpread gets its values from the control file according to the input value ktimpnt (in seconds). If ktimpnt proceeds at the analysis rate, time-normal synthesis will result; proceeding at a faster, slower, or variable rate will result in time-warped synthesis. At each k-period, lpread interpolates between adjacent frames to more accurately determine the parameter values (presented as output) and the filter coefficient settings (passed internally to a subsequent lpfreson).

- asig
- an audio signal to be modified
- kfrqratio
- frequency ratio. Must be greater than 0

This page updated May 16, 2005

This page under construction

Graphics will be uploaded by May 17, 2005

cSound examples will be added by the end of the Summer of 2005

contact info: Jeremy Baguyos ( jbkontra@hotmail.com )