Sound Processing in MATLAB

Structure arrays in MATLAB

Go to http://www.openairlib.net/anechoicdb/content/operatic-voice –> anechoic sound files. Control + click in singing.wav and choose Download linked file to save it to your Downloads folder. Navigate to that folder in MATLAB:

cd ~/Downloads

Get some info about the file and store it into a:

a = audioinfo('singing.wav')

Notice that a has a struct class:

whos a

What is the bit depth? let's store it into a variable:

n = a.BitsPerSample;

Let's calculate the resolution of a 16-bit system:

r = 2^n

We can put a string in the Title field:

a.Title='My Song'

Or add a new field, let's call it Rating and add a value:

a.Rating = 10

Actually we could add another value to our field:

a.Rating(2) = 5

Type a. and press the TAB key to see all the fields in the structure. Use up and down arrows to choose a field, press TAB to autocomplete. You can use structures to encapsulate your variables. For example, you can store a into a new variable called mystructure:

mystructure=a

Let's add a field called b to mystructure, and include a value for b.

mystructure.b=5

a. audioinfo informs about the sample rate. Can you calculate it using the structure fields provided by this function?
b. Remove one of the structure array fields (help rmfield)
c. Create a new structure array with two fields. Add a value to one field and a string to the other one.

Creating a sinewave

fs = 44100;       
t = 0 : 1/fs : 2; 
f = 200;           
A = .5;           
w = 0 * pi/180; % degrees 
y = A * sin( 2 * pi * f * t + w );
plot(t,y)
sound( y, fs, 16 );

a. Create a sinewave with a frequency of 1000 Hz, amplitude of .2 and phase of 15°.
b. What would be the minimum sampling rate that this sinewave would require?

Phase cancellation

% phase cancellation
fs = 44100; % sampling frequency (Hz)
t = 0 : 1/fs : 5; % time axis (seconds)
f1 = 440; % frequency (Hz)
f2 = 440;
A1 = .3;
A2 = .3;
w1 = 0; % phase in degrees
w2 = 180; % phase in degrees
y1 = A1 * sin( 2 * pi * f1 * t + w1 * pi/180 );
y2 = A2 * sin( 2 * pi * f2 * t + w2 * pi/180 );
y = (y1+y2)/2;
sound( y, fs, 16 );

a. Plot the sound. Now change one of the frequencies to 441 Hz, plot the sound again and listen to it. What do you hear?
b. Now the difference between phases is 180°. Change it to 179°. What do you hear? Try changing it to 181°. What do you hear now?

Binaural beats / Cocktail party effect

fs = 44100;       
t = 0 : 1/fs : 5; 
f1 = 300;           
f2 = 310;           
A = .5;           
w = 0 * pi/180; % degrees 
y1 = A * sin(2 * pi * f1 * t + w );
y2 = A * sin(2 * pi * f2 * t + w );

y=[y1;y2];
sound( y, fs, 16 );

a. Try decreasing the difference between both frequencies from 10 Hz to 2 Hz. What do you hear?
b. Play 1000 Hz in one ear and 1010 in the other. Can you hear the binaural beat?
c. Create a cocktail party effect by increasing the difference between frequencies up to 30 Hz. What do you hear?

Complex tones via additive synthesis

fs = 44100; % sampling frequency (Hz)
t = 0 : 1/fs : 5; % time axis (seconds)
f1 = 440; % frequency (Hz)
f2 = 2 * f1
f3 = 3 * f1;
f4 = 4 * f1;
A1 = .3; A2 = A1/2; A3 = A1/3; A4 = A1/4;
w = 0; % phase

y1 = A1 * sin( 2 * pi * f1 * t + w );
y2 = A2 * sin( 2 * pi * f2 * t + w );
y3 = A3 * sin( 2 * pi * f3 * t + w );
y4 = A4 * sin( 2 * pi * f4 * t + w );
y = (y1+y2+y3+y4)/4;
sound( y, fs, 16 ); % playback sinewave

These are natural harmonics of the cello: C3 (130.813 Hz) G3 (195.998 Hz) C4 (261.626 Hz) E4 (329.628 Hz) G4 (391.995 Hz).

a. Create a complex tone based on these 5 frequencies. Consider that harmonics decrease in amplitude as the frequency rises.

"Melodies"

Use the sine waves created in the previous example to build a melody:

y=[y1 y2 y3 y4];
soundsc( y, 44100 );

a. Create a short melody using sinewaves of different duration.
b. Make a variable x containing a modified version of your melody y using x = y(1:2:length(y)). Listen to the result. What happened?

Modulation

a. Create a signal y with a sampling rate fs of 44100 Hz. Modulate its frequency and its amplitude:

m=modulate(y, 20, fs, 'fm'); % Frequency modulation
soundsc(m, fs)

n=modulate(y, .5, fs, 'am'); % Amplitude modulation
soundsc(n, fs)

b. Perform a phase modulation using a modulation index equal to 20 (help modulate).

Fast Fourier Transform

Given a signal y and a sampling frequency fs you can obtain the signal frequency spectrum and plot it using the following code:

Y = fft(y);
plot(abs(Y))

N = fs % number of FFT points
transform = fft(y,N)/N;
magTransform = abs(transform);

faxis = linspace(-fs/2,fs/2,N);
plot(faxis,fftshift(magTransform));
xlabel('Frequency (Hz)')

% view frequency content up to half the sampling rate:
axis([0 length(faxis)/2, 0 max(magTransform)]) 

% change the tick labels of the graph from scientific notation to floating point: 
xt = get(gca,'XTick');  
set(gca,'XTickLabel', sprintf('%.0f|',xt))

a. Create a signal y at 440 Hz and plot its spectrum. Use an adequate axis scaling (help axis).
b. Plot the spectrum of a complex sinewave at different amplitudes.
c. Do the plots include phase information of the signal? Explain.

Spectrogram

Using a signal y and a sampling frequency fs you can obtain its frequency spectrum over time:

win = 128 % window length in samples
% number of samples between overlapping windows:
hop = win/2            

nfft = win % width of each frequency bin 
spectrogram(y,win,hop,nfft,fs,'yaxis')

% change the tick labels of the graph from scientific notation to floating point: 
yt = get(gca,'YTick');  
set(gca,'YTickLabel', sprintf('%.0f|',yt))

a. Create a quadratic chirp starting at 0 Hz and reaching 440 Hz at 1 s using c = chirp(t,0,1,440,'q'). Plot its spectrogram.
b. Plot a spectrogram of the melody that you created before.
c. Try the effect of using different window lengths on the spectrogram. What happens when you modify this length?

Shephard tones

%Parameters:
fs   = 16000;
d    = 1; % time after which waveform repeats 
fmin = 300; % lowest frequency
fmax = 3400; % highest frequency
n = 12; % number of tones
t = 0: 1/fs : d-1/fs; % timestamps for each sample point

% normalized logarithm of frequency of each tone (row)
% for each sample point (column), all rising linearly
% from 0 to 1, then wrap around back to 0
l = mod(([0:n-1]/n)'  * ones(1, fs*d) + ones(n,1) * (t/(d*n)), 1);
f = fmin * (fmax/fmin) .^ l;    % freq. for each tone and sample
p = 2*pi * cumsum(f, 2) / fs;   % phase for each tone and sample
% make last column a multiple of 2*pi for phase continuity
p = diag((2*pi*floor(p(:,end)/(2*pi))) ./ p(:,end)) * p;
s = sin(p); % sine value for each tone and sample
% mixing amplitudes from raised-cosine curve over frequency
a =0.5-0.5*cos(2*pi*l);
w = sum(s .* a)/n; % mix tones together, normalize to [-1, +1]
w = repmat(w, 1, 3);   % repeat waveform 3x
specgram(w, 2048, fs, 2048, 1800);  ylim([0 4000])  % plot
w = repmat(w, 1, 20);  % repeat waveform 20x
soundsc(w,16000)

a. Compute the spectrogram for the 1-minute sound
b. Export this auditory illusion as a .wav file (help wavwrite)

Amplitude modulation and Ring modulation

a. Import a mono .wav file into matlab using audioread and save it into a variable called y. Set the sampling rate with fs and perform amplitude modulation:

load('handel')

index = 1:length(y);

Fc = 5; % frequency of trem
A = .5; % amplitude of trem
w = 0 % phase of trem

trem=(w*pi/180 + A * sin(2*pi*index*(Fc/Fs)))'; 

y = y .* trem;

soundsc(y,Fs)

b. Matlab comes with a sample audio file of Handel's "Hallelujah". Load it with load handel (or s = load handel to make a structure). Perform an amplitude modulation.
c. Ring modulation is a special case of amplitude modulation. Increase the trem frequency to around 1000 Hz and listen to the result.

Filter design

We will make a low-pass filter to attenuate frequencies above 442 Hz:

cutoff = 442/(fs/2); % Set the frequency at which the attenuation begins.
order = 10;
d = designfilt('lowpassfir','CutoffFrequency',cutoff,'FilterOrder',order)

d has a struct class. type d. and press TAB to look at the options. Check out d.info. Some of these, such as d.freqz and d.impz are plots. You can also take a look at fvtool(d), it is a graphical user interface for filter analysis.

Let's now use our low-pass filter to attenuate frequencies of a tone. Store the complex 'cello' tone into a variable called y. Also store its sampling rate into fs.

o = filter(d,y); % o is the filtered version of y

soundsc(o)

a. Listen to the result. Listen to the original tone and compare both.
b. Plot the beginning of both versions of the tone with plot(o(1:800)),figure,plot(y(1:800)). Can you find any differences?
c. Try modifying the filter cutoff and the filter order. How do they affect the sound?
d. Repeat the procedure but using a highpass filter (help designfilt)

Convolution reverb

Go to http://www.openairlib.net/auralizationdb/content/terrys-factory-warehouse -> Impulse Responses. Download the ommnidirectional mono recording to your Downloads folder. Read the audio with audioread and store it in a variable called ir. Also store its sampling rate in a variable called fs. Listen to it with soundsc. Read also singing.wav into MATLAB. Store it into a variable called x. Convolve both signals, listen and plot the result using the following code.

y = conv(x,ir); % convolution with FFT

soundsc(y,fs)

figure();
plot(y,'b');
hold on
plot(x,'black');
t = 0 : 1/fs : length(ir)/fs; % time point
title('Impulse Response Reverberated Signal');
xlabel('time'),ylabel('Amplitude')
figure();

plot(ir);
title('Impulse Response');

a. Convolve singing.wav with the first half second of the complex tone shown in Example 5.

Audio recording

Thank you for following this tutorial. By the way, MATLAB has a function for audio recording. Check it out!

r=audiorecorder(44100,16,1)
record(r) % speak
stop(r)
g=getaudiodata(r,'double','int16');
soundsc(g,44100)