Sound Processing in MATLAB
Structure arrays in MATLAB
Go to http://www.openairlib.net/anechoicdb/content/operatic-voice –> anechoic sound files. Control + click in singing.wav and choose Download linked file to save it to your Downloads folder. Navigate to that folder in MATLAB:
cd ~/Downloads
Get some info about the file and store it into a:
a = audioinfo('singing.wav')
Notice that a has a struct class:
whos a
What is the bit depth? let's store it into a variable:
n = a.BitsPerSample;
Let's calculate the resolution of a 16-bit system:
r = 2^n
We can put a string in the Title field:
a.Title='My Song'
Or add a new field, let's call it Rating and add a value:
a.Rating = 10
Actually we could add another value to our field:
a.Rating(2) = 5
Type a. and press the TAB key to see all the fields in the structure. Use up and down arrows to choose a field, press TAB to autocomplete. You can use structures to encapsulate your variables. For example, you can store a into a new variable called mystructure:
mystructure=a
Let's add a field called b to mystructure, and include a value for b.
mystructure.b=5
- a. audioinfo informs about the sample rate. Can you calculate it using the structure fields provided by this function?
- b. Remove one of the structure array fields (help rmfield)
- c. Create a new structure array with two fields. Add a value to one field and a string to the other one.
Creating a sinewave
fs = 44100; t = 0 : 1/fs : 2; f = 200; A = .5; w = 0 * pi/180; % degrees y = A * sin( 2 * pi * f * t + w ); plot(t,y) sound( y, fs, 16 );
- a. Create a sinewave with a frequency of 1000 Hz, amplitude of .2 and phase of 15°.
- b. What would be the minimum sampling rate that this sinewave would require?
Phase cancellation
% phase cancellation fs = 44100; % sampling frequency (Hz) t = 0 : 1/fs : 5; % time axis (seconds) f1 = 440; % frequency (Hz) f2 = 440; A1 = .3; A2 = .3; w1 = 0; % phase in degrees w2 = 180; % phase in degrees y1 = A1 * sin( 2 * pi * f1 * t + w1 * pi/180 ); y2 = A2 * sin( 2 * pi * f2 * t + w2 * pi/180 ); y = (y1+y2)/2; sound( y, fs, 16 );
- a. Plot the sound. Now change one of the frequencies to 441 Hz, plot the sound again and listen to it. What do you hear?
- b. Now the difference between phases is 180°. Change it to 179°. What do you hear? Try changing it to 181°. What do you hear now?
Binaural beats / Cocktail party effect
fs = 44100; t = 0 : 1/fs : 5; f1 = 300; f2 = 310; A = .5; w = 0 * pi/180; % degrees y1 = A * sin(2 * pi * f1 * t + w ); y2 = A * sin(2 * pi * f2 * t + w ); y=[y1;y2]; sound( y, fs, 16 );
- a. Try decreasing the difference between both frequencies from 10 Hz to 2 Hz. What do you hear?
- b. Play 1000 Hz in one ear and 1010 in the other. Can you hear the binaural beat?
- c. Create a cocktail party effect by increasing the difference between frequencies up to 30 Hz. What do you hear?
Complex tones via additive synthesis
fs = 44100; % sampling frequency (Hz) t = 0 : 1/fs : 5; % time axis (seconds) f1 = 440; % frequency (Hz) f2 = 2 * f1 f3 = 3 * f1; f4 = 4 * f1; A1 = .3; A2 = A1/2; A3 = A1/3; A4 = A1/4; w = 0; % phase y1 = A1 * sin( 2 * pi * f1 * t + w ); y2 = A2 * sin( 2 * pi * f2 * t + w ); y3 = A3 * sin( 2 * pi * f3 * t + w ); y4 = A4 * sin( 2 * pi * f4 * t + w ); y = (y1+y2+y3+y4)/4; sound( y, fs, 16 ); % playback sinewave
These are natural harmonics of the cello: C3 (130.813 Hz) G3 (195.998 Hz) C4 (261.626 Hz) E4 (329.628 Hz) G4 (391.995 Hz).
- a. Create a complex tone based on these 5 frequencies. Consider that harmonics decrease in amplitude as the frequency rises.
"Melodies"
Use the sine waves created in the previous example to build a melody:
y=[y1 y2 y3 y4]; soundsc( y, 44100 );
- a. Create a short melody using sinewaves of different duration.
- b. Make a variable x containing a modified version of your melody y using
x = y(1:2:length(y))
. Listen to the result. What happened?
Modulation
- a. Create a signal y with a sampling rate fs of 44100 Hz. Modulate its frequency and its amplitude:
m=modulate(y, 20, fs, 'fm'); % Frequency modulation soundsc(m, fs) n=modulate(y, .5, fs, 'am'); % Amplitude modulation soundsc(n, fs)
- b. Perform a phase modulation using a modulation index equal to 20 (help modulate).
Fast Fourier Transform
Given a signal y and a sampling frequency fs you can obtain the signal frequency spectrum and plot it using the following code:
Y = fft(y); plot(abs(Y)) N = fs % number of FFT points transform = fft(y,N)/N; magTransform = abs(transform); faxis = linspace(-fs/2,fs/2,N); plot(faxis,fftshift(magTransform)); xlabel('Frequency (Hz)') % view frequency content up to half the sampling rate: axis([0 length(faxis)/2, 0 max(magTransform)]) % change the tick labels of the graph from scientific notation to floating point: xt = get(gca,'XTick'); set(gca,'XTickLabel', sprintf('%.0f|',xt))
- a. Create a signal y at 440 Hz and plot its spectrum. Use an adequate axis scaling (help axis).
- b. Plot the spectrum of a complex sinewave at different amplitudes.
- c. Do the plots include phase information of the signal? Explain.
Spectrogram
Using a signal y and a sampling frequency fs you can obtain its frequency spectrum over time:
win = 128 % window length in samples % number of samples between overlapping windows: hop = win/2 nfft = win % width of each frequency bin spectrogram(y,win,hop,nfft,fs,'yaxis') % change the tick labels of the graph from scientific notation to floating point: yt = get(gca,'YTick'); set(gca,'YTickLabel', sprintf('%.0f|',yt))
- a. Create a quadratic chirp starting at 0 Hz and reaching 440 Hz at 1 s using
c = chirp(t,0,1,440,'q')
. Plot its spectrogram. - b. Plot a spectrogram of the melody that you created before.
- c. Try the effect of using different window lengths on the spectrogram. What happens when you modify this length?
Shephard tones
%Parameters: fs = 16000; d = 1; % time after which waveform repeats fmin = 300; % lowest frequency fmax = 3400; % highest frequency n = 12; % number of tones t = 0: 1/fs : d-1/fs; % timestamps for each sample point % normalized logarithm of frequency of each tone (row) % for each sample point (column), all rising linearly % from 0 to 1, then wrap around back to 0 l = mod(([0:n-1]/n)' * ones(1, fs*d) + ones(n,1) * (t/(d*n)), 1); f = fmin * (fmax/fmin) .^ l; % freq. for each tone and sample p = 2*pi * cumsum(f, 2) / fs; % phase for each tone and sample % make last column a multiple of 2*pi for phase continuity p = diag((2*pi*floor(p(:,end)/(2*pi))) ./ p(:,end)) * p; s = sin(p); % sine value for each tone and sample % mixing amplitudes from raised-cosine curve over frequency a =0.5-0.5*cos(2*pi*l); w = sum(s .* a)/n; % mix tones together, normalize to [-1, +1] w = repmat(w, 1, 3); % repeat waveform 3x specgram(w, 2048, fs, 2048, 1800); ylim([0 4000]) % plot w = repmat(w, 1, 20); % repeat waveform 20x soundsc(w,16000)
- a. Compute the spectrogram for the 1-minute sound
- b. Export this auditory illusion as a .wav file (help wavwrite)
Amplitude modulation and Ring modulation
- a. Import a mono .wav file into matlab using audioread and save it into a variable called y. Set the sampling rate with fs and perform amplitude modulation:
load('handel') index = 1:length(y); Fc = 5; % frequency of trem A = .5; % amplitude of trem w = 0 % phase of trem trem=(w*pi/180 + A * sin(2*pi*index*(Fc/Fs)))'; y = y .* trem; soundsc(y,Fs)
- b. Matlab comes with a sample audio file of Handel's "Hallelujah". Load it with load handel (or s = load handel to make a structure). Perform an amplitude modulation.
- c. Ring modulation is a special case of amplitude modulation. Increase the trem frequency to around 1000 Hz and listen to the result.
Filter design
We will make a low-pass filter to attenuate frequencies above 442 Hz:
cutoff = 442/(fs/2); % Set the frequency at which the attenuation begins. order = 10; d = designfilt('lowpassfir','CutoffFrequency',cutoff,'FilterOrder',order)
d has a struct class. type d.
and press TAB to look at the options. Check out d.info. Some of these, such as d.freqz and d.impz are plots.
You can also take a look at fvtool(d), it is a graphical user interface for filter analysis.
Let's now use our low-pass filter to attenuate frequencies of a tone. Store the complex 'cello' tone into a variable called y. Also store its sampling rate into fs.
o = filter(d,y); % o is the filtered version of y
soundsc(o)
- a. Listen to the result. Listen to the original tone and compare both.
- b. Plot the beginning of both versions of the tone with plot(o(1:800)),figure,plot(y(1:800)). Can you find any differences?
- c. Try modifying the filter cutoff and the filter order. How do they affect the sound?
- d. Repeat the procedure but using a highpass filter (help designfilt)
Convolution reverb
Go to http://www.openairlib.net/auralizationdb/content/terrys-factory-warehouse -> Impulse Responses. Download the ommnidirectional mono recording to your Downloads folder. Read the audio with audioread and store it in a variable called ir. Also store its sampling rate in a variable called fs. Listen to it with soundsc. Read also singing.wav into MATLAB. Store it into a variable called x. Convolve both signals, listen and plot the result using the following code.
y = conv(x,ir); % convolution with FFT soundsc(y,fs) figure(); plot(y,'b'); hold on plot(x,'black'); t = 0 : 1/fs : length(ir)/fs; % time point title('Impulse Response Reverberated Signal'); xlabel('time'),ylabel('Amplitude') figure(); plot(ir); title('Impulse Response');
- a. Convolve singing.wav with the first half second of the complex tone shown in Example 5.
Audio recording
Thank you for following this tutorial. By the way, MATLAB has a function for audio recording. Check it out!
r=audiorecorder(44100,16,1) record(r) % speak stop(r) g=getaudiodata(r,'double','int16'); soundsc(g,44100)