function [syl,data,spectra] = sylana(); % function [syl,data,spectra] = sylana(); % % must be run in the directory containing the files of interest % % OUTPUT % % syl = structure with the following fields % syl(i).name = list of filenames (*.wav) imported % syl(i).sig = the ith signal (waveform) % syl(i).pow = the ith power profile (from the pwr() function) % syl(i).dpow = diff(syl(i).powr) % syl(i).samp = samples at which the power (and diff) windows are taken % % data = matrix of data (measurements) including duration, spectral, etc.. % (fill this in in more detail later) % % sound files should be named according to the following convention: % first element = trial number in the form t###, e.g., t023 = 23rd trial % second element = target consonant indicated by the following convention % f, v, s, z indicate themselves; vcd/vcls interdentals indicated by d,t % third element = condition indicated by either the consonant from which % the target is being disambiguated or 'n' for neutral (talker disam.) % fourth element = o(nset) or r(hyme) position of target consonant % fifth element = a(nchor), c(onsonant), or s(yllable) file type (anchor = % reference sound for relative power/amp measurements; consonant = just the % target fricative; syllable = the whole target syllable) % e.g., % onset f syllable disam from v on the 37th trial = t037fvos.wav % consonant by itself from same trial = t037fvoc.wav % reference 's' for same trial = t037fvoa.wav % initialize the structure syl syl = struct('name',{},'sig',{},'pow',{},'dpow',{},'samp',{}); spectra = struct('con',{},'anc',{}); % creates a structure with, among other things, file names filelist = dir('*.wav'); % number of files; for cycling through the files nfile = length(filelist); %%%%%%%%%%%%%%%%%%%%%%%%% % changeable parameters % %%%%%%%%%%%%%%%%%%%%%%%%% % resampled sampling frequency fs = 30000; % calculate the nyquist frequency nyq = fs/2; % 1) reference s or V or whatever I end up using; 2) consonant; 3) syllable nfile_per_syl = 3; %%%%%%%%%%%%%%% % other stuff % %%%%%%%%%%%%%%% % number of syllables to be analyzed nsyl = nfile/nfile_per_syl; % extracts the file names from filelist struct, signal and sampling % frequencies the files themselves for r = 1:nfile disp([' resampling file # ' num2str(r)]) syl(r).name = filelist(r).name; %%%%%%%%%%%% % resample % %%%%%%%%%%%% % import the rth wave [wave_temp1, fs_temp, nbits] = wavread(syl(r).name); % resample to fs = 20000 wave_temp2 = resample(wave_temp1,fs,fs_temp); % might have to take into account the edge effects due to filtering and % resampling; for now, I'm just assigning the resampled wave as the sig % this could turn out to be very important (?) % the ( 2^(nbits-1) - 1 ) term is to rescale outside of -1 to 1 range syl(r).sig = wave_temp2 * ( 2^(nbits-1) - 1 ); %sound(syl(i).sig,fs) % obtain measurements that I'm not even sure I'll want or need? maybe % for the envelope stuff, though... [syl(r).pow syl(r).dpow syl(r).samp] = pwr(syl(r).sig, fs); end %sound(syl(1).sig,fs) %sound(syl(2).sig,fs) %sound(syl(3).sig,fs) % a matrix containing all the data % data = zeros(nfile,??????) - i don't know how many columns i need just yet %%%%%%%%%%%%%%%%%%%%%% % duration variables % %%%%%%%%%%%%%%%%%%%%%% % to be filled with consonant durations in ms durc = zeros(nsyl,1); % to be filled with vowel durations in ms durv = zeros(nsyl,1); % to be filled with consonant durations as a proportion of total syl dur durcp = zeros(nsyl,1); % to be filled with vowel duration as a proportion of total syl dur durvp = zeros(nsyl,1); % to be filled with syllable durations dursyl = zeros(nsyl,1); %%%%%%%%%%%%%%%%%%%%%%%%%%%%% % spectral moment variables % %%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%% % changeable parameters % %%%%%%%%%%%%%%%%%%%%%%%%% % number of times at which to measure voicing power; % 3, two boundary areas, one middle (only one moment for each fric) ntime = 3; % fft window size in ms; 23.22 gives 512.00 something samples/window at fs = 22050 winsize_ms = 17.067; % for calculating the fft and displaying the spectral 'slice' % for fs = 22050 & winsize = 23.22 ms, round(winsize_ms) = 512 samples, % so fftlen = 512 probably okay fftlen = round( (winsize_ms/1000)*fs ) ; speclen = fftlen/2; % transformation of x-axis to frequency in Hz frequency = 1:speclen; frequency = (frequency/speclen)*nyq; % for truncating the temporary spectrum at Hertz Hz truncHz = 750; % someone else use 750 Hz to truncate... truncpoint = floor((truncHz/nyq)*speclen); % truncate frequency vector freq = frequency(truncpoint+1:speclen); % spectral binwidth, for 'integrating' spectra specbin = nyq/speclen; % number of spectra to average nspec = 12; % power measurements via 'integration' of spectra nsampsum = floor(speclen*(truncHz/nyq)); nsum = floor(speclen/nsampsum); % instead of # of spectra, spacing between windows (in samples) % nsamp = 100; % this will be variable with fixed nspec % logical to determine whether or not to preemphasize (first-difference % filter) the signal before taking spectral slices preemph = 1; % percent of peak amplitude to reach in envelope measurement percpa = 6/10; %%%%%%%%%%%%%%%%%%%%%%%%%%% % arrays for storing data % %%%%%%%%%%%%%%%%%%%%%%%%%%% % for con voicing 1 = voiced conv = zeros(nsyl,1); % for disam voicing 1 = voiced, 2 = neutral condition disv = zeros(nsyl,1); % for con place 0 = lab, 1 = cor conp = zeros(nsyl,1); % for disam place 0 = lab, 1 = cor displ = zeros(nsyl,1); % for prosodic context, 0 = onset, 1 = coda pros = zeros(nsyl,1); % for spectral means specmcon = zeros(nsyl,1); specmanc = zeros(nsyl,1); % for spectral variance specvcon = zeros(nsyl,1); specvanc = zeros(nsyl,1); % for spectral skewness specscon = zeros(nsyl,1); specsanc = zeros(nsyl,1); % for spectral kurtosis speckcon = zeros(nsyl,1); speckanc = zeros(nsyl,1); % for spectral peak frequencies and amplitudes specconpf = zeros(nsyl,1); % col 1 = dB re: 1 bit; col 2 = dB re: ancpa specconpa = zeros(nsyl,2); %specancpf = zeros(nsyl,1); %specancpa = zeros(nsyl,1); % for storing the number of spectra taken from each consonant % (both target and anchor) %nconspec = zeros(nsyl,ntime); %nancspec = zeros(nsyl,ntime); % for storing the amount of overlap (in ms) from the nspec spectra taken % col 1 for con; col 2 for anc conoverlap = zeros(nsyl,2); % for storing rms measurements of fricative intensity; % col 1 = con dB re: 1 bit; col 2 = anc dB re: 1 bit % col 3 = con dB relative anc %wavedb = zeros(nsyl,3); % THESE MEASUREMENTS TAKEN TO BE IRRELEVANT GIVEN SPECTRAL MOMENTS % to store power measurements via 'integration' of spectra % specconpow = zeros(nsyl,ntime*nsum); % bitconpow = zeros(nsyl,ntime*nsum); % powpeg = zeros(nsyl,2); % to store power measurements via 'integration' of % spectra as in indicator of amount of voicing; within each ntime, % col 1 = con dB re 1 bit; col 2 = anc dB re 1 bit % col 3 = con dB re anc <- JUST THIS ONE, ntime/-or-1 TIMES voipow = zeros(nsyl,ntime); % below truncHz noipow = zeros(nsyl,1); % above truncHz % envelope measurement array % samples envconss = zeros(nsyl,1); % milliseconds envconms = zeros(nsyl,1); % proportion of vowel envconpr = zeros(nsyl,1); for i = 1:nsyl disp([' measuring syllable # ' num2str(i) ' out of ' num2str(nsyl)]) if strcmp(syl(3*i).name(5),'v') | strcmp(syl(3*i).name(5),'z') conv(i) = 1; else conv(i) = 0; end if strcmp(syl(3*i).name(5),'s') | strcmp(syl(3*i).name(5),'z') conp(i) = 1; else conp(i) = 0; end if strcmp(syl(3*i).name(6),'v') | strcmp(syl(3*i).name(6),'z') disv(i) = 1; elseif strcmp(syl(3*i).name(6),'n') disv(i) = 2; else disv(i) = 0; end if strcmp(syl(3*i).name(6),'s') | strcmp(syl(3*i).name(6),'z') displ(i) = 1; elseif strcmp(syl(3*i).name(6),'n') displ(i) = 2; else displ(i) = 0; end if strcmp(syl(3*i).name(7),'o') pros(i) = 0; else pros(i) = 1; end sylb = syl( 3*i ).sig; con = syl( (3*i) - 1 ).sig; anc = syl( (3*i) - 2 ).sig; %%%%%%%%%%%%%%%%%%%%%% % duration, envelope % %%%%%%%%%%%%%%%%%%%%%% % duration in samples transformed to duration in ms % just calculate it once! durfact = 1000/fs; csamp = length( con ); sylsamp = length( sylb ); vsamp = sylsamp - csamp; durc(i) = csamp * durfact; dursyl(i) = sylsamp * durfact; durv(i) = dursyl(i) - durc(i); %( csamp * ( 1000 / syl(3*i-1).fs ) ); % duration proportions (durc/dursyl and durv/dursyl) durcp(i) = csamp / sylsamp; durvp(i) = 1 - durcp(i); %( sylsamp - csamp ) / sylsamp; % anchor duration measurements; not sure if I need 'em asamp = length( anc ); dura(i) = asamp * durfact; % envelope: time to percpa (percentage peak amp) % % just in case nothing equals exactly percpa of the max %fudge = maxsyl*1/10; % % to get indices of the points that reach percpa if pros(i) == 0 indvect = csamp+1:sylsamp; vow = sylb(csamp+1:sylsamp); else indvect = 1:sylsamp-csamp+1; vow = sylb(1:sylsamp-csamp+1); end % to calculate the percpa of max maxsyl = max(abs(vow)); % % percpa! percpasyl = percpa*maxsyl; % % gathers the indices of the percpa +/- fudge portions of sylb if pros(i) == 0 percpaind = indvect( ( abs( sylb(csamp+1:sylsamp) ) > percpasyl ) ); % else percpaind = indvect( ( abs( sylb(1:sylsamp-csamp+1) ) > percpasyl ) ); % end minperc = min(percpaind); maxperc = max(percpaind); % if pros(i)==0 envconss(i,1) = minperc - csamp; else envconss(i,1) = vsamp - maxperc; end % envconms(i,1) = envconss(i,1) * durfact; envconpr(i,1) = envconss(i,1) / vsamp; % calculate ratio of anchor and target c duration? not for now.... %%%%%%%%%%%%%%%%%%%%% % fricative spectra % %%%%%%%%%%%%%%%%%%%%% % integer increment to get nspec windows of fricative to average % used to be csamp - (fftlen + 1) / nspec, so ? nsamp = floor( ( csamp - fftlen ) / nspec ); nsampanc = floor( ( asamp - fftlen ) / nspec ); % window overlap in ms conoverlap(i,1) = (fftlen - nsamp) * durfact; conoverlap(i,2) = (fftlen - nsampanc) * durfact; %nspeccon = ceil( (csamp - fftlen) / nsamp ); %nspecanc = ceil( (asamp - fftlen) / nsamp ); %nconspec(i,1) = nspeccon; %nancspec(i,1) = nspecanc; % to store the spectra prior to be averaged indspecscon = zeros(speclen,nspec); indspecsanc = zeros(speclen,nspec); % if preemph, implement first difference filter if preemph % target consonant tempsigcon = filter([1 -.95],[1],con); sigcon = tempsigcon(1:length(tempsigcon)-1); % anchor consonant tempsiganc = filter([1 -.95],[1],anc); siganc = tempsiganc(1:length(tempsiganc)-1); else sigcon = con; siganc = anc; end % con and anc hamming window, fft, storage for j = 1:nspec % target consonant % take a window, hamming it winsigcon = hamming(fftlen) .* sigcon( ( j*nsamp-nsamp+1):( (j*nsamp-nsamp+1)+fftlen-1) ); % take the fast fourier transform of same dftwinsigcon = fft(winsigcon, fftlen); % put the 0:pi (0:nyq) portion in indspecs indspecscon(:,j) = abs( dftwinsigcon( 1:speclen ) ); % 'anchor' consonant - [s]aid % take a window, hamming it winsiganc = hamming(fftlen) .* siganc( ( j*nsampanc-nsampanc+1):( (j*nsampanc-nsampanc+1)+fftlen-1) ); % take the fast fourier transform of same dftwinsiganc = fft(winsiganc, fftlen); % put the 0-pi (0-nyq) portion in indspecs indspecsanc(:,j) = abs( dftwinsiganc( 1:speclen ) ); end % manipulations of the anchor spectrum, which is averaged once % average the anchor spectra tempspecanc = zeros(speclen,1); tempspecanc = mean( indspecsanc, 2 ); spectra(i).anc = tempspecanc; % truncate the anchor spectrum to keep voiced components separate truncspecanc_dn = tempspecanc(1:truncpoint,1); truncspecanc_up = tempspecanc(truncpoint+1:speclen,1); % max frequency anchor [mxanc mianc] = max( tempspecanc ); %specancpf(i) = mianc*(nyq/speclen); %specancpa(i) = mxanc; % normalize the anchor spectrum to take moments normspecanc = truncspecanc_up/(specbin*sum(truncspecanc_up)); % spectral moments for the anchor specmanc(i) = specbin*sum( normspecanc.*freq' ); specvanc(i) = sqrt(specbin*sum( ((freq - specmanc(i)).^2)'.*normspecanc )); specsanc(i) = specbin*sum( ((freq - specmanc(i)).^3)'.*normspecanc )/specvanc(i)^3; speckanc(i) = specbin*sum( ((freq - specmanc(i)).^4)'.*normspecanc )/specvanc(i)^4; % for pegging the spectral power measurements below % taking the mean here was a nearly 100% arbitrary decision % this could more logically/profitably be something like the sum or the % integral (sum of the amplitude times the binsize), or something... %ancpeg = specbin*sum(tempspecanc(truncpoint+1:2*truncpoint,1)); %powpeg(i,:) = [ancpeg 20*log10(ancpeg)]; % average across spectra tempspeccon = zeros(speclen,1); tempspeccon = mean( indspecscon, 2 ); spectra(i).con = tempspeccon; % to split the 1:nspeccon spectra into ntime sets specsplit = floor(nspec/ntime); % = 3 for ntime = 3, nspec = 9 modspec = mod(nspec,ntime); % = 0 for ntime = 3, nspec = 9 % voicing power for k = 1:ntime truncspec_dn = mean( indspecscon( 1:truncpoint, k*specsplit-specsplit+1:k*specsplit ), 2); convint = specbin*sum( truncspec_dn ); ancvint = specbin*sum( truncspecanc_dn ); % voipow(i,k) = 20*log10( convint/ancvint ); voipow(i,k) = 20*log10( convint ); % double-checking that 'raw' analysis is consistent... 1/18/07 end % GOOD CODE, SO DON'T WANT TO DELETE IT, BUT UNNECESSARY RIGHT NOW % averaging across multiple target consonant spectra %for k = 1:ntime % if pros(i)==0 % onset, so work backward (in time) from vowel % kfact = ntime-(k-1); % tempspeccon(:,kfact) = mean( indspecscon( :, kfact*specsplit-specsplit+1:kfact*specsplit+modspec ), 2); % spectra(i,kfact).con = tempspeccon(:,kfact); % else % coda, so work forward (in time) from vowel % tempspeccon(:,k) = mean( indspecscon( :, k*specsplit-specsplit+1:k*specsplit ), 2); % spectra(i,k).con = tempspeccon(:,k); % end %end % END SAVE GOOD CODE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % noise power truncspec_up = tempspeccon( truncpoint+1:speclen, : ); connint = specbin*sum( truncspec_up ); ancnint = specbin*sum( truncspecanc_up ); % noipow(i,1) = 20*log10( connint/ancnint ); noipow(i,1) = 20*log10( connint ); % double-checking that 'raw' analysis is consistent... 1/18/07 % find the spectral peak and its index [mxcon micon] = max( truncspec_up ); % change the peak index to frequency specconpf(i,1) = micon*(nyq/speclen); % raw peak amplitudes specconpa(i,1) = 20*log10( mxcon ); specconpa(i,2) = 20*log10( mxcon/mxanc ); % normalize the spectrum to take moments normspeccon = truncspec_up/(specbin*sum(truncspec_up)); % spectral mean specmcon(i,1) = specbin*sum( normspeccon.*freq' ); % spectral standard deviation (sqrt(variance)) specvcon(i,1) = sqrt(specbin*sum( ((freq - specmcon(i)).^2)'.*normspeccon )); % spectral skewness specscon(i,1) = specbin*sum( ((freq - specmcon(i)).^3)'.*normspeccon )/specvcon(i)^3; % spectral kurtosis speckcon(i,1) = specbin*sum( ((freq - specmcon(i)).^4)'.*normspeccon )/specvcon(i)^4; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % fricative intensity (waveform) % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % root mean square amplitude values for each fricative %rms_con = sqrt( sum( syl(3*i-1).sig.^2 ) / csamp ); %rms_anc = sqrt( sum( syl(3*i-2).sig.^2 ) / asamp ); % plug the above into a decibel formula, and voila! %wavedb(i,1) = 20*log10( rms_con ); %wavedb(i,2) = 20*log10( rms_anc ); %wavedb(i,3) = 20*log10( rms_con / rms_anc ); %%%%%%%%%%%%%%%%%%%% %% %%%%%%%%%%%%%% %% %% % IMPORTANT! % %% %% %%%%%%%%%%%%%% %% %%%%%%%%%%%%%%%%%%%% % things to do: % % can't remember what I wanted to tell myself... % % things not to do (?): % % -figure out how far from CV/VC boundary to start taking formant measurements % -use (adapt?) the peak picking algorithm(s) from S522 to find the formant peaks % -decide how many formant measurements to take % -decide how to calculate dF1, dF2 % -implement (and analyze on paper) the autocorrelation thingy for amount of voicing end % no idea if all the variables are included, so this will probably need to % be checked and updated repeatedly. data = [conv conp disv displ pros durc durv dursyl durcp durvp envconms envconpr ... specconpf specconpa specmcon specvcon specscon speckcon noipow voipow conoverlap]; %data = [conv conp disv displ pros noipow voipow]; % for checking on raw %power consistency %dcols = ['conv conp disv displ pros durc durv dursyl durcp durvp specconpf specconpa specancpf specancpa specmcon specvcon specscon speckcon specmanc specvanc specsanc speckanc wavedb']; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % NOTE: amplitude/intensity comparisons between the anchor % and the target consonant should be done with the waveforms % instead of the spectrum (i.e., spectral peak), right? NO! % % amplitude/intensity as a function of the whole consonant % seems more relevant than the amplitude at just one frequency % component, but I may be thinking about this wrong... % % Indeed, you are (or were). The integral of the spectrum over certain % frequency bands (to be decided) is what counts! %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % A nested loop for taking multiple spectral slices instead of a single % average spectral representation % % % - probably don't need it, but there's some % fancy indexing in there, so I don't want to throw it out, and I'll % probably forget about it completely if it's in another file... % % divide the fricative up into ntime+1 equal parts % should probably make sure that winsize < chunk ? % NOT SURE I NEED THIS FOR AVERAGED SPECTRA; PROBABLY DON'T % chunk = floor(length( syl(2*i-1).sig )/(ntime+1)); % % for j = 1:ntime % % % take a window, multiply by a hamming window % winsig = hamming(winsize) .* syl(2*i-1).sig( (j*chunk):(j*chunk+winsize-1) ); % % % take the fast fourier transform % dftwinsig = fft(winsig); % % % stick the 0-pi portion of that in the spectra matrix % tempspec = abs(dftwinsig(1:speclen)); % spectra(:,ntime*i-ntime+j) = tempspec; % % % find the spectral peak % nyq = syl(2*i-1).fs/2; % [mx mi] = max(tempspec); % specpf(i,j) = mi*(nyq/speclen); % % amplitude not dB transformed - still need to figure out what to % % do with the 's' in said... % specpa(i,j) = mx; % % normspec = tempspec/sum(tempspec); % % % transformation of x-axis to frequency in Hz % freq = 1:speclen; % freq = (freq/speclen)*nyq; % % % spectral mean % specm(i,j) = sum( normspec.*freq' ); % % % spectral standard deviation (sqrt(variance)) % specv(i,j) = sqrt(sum( ((freq - specm(i,j)).^2)'.*normspec )); % % % spectral skewness % specs(i,j) = sum( ((freq - specm(i,j)).^3)'.*normspec )/specv(i,j)^3; % % % spectral kurtosis % speck(i,j) = sum( ((freq - specm(i,j)).^4)'.*normspec % )/specv(i,j)^4;% % % end %end