Christian's Python Library

Main Page  |  Class Hierarchy  |  Alphabetical List  |  Compound List  |  File List

Sponsored by the Laboratory of Bio-Acoustics, Dept. of Cognitive Biology, University Vienna
This is a collection of open source Python scripts that I found useful for analyzing data from human and mammalian vocalizations, and for generating aesthetically pleasing graphs and videos, to be used in publications and presentations/lectures.
To date, these packages are available (there's more, but the respective code is still in development):
You can download the library here. There's two straightforward ways to install the modules: either store them in the same path as the script that you're running, or add the module's path to the system path (before importing the modules) from within the Python script that you're executing, such as:
modulePath = '/Users/ch/data/programming/python/lib/' # change as appropriate
import sys
# now you're good to import the modules
import generalUtility
import dspUtil
import matplotlibUtil
Here are a few tutorial-style examples (with Python code):
Enjoy! - If you have any questions, please contact me.
Finally, two DISCLAIMERS: (1) This library was developed on a Mac, and it was never thoroughly tested a Windows platform. There might be problems with the backslashes used in Windows path indicators. From what I've seen you should not run into problems if you avoid backslashes, but rather use forward slashes. (2) This was developed with Python 2.7. While in theory it MIGHT run with Python 3, there is no guarantee ...

Loading a wave file and saving a normalized version of the sound

Download source code | WAV input file
import myWave
import dspUtil
import numpy
import copy
import generalUtility
fName = 'WilhelmScream.wav'
# load the input file
# data is a list of numpy arrays, one for each channel
numChannels, numFrames, fs, data = myWave.readWaveFile(fName)
# normalize the left channel, leave the right channel untouched
data[0] = dspUtil.normalize(data[0])
# just for kicks, reverse (i.e., time-invert) all channels
for chIdx in range(numChannels):
n = len(data[chIdx])
dataTmp = copy.deepcopy(data[chIdx])
for i in range(n):
data[chIdx][i] = dataTmp[n - (i + 1)]
# save the normalized file (both channels)
# this is the explicit code version, to make clear what we're doing. since we've
# treated the data in place, we could simple write:
# myWave.writeWaveFile(data, outputFileName, fs) and not declare dataOut
dataOut = [data[0], data[1]]
fileNameOnly = generalUtility.getFileNameOnly(fName)
outputFileName = fileNameOnly + "_processed.wav"
myWave.writeWaveFile(dataOut, outputFileName, fs)

Reading and writing Praat TextGrids (for interactively annotating recordings)

Praat is an incredibly powerful free software application for analyzing human (and mammalian) vocalizations. If offers a wealth of analysis options, as well as scripting support. However, when performing more complex analysis tasks (particularly when a larger number of files is involved, or when performing an analysis that is not provided by Praat, such as calculating the EGG contact quotient), algorithmic interaction between Praat and Python might be desirable.
In order to call Praat from your Python code, Praat's directory must be known to your computer. For this, you need to add the directory where Praat is stored to your computer's system path variable. See these tutorials for doing this on a Mac or on Windows.
One possibility is to utilize Praat as a graphical user interface to annotate files containing acoustic recordings using Praat TextGrids, and then continue processing the annotated segments with Python. Here, such an approach is presented, consisting of three steps:
(1) Locate all WAV files in a directory and automatically create Praat TextGrids containing one IntervalTier with Python (saves you a lot of clicking when you need to analyze hundreds of files):
Download source code (part 1)
import praatTextGrid
import myWave
import generalUtility
import os
# we assume that you have a couple of audio files in this directory, which
# you'd like to annotate - change as needed
path = '/Users/ch/data/programming/python/lib/demo/'
# we'll only deal with WAV files in this example
suffix = 'wav'
# look for audio files in the directory
for fName in os.listdir(path):
if fName.split('.')[-1] == suffix:
print fName
# safeguard: do not create a new TextGrid if there's already one (to
# prevent yourself from accidentally overwriting already performed
# annotations)
fileNameOnly = generalUtility.getFileNameOnly(fName)
outputFileName = path + fileNameOnly + '.TextGrid'
if os.path.isfile(outputFileName):
print "\tWARNING: TextGrid already exists"
# open audio file to get duration
numChannels, numFrames, fs, data = myWave.readWaveFile(path + fName)
n = len(data[0])
duration = float(n) /float(fs)
# create a new PraatTextGrid object. make sure to indicate the sample
# duration
textGrid = praatTextGrid.PraatTextGrid(0, duration)
# create a new interval tier (you may want to change the tier label)
intervalTier = praatTextGrid.PraatIntervalTier("myAnnotation")
# add an empty element to the interval tier (to prevent Praat from
# crashing when working with the generated TextGrid)
intervalTier.add(0, duration, "")
# add the interval tier to the TextGrid
# finally, save the TextGrid in the same directory as the WAV file
(2) Annotate all WAV files within Praat (select both the WAV file and the TextGrid, once loaded in Praat, and open them by clicking "View & Edit"): add intervals to the IntervalTier as needed, and provide a meaningful label (in this example, any interval lavel will suffice).
(3) Finally, the annotation data can be utilized for further analysis. In this example, we'll simply generate a CSV file containing the file name, start and end time and the label of all annotations of all WAV files in the directory. The generated CSV could then be used for further analysis, e.g. in OpenOffice.
Download source code (part 2)
import praatTextGrid
import generalUtility
import os
# path that contains all the annotated wave files
path = '/Users/ch/data/programming/python/lib/demo/'
csvFileName = 'TextGridAnalysis.csv'
# open the output file
csvFile = open(path + csvFileName, 'w')
csvFile.write("file, tStart, tEnd, label\n")
# look for all TextGrids in the directory
for fName in os.listdir(path):
if fName.split('.')[-1] == 'TextGrid':
fileNameOnly = generalUtility.getFileNameOnly(fName)
print fileNameOnly
# instantiate a new TextGrid object
textGrid = praatTextGrid.PraatTextGrid(0, 0)
# initialize the TextGrid object from the TextGrid file
# arrTiers is an array of objects (either PraatIntervalTier or
# PraatPointTier)
arrTiers = textGrid.readFromFile(fName)
numTiers = len(arrTiers)
if numTiers != 1:
raise Exception("we expect exactly one Tier in this file")
# get the first tier in the file and check it's name (and we assume that
# there's exactly one tier in the file
tier = arrTiers[0]
if tier.getName() != 'myAnnotation':
# ignore the text grid if the name of the Tier does not match the
# name that was given in the script that generated the TextGrid
print "\tWARNING: unexpected tier (%s) in file %s. Skipping this file." \
% (tier.getName(), fName)
# now loop over all the defined intervals in the tier.
for i in range(tier.getSize()):
# only consider those intervals that are actually labelled.
if tier.getLabel(i) != '':
interval = tier.get(i)
print "\t", interval
# write to CSV file
csvFile.write("%s, %f, %f, %s\n" % (fileNameOnly, \
interval[0], interval[1], interval[2]))

Delegating analysis tasks to Praat from within Python

In this little example, we'll calculate a sound file's time-varying intensity by calling Praat's To Intensity... function within Python and create a simple graph with the result. Since Praat's intensity data is not calibrated, we'll convert the analysis data to relative dB.
In order for this example to work, Praat needs to be installed properly, and the Praat executable needs to be available in the command line.
Download source code | WAV input file
import praatUtil
import os
from matplotlib import pyplot as plt
import matplotlibUtil
import generalUtility
import sys
fName = 'WilhelmScream.wav'
# for this to work our sound file needs to be in the same directory as this
# Python script, and we need to get the path of that script:
path = sys.path[0] + '/'
fileNameOnly = generalUtility.getFileNameOnly(fName)
# calculate the Intensity data using Praat
dataT, dataI = praatUtil.calculateIntensity(path + fName)
# normalize the dB data, since it's not calibrated
dataI -= dataI.max()
# generate the graph
graph = matplotlibUtil.CGraph(width = 8, height = 3)
ax = graph.getArrAx()[0]
ax.plot(dataT, dataI, linewidth = 2)
ax.set_xlabel("Time [s]")
ax.set_ylabel("SPL [dB]")
graph.padding = 0.1
graph.adjustPadding(bottom = 2, right = 0.5)
# It is not aesthetically pleasing when graph data goes to all the way to the
# upper and lower edges of the graph. I prefer to have a little space.
matplotlibUtil.setLimit(ax, dataI, 'y', rangeMultiplier = 0.1)
# every doubling of sound pressure level (SPL) results in an increase of SPL by
# 6 dB. Therefore, we need to change the y-axis ticks
# finally, save the graph to a file
plt.savefig(fileNameOnly + "_intensity.png")
Running this script will produce this graph:

Creating a F1/F2 plot (Praat interaction, simple graph example)

In this more complex example three concepts are being demonstrated: running a Praat script from within Python, loading and parsing a Praat Formant structure, and generating a simple graph. [Note that you could just as well call this module's calculateFormants() function instead of generating a Praat script, but then we'd miss the chance to explain the runPraatScript() method in this tutorial.]
Download source code | the analyzed WAV file | TextGrid annotation
import praatUtil
import matplotlibUtil
import generalUtility
import praatTextGrid
from matplotlib import pyplot as plt
fName = 'AEIOU_vocalFry.wav'
# in order for this to work, you need to specify a path. change as appropriate
path = '/Users/ch/data/programming/python/lib/demo/'
# assemble a Praat script to analyze the formants of the file. Make sure that
# you add a backslach in front of every single quote of your Praat script (i.e.,
# every ' turns into /' - this does not apply here, since the Praat script below
# does not contain any single quotes). Also, add a new line (backslash n) at the
# end of every line in the script
# In particular, we'll create the script below. Note how the path and file names
# are being replaced by variables, so that you can easily change them.
# do ("Read from file...", "/Users/ch/data/programming/python/lib/demo/AEIOU_vocalFry.wav")
# do ("To Formant (burg)...", 0, 5, 5000, 0.025, 50)
# do ("Save as short text file...", "/Users/ch/data/programming/python/lib/demo/AEIOU_vocalFry.Formant")
fileNameOnly = generalUtility.getFileNameOnly(fName)
script = ''
script += 'do ("Read from file...", "' + path + fName + '")\n'
script += 'do ("To Formant (burg)...", 0, 5, 5000, 0.025, 50)\n'
script += 'do ("Save as short text file...", "' + path + fileNameOnly \
+ '.Formant")\n'
elapsed = praatUtil.runPraatScript(script)
print "Praat script executed in " + str(elapsed) + " seconds."
# read the generated Praat formants file
formants.readFile(fileNameOnly + '.Formant')
# read the accompanying Praat text grid (see the Praat TextGrid example for an
# extended documentation). We expect a TextGrid that contains one IntervalTier
# lablled 'vowels'. Within this IntervalTier, the occurring vowels are indicated
textGridFileName = fileNameOnly + '.TextGrid'
arrTiers = textGrid.readFromFile(textGridFileName)
numTiers = len(arrTiers)
if numTiers != 1:
raise Exception("we expect exactly one Tier in this file")
tier = arrTiers[0]
if tier.getName() != 'vowels':
raise Exception("unexpected tier")
# parse the TextGrid: create a dictionary that stores a list of start and end
# times of all intervals where that particular vowel occurs (that way we'll
# cater for multiple occurrances of the same vowel in a file, should that ever
# happen)
arrVowels = {}
for i in range(tier.getSize()):
if tier.getLabel(i) != '':
interval = tier.get(i)
vowel = interval[2]
if not vowel in arrVowels:
arrVowels[vowel] = []
tStart, tEnd = interval[0], interval[1]
arrVowels[vowel].append([tStart, tEnd])
# analyze the formant data: assign formant data to occurring (annotated) vowels
# where applicable, and discard the other formant data (i.e., that data that
# occurs in time when no vowel annotation was made)
n = formants.getNumFrames()
arrFormants = {}
arrGraphData = {}
for i in range(n):
t, formantData = formants.get(i)
# loop over all vowels and all intervals for each vowel
for vowel in arrVowels:
for tStart, tEnd in arrVowels[vowel]:
if t >= tStart and t <= tEnd:
# now we know that that particular formant data chunk is within
# the interval of a particular vowel annotation. use the formant
# data in the graph to be generated
# make sure we can actually store the formant data: create a
# dictionary holding two lists: one for the first and one for
# the second formant
if not vowel in arrGraphData:
arrGraphData[vowel] = {'f1':[], 'f2':[]}
# only consider 1st and 2nd formant
# finally, generate the graph. We're making use of matplotlib's colour cycle by
# only issuing one plot command per vowel. That way we won't have to deal with
# indicating colours ourselves, making our code flexible so it can deal with any
# number of occurring vowels
graph = matplotlibUtil.CGraph(width = 6, height = 6)
ax = graph.getArrAx()[0]
for vowel in arrGraphData:
print vowel, len(arrGraphData[vowel]['f1'])
ax.plot(arrGraphData[vowel]['f1'], arrGraphData[vowel]['f2'], 'o', \
markersize = 5, alpha = 0.4, label=vowel)
ax.set_xlabel("F1 [Hz]")
ax.set_ylabel("F2 [Hz]")
ax.set_title("F1/F2 plot")
graph.padding = 0.1
graph.adjustPadding(left = 1.5)
Executing this code will produce the following graph:
top of page

Graph demo

Matplotlib is a superb module for creating aesthetically pleasing graphs. When combinaed with numpy and any other data analysis framework (I mostly use Praat from within Python via the praatUtil module, one can create fully or semi-automated algorithmic solutions for analyzing huge amounts of data - an approach that, once mastered, vastly increases productivity!
There is no limit to the praises I sing for the matplotlib framework, it has enormously simplified my life as a publishing scientist. Matplotlib is easy to use at large, particularly when executing simpler tasks. There are, however, a few situations where matplotlib's functionality is obscure and not well documented (mostly if one wants to tweak details in a graph's appearance). To overcome this minor shortcoming, a number of utility functions are collected in the module matplotlibUtil.
A few of these utility functions (and one class for handling the layout of graphs: CGraph) are illustrated in the code below:
Download source code
import matplotlibUtil
from matplotlib import pyplot as plt
import numpy
import as cm
outputPath = '' # substitute the output path here
# instantiate the graph container
width = 8,
height = 8,
dpi = 72,
lineWidth = 1.5,
padding = 0.06,
fontSize = 13,
fontFamily = 'serif',
fontFace = 'Times New Roman',
fontWeight = 'normal'
# define the layout and create the graph
graph.setRowRatios([3, 2, 5])
graph.setColumns([1, 2, 2])
arrAx = graph.createFigure()
# for demonstration purposes: add a title for each panel, so we know which
# ax reference points to which panel
for i, ax in enumerate(arrAx):
ax.set_title("panel " + str(i+1))
# invent some data
duration = 0.5 # [s]
fs = 1000 # [Hz] sampling frequency
f0 = 40 # [Hz] frequency of the sine wave we'll create
n = int(round(duration * float(fs)))
arrData = numpy.zeros(n)
arrT = numpy.zeros(n)
for i in range(n):
A = 1.0 - float(i) / n # amplitude [0..1] - create a decaying sinusoid
t = float(i) / float(fs)
arrT[i] = t
arrData[i] = A * numpy.sin(numpy.pi * 2.0 * t * float(f0))
# plot the data in the top panel
ax = arrAx[0]
ax.plot(arrT, arrData, linewidth = graph.lineWidth)
ax.set_xlabel("Time [s]")
matplotlibUtil.formatAxisTicks(ax, 'y', 1) # adapt the x axis ticks
# extract parts of the data and plot in next row
n = len(arrData)
for i in range(2):
idx1 = (n / 2) * i
idx2 = idx1 + n / 4
ax = arrAx[i + 1]
ax.plot(arrT[idx1:idx2], arrData[idx1:idx2], linewidth = graph.lineWidth)
ax.set_xlabel("Time [s]")
matplotlibUtil.setLimit(ax, arrData[idx1:idx2], 'y', 0.1)
# create a noisy data distribution and plot in bottom left panel. fit a
# polinomial to the data
n = 100
arrX = range(n)
arrY = numpy.zeros(n)
for i in arrX:
arrY[i] = numpy.sqrt(i) + numpy.random.random() * 3
ax = arrAx[-2]
ax.plot(arrX, arrY, 'o', markersize = 5, alpha = 0.6, color='green')
matplotlibUtil.plotPolyFit(ax, arrX, arrY, degrees = 2, fontSize = 10, \
lineSize = 3, lineColor = 'red', txtX = 10, txtY = 1.6, numDigitsEq = 5)
# create and plot a 3D array with isocontours
ax = arrAx[-1]
numIsocontours = 10
arrX = numpy.arange(0, 1, 0.02)
arrY = numpy.arange(0, 1, 0.02)
arrZ = numpy.zeros((len(arrY), len(arrX)))
for idxY, y in enumerate(arrY):
for idxX, x in enumerate(arrX):
arrZ[idxY][idxX] = 100.0 * (numpy.sqrt(x) + y * y)
#print x, y, numpy.sqrt(x) + y * y
matplotlibUtil.plotIsocontours(ax, arrX, arrY, arrZ, colorMap = cm.afmhot, \
numIsocontours = 6, contourFontSize = 10)
# finalize and save the graph
graph.adjustPadding(left = 2.5, right = 1.0, top = 1.0, bottom = 1.0, \
hspace = 0.45, wspace = 0.5)
graph.addPanelNumbers(numeratorType = matplotlibUtil.NUMERATOR_TYPE_ROMAN, \
fontSize = 16, fontWeight = 'bold', \
countEveryPanel = True, format = '(%s)', offsetLeft = 0.1, offsetTop = 0.00)
fileName = outputPath + 'matplotlibUtilDemo'
plt.savefig(fileName + '.png')
plt.savefig(fileName + '.svg')
Running this script will produce this graph:
top of page

Generating a video from a series of matplotlib graphs

FFMPEG is an excellent open source tool to process and manage video data. Here, we utilize FFMPEG's functionality to turn a series of graphs (created with matplotlib) into an AVI movie. In order for this to work, you need to have FFMPEG installed and an available on your command line.
Download source code | WAV input file
import praatUtil
import os
from matplotlib import pyplot as plt
import matplotlibUtil
import generalUtility
import sys
import myWave
import numpy
# script control
fName = 'WilhelmScream.wav' # we will create a scolling display of this file
fps = 25 # frame rate of the generated video
videoBitrate = 8000000 # compression: define the quality of the generated movie
deleteTmpFiles = True # we'll delete the temporary graph files once we're done
displayDuration = 2 # how many seconds of data to display
cursorOffset = 0.5 # offset of the display cursor within the display window
# define the structure of the generated tmp files
fileNameOnly = generalUtility.getFileNameOnly(fName)
fileNameStructure = fileNameOnly + '_%05d.png'
# get the user's tmp directory, where we'll store all temporary data
# get the current directory, where we'll store the generated movie
userPath = path = sys.path[0] + '/'
# calculate the Intensity data using Praat and prepare data for plotting
dataT, dataI = praatUtil.calculateIntensity(path + fName)
dataI -= dataI.max()
intensityTimeStep = dataT[1] - dataT[0]
fsIntensity = 1.0 / intensityTimeStep
# read the sound data and prepare it for plotting
numChannels, numFrames, fs, data = myWave.readWaveFile(fName)
sampleData = data[0] # take the left (and only) channel in the file
dataTsound = numpy.zeros(numFrames)
for i in range(numFrames):
dataTsound[i] = float(i) / float(fs) # time offsets
# loop over the data
timeStep = 1.0 / fps
duration = numFrames / float(fs)
t = 0
frameCount = 0
arrImageFileNames = [] # we'll store the names of the generated files here, so
# we can delete them when we clean up
while t < duration:
tStart = t - cursorOffset
tEnd = tStart + displayDuration
frameCount += 1
print "Frame %d at t = %f seconds" % (frameCount, t)
# generate the graph
graph = matplotlibUtil.CGraph(width = 8, height = 6)
graph.setRowRatios([6, 4]) # set the ratio fo the row heights
arrAx = graph.getArrAx()
# plot the waveform: need to determine the data that is shown
# watch out: tStart can be negative, tEnd can be larger than the sound
# duration
offsetL = int(round(tStart * float(fs))) # theoretical lower offset
offsetU = int(round(tEnd * float(fs))) # theoretical upper offset
offsetLreal = offsetL # real (potentially corrected) lower offset
if offsetLreal < 0: offsetLreal = 0
offsetUreal = offsetU # real (potentially corrected) upper offset
if offsetU >= numFrames: offsetUreal = numFrames - 1
ax = arrAx[0]
ax.plot(dataTsound[offsetLreal:offsetUreal], \
sampleData[offsetLreal:offsetUreal], linewidth = 2)
ax.plot([t, t], [-1000, 1000], color='red') # time cursor
ax.set_xlabel("Time [s]")
ax.set_ylabel("Pressure [arbitrary]")
ax.set_xlim(tStart, tEnd)
matplotlibUtil.setLimit(ax, sampleData, 'y', rangeMultiplier = 0.1)
# plot the intensity: need to determine the data that is shown
# watch out: tStart can be negative, tEnd can be larger than the sound
# duration
offsetL = int(round(tStart * float(fsIntensity))) # theoretical lower offset
offsetU = int(round(tEnd * float(fsIntensity))) # theoretical upper offset
offsetLreal = offsetL # real (potentially corrected) lower offset
if offsetLreal < 0: offsetLreal = 0
offsetUreal = offsetU # real (potentially corrected) upper offset
if offsetU >= len(dataI): offsetUreal = numFrames - 1
ax = arrAx[1]
ax.plot(dataT[offsetLreal:offsetUreal], \
dataI[offsetLreal:offsetUreal], linewidth = 2, color='orange')
ax.plot([t, t], [-1000, 1000], color='red') # time cursor
ax.set_xlabel("Time [s]")
ax.set_ylabel("SPL [dB]")
ax.set_xlim(tStart, tEnd)
matplotlibUtil.setLimit(ax, dataI, 'y', rangeMultiplier = 0.1)
# finalize the graph
graph.padding = 0.1
graph.adjustPadding(left = 1.2, right = 0.5, bottom = 1.2, hspace = 0.4)
# finally, save the graph to the tmp dir. note that the frame number is part
# of the file name
graphFileName = tmpDataDir + (fileNameStructure % frameCount)
# very important: increase the time, to avoid an endless loop
t += timeStep
# now we should have a bunch of graphs (with ever increasing time offset) in
# our tmp directory. let's convert them into a movie
aviOutputFileName = userPath + generalUtility.getFileNameOnly(fName) + '.avi'
print "generating AVI file " + aviOutputFileName
generalUtility.createMovie(arrImageFileNames, aviOutputFileName, \
videoFps = fps, audioFileName = userPath + fName, \
deleteImageFiles = deleteTmpFiles, fileNameStructure = fileNameStructure, \
overwriteAviFile = True, videoBitrate = videoBitrate)
print "done."
Running this script will produce this video.
top of page