Dalhousie University: Sounds of the Brain

	Sounds of the Brain
	Matt Boardman, Faculty of Computer Science

Below are several electroencephalogram (EEG) recordings, transformed into an audible frequency range. Click the icon next to each recording to hear the brain in action!

Chris, June 14

Original frequencies
(305 kB)

Additional harmonics
(508 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Chris, June 14 (2)

Original frequencies
(276 kB)

Additional harmonics
(460 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Chris, June 15

Original frequencies
(423 kB)

Additional harmonics
(423 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Chris, June 28

Original frequencies
(290 kB)

Additional harmonics
(484 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Chris, June 28 (2)

Original frequencies
(290 kB)

Additional harmonics
(484 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Chris, June 28 (3)

Original frequencies
(309 kB)

Additional harmonics
(514 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Chris, June 28 (4)

Original frequencies
(296 kB)

Additional harmonics
(494 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Jesse, July 6

Original frequencies
(344 kB)

Additional harmonics
(573 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Jesse, July 6 (2)

Original frequencies
(315 kB)

Additional harmonics
(525 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Billie, Aug. 3

Original frequencies
(311 kB)

Additional harmonics
(518 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

Jeff

Original frequencies
(909 kB)

Additional harmonics
(909 kB)

Frequency Spectrogram (FFT)

Power Spectral Density

How were these recordings made?

The recordings below were made during the summer of 2005, from several experiments conducted at Dalhousie University as a joint effort between the Department of Psychology and the Cognitive Computational Neuroscience group in the Faculty of Computer Science. A 64-channel BioSemi [1] digital EEG machine was used, with a maximum sample frequency of 2048 Hz.

The concept of making the recordings audible was inspired by NASA's Cassini probe [10], which recorded radio emissions from Saturn's rings in November 2003. These radio emissions were later transformed into audio recordings using a similar process to that described below.

What does is sound like?

The audio recordings sound somehow familiar. Listeners have described them as the sounds you hear when moving in an elevator, the resonance of a strong wind around a post or the ocean around a wharf, the sound you hear when you hold a seashell to your ear, the sound of a heart valve pumping blood, or the sonogram of a baby in the womb. In truth the sounds do not exist in nature in any human-audible form, but we can imagine that the brain would sound like this if it were possible to hear the electrical energy it generates.

What do the graphs mean?

The graphs are spectrograms, also known as sonograms or a waterfall display. They show how the frequency components of a signal change over time. Frequencies are shown on the vertical scale, and time increases along the horizontal scale. Higher intensity frequencies are shown in red, while blue indicates lower intensities. The right diagrams show the power spectral density rather than the relative intensity of each frequency, which can sometimes allow for higher discrimination of the signal from the surrounding noise.

Many of the diagrams show higher intensities throughout the course of the experiments at frequencies of 10 to 12 Hz, which are alpha waves indicating that the subject is awake and alert, but relaxed. Some of the graphs show frequencies of 21 to 23 Hz, which are beta waves and indicate that the subject is engaged in an activity which requires concentration. The vertical bands that sometimes appear show activity at all frequencies, and can most likely be attributed to noise in the signal, e.g. sensor noise as one or more sensors momentarily lose contact with the scalp.

What are EEG machines?

EEG machines detect electrical energy created by the firing activity of neurons in the human brain, which sum to create brainwave patterns that can be detected by sensors attached to the scalp. These brainwaves typically have a frequency range [6] between 0.5 to 30 Hz (cycles per second).

How were the audio recordings created?

Subjects wore a cloth cap resembling a swimmer's diving cap, similar to the one shown here [4], in which were 64 sensors attached to the EEG machine arranged in accordance with the 10-20 international standard [9], with 64 channels (i.e. A=auricular, C=central, Fp=frontal pole, F=frontal, P=parietal, O=occipital, T=temporal). The average of the left and right mastoids (forehead) was used as a reference signal to create the montage of electrodes. Subjects were asked to perform a simple task repetitively, such as tapping their left and right forefingers, and the sensors recorded their brain activity.

MATLAB [13] was used to read in the binary .bdf file created by the EEG machine. MATLAB was used to generate the above spectrograms and power spectral density diagrams, and to create uncompressed .wav audio files which were then compressed into .mp3 format using RazorLAME 1.1.5.1342 [5] with joint stereo and a 48 kHz output resampling frequency, with either 48 kB bitrate or 160 kB bitrate depending on the sampling frequency of the original EEG file (256 Hz or 2048 Hz respectively).

What digital processing was employed?

The EEG files were not processed for article removal (e.g. eye blinks or heartbeat), however any frequencies beyond the 0.5 to 30 Hz range were excluded using fast Fourier transforms (FFT).

STEP 1 The EEG file is read from the proprietary .bdf format as one large matrix, with 73 rows (64 channels plus eight reference electrodes and a status channel) and a number of columns corresponding to the duration of the recording in seconds × the sampling frequency F_s.
STEP 2 A montage is created by de-referencing all 64 electrode channels relative to the mean of the left and right mastoid electrodes.
STEP 3 The frequencies of each of the 64 electrode channels are limited to a range of 0.5 to 30 Hz, by taking the FFT of each channel individually, setting all frequencies beyond the intended range to zero, and taking the real part of the resulting complex inverse FFT.
STEP 4 Left and right channel vectors are created by summing the 64 channels from the EEG's sensors such that left audio is created from sensors near the right part of the brain and right audio is created from sensors near the left part of the brain.
STEP 5 The left and right channel vectors are concatenated to a single matrix with two rows, one per channel. The stereo audio vector is normalized by dividing by twice the RMS (Root Mean Square). Soft clipping to a maximum amplitude of +/- 1 is achieved by passing the signal through an inverse hyperbolic tangent tanh(). To remove the audible "click" at the beginning and end of the audio sample, a linear fade-in / fade-out is employed with a duration of one sample (i.e. one second). The vector is then saved as a .wav audio file, specifying the playback speed F_p = 16 × F_s to bring the frequency spectrum into an audible range (i.e. F_s=256 Hz becomes F_p=4096 Hz, or F_s=2048 Hz becomes F_p=32,768 Hz).
STEP 6 To create the additional harmonics, the stereo sound vector is multiplied by a complex frequency to modulate the sidebands of the signal to a higher frequency, using the vector itself for the positive parts of the frequency spectrum and a Hilbert transform [2] (one-sided Fourier transform) of the signal for the negative parts of the frequency spectrum:
s₂(t)=cos(2πt×(F₀/F_s))s₁(t) - sin(2πt×(F₀/F_s))H(t);
Where:
s₁(t) is the original signal
s₂(t) is the frequency shifted signal
H(t) is the Hilbert transform of the signal
F₀ is the frequency to shift by (in Hz)
F_s is the sampling frequency (in Hz)

This technique is called frequency shifting [3], and is done entirely in the time domain so that no FFT is required. The resulting signal is added to the original signal to create a signal containing both low frequency and high frequency harmonics. This signal is then normalized and soft-clipped as above, and saved as a second .wav audio file.

Table 1: Digital processing required to generate audio signal.

The stereo sound was created from these 64 EEG channels by including even numbered EEG sensors in the left channel (e.g. F2, F4), odd numbered sensors in the right channel (e.g. F1, F3), and the centrally-located z-channels (e.g. Fpz, Cz) were added to both left and right stereo channels. Note that the left and right were reversed from the 10-20 standard to create a "mirror" audio image.

This processing is detailed in Table 1.

How were the spectrogram images created?

Spectrograms show a graph of how the frequency distribution of a signal changes over time. Frequencies are shown on the vertical access, and the recording time is shown along the horizontal access. Colours in the graph show that the signal at a particular time is more or less intense at a particular frequency, and are shown on a logarithmic scale in decibels (dB), according to the colour bar to the right of each graph [7]:

Relative Intensity = 20 × log₁₀(P(f,t)/P_ref)

Where the power P(f,t) is relative to the power P_ref generated by a 1.0 μV signal at 30 Hz, and is therefore unitless. The time signal was split into several one-half second samples in accordance with the original sampling frequency (e.g. 256 Hz or 2048 Hz). Each sample was concatenated with several previous and following samples in order to create a moving window on the sampled data of width 4×F_s (beginning and end samples were concatenated with a zero sample to extend the sample to the same length, thus preserving the frequency scale). The sample was then multiplied by a sine wave window to de-emphasize data from the surrounding samples. The resulting sample was then analyzed for frequency content using the fast Fourier transform (FFT) method.

The power spectral density diagrams were created in a similar way, but rather than using FFTs to determine the frequency content of each sample, Welch's method was used to estimate the power spectral density. Welch's method allows for greater discrimination of the signal from background noise, which is assumed to be Gaussian, by overlaying sample slices within each sample. More information on Welch's method is available here.

On some spectrograms, you will notice small triangles at the bottom edge. These triangles correspond to button presses during one of the experiments (left is a upwards pointing triangle, right is down). The spectrograms without these triangles were created during other experiments.

What use are these recordings?

The audio signals were created for fun to see what would happen. However, they may be useful as a simple auditory cue to gauge distortions caused by noise that may be present in the signals. A continuous version available during the initial sensor attachment and setup may provide instantaneous feedback as a measure of the quality of contact of each sensor with the scalp.

Is the MATLAB code available?

The code is available for perusal here (7 kB). You are of course free to use this code as is, however it is specifically tailored to our particular setup, so some adjustments will inevitably need to be made for your unique requirements (channel labels, file formats and so on).

References

[1] BioSemi, Corporate Website, [http://www.biosemi.com/], 2005.

[2] R. Bracewell, The Fourier Transform and Its Applications, 3rd ed., McGraw-Hill, 1999, pp. 267-272.

[3] Robert Bristow-Johnson, Frequency Shift by Complex Exponential, Music-DSP Mailing List, California Institute of the Arts, [http://aulos.calarts.edu/pipermail/music-dsp/2005-February/029614.html], 2005.

[4] Cortech Solutions, LLC, Image of ActiveTwo Data Acquisition System Head Cap, [http://www.cortechsolutions.com/ActiveTwo_System.htm], 2004-2005.

[5] Holger Dors, RazorLAME, [http://www.dors.de/razorlame/], 2000-2003.

[6] Glenn Elert, Samantha Charles, Frequency of Brain Waves, The Physics Hypertextbook, [http://hypertextbook.com/facts/2004/SamanthaCharles.shtml], 2004-2005.

[7] Peter Elsea, Decibels, [http://arts.ucsc.edu/EMS/Music/tech_background/TE-06/teces_06.html], 1996.

[8] Monson H. Hayes, Statistical Digital Signal Processing and Modeling, John Wiley & Sons, Inc., 1996, pp. 415-20.

[9] Jaakko Malmivuo, Robert Plonsey, Bioelectromagnetism: Principles and Applications of Bioelectric and Biomagnetic Fields, Oxford University Press, 1995, Fig. 13.2c.

[10] NASA, Cassini-Huygens: Mission to Saturn & Titan, [http://saturn.jpl.nasa.gov/], 2003-2005.

[11] Alois Schloegl, T.S. Lorig, BDF File Format Reader for MATLAB (Distributed under GNU General Public License), [http://www.medfac.leidenuniv.nl/neurology/knf/kemp/edf.htm], 1997-1998.

[12] Julius O. Smith III, Mathematics of the Discrete Fourier Transform (DFT), [http://ccrma.stanford.edu/~jos/mdft/], 2003.

[13] The MathWorks, Inc., MATLAB Online Help File, [http://www.mathworks.com/], 1994-2005.

Back to top

STEP 1	The EEG file is read from the proprietary .bdf format as one large matrix, with 73 rows (64 channels plus eight reference electrodes and a status channel) and a number of columns corresponding to the duration of the recording in seconds × the sampling frequency F_s.
STEP 2	A montage is created by de-referencing all 64 electrode channels relative to the mean of the left and right mastoid electrodes.
STEP 3	The frequencies of each of the 64 electrode channels are limited to a range of 0.5 to 30 Hz, by taking the FFT of each channel individually, setting all frequencies beyond the intended range to zero, and taking the real part of the resulting complex inverse FFT.
STEP 4	Left and right channel vectors are created by summing the 64 channels from the EEG's sensors such that left audio is created from sensors near the right part of the brain and right audio is created from sensors near the left part of the brain.
STEP 5	The left and right channel vectors are concatenated to a single matrix with two rows, one per channel. The stereo audio vector is normalized by dividing by twice the RMS (Root Mean Square). Soft clipping to a maximum amplitude of +/- 1 is achieved by passing the signal through an inverse hyperbolic tangent tanh(). To remove the audible "click" at the beginning and end of the audio sample, a linear fade-in / fade-out is employed with a duration of one sample (i.e. one second). The vector is then saved as a .wav audio file, specifying the playback speed F_p = 16 × F_s to bring the frequency spectrum into an audible range (i.e. F_s=256 Hz becomes F_p=4096 Hz, or F_s=2048 Hz becomes F_p=32,768 Hz).
STEP 6	To create the additional harmonics, the stereo sound vector is multiplied by a complex frequency to modulate the sidebands of the signal to a higher frequency, using the vector itself for the positive parts of the frequency spectrum and a Hilbert transform [2] (one-sided Fourier transform) of the signal for the negative parts of the frequency spectrum: s₂(t)=cos(2πt×(F₀/F_s))s₁(t) - sin(2πt×(F₀/F_s))H(t); Where: s₁(t) is the original signal s₂(t) is the frequency shifted signal H(t) is the Hilbert transform of the signal F₀ is the frequency to shift by (in Hz) F_s is the sampling frequency (in Hz) This technique is called frequency shifting [3], and is done entirely in the time domain so that no FFT is required. The resulting signal is added to the original signal to create a signal containing both low frequency and high frequency harmonics. This signal is then normalized and soft-clipped as above, and saved as a second .wav audio file.
Table 1: Digital processing required to generate audio signal.