My Other Sites of Interest:

SpectratunePlus (including Jan 2014 additions):
Free Software (Windows) that Generates Several Types of Spectrograms of Music: Both Real Time and from Recorded Sound, with Many Features for Musical Aural Feedback (MIDI notes, Sing-Along pitch overlay), Other Features to Aid in Chord and Key Recognition, as Well as Some Other Features to Support an Understanding of How Overtones Relate to Perceived Musical Sound.

MIDI Music Visualizer/ Feedback Tool
Free Windows Software: Use any MIDI file and Get a Piano Roll view of the Music, Supplemented by Aural/Visual Feedback Tools Designed to Help in Chord and Key Identification and Aural Recognition, and Sing Along Pitch Accuracy.


ToneGen
Free Windows Software that Generates Simple or Complicated Combinations of Musically-Scaled Sines, For Experimenting with Psychoacoustic Foundations of Musical Sounds. Also Supports Measuring Limits of Pitch Discrimination in Individuals.



DYNAMIC SPECTROGRAMS OF MUSIC WEBSITE

A Music and Science Site



This site discusses, and has examples of, a kind of spectrogram video geared towards looking at and exploring the psychoacoustics of music.

The site was put together by Norm Spier.






For the purposes of looking at performed music, consider a spectral analysis (i.e., pitch breakdown) for a fixed time that looks like:



On this type of spectrogram, when you go around clockwise 1/12 of a revolution, you go up a half-step. The half steps (against a particular tuning, here the piano was tuned to A3=222hz) lie on the white rays emanating from the center. What we have in this type of spectrogram is that all of the occurrences of the same notes on higher octaves are on the same ray. Going out one level on the blue spiral curve brings you up an octave. The labels of note and octave (e.g. "E8") give the octave for the note shown on the ray on the outermost loop of the spiral curve. (E.g., the example shows of note E, some E3, E4, E5, and E6, with E5 the strongest.)

Now, if you do a spectrogram like this every tenth of a second or so, and look at it synchronized with the playing music, you get a dynamic picture of the music.

(People familiar with the science of hearing should note that what you are looking at in such a dynamic spectrogram is the raw data that the brain gets from the ear.)


Here [.mov] <-(choose format)-> [.avi] is a 30 second a segment of recorded music (from Bach's Goldberg Variations) as such a dynamic spectrogram. (NOTE: ".mov" format gives a sharper image, but on Windows, you need Apple's Quicktime player. ".avi" should work on any Windows machine. On a Windows computer, if you don't already have it, you can get the Quicktime player needed for the sharper ".mov" for free by clicking here. These files are about 4Mb each, and take about 20 seconds to load on a high-speed connection, longer on a low-speed.)

Here [.mov] <-(choose format)-> [.avi] is another segment of recorded music (about 35 seconds from the Brahms Requiem movement 2).

NOTE: You can pause these sample clips, go backwards, forwards, single frame, even in the I.E. browser view. To do so, use the controls on the bottom right for Apple Quicktime, and elsewhere for other players.

Note that what you are looking at in such a spectrogram is not just the notes in the score, but also the harmonics of those notes, as present in the recorded music. The amount of such harmonics will depend on the particular instruments and how they are played. (Where, exactly, will the harmonics be? --click here to see.) A detail here that winds up being important is that, each fundamental and each harmonic is, by definition, a sine-wave shape. This is important because the ear (via the basilar membrane) discriminates and separates out the separate sine waves in a sound, i.e. it separates out the tones and harmonics. The spectrograms as well separate out the sine waves, and display strength of each sine component.

The fact that the spectrograms contains both the notes and some harmonics can be a bit confusing at first. It helps to look at a single note, intervals, and chords, which I have spectralized and can be examined below (under MORE SAMPLE SPECTROGRAM CLIPS). One of the basic observations is that the patterns of a single note, or a consonant chord, have energy at a fundamental, 4 half-steps up, then 3 more up (counting angle only, not octave). However, when a note is played softly, some of the harmonics may not show up, and either the 4 or 3 more half-step-up positions may not show. (They are caught by my analytical method, actually, but are below the start of the scale that I have used on my spectrogram.)

Unification of Single Sounds: As in these spectrograms, the ear does output the information about all the different harmonics of each separate sound as separate signals, and, in its marvelousness, the brain puts it all together into the correct picture of actual distinct sounds. From the wonderful book edited by Perry Cook (Amazon | B & N ) , we learn this is done using the perceptual mechanism of "common outcome" -- that is, the harmonics of a given sound go through parallel changes in volume and frequency-shift (as well as having a prescribed harmonic frequency relationship) -- this allows the reconstruction. With a little practice, you can somewhat see the parallel changes and separate out different sounds in these spectrograms. (It is much harder to also incorporate the harmonic frequency relationship information -- everything is happening too fast.) Though the visual perceptive apparatus can't sort this out all that well (and further, my frame rate is a bit slow to give the visual perceptive apparatus its best shot) -- the aural perceptive apparatus apparently can.

Consonance and Dissonance: Psychoacoustic studies quoted in Perry Cook's book show that dissonance is caused when music contains sine components (i.e. fundamental or overtones) that are close to each other (less than the critical bandwidth), but not at virtually the same tone. The critical bandwidth is about 3 half-steps when we are dealing with tones above about A5, and a bit larger (about 100 hz -- a frequency-varying number of half-steps) below A5. The dissonance occurs when substantial such close sines exist, either in any of the fundamentals or overtones. (This yields surprising results: sines 11 half-steps apart DO NOT sound dissonant, though the major 7th interval is classified as dissonant. The sines do not sound dissonant only because there are no overtones -- there would be clashes if we added in overtones. See my example moving sine spectrogram below.) At any rate, consonance should be pretty perceivable from these spectrograms -- close, but not negligible, differences.

Do note that these spectrograms supersede neither the score, nor music theory, for appreciation and analysis purposes. It is simply an additional tool, which provides certain information about the sound. This information is likely closer to the input feeding into the nervous system than is the musical score. Though the score, particularly after a music-theoretic interpretation is applied, may be closer to what is actually perceived after processing by the brain, particularly for melodic elements.

Though the spectrograms are scientifically pure and without any artistic enhancement, you may notice certain visual symmetries and beauties, and certain beauties in the combined visual and aural, as in dance.


HOW I GENERATE THE SPECTROGRAMS

Basically, the ear distinguishes sound pitch by use of a long membrane called the basilar membrane in the inner ear, whose thickness and stiffness varies over its length. The varying thickness and stiffness cause different portions of the membrane over its length to tend to sustain vibration in response to particular different frequencies of sound. (In engineering terminology, the different portions "resonate" at different frequencies.) At any rate, when the ears receive a sound, the nerves sense which portions of the basilar membrane are vibrating, and thus can determine the pitches present.

Now, the computer program that generates these spectrograms does NOT simulate the basilar membrane exactly. But it uses a similar idea of resonation induced by vibration of an elastic object. Instead of a vibrating basilar membrane, the program uses a large number of (simulated) tuned, damped mass-springs picking up the sound. Each mass-spring looks like:

The sound vibration (change in air pressure) puts an alternating leftward and rightward force on the mass. The resulting motion builds up to a very large amount if and only if the spring tension is just right so that the mass moves a little, springs back and forth, and does the springing back and forth so that it is in synch with the continuing back and forth of the sound pressure, that is, so that the sound pressure is always going in the same direction as the building up motion. If this happens, we have resonance. It works out that the resonance will occur only close to one particular frequency (with the resonance being less as we move away from the frequency). How quickly the resonance reduces as we move away from the resonant frequency depends on the amount of frictional/air-resistance forces encountered by the mass-spring.

All in all, by selecting lots of mass springs with varying spring tension, masses, and friction/air-resistance levels, my program comes up with about 3000 mass-springs which have resonant frequencies spread throughout the audio frequencies. (The tendency of each spring to resonate as we go off the frequency of greatest resonance is set as appropriately as the laws of physics allow.) The program simulates the sound pushing the mass-springs, and plots the amount of vibration of each mass-spring at the appropriate point for its resonant frequency on the spiral spectrogram. It does this each 1/10 of a second or so.

NOTE:The power that I have plotted is decibels of power, where the total number of decibels in the scale is indicated on the plot in the lower left. (Less-technical people: This is a logarithmic scale. When you go up 10 decibels, you multiply the power by 10. When you go up 20 decibels, you multiply the power by 100. Thus, for a recording where the plotted range is 30 decibels, a bit of sound that goes 1/3 of the way up the scale is about 10 times as powerful as one you can just barely see. One that goes 2/3 of the way up the scale is 100 times as powerful as one you can just see, and one that goes all the way up is 1000 times as powerful as one you can just see.)

As another example, on the still at the top of this page (where the indicated range is 25 decibels, not 30), the E5 is roughly 20 decibels, or 100 times, more powerful than the E3 or E6.

Engineering and Math people: I have a few technical details at the bottom of this web page.






FREE

REAL TIME MUSICAL SPECTRATUNE SOFTWARE

-- DOES A LIVE VISUALIZATION LIKE ON THIS PAGE AND MUCH MORE

--CLICK HERE FOR INFO




-- FREE

NORM'S MUSIC VISUALIZER (Interactively play and display MIDI files with chord- and key- recognition tools, to aid comprehension of tonal aspects of the music, with optional overlaid pitch feedback for humming, singing along, and several other musical immersion features, as well. )

--Click here for MUSIC VISUALIZER INFO


-- FREE

New (2012) SpectratunePlus (See notes and harmonics in recordings with notes/key overlaid; support for sing-along intonation and piano-tone feedback; as well as Spectratune features) )

--Click here for MUSIC VISUALIZER INFO


MORE SAMPLE SPECTROGRAM CLIPS--(EACH TAKES 20 SECS OR MORE TO DOWNLOAD AT HIGH SPEED!)
(Each plays for about 30 seconds)

[.mov] -- [.avi] MOVING SINE AND FIXED SINE. The moving sine starts more than an octave down from the fixed one, and goes to more than an octave above. It is not dissonant when anywhere within a few half-steps of an octave down, or anywhere within a few half-steps of an octave above. It is only dissonant within a few half-steps of the fixed note (but not when almost exactly at the fixed note).


[.mov] -- [.avi] of a playing of intervals on the piano (middle C is always the lower note):
First consonant: unison, octave, perfect 5th (7 half-steps), perfect 4th (5 half steps)
Then imperfect-consonant: major 3rd (4 h-s), minor 6th (8-hs), minor 3rd (3-hs), major 6th (9-hs)
Then dissonant: major 2nd (2 h-s), minor 7th (10-hs), minor 2nd (1-hs), major 7th (11-hs), augmented 4th (6-hs)
Note: The explanation in Cook's book of the presence of close tones seems to hold. However, if it also has to do a bit with the more dissonant patterns being simply unlike the patterns you get more used to from listening to harmonics in nature -- like the human voice (energy at the rays 4 and 7 half-steps up from the fundamental -- e.g. like the first sound--the unison), this wouldn't surprise me. (My moving sine example doesn't seem to support the latter, but I'd need to see more variations on that to be sure. And pass them through expert ears, not my own.)

[.mov] -- [.avi] SOME TRIADS: a C, followed by a major triad, a minor triad, a diminished triad, and an augmented triad (all with C as root). Again, the explanation in Cook's book would do it. An again, I wonder if the deviation from the standard harmonic pattern adds some bite for the dissonant ones.

[.mov]-- [.avi] APPLAUSE. The spectrum is diffuse (noiselike), running over about 3 octaves. (There is also at times some energy at a very low frequency. This is not the applause, but some recording equipment rumble.)

[.mov]-- [.avi] SOME INDIAN MUSIC. There is a Sitar (Plucked fretted instrument playing the melody in an improvised fashion within the bounds of the "raga" or formula), Tamboura (Drone: Playing open-string always, tuned to the main tones of the raga), and a Tabla (tuned-pitch hand drums, tuned to some main tones of the raga). This is the Hindustani variant of Indian music, and is in the Dadra raga.
An Indian raga is roughly a formula combining particular notes and orders of playing of those notes. The notes are roughly (but not exactly) a subset of those spaced as within a tempered Western scale. Thus, after setting my software to show A at 227.5 hz (an unusual tuning for Western music), the fundamental tones of the music appear roughly where they would in Western Music -- that is, on my 12 outward rays.

[.mov] -- [.avi] from the Beethoven 4th Piano Concerto.

[.mov] -- [.avi] from the Beethoven 2nd String Quartet.

[.mov] from the Beethoven 2nd Razumovsky Quartet.

[.mov] from Mozart's Symphony 41.

[.mov] from the Beethoven Violin Concerto.

[.mov] from the Bach Passacaglia and Fugue BWV 582.

[.mov] from Mozart's Piano Quartet K. 478



[.mov] -- from Bach's Sonata BWV 1001 (transcribed for guitar). Observe the broad spectrum of the guitar.

[.mov] -- from Beethoveven's Piano Trio Op. 97




NEW! Due to storage-limit increases from my Web-hosting provider, I am able to offer a free, full-length, non-copyright protected spectral video!


BRAHMS GERMAN REQUIEM -- FULL LENGTH SPECTROGRAM VIDEO (Over 1 hr)

FREE DOWNLOAD (.mov) -- Yours to Keep and Play as Often As You Like -- 660 Mb


DOWNLOAD YOUR FREE COPY from this Link
(Internet Explorer: Right click + save target as)

(Takes 30 Min or More on a cable-speed connection)

Download is a Quicktime .mov and will play if the shorter .movs above played


Creative Commons: Music and video portions are both covered by Creative Commons agreement
.

Concert Program (.pdf)




THANK YOU NAXOS: I am grateful to Naxos for making available a number of its high-quality professional recordings for this project. Here is the Naxos site.


RELATED SITES AND BOOKS:

SPECTROGRAM/PSYCHOACOUSTICS/MUSIC - RELATED SITES AND BOOKS



TECHNICAL NOTES FOR PEOPLE WITH A MATH/ENGINEERING BACKGROUND

I have not used the somewhat standard Fourier techniques to do these spectrograms. The battery of tuned damped mass-springs seem closer in functioning to the basilar membrane than Fourier transforms. Further, the efficiency of the FFT does not come into play so much, since the evenly spaced musical intervals are not evenly spaced in frequency.

I have no knowledge of how the method I have used might compare with using windowed Fourier transforms. (My guess is that the results would be roughly similar. However, this comparison does not apply to DFT/FFT -- my technique is much sharper in tone distinctions.)

With Fourier transforms, there is a well-known tradeoff called the Uncertainty Principle (absolutely NOT related to Heisenberg's) where the shorter the sample in time, the less precise the image in frequency. This tradeoff is clearly visible when I look at the examples in my method as well. In my dynamic spectrograms, I choose parameters to place a cap on the rate of spring slow-down after the sound signal is removed (keeping lag or sluggishness of response under control). Doing so yields frequency images which are less and less sharp (even on synthetic pure sines) as we go down in frequency. (Some charts in Cook confirm that the same type of thing happens in the basilar membrane. Further, the wider critical bandwidth (in terms of half-steps) may be another manifestation of this.)

The precise modelling I use for each damped mass-spring is:

x" = -(k/m)x - (c/m)x' + s/m

here, x(t) is the one-dimensional position as a function of time, k is the spring constant, c is a damping constant, m is the mass, and s(t) is the one-dimensional force placed on the mass by the sound vibration in the air.

This mass-spring model and differential equation is covered in virtually all basic physics and differential equations texts, and the solution to the equation is given (with or without proof).

It is important to comment that, of course, it is not in the nature of biological things, like a basilar membrane, to be precisely engineered so that neurological detectors can be pre-wired to know that this position on the membrane resonates precisely 2 octaves above where this other point resonates. That correspondence would be learned, either from music, or from simply the experience of hearing sounds in nature. Thus, the spiral layout that I have used, with outward rays representing the same note in all octaves, presents the information not quite raw to the nervous system, but actually after a bit of neurological processing.

What Are the Frequencies of the Notes?: "A" right below middle C has a fundamental that is a sine of 220 cycles per second. Each time you go up a half-step, you multiply this by the 12th root of 2 (about 1.059463094359), down a half step, divide by the 12th root of 2. In each case, the harmonics are sines (in varying strengths, dying out as you go up) at 2 times the fundamental frequency, 3 times, 4 times, 5 times. Thus, the B right below middle C has a fundamental of 246.94 cycles per second, with harmonics 493.88, 740.82, 987.77, etc. DETAIL: This way of determining notes is the common standard way, called the tempered scale. Sometimes instead of the A below middle C being 220 cycles per second, it is a few cycles different. (My spectrogram software used to make the videos was adjustable, and sometimes I adjusted to the apparent tuning used by the ensemble in the recording -- e.g. 222hz in the first screenshot.)


A discerning observer's question: why sines?: The notion of frequency and harmonics implicitly defines that a vibration of a certain frequency is a sine function of that frequency. Why does everybody seem to choose this (the set of sine waves) as our "basis"? Most books just start out by looking at the sine as the fundamental wave, without saying why.

I am not sure all of the reasons, but the fundamental and best reason is that the sine is what the ear perceives. As above, the psychacoustics literature seems to bear this out. (However, the common choice of sine probably predates the psychoacoustics literature.)

A second reason is that the general model for most acoustical transformation, the linear-time-invariant system (supported by physics and usually reasonably accurate), preserves sines (just shifting and changing amplitude). Incidentally, using the knowledge that linear time invariant systems are those systems that are describable as the effect of an impulse response (i.e., essentially a large linear combination of shifted versions of the input wave), then the fact that linear time invariant systems preserves sine waves boils down to the well-known elementary trigonometric identity cos(x+y)=cos(x)cos(y)-sin(x)sin(y).

If anyone knows of any other reasons for using sines, please let me know.

SPIRAL REPRESENTATION: Of course it's not new. The spiral representation is pretty obvious, so one does not expect it is new. Indeed, I have bumped into a few people who have used it recently, and the book by Perry R. Cook indicates that the German physicist Moritz Drobisch proposed a helical representation (essentially the same thing--just pull up my flattened spring to form a stretched-out spring -- a helix). I expect the representation is even older than that, and I do know of one of my old Math professors who would be pretty surprised if Archimedes didn't think of this arrangement. (Oh, by the way: there is a terminology for the angular position around the spiral or helix (i.e. the note without reference to octave) -- it is called the chroma of the tone.)



NEW (2/07) -- REAL TIME MUSICAL SPECTROGRAM/TUNER SOFTWARE -- FREE FOR NON-COMMERCIAL PERSONAL USE

My new Windows-based software allows you to look at your own live or recorded sounds in the same fashion as the videos on this page, in real time. It supports either the spiral representation as in the dynamic spectrograms on this page, or a multi-line (one octave per line) display.

The new software has a mode for directly displaying the pitch when you have single sounds. It also can analyze separately the sound from two separate devices, so you can check your pitch as you sing or play along.


Click on the link in this box for information.




About me, Norm Spier:

I am a free-lance mathematical statistician and computer programmer, living in Northampton, Massachusetts, U.S.




Really Nice Physics Java Web Applets, Especially Acoustics and Signal Processing with Live Sound,

(plus other science links), from Paul Falstad here.




Ear Training Software:

I have, and recommend, EarMaster ear training software. These links, through Amazon, seem to be for the same product that I have: EarMaster 5. The prices are different: one through Amazon direct, one through a sub-vendor.






If you have any questions or comments, please email me at mailto:norm@nastechservices.com.






Notes on the Harmony/Theory Book:

Schoenberg is a classic, at some points articulate, at others unclear. I have place it here because it has considerable reference to overtones as explanations for the rules of harmony. However, some of these explanations may be speculative!

Historical Psychoacoustics Book (public domain)

Helmholtz, On The Sensations of Tone--A Physiological Basis for the Theory of Music

FRIEND SITE:

Very Nice Notecards and Photos: of St. Louis, and flowers by artist Vivian Brill (click here)






TAKE BACK YOUR TIME DAY


October 24th -- Every Year


www.timeday.org




Legal Information

_