reverse microwave: Simplistic audio synthesis with Haskell

In an effort to practice Haskell, I’ve been working on a program that generates guitar chords for a given chord “specification”.

To be able to have instant feedback (and also when I didn’t have an instrument available) I’ve also created a tool for playing back the generated chords. While I was working in a Windows environment, I could use the system MIDI synth for this, so playing chords was easy.

Lately, however, I’ve been spending more time in OS X. Unfortunately, OS X doesn’t ship with a system-available MIDI synth, and I couldn’t find any lightweight options for being able to play couple of notes. So given my very relaxed requirements for this, I decided the most lightweight option would be to make a simple synth.

Playing sounds in OS X

First, we need to be able to play back arbitrary audio data. There’s the wonderful CoreAudio and Audio Units and a lot of other cool things, but given my laziness and limited time I decided to go with the simplest option again:

brew install sox

SoX is a universal audio-tinkering tool. Apart from features like audio format conversion, audio processing effects, noise removal, etc. SoX can also play audio from stdin:

cat /dev/urandom | play -traw -r8000 -c2 -b8 -u - # play some stereo noise

So what was left is to generate the actual audio data.

Functional audio signals

Let’s define some types first:

type Amplitude = Double
type Time = Double
type Signal = Time -> Amplitude

We will try to use a purely-functional model of an audio signal – a function of time point to a wave amplitude. We’ll see how good that will work out.

(While the choice of Double for audio wave amplitude seems pretty obvious, it’s not as much for the Time. I’m confident it’s actually a pretty horrible choice for a serious audio synthesis, but it will probably be OK in my case.)

Let’s define some basic audio signals:

-- |A sound from outer space.
silence :: Signal
silence = const 0

-- |Oscillate in a form of a sine wave at 'freq' Hz.
sine :: Time -> Signal
sine freq t = sin $ freq * (2.0 * pi) * t

-- |Square wave at 'freq' Hz.
square :: Time -> Signal
square freq t = if odd i then 1.0 else -1.0
    where (i, _) = properFraction (t * freq)

Looks pretty straightforward. Don’t forget that functions are always curried in Haskell, so sine 440 is a value of type Time -> Amplitude, for which we have defined a synonym – Signal. So sine 440 will return a Signal.

Now, let’s define a couple of audio signal combinators so we can compose several sounds in a chord:

-- |Multiplies the signal by a fixed value.
volume :: Amplitude -> Signal -> Signal
volume x s t = s t * x

-- |Mixes two signals together by adding amplitudes.
mix :: Signal -> Signal -> Signal
mix x y t = x t + y t

-- |Mixes several signals together by adding amplitudes.
mixMany :: [Signal] -> Signal
mixMany signals = foldr mix silence signals

This gives us the ability to mix different signals, and also in arbitrary proportions:

-- mix one quarter of 440 Hz sine with one half of 660Hz square
mix (volume 0.25 $ sine 440) (volume 0.5 $ square 660)

Great, now let’s create a signal that’s constituted of actual notes mixed together. First, we’ll need to be able to calculate oscillation frequency for a given note. We’ll use General MIDI integers to denote notes:

-- |Calculates an oscillation frequency for a MIDI note number, in an equally tempered scale.
midiNoteToFreq :: (Floating a) => Int -> a
midiNoteToFreq n =
    f0 * (a ** (fromIntegral n - midiA4))
    where
        a = 2 ** (1.0 / 12.0)
        f0 = 440.0 -- A-4 in an ETS is 440 Hz.
        midiA4 = 69 -- A-4 in MIDI is 69.

And now, to generate signal for a complete chord:

-- |Mixes given notes into a single chord signal.
chord :: [Int] -> Signal
chord notes = mixMany $ map (volume 0.2 . sine . midiNoteToFreq) notes

Looks very clear and concise. Awesome!

Rendering a signal

To hear a Signal we need to sample it and feed it to sox for playback. All we actually need to do is to evaluate the signal at each sampling point and convert it to a value that sox will understand.

import Data.Int (Int16 (..)) -- we're going to use Int16 as output signal format

-- |Limits the signal's amplitude to not leave specified range.
clip :: Amplitude -> Amplitude -> Signal -> Signal
clip low high s = max low . min high . s

-- |Samples the signal over a specified time range with given sample rate.
render :: Time -> Time -> Int -> Signal -> [Int16]
render startT endT sampleRate s =
    [ int16signal (sample * samplePeriod) | sample <- [0..totalSamples] ]
    where
        int16signal = (toInt16 . clipped) -- a function of Time -> Int16
        toInt16 x = (truncate (minSig + ((x + 1.0) / 2.0 * (maxSig - minSig))))
        minSig = fromIntegral (minBound :: Int16)
        maxSig = fromIntegral (maxBound :: Int16)
        clipped = (clip (-1.0) 1.0 s) -- the same signal clipped to stay within [-1; 1]
        totalSamples = (endT - startT) / samplePeriod -- total number of samples to render
        samplePeriod = 1.0 / (fromIntegral sampleRate) -- time interval between two sample points

Now, to actually play a chord we would do something like this:

import Data.Binary (encode)
import qualified Data.ByteString.Lazy as BS (concat, putStr)

main :: IO ()
main = do
    BS.putStr $ playChord [58, 63, 67, 72, 77]    
    where
        playChord notes = 
            let rendered = render 0.0 3.5 44100 $ chord notes
            in BS.concat $ map encode rendered

and then pipe that output to sox:

$ ./play-chords | play -ts16 -c1 -r44100 -x -

Pimping up the sound

Okay, we now can play chords, but the sound is boring. We want the voices to reminisce (at least a bit) the sound of guitar, or any other musical instrument for that matter (we’re really desperate). We also want the chord to be played in arpeggio.

Let’s make the note signals fading, and also mix different waveforms and see if that helps:

-- |Controls the amplitude of one signal by value of another signal.
amp :: Signal -> Signal -> Signal
amp x y t = x t * y t

-- |Emits a control signal for an exponential fade out.
fade :: Time -> Signal
fade speed = exp . (* speed) . (* (-1.0))

-- |Plays a fading note with given waveform.
fadingNote :: Int -> (Time -> Signal) -> Double -> Signal
fadingNote n wave fadeSpeed = amp (fade fadeSpeed) (wave (midiNoteToFreq n))

-- |Plays a note with a nice timbre. Mixes slowly fading square wave with rapidly fading sine.
niceNote :: Int -> Signal
niceNote n = mix voice1 voice2
    where
        voice1 = volume 0.3 $ fadingNote n square 1.9
        voice2 = volume 0.7 $ fadingNote n sine 4.0

Now our chord function will look like this:

chord notes = mixMany $ map (volume 0.2 . niceNote) notes

This sounds a lot better, something akin to AY.

And now for arpeggio, we’ll define another signal combinator, that will allow us to change the time at which the signal starts:

-- |Delays a signal by a given time.
delay :: Time -> Signal -> Signal
delay delayTime s t = if t >= delayTime then s (t - delayTime) else 0.0

With this addition, our chord function transforms into this:

chord notes = mixMany noteWaves
    where
        noteWaves = zipWith noteInArpeggio [0..] notes
        noteInArpeggio idx n = delay (fromIntegral idx * 0.05) $ niceNote n

And it sounds like this:

The complete source code is on github.