Simplistic audio synthesis with Haskell
In an effort to practice Haskell, I’ve been working on a program that generates guitar chords for a given chord “specification”.
To be able to have instant feedback (and also when I didn’t have an instrument available) I’ve also created a tool for playing back the generated chords. While I was working in a Windows environment, I could use the system MIDI synth for this, so playing chords was easy.
Lately, however, I’ve been spending more time in OS X. Unfortunately, OS X doesn’t ship with a system-available MIDI synth, and I couldn’t find any lightweight options for being able to play couple of notes. So given my very relaxed requirements for this, I decided the most lightweight option would be to make a simple synth.
Playing sounds in OS X
First, we need to be able to play back arbitrary audio data. There’s the wonderful CoreAudio
and
Audio Units
and a lot of other cool things, but given my laziness and limited time I
decided to go with the simplest option again:
brew install sox
SoX is a universal audio-tinkering tool. Apart from features like
audio format conversion, audio processing effects, noise removal, etc. SoX can also play audio from
stdin
:
cat /dev/urandom | play -traw -r8000 -c2 -b8 -u - # play some stereo noise
So what was left is to generate the actual audio data.
Functional audio signals
Let’s define some types first:
type Amplitude = Double
type Time = Double
type Signal = Time -> Amplitude
We will try to use a purely-functional model of an audio signal – a function of time point to a wave amplitude. We’ll see how good that will work out.
(While the choice of Double
for audio wave amplitude seems pretty obvious, it’s not as much for
the Time
. I’m confident it’s actually a pretty horrible choice for a serious audio synthesis, but
it will probably be OK in my case.)
Let’s define some basic audio signals:
-- |A sound from outer space.
silence :: Signal
silence = const 0
-- |Oscillate in a form of a sine wave at 'freq' Hz.
sine :: Time -> Signal
sine freq t = sin $ freq * (2.0 * pi) * t
-- |Square wave at 'freq' Hz.
square :: Time -> Signal
square freq t = if odd i then 1.0 else -1.0
where (i, _) = properFraction (t * freq)
Looks pretty straightforward. Don’t forget that functions are always curried
in Haskell, so sine 440
is a value of type Time -> Amplitude
, for which we have defined a
synonym – Signal
. So sine 440
will return a Signal
.
Now, let’s define a couple of audio signal combinators so we can compose several sounds in a chord:
-- |Multiplies the signal by a fixed value.
volume :: Amplitude -> Signal -> Signal
volume x s t = s t * x
-- |Mixes two signals together by adding amplitudes.
mix :: Signal -> Signal -> Signal
mix x y t = x t + y t
-- |Mixes several signals together by adding amplitudes.
mixMany :: [Signal] -> Signal
mixMany signals = foldr mix silence signals
This gives us the ability to mix different signals, and also in arbitrary proportions:
-- mix one quarter of 440 Hz sine with one half of 660Hz square
mix (volume 0.25 $ sine 440) (volume 0.5 $ square 660)
Great, now let’s create a signal that’s constituted of actual notes mixed together. First, we’ll need to be able to calculate oscillation frequency for a given note. We’ll use General MIDI integers to denote notes:
-- |Calculates an oscillation frequency for a MIDI note number, in an equally tempered scale.
midiNoteToFreq :: (Floating a) => Int -> a
midiNoteToFreq n =
f0 * (a ** (fromIntegral n - midiA4))
where
a = 2 ** (1.0 / 12.0)
f0 = 440.0 -- A-4 in an ETS is 440 Hz.
midiA4 = 69 -- A-4 in MIDI is 69.
And now, to generate signal for a complete chord:
-- |Mixes given notes into a single chord signal.
chord :: [Int] -> Signal
chord notes = mixMany $ map (volume 0.2 . sine . midiNoteToFreq) notes
Looks very clear and concise. Awesome!
Rendering a signal
To hear a Signal
we need to sample it and feed it to sox
for playback. All we actually need to
do is to evaluate the signal at each sampling point and convert it to a value that
sox
will understand.
import Data.Int (Int16 (..)) -- we're going to use Int16 as output signal format
-- |Limits the signal's amplitude to not leave specified range.
clip :: Amplitude -> Amplitude -> Signal -> Signal
clip low high s = max low . min high . s
-- |Samples the signal over a specified time range with given sample rate.
render :: Time -> Time -> Int -> Signal -> [Int16]
render startT endT sampleRate s =
[ int16signal (sample * samplePeriod) | sample <- [0..totalSamples] ]
where
int16signal = (toInt16 . clipped) -- a function of Time -> Int16
toInt16 x = (truncate (minSig + ((x + 1.0) / 2.0 * (maxSig - minSig))))
minSig = fromIntegral (minBound :: Int16)
maxSig = fromIntegral (maxBound :: Int16)
clipped = (clip (-1.0) 1.0 s) -- the same signal clipped to stay within [-1; 1]
totalSamples = (endT - startT) / samplePeriod -- total number of samples to render
samplePeriod = 1.0 / (fromIntegral sampleRate) -- time interval between two sample points
Now, to actually play a chord we would do something like this:
import Data.Binary (encode)
import qualified Data.ByteString.Lazy as BS (concat, putStr)
main :: IO ()
main = do
BS.putStr $ playChord [58, 63, 67, 72, 77]
where
playChord notes =
let rendered = render 0.0 3.5 44100 $ chord notes
in BS.concat $ map encode rendered
and then pipe that output to sox
:
$ ./play-chords | play -ts16 -c1 -r44100 -x -
Pimping up the sound
Okay, we now can play chords, but the sound is boring. We want the voices to reminisce (at least a bit) the sound of guitar, or any other musical instrument for that matter (we’re really desperate). We also want the chord to be played in arpeggio.
Let’s make the note signals fading, and also mix different waveforms and see if that helps:
-- |Controls the amplitude of one signal by value of another signal.
amp :: Signal -> Signal -> Signal
amp x y t = x t * y t
-- |Emits a control signal for an exponential fade out.
fade :: Time -> Signal
fade speed = exp . (* speed) . (* (-1.0))
-- |Plays a fading note with given waveform.
fadingNote :: Int -> (Time -> Signal) -> Double -> Signal
fadingNote n wave fadeSpeed = amp (fade fadeSpeed) (wave (midiNoteToFreq n))
-- |Plays a note with a nice timbre. Mixes slowly fading square wave with rapidly fading sine.
niceNote :: Int -> Signal
niceNote n = mix voice1 voice2
where
voice1 = volume 0.3 $ fadingNote n square 1.9
voice2 = volume 0.7 $ fadingNote n sine 4.0
Now our chord
function will look like this:
chord notes = mixMany $ map (volume 0.2 . niceNote) notes
This sounds a lot better, something akin to AY.
And now for arpeggio, we’ll define another signal combinator, that will allow us to change the time at which the signal starts:
-- |Delays a signal by a given time.
delay :: Time -> Signal -> Signal
delay delayTime s t = if t >= delayTime then s (t - delayTime) else 0.0
With this addition, our chord function transforms into this:
chord notes = mixMany noteWaves
where
noteWaves = zipWith noteInArpeggio [0..] notes
noteInArpeggio idx n = delay (fromIntegral idx * 0.05) $ niceNote n
And it sounds like this:
The complete source code is on github.