Digital Sound

Title-	Digital Sound	Date-	2019/1/13

Other Keywords-	tutorial synthesizer

Author	Location	Email
Gabe Taubman	NYC	gtaubman@gmail.com

No. Words-	818	No. Figures-	0	No. Mins-	4

Welcome to part 2 of the Build a Synthesizer series!

In part 1, we covered that sound is an experience your brain creates when your ear drum moves back and forth. We also learned a simple way to get ourselves to hear sound, by moving a speaker back and forth causing air to move back and forth causing our ear drum to move back and forth.

In this post, we’ll learn what goes into representing and producing sound on a digital computer.

Previously, we looked at the following curvy line and imagined that it represented the position of an ear drum as it moves in and out. Now, let’s imagine that instead of an ear drum, it represents the position of the speaker. We’ve covered that moving a speaker moves the air which moves your ear drum, so ear drum or speaker, they’re represent the same result.

cosine

When moving a speaker around, we have two downsides of reality that we have to deal with, and we’ll cover them one at a time.

Sample Rate

The first downside of computers is that they are not infinitely fast. Every second, there is some maximum number of times our computer can tell the speaker to move. If we wish to have the speaker output our smooth curve, but the computer tells the speaker to move only 100 times per second, this is what the positions of the speaker look like:

cosine with dots

You can see, it’s much more jagged, and you can audibly tell the difference. As the computer instructs the speaker to move fewer and fewer times per second, the output looks less and less like our desired smooth curve:

cosine with dots

However, if our computer moves the speaker 200 times per second, the output begins to look smooth again:

cosine with dots

At 1000 times per second, our output is visually indistinguishable from our original smooth curve:

cosine with dots

This concept of how often we tell the speaker to move is called the sample rate. It represents the rate at which we send samples of our sound to the speaker and is one of the most important aspects of digital audio. It’s measured in samples per second.

To give you a baseline, audio CDs use a sample rate of 44,100. The unit Hertz (Hz) is used to measure “somethings”-per-second, and so a sample rate of 44,100 is often written as 44.1 thousand hertz, or 44.1 kHz (kilohertz). This means that on your CD, there are numbers that represents the movement of a speaker, and 44,100 of them get sent to the speaker every second. Put another way, your CD player is instructing the speaker to move every 0.000023 seconds.

You may be curious about the impact of choosing various sample rates. The most important thing to remember about sample rate is that it governs the highest pitch you can produce. The relationship is that in order to represent a pitch, your sample rate must be at least twice the pitch’s frequency. This intuitively makes some sense, because you need to have a sample for the speaker to be all the way out, and a sample for the speaker to be all the way in. The 44.1 kHz sample rate for audio CDs allows them to represent pitches up to 22.05 kHz, which not coincidentally is about the upper limit of human hearing.

Bit Depth

The second unfortunate reality of computers is that they are not infinitely precise. We’ve discussed how we represent the positions of the speaker as numbers between -1 and 1, but there are an infinite number of numbers in that span. Much like with sample rate where we had to pick how fine to divide up time, we also have to decide how fine to divide up number of positions at which we can place the speaker. If you’ve ever heard of 8-bit music like that used on a Nintendo, the 8-bit part refers to there being 8 bits used to represent the speaker position, giving a total of 256 positions between -1 and 1. Again for reference, audio CDs use a bit depth of 16 bits which allows for 65,536 positions between -1 and 1.

Summary

In this post we’ve covered the two main components of digital sound: bit depth and sample rate. Bit depth controls how finely we can represent positions of the speaker, and sample rate controls how many times per second we instruct the speaker to move. Audio CDs use a bit depth of 16 bits, and a sample rate of 44.1 kHz.

Next Time

Armed with what we now know about digital sound, the next post will cover creating an Oscillator. What is an Oscillator? Oscillate means to go back and forth, and so an Oscillator is something which can generate numbers that go back and forth, and create curvy lines like we’ve been showing above. Oscillators are a fundamental building block of a synthesizer as they’re what creates sound.