Inthepipeline.net is written primarily as a resource for inthepipeline recording school.




The in-depth workings of digital audio is something to which many musicians pay sparse attention.
The reason is the usual one: "We want to get on with making music".
Here I'll attempt to introduce you to the concept without the involvement of lots of mathematics.


Q.

What are bit Depth and Sample Rate?

A.
 
In the computer domain, sound gets drawn like a graph. Imagine you have drawn the shape of a sound-wave across a piece of graph paper and marked all the places where the wave intersects with the grid. The lines from top to bottom represent a value for the magnitude of the sound (bit depth) and the lines  across represent the duration or time (sampling rate). These values get recorded and stored. Now get another piece of paper the same and put dots on the paper from the information that was stored. A copy of the wave can be created.
The more squares there are on the graph, the more accurately the original image can be re-drawn, but more information (more points on the graph) need to be recorded, so the file is larger. It contains more information.
When a sound is recorded by a computer, the sound card converts a voltage into a number which is then stored with a time position.
When a computer informs the sound-card to spit out a voltage at a given time interval, it creates a waveform similar to the one which was recorded.

Example of a low sampling rate
An example of the way that a low sample-rate might represent a sine wave.

Example of a higher sampling rate
The same wave-form sampled at a slightly higher rate.
Note how some of the peaks and troughs are accentuated and others are flattened.

The wave which comes out represents a clearer image of the wave that went in when there are more reference points (more possible values and more samples over time). This requires more information to be stored so the audio file becomes larger, but the quality increases as a result.

Think of it like digital movie making. for a given size of picture, the more pixels the image has, the sharper it appears. The greater the number of images captured over a given time, the more accurate the movement of the sound is captured.


Q.

CDs are sampled at 16Bit, 44.1 Khz. That sounds OK. Should I use  this bit depth and Sample rate for recording?

A.

No, not if you can help it!
16Bit, 44.1 Khz was chosen by the music industry as a good compromise between quality and file size. It was considered to be "just good enough to get away with", so to speak.
When we do recordings on our DAWs there are a few reasons to use higher sampling rates and bit depths which are really easy to understand and others which become easy to understand once the easier ones are understood. It's not rocket-science, but they use it!

Going back to the idea of a graph drawn on a piece of paper with the dots marked on it, the original drawing has a nice smooth line which passes through the dots, but the computer only sees the numbers and re-creates the dots, so the soundcard will be inclined to step between these values rather than sweeping through them in the way they were originally input. They have been given a quantity and then the quantity has been used to attempt to "draw" the wave again. This distortion is known as quantisation distortion and leads to the cold, metallic sound often associated with digital recordings. At extremely low bit rates and sample frequencies this effect can clearly be heard. A good examples is bad quality digital telephony.

If a sound-file is processed within the digital domain (this could be changing the level, altering the EQ, adding effects or anything else) the image of the wave may be re-drawn many times in the process and will be changed in a similar way to processing a photo. After altering the image, it will become less and less like the original. The better the image is to begin with, the more we are able to process it before it becomes badly altered.

Choose a good compromise between sample rate, bit depth and file size. The internal engine of most DAW applications work with 32bit file sizes and often above. Reaper, for example uses a 64 bit engine. Hard drives are cheap compared to analog tapes. Your material is worth recording well. Don't choose to save a small amount of money to keep your projects compact. Using larger file sizes is a good way to store your hard work. You can do more processing of your material before it will deteriorate. You will get a better sound compared to using smaller file sizes, but the trade off will be more CPU use in performing the same operation on larger files. So, a more powerful computer will be able to perform more complex operations than one which is less powerful. Getting the best out of the digital recording often requires the user to make judgements to balance between these factors.

Q.

How do I set up my recording levels?

A.

Not to high.

Recording with high recording levels is likely to cause clipping (distortion). 

For more detail about this have a look at the Understanding Metering page


Q.

My sound card can only record with 24bit accuracy. What is the advantage of recording 32bit float?


A.

Your recording was recorded in that great big file. You had all those extra values to play with when you recorded it. Now it's time to play. You can equalise, compress, expand, gate, mutilate and as much as the file will eventually deteriorate, it won't do it with as few operations. This is where working with digital sound really gets to be fun. It's still important not to be abusive with the levels at this point, but you will have much more flexibility than if you were to use good ol' fashion 16 bit, 20 or even 24.

One advantage of recording at higher resolution will be more accurate "drawing" of the wave-form. This can result in a much warmer and clearer sound, especially at higher frequencies. If you refer back to the two examples above you will notice that in some cases the superimposed digital image entirely cuts the top of the waveform off. Other of the cycles are almost like a saw-tooth, with a very sharp rise in voltage and a very sharp drop. At lower resolutions this can have the effect of making a sound gritty and metalic.

Choosing very high sampling rates like 192Hkz probably won't help you either. Apart from the ridiculously large file sizes which would result and the huge amount of hard drive activity which would be needed to record and play tracks, there is a limit to how well the process of recording the sound will remain relatively accurate. A balance has to be struck between the limitations of the hardware as well as the limitations of the system.

Nyquist theory implies that a sample needs to be recorded at twice the sampling frequency for the highest frequency required in a sample, but this will only allow for a very limited number of samples to be recorded at the highest frequencies in the supposed audio spectrum, which terminates at 20Khz.

In practice most young and un-damged Human hears can hear little or nothing above 14Khz. Between around 12Khz and 14Khz we can detect sound, but our ability to actually discern pitch is very poor, as is our ability to discern detail. At such high frequencies sound tends to bounce off hard surfaces and is vastly difficult to re-create accurately in any case. We rely on such sounds more to help us interpret vicinity, directional and spacial information than musical notes and harmonic detail.

Let us consider a waveform at a frequency of 10Khz for a moment.

If the sample rate is 44.1Khz there will be 4.41 samples per cycle,
If the sample rate is 64Khz, 6.4 samples per cycle,
with a sample rate of 88.2Khz, 8.82 samples per cycle,
96Khz will create 9.6 samples per cycle.

Here are a couple of real world examples of digitally re-created waveforms recorded digitally using a sine wave generator and displayed on s(M)exoscope, a donationware VST plugin from Bram@Smartelectronix.com. The sine wave is displayed with far less accuracy at the lower sample-rate and it is very easy to hear the difference in the quality of the sound.

8Khz sine wave displayed with a sample-rate of 64Khz. It does resemble a sine wave fairly closely.



The same sine-wave dispalyed at 32Khz. Note, this looks almost like a synthesizer triangle wave. It sounds like it too!

It is easily possible to see from the examples, that the quality with which the wave-form has been re-drawn vastly deteriorates as the sample-rate is lowered. In both of the examples above it is possible to count the number of samples which draw each cycle of the wave-form.

Clearly it is possible to see that raising the sample-rate has a dramatic effect on the quality of the audio sampling at much lower frequencies as well as widening the total bandwidth over which we can record, even if at the highest frequencies which we could theoretically record, we wouldn't be able to hear anything.

In practice a 64Khz sample-rate is a good compromise between file size, disk usage and quality of sound if your soundcard and hardware supports it.
Even at 96Khz the quality is not greatly improved and beyond this, your hardware may well have difficulty in coping very well.

Q.

Should I consider recording levels in the same way I consider mastering levels?

A.

No. Recording into your DAW is not at all the same as mastering from it.

Your DAW is a complex environment for making, processing and combining sounds at a high resolution. Mastering to a file for domestic consumption requires that the file which is output has very good data integrity and usually far lower quality than the files we use within the DAW environment. This means that we have to ensure that the actual percieved sound level of the file will be comparable when it is played on domestic equipment. It is also very important to make sure that there are no digital 'overs' (that the signal never exceeds 0dB) but volume level standards exist in the form of the 'K' system for metering. This system for setting monitoring levels so that we can produce a consistent sound-output level is discussed  in greater depth on the 'metering levels' page of this site.

In practice, when we record using a DAW, the combination of software and plugins will be capable of performing far more accurate processing of our sound at a carefully chosen bit-depth and sample rate. This is an important consideration when perhaps carrying out a large number of sound-altering processes on each track, performing sub-mixes and eventually mastering our material to a final track.
At present, most domestic HIFI consists of CD or DVD quality sound and a greater number of devices are around which can only play MP3 format. The major difference from the aspect of recording levels is that a consumer will not typically seek to alter the level from the original, so the impotus is on the engineer who masters the work to ensure that the sound level which will be reproduced is within acceptable levels and comfortable to the listener.

There was a growing trend toward trying to get the level of finished work to play back as loudy as possible, so that the master would have a very high "volume" when played on a HIFI. This is very bad practice for more than one reason. In the music we record there are often very high level transients which occur to quickly to be noticed on recording meters. These transients contain lots of the detail which is best preserved if our objective is to produce a really high fidelity recording. It may not be possible to recognise them visually as we record, but we can make a sensible allowance for them in the recording process.
The effect of these momentary details reaching the highest digital value available in a wave file causes digital clipping. The sound that occurs is very unpleasant to the ear. If the duration of this clipping is very short, it may not be possible to define it, but it will have an unpleasant effect on the recording and will degrade the sound as a whole. It is also worth considering the effect that a "loud" recording will have on the end user. More than likely, if a track sounds too loud, a listener is likely to turn it down, if it is too quiet, turn it up. So if someone is listening to a number of tracks it is important that the sound level from each of them is comparable in terms of over-all output.

Going back 60 years it was quite possible to achieve outstanding results with recording equipment which was comparatively primative. The engineers needed to understand the very tight tolerances of their tools and learn to work within them. Today extremely powerful tools are available to us. They are far more tolerant than the older recording methods, but as with them, it is important that we take full advantage by learning about the tolerances of our modern recording equipment. With forsight our results can be better still.

How can we achieve this?  What is a "comfortable" listening level?  Why is it important?

Go to the Metering Levels  page


This site is under on-going development. More pages will be added soon. Thanks for visiting inthepipeline.net