The range of human hearing is from around 2Hz (2 cycles per second) to 20,000Hz (alias 2KHz) although with age one tends to lose acuity in the higher frequencies so for most adults the upper limit is around 10KHz.
The lowest frequency that has a pitch-like quality is about 20Hz.
A typical value for the extent to which an individual can distinguish pitch differences is 05-1% for frequencies between 500 and 5000Hz. (Differentiation is more difficult at low frequencies). Thus at 500Hz most individuals will be unable to tell if a note is sharp or flat by 2.5-5Hz (ie an 'allowable' pitch range for that note might be from 495Hz to 505Hz maximum.
Analog is a technique used for the recording of analog signals which is recording methods to store signals as a continual wave in or on the media. The wave might be stored as a physical texture on a phonograph record, or a fluctuation in the field strength of a magnetic recording.
Digital audio refers to technology that records, stores, and reproduces sound by encoding an audio signal in digital form instead of analog form. Sound is passed through an analog-to-digital converter (ADC), and pulse-code modulation is typically used to encode it as a digital signal. A digital-to-analog converter performs the reverse process.
Audio Bit Depth
In digital audio using pulse code modulation (PCM), bit depth describes the number of bits of information recorded in each individual sample. Bit depth directly corresponds to the resolution of each sample in a set of digital audio data. Examples of bit depth include CD quality audio, which is recorded at 16 bits, and DVD-Audio and Blu-ray Disc which can support up to 24 bits.
A set of digital audio samples contains data that provides the necessary information to reconstruct the original signal. The audio bit depth limits the signal-to-noise ratio (SNR) of the reconstructed signal to a maximum level determined by quantization error. The bit depth has no impact on the frequency response, which is constrained by the sample rate.
Quantization noise is a model of quantization error introduced by quantization in the analog-to-digital conversion (ADC) in telecommunication systems and signal processing. It is a rounding error between the analog input voltage to the ADC and the output digitized value. The noise is non-linear and signal-dependent.
In an ideal ADC, where the quantization error is uniformly distributed between −1/2 and +1/2 least-significant bit, and where the signal has a uniform distribution covering all quantization levels, the Signal-to-quantization-noise ratio (SQNR) can be calculated from
Where Q is the number of quantization bits. 24-bit digital audio has a theoretical maximum SNR of 144 dB, compared to 96 dB for 16-bit; however, as of 2007 digital audio converter technology is limited to a SNR of about 124 dB (21-bit) because of real-world limitations in integrated circuit design. Still, this approximately matches the performance of the human auditory system.
It is important to note that bit depth is only meaningful when applied to PCM. Non-PCM formats, such as lossy compression formats like MP3, AAC and Ogg Vorbis, do not have associated bit depths. For example, in MP3, quantization is performed on PCM samples that have been transformed into the frequency domain.
Using higher bit depths during studio recording enables greater headroom to be left on the recording. This reduces the risk of clipping without encountering quantization errors at low volumes.
bits - SNR - Possible integer value
4 -24.08 dB - 16
8 - 48.16 dB- 256
16 - 96.33 dB - 65,536
24 - 144.49 dB - 16,777,216
32 - 192.66 dB - 4,294,967,296
48 - 288.99 dB - 281,474,976,710,656
64 - 385.32 dB - 9,223,372,036,854,775,807
Floating point . Many audio file formats and digital audio workstations (DAWs) now support PCM formats with samples represented by floating point numbers. Both the Microsoft WAV file format and the Apple AIFF file format support floating point PCM and major DAWs support varied floating point processing capabilities.Unlike integers, whose bit pattern is a single series of bits, a floating point number is instead composed of several smaller bit patterns whose mathematical relation forms a number. This method of representation is similar to scientific notation and expands a binary system to more closely approximate real numbers. Floating point numbers still have upper and lower bounds that are fixed but the method of representation allows increasingly smaller integer values to include an increasingly larger fractional part. The most common standard is IEEE floating point which is composed of three bit patterns: a sign bit which represents whether the number is positive or negative, an exponent and a mantissa which is raised by the exponent. The mantissa is expressed as a binary fraction in IEEE base two floating point formats.IEEE single-precision (32-bit) floating point format:
For example, the 32-bit floating point bit pattern 1 01111101 00101100000000000000000 is interpreted as the following:
- (-1)1 × (1 + 0.34375) × 2(125 - 127) = -1.34375 × 2-2 = -0.3359375
As a different example, the bit pattern 0 10010010 10110001010000000001000 is a larger number and shows the fraction become reduced in length:
- (-1)0 × (1 + 0.004883766174316406) × 2(146 - 127) = 1.004883766174316406 × 219 = 526,848.5
Audio processing.Sometimes a small amount of random noise, called dither, is deliberately added to the signal before quantizing. Dithering eliminates the granularity of quantization error, giving very low distortion, but at the expense of a slightly raised noise floor. Measured using ITU-R 468 noise weighting, this is about 66dB below alignment level, or 84dB below digital full scale, which is somewhat lower than the microphone noise level on most recordings, and hence of no consequence .24-bit audio is sometimes used undithered, because for most audio equipment and situations the noise level of the digital converter can be louder than the required level of any dither that might be applied.For most situations the advantage given by a resolution higher than 16-bit is mainly in the ease of setting recording levels. With 16 bit audio, poorly set recording levels can result in noisy recordings. With 24 bit audio, up to 10-20 dB of extra range can be available, providing additional margin for error. Although 24 bit audio provides additional dynamic range, this is generally insufficient for all but trivial processing steps. Furthermore, the use of integer precision introduces potentially difficult to anticipate overflow and underflow errors. Consequently, most audio processing is performed after first converting to 32 bit or higher floating point precision. Following audio processing, samples are often reduced to 16 or 24 bit precision for distribution, or encoded to non-PCM formats such as MP3 that do not have a finite bit depth.
The sample rate or sampling frequency defines the number of samples per unit of time taken from a continuous signal to make a discrete signal. For time-domain signals, the unit for sampling rate is hertz (inverse seconds, 1/s, s−1), sometimes noted as Sa/s or S/s (samples per second). The reciprocal of the sampling frequency is the sampling period or sampling interval, which is the time between samples.
Oversampling : In some cases it is desirable to have a sampling frequency more than twice the desired system bandwidth so that a steep digital filter and a less steep analog anti-aliasing filter can be used in exchange for a steep analog anti-aliasing filter. The reason for wanting a less steep analog anti-aliasing filter is that the digital filter is not subject to any component variations thus always giving the filter response (filtering function) that the designer has chosen. This process is known as oversampling.
Undersampling : Conversely, one may sample below the Nyquist rate. For a baseband signal (one that has components from 0 to the band limit), this introduces aliasing, but for a passband signal (one that does not have low frequency components), there are no low frequency signals for the aliases of high frequency signals to collide with, and thus one can sample a high frequency (but narrow bandwidth) signal at a much lower sample rate than the Nyquist rate.
In digital audio the most common sampling rates are 44.1 kHz, 48 kHz, 88.2 kHz, 96 kHz and 192 kHz. Lower sampling rates have the benefit of smaller data size and easier storage and transport. Because of the Nyquist-Shannon theorem, sampling rates higher than about 50 kHz to 60 kHz cannot supply more usable information for human listeners. Early professional audio equipment manufacturers chose sampling rates in the region of 50 kHz for this reason. 88.2 kHz and 96 kHz are often used in modern professional audio equipment, along with 44.1 kHz and 48 kHz. Higher rates such as 192 kHz are prone to ultrasonic artifacts causing audible intermodulation distortion, and inaccurate sampling caused by too much speed. The Audio Engineering Society recommends 48 kHz sample rate for most applications but gives recognition to 44.1 kHz for Compact Disc and other consumer uses, 32 kHz for transmission-related application and 96 kHz for higher bandwidth or relaxed anti-aliasing filtering.
8.000 HZ : Telephone, walkie-talkie, wireless
11.025 HZ : One quarter the sampling rate of audio CDs, used for lower-quality PCM, MPEG audio and for audio analysis of subwoofer bandpasses.
16.000 HZ : Wideband frequency extension over standard telephone narrowband 8,000 Hz. Used in most modern VoIP and VVoIP communication products.
22.050 HZ : One half the sampling rate of audio CDs; used for lower-quality PCM and MPEG audio and for audio analysis of low frequency energy. Suitable for digitizing early 20th century audio formats such as 78s
32.000 HZ : for minidv , divicam with 4 channels of audio , DAT (LP mode) ,NICAM , High-quality digital wifi mic.
44.056 HZ : Used by digital audio locked to NTSC color video signals (245 lines by 3 samples by 59.94 fields per second = 29.97 frames per second).
44.100 HZ : Audio cd , most commonly used with MPEG-1 audio (VCD,MP3,A.O)Much pro audio gear uses (or is able to select) 44.1 kHz sampling, including mixers, EQs, compressors, reverb, crossovers, recording devices and CD-quality encrypted wireless microphones.
47.250 HZ : world's first commercial PCM sound recorder by Nippon Columbia
48.000 HZ : The standard audio sampling rate used by professional digital video equipment. used for sound with consumer video formats like DV, digital TV, DVD, and films. Much professional audio gear uses (or is able to select) 48 kHz sampling, including mixers, EQs, compressors, reverb, crossovers and recording devices such as DAT.
50.000 HZ : First commercial digital audio recorders from the late 70s from 3M and Soundstream.
50.400 HZ : Sampling rate used by the Mitsubishi X-80 digital audio recorder.
88.200 HZ : Sampling rate used by some professional recording equipment when the destination is CD (multiples of 44,100 Hz).
96.000 HZ : DVD-Audio, some LPCM DVD tracks, BD-ROM (Blu-ray Disc) audio tracks, HD DVD (High-Definition DVD) audio tracks.
176.400 HZ : Sampling rate used by HDCD recorders and other professional applications for CD production.
192.00 HZ : BD-ROM (Blu-ray Disc) audio tracks, and HD DVD (High-Definition DVD) audio tracks , 4 times the 48khz
352,800 HZ : Digital eXtreme Definition, used for recording and editing Super Audio CDs, as 1-bit DSD is not suited for editing. Eight times the frequency of 44.1 kHz.
2.822.400 HZ : SACD, 1-bit delta-sigma modulation process known as Direct Stream Digital, co-developed by Sony and Philips.
5.644.800 HZ : Double-Rate DSD, 1-bit Direct Stream Digital at 2x the rate of the SACD. Used in some professional DSD recorders.
In digital music processing technology, quantization is the process of transforming performed musical notes, which may have some imprecision due to expressive performance, to an underlying musical representation that eliminates this imprecision. The process results in notes being set on beats and on exact fractions of beats. Normal quantization: The 1/1-Note, 1/2-Note, 1/4-Note, 1/8-Note, 1/16-Note, 1/32-Note, 1/64-Note and 1/128-note settings quantize the MIDI or Audio region to the equivalent note value. Triplet quantization: The 1/3-Note, 1/6-Note, 1/12-Note, 1/24-Note, 1/48-Note, and 1/96-Note settings quantize the MIDI region to triplet note values. A 1/6 note is equivalent to a quarter triplet, 1/12 note to an eighth triplet, 1/24 note to a sixteenth triplet, and 1/48 note to a thirty-second triplet. Quantization off: The off (3840) setting plays the notes at the finest possible timing resolution: 1/3840 note, which is unquantized playback, in practical terms.
The quietest sounds that can be heard have a power (measured in Watts) of 10 to the -12 W/m2, whilst the loudest that can be withstood have a power of 1 W/m2. The range is therefore in the order of 10 to the 12, or one million million times.One decibel is a leap by a factor of 10, so that 0Db is the quietest noise, and 120Db is the loudest.