聲音的質量是指經傳輸、處理后音頻信號的保真度。目前,業界公認的聲音質量標準分為4級,即數字激光唱盤CD-DA質量,其信號帶寬為10Hz~20kHz;調頻廣播FM質量,其信號帶寬為20Hz~15kHz;調幅廣播AM質量,其信號帶寬為50Hz~7kHz;電話的話音質量,其信號帶寬為200Hz~3400Hz。可見,數字激光唱盤的聲音質量,電話的話音質量。除了頻率范圍外,人們往往還用其它方法和指標來進一步描述不同用途的音質標準。
The quality of sound refers to the fidelity of audio signal after transmission and processing. At present, the recognized sound quality standard in the industry is divided into four levels, namely, the quality of digital compact disc CD-DA, and its signal bandwidth is 10Hz ~ 20kHz; FM broadcasting quality, with signal bandwidth of 20Hz ~ 15KHz; AM broadcast quality, with signal bandwidth of 50Hz ~ 7kHz; The voice quality of the telephone, and its signal bandwidth is 200Hz ~ 3400Hz. It can be seen that the sound quality of digital compact disc is the highest and that of telephone is the lowest. In addition to the frequency range, people often use other methods and indicators to further describe the sound quality standards for different purposes.
音響
sound
對模擬音頻來說,再現聲音的頻率成分越多,失真與干擾越小,聲音保真度越高,音質也越好。如在通信科學中,聲音質量的等級除了用音頻信號的頻率范圍外,還用失真度、信噪比等指標來衡量。
For analog audio, the more frequency components of the reproduced sound, the less distortion and interference, the higher the sound fidelity and the better the sound quality. For example, in communication science, the level of sound quality is measured not only by the frequency range of audio signal, but also by indicators such as distortion and signal-to-noise ratio.
對數字音頻來說,再現聲音頻率的成分越多,誤碼率越小,音質越好。通常用數碼率(或存儲容量)來衡量,取樣頻率越高、量化比特數越大,聲道數越多,存儲容量越大,當然保真度就高,音質就好。
For digital audio, the more components of the reproduced sound frequency, the smaller the bit error rate and the better the sound quality. Usually measured by digital rate (or storage capacity), the higher the sampling frequency, the larger the number of quantization bits, the more channels, and the larger the storage capacity. Of course, the higher the fidelity and the better the sound quality.
聲音的類別特點不同,音質要求也不一樣。如,語音音質保真度主要體現在清晰、不失真、再現平面聲象;樂音的保真度要求較高,營造空間聲象主要體現在用多聲道模擬立體環繞聲,或虛擬雙聲道3D環繞聲等方法,再現原來聲源的一切聲象。
The category and characteristics of sound are different, and the sound quality requirements are also different. For example, the fidelity of voice quality is mainly reflected in clarity, no distortion and reproduction of plane sound image; The fidelity of music is required to be high. The creation of spatial sound image is mainly reflected in the use of multi-channel simulation of three-dimensional surround sound, or virtual two-channel 3D surround sound and other methods to reproduce all sound images of the original sound source.
音頻信號的用途不同,采用壓縮的質量標準也不一樣。如,電話質量的音頻信號采用ITU-TG?711標準,8kHz取樣,8bit量化,碼率64Kbps。AM廣播采用ITU-TG?722標準,16kHz取樣,14bit量化,碼率224Kbps。高保真立體聲音頻壓縮標準由ISO和ITU-T聯合制訂,CD11172-3MPEG音頻標準為48kHz、44.1kHz、32kHz取樣,每聲道數碼率32Kbps~448Kbps,適合CD-DA光盤用。
Audio signals are used for different purposes, and the quality standards of compression are also different. For example, ITU-TG? 711 standard, 8kHz sampling, 8bit quantization, code rate 64Kbps. ITU-TG? 722 standard, 16KHz sampling, 14bit quantization, code rate 224kbps. The high fidelity stereo audio compression standard is jointly formulated by ISO and ITU-T. the cd11172-3mpeg audio standard is 48Khz, 44.1KHz and 32kHz sampling, and the digital rate of each channel is 32kbps ~ 448kbps, which is suitable for CD-DA discs.
對聲音質量要求過高,則設備復雜;反之,則不能滿足應用。一般以"夠用,又不浪費"為原則。
If the requirements for sound quality are too high, the equipment is complex; On the contrary, it cannot meet the application requirements. It is generally based on the principle of "enough without waste".
語音音質:評定語音編碼的質量目前常用的是主觀評定,即以主觀打分 (MOS)來度量,它分為以下五級:5(優),不察覺失真;4(良),剛察覺失真,但不討厭;3(中),察覺失真,稍微討厭;2(差),討厭,但不令人反感;1(劣),極其討厭,令人反感。一般再現語音頻率若達7kHz以上,MOS可評5分。這種評價標準廣泛應用于多媒體技術和通信中,如可視電話、電視會議、語音電子郵件、語音信箱等。
Speech quality: subjective evaluation is commonly used to evaluate the quality of speech coding, that is, it is measured by subjective scoring (MOS), which is divided into the following five levels: 5 (excellent), undetected distortion; 4 (good), just aware of distortion, but not annoying; 3 (middle), perceived distortion, slightly annoying; 2 (poor), annoying, but not offensive; 1 (inferior), extremely annoying and disgusting. Generally, if the reproduction speech frequency reaches more than 7kHz, MOS can be rated as 5 points. This evaluation standard is widely used in multimedia technology and communication, such as videophone, video conference, voice email, voice mailbox and so on.
樂音音質:樂音音質的優劣取決于多種因素,如聲源特性(聲壓、頻率、頻譜等)、音響器材的信號特性(如失真度、頻響、動態范圍、信噪比、瞬態特性、立體聲分離度等)、聲場特性(如直達聲、前期反射聲、混響聲、兩耳間互相關系數、基準振動、吸聲率等)、聽覺特性(如響度曲線、可聽范圍、各種聽感)等。所以,對音響設備再現音質的評價難度較大。
Music sound quality: the quality of music sound depends on many factors, such as sound source characteristics (sound pressure, frequency, spectrum, etc.), signal characteristics of sound equipment (such as distortion, frequency response, dynamic range, signal-to-noise ratio, transient characteristics, stereo separation, etc.), sound field characteristics (such as direct sound, early reflection, reverberation, cross-correlation coefficient between ears, reference vibration, sound absorption rate, etc.) Auditory characteristics (such as loudness curve, audible range, various auditory senses), etc. Therefore, it is difficult to evaluate the reproduction sound quality of audio equipment.
我們通常用下列兩種方法:一是使用儀器測試技術指標;二是憑主觀聆聽各種音效。由于樂音音質屬性復雜,主觀評價的個人色彩較濃,而現有的音響測試技術又只能從某些側面反映其保真度。所以,迄今為止,還沒有一個能真正定量反映樂音音質保真度的國際公認的評價標準。但也有報道,國際電信聯盟(ITU-T)近期已批準一種客觀評價音質的被稱之為電子耳的新型測量方法,可對任何音響器材的音質進行客觀聽音評價,也可用于檢測電話通訊語音編碼系統的缺陷。
We usually use the following two methods: one is to use instruments to test technical indicators; Second, listen to various sound effects subjectively. Due to the complexity of music quality attributes, the personal color of subjective evaluation is strong, and the existing sound testing technology can only reflect its fidelity from some aspects. So far, there is no internationally recognized evaluation standard that can truly quantitatively reflect the fidelity of music quality. However, it is also reported that the International Telecommunication Union (ITU-T) has recently approved a new measurement method called electronic ear, which can objectively evaluate the sound quality of any audio equipment, and can also be used to detect the defects of telephone communication speech coding system.
通常,據樂音音質聽感三要素,即響度、音調和愉快感的變化和組合來主觀評價音質的各種屬性,如低頻響亮為聲音豐滿,高頻響亮為聲音明亮,低頻微弱為聲音平滑,高頻微弱為聲音清澄。下面結合聲源、聲場及信號特性介紹幾種典型的聽感。
Generally, according to the changes and combinations of the three elements of musical sound quality, namely loudness, tone and pleasure, various attributes of sound quality are subjectively evaluated. For example, low-frequency loudness is full sound, high-frequency loudness is bright sound, low-frequency weakness is smooth sound, and high-frequency weakness is clear sound. Next, several typical listening senses are introduced in combination with sound source, sound field and signal characteristics.
立體感。主要由聲音的空間感(環繞感)、定位感(方向感)、層次感(厚度感)等所構成的聽感,具有這些聽感的聲音稱為立體聲。自然界的各種聲場本身都是富有立體感的,它是模擬聲源聲象重要的一個特征。德?波爾效應證明,人耳的生理特點是:人耳在兩聲源的對稱軸上,當聲壓差△p=0dB和時間差△t=0ms時,感覺兩聲源聲象相同,分不出有兩個聲源;而當△p>15dB或△t>3ms時,人耳就感覺到有兩個聲源,聲像往聲壓大或導前的聲源移動,每5dB的聲壓差相當于lms的時間差。哈斯效應又進一步證明,當△t=5ms~35ms時,人耳感到有兩個聲源;而當近次反射聲、滯后直達聲或兩個聲源的時間差△t>50ms時,即使一次反射聲(又稱近次或前期反射聲)或滯后聲的響度比直達聲或導前聲的響度大許多倍,聲源方位仍由直達聲或導前聲決定。
Three dimensional sense. The sense of hearing is mainly composed of the sense of space (sense of surround), the sense of positioning (sense of direction), the sense of hierarchy (sense of thickness), etc. the sound with these sense of hearing is called stereo. All kinds of sound fields in nature are full of three-dimensional sense. It is the most important feature of simulating the sound image of sound source. virtue? Bohr effect proves that the physiological characteristics of human ear are as follows: when the sound pressure difference △ P = 0dB and time difference △ t = 0ms are on the symmetry axis of two sound sources, the human ear feels that the sound images of the two sound sources are the same and there are no two sound sources; When △ P > 15dB or △ T > 3MS, the human ear feels that there are two sound sources. The sound image moves to the sound source with high sound pressure or in front of the guide. The sound pressure difference of every 5dB is equivalent to the time difference of LMS. Haas effect further proves that when △ t = 5ms ~ 35ms, human ears feel that there are two sound sources; When the time difference △ t between the near reflected sound, the delayed direct sound or the two sound sources is more than 50ms, even if the loudness of the primary reflected sound (also known as the near or early reflected sound) or the delayed sound is many times greater than that of the direct sound or the leading sound, the sound source orientation is still determined by the direct sound or the leading sound.
根據人耳的這個生理特點,只要通過對聲音的強度、延時、混響、空間效應等進行適當控制和處理,在兩耳人為的制造具有一定的時間差△t、相位差△θ、聲壓差△P的聲波狀態,并使這種狀態和原聲源在雙耳處產生的聲波狀態完全相同,人就能真實、完整地感受到重現聲音的立體感。與單聲道聲音相比,立體聲通常具有聲象分散、各聲部音量分布得當、清晰度高、背景噪聲低的特點。
According to the physiological characteristics of the human ear, as long as the sound intensity, delay, reverberation and spatial effect are properly controlled and processed, there is a certain time difference △ T and phase difference △ in the artificial manufacturing of the two ears θ、 The sound wave state of sound pressure difference △ P, and make this state exactly the same as the sound wave state generated by the original sound source at both ears, so that people can truly and completely feel the three-dimensional sense of reproduced sound. Compared with mono sound, stereo usually has the characteristics of sound image dispersion, proper volume distribution of each sound part, high definition and low background noise.