Introduction
I've been working for a while on a voice chat program in C# and encountered a vexing problem. Uncompressed audio data simply wouldn't do for a chat program and yet all of the .NET voice compression solutions I could find were quite expensive. Speex, the license-free open-source voice codec, seemed to be the obvious choice, and yet an exhaustive search turned up no C# implementations of the library. A quick search on SourceForge turned up a few such projects, but all were incomplete and long-abandoned. This is not a comforting fact for someone considering attempting the same feat.Of course the entire time an exceedingly simple solution was readily available, if only I had been able to see it. The Speex website provides two command-line utilities, speexenc and speexdec. As I was looking for a programmatic solution, it never occurred to me to even look at the syntax of these utilities. If I had, I would've seen how easy it would be to use the utilities from .NET code.
What is Speex?
Speex is a license-free open-source voice codec. It is used for compressing audio data into a smaller format, which is advantageous for transmitting voice over the internet. Keep in mind that it is generally not efficient for non-voice data. This is because (according to Wikipedia) voice codecs work by eliminating frequencies that cannot be made by the human voices and those that are inaudible to human ears. With a reduced number of available frequencies, the audio data can be stored in a more compact form. Speex is usually used in VoIP programs and other, similar internet voice applications, but it can also be used simply for reducing the size of a file on your computer.More information can be found at speex.org.
Using the Code
This code is quite simple to use. It contains an exception class, two structures for storing various data, and a class with two methods: encode and decode. It also makes use of a heavily modified version of Sujoy G.'sclsWaveProcessor class from his article Wave File Processor in C#. I've added a new field that contains the raw PCM data, removed all methods except for WaveHeaderIN, and edited that method to work on a stream instead of a filename. Here is the meat of the program, the Codec class.public class Codec{ public EncodeReturn Encode(byte[] raw, int bytespersecond, int samplespersecond, bool stereo, short bitspersample, bool denoise, bool agc) { //Start speexenc process Process encProc = Process.Start("speexenc", "-u " //Ultra wide-band (denoise ? "--denoise " : "") //Denoise before encode "--agc " //Addaptive gain control before encode "--bitrate " bytespersecond * 8 " " //Set the bitrate "--rate " samplespersecond " " //Set the sample rate (stereo ? "--stereo " : "") //Set the channel count (bitspersample != 16 ? "--8bit " : "") // "con con"); //Set console input and output //Writes the raw audio data to encproc's StdIn one byte at a time foreach (byte b in raw) { encProc.StandardInput.BaseStream.WriteByte(b); } //Wait, to ensure that all output has been written encProc.WaitForExit(); //Check for success if (encProc.ExitCode != 0) throw new EncodeDecodeFailureException(encProc.ExitCode); //Skip the first line encProc.StandardOutput.ReadLine(); //Remove output BinaryReader br = new BinaryReader(encProc.StandardOutput.BaseStream); byte[] retB = new byte[encProc.StandardOutput.BaseStream.Length]; //In non-verbose mode, the first line of output is the only line on //non-audio data encProc.StandardOutput.ReadLine(); //Read the output int k = 0; while (!encProc.StandardOutput.EndOfStream) { retB[k ] = br.ReadByte(); } //Clean up br.Close(); //Create the return object EncodeReturn retVal = new EncodeReturn(retB); //And return it return retVal; } public DecodeReturn Decode(byte[] raw) { //Create and start the decoding process Process decProc = Process.Start("speexdec", "--force-uwb con con"); //Writes the raw audio data to encproc's StdIn one byte at a time foreach (byte b in raw) { decProc.StandardInput.BaseStream.WriteByte(b); } //Wait, to ensure that all output has been written decProc.WaitForExit(); //Check for success if (decProc.ExitCode != 0) throw new EncodeDecodeFailureException(decProc.ExitCode); //Skip the first line encProc.StandardOutput.ReadLine(); //Pass the output to clsWaveProcessor clsWaveProcessor cwp = new clsWaveProcessor(); //Process the header and the data cwp.WaveHeaderIN(decProc.StandardOutput.BaseStream); //Create the output DecodeReturn dr = new DecodeReturn(cwp.RawPcmWaveData, ((cwp.BitsPerSample / 8) * cwp.SampleRate), cwp.SampleRate, cwp.Channels != 1, cwp.BitsPerSample); //Return the output return dr; }}
评论