Audio Primer

OK, so you want to record something, but you know nothing about audio, and Yahweasel and Craig are abusive jerks. Where do you begin? Right here, that's where.

Firstly, audio from Craig is always delivered as multiple files, with each file representing a single speaker. This is called multi-track recording (a single stream of audio is called a “track”), and is Craig's primary feature. This is extremely useful, as you can edit or cut parts of each track independently. If one speaker is too quiet, you can increase their volume without affecting anyone else; if another speaker keeps coughing, you can remove that without removing anyone's speech. Ultimately, you will probably need to use an audio editor to mix these tracks into a single audio stream. There are many audio editors in the world, and Audacity is a popular, free one.

The multiple files delivered by Craig are packaged in a ZIP file. ZIP is a common archive format; that is, it's used to package several files as one. All modern desktop operating systems have support for ZIP out-of-the-box; mobile users will likely need to search for a ZIP file extractor for their system.

Digital audio data may be encoded in a variety of ways, with a variety of benefits and disadvantages. The classic and simplest technology is a WAV file, or a “raw PCM waveform”. The key word there is “raw”: WAV files are totally raw data, and are huge. So huge, in fact, that the idea of sending them over the Internet to anyone who asks is just infeasible. For this reason, Craig only directly offers WAV files of reduced quality, which should only be used if you have no other option. If you're on a supported operating system, Craig will offer a WAV file “extractor”, which is full-quality compressed files plus a program to convert them to WAV. If you don't trust me to run random software on your computer (and you have no reason to), or if you're not using a supported operating system, use a different format and read on.

Typically, audio data transmitted over the Internet is transmitted in a compressed format. With audio compression, there are two options: The audio may be losslessly compressed, or lossily compressed. Lossless compression takes more space than lossy compression, much less space than raw data (WAV), and loses absolutely no information relative to a raw WAV file. Lossless compression usually has no disadvantages except its size. The most popular lossless audio compression format is FLAC (Free Lossless Audio Codec), which is Craig's primary format. If your software supports FLAC and you have a fast enough Internet connection, FLAC is what you want. Apple has their own (slightly inferior) lossless audio format, called ALAC. There are a few other lossless formats, but none are as well supported as FLAC and ALAC.

If lossless compression isn't your thing, you want lossy compression. Lossy compression makes audio data smaller by intentionally losing some information: Lossily compressing audio data invariably loses some quality relative to the original. In the world of lossy compression, there are a huge variety of options available. The Moving Picture Experts Group (MPEG) defines an array of formats which are widely supported and considered the international standard. Amongst them is AAC (Advanced Audio Coding), otherwise known simply as MPEG-4 audio, which is the most widely supported modern audio format. AAC has excellent quality at reasonable size. If you can't use FLAC, AAC is what you want.

Craig also supports a collection of other lossy formats, namely MPEG-4 HE-AAC, Opus and Ogg Vorbis. HE-AAC is an improved version of AAC, but is slightly less widely supported; the quality of an HE-AAC file will be roughly the same as the AAC file, but it will usually be smaller. Opus is the current darling of lossy compression, is the format used by Discord itself, and is simply the best lossy format available today, but currently support in audio editing software is mostly lacking. Opus is also unemcumbered by patents, if that's important to you. Ogg Vorbis isn't as widely supported or good as AAC, but is more widely supported than FLAC or Opus, and is also unencumbered by patents.

The elephant in the room is MP3. To many people, “MP3” is synonymous with “audio file”. First, let's avoid a confusing issue: MP3 is not MPEG-3 audio, and thus is not the immediate predecessor to MPEG-4 audio. MP3 is the third audio standard in the first MPEG standard; i.e., it's MPEG-1. I will call it MPEG-1 from here on so as not to give it more prominance than it deserves. I don't support MPEG-1 for a variety of reasons:

This concludes my ranting. If you have read this and still have questions, you are free to ask them on Craig's discord server.

¹ “Wait!”, I hear you cry, “I use AAC audio and my audio editor asks me for a bitrate! I thought you said that no modern format supports constant bitrate!” Unfortunately, the language of “bitrate” has become so ingrained into how people talk about audio that modern encoders fake constant bitrate modes. Technically AAC does support a constant bitrate mode, but it's rarely used; when you set a constant bitrate, what a normal encoder does is check how many bits are being used per second every so often—usually every one or two seconds—and adjust the quality to bring the bitrate closer to the requested one. It's an “average bitrate” mode, not a constant bitrate mode. If your software supports a constant quality mode, you almost certainly should use it. The only modern situation in which average bitrate is still logical is livestreaming, and even there it's hardly brilliant, but I won't go into the details of packet-switched networking and momentary bandwidth spikes.