If this page does not print out automatically, select Print from the File menu.

Create your own soundtrack

It’s not that difficult to capture and create the perfect soundtracks on your PC. Here's how

Karl Foster, Personal Computer World 12 Feb 2007

Whether you’re creating a video, presentation or podcast, with well-balanced audio your production will engage an audience rather than force them to cover their ears.

Far too many productions are blighted by noise, poor tone and a lacklustre score. Yet it only takes a little thought and effort to create sound that excites. You don’t need composers, studios or voiceover artists – the computer, some specialist technology and tips on technique, along with your own sensibilities, will do the trick.

If you haven’t got a musical bone in your body, but your production demands a tune, there are ways to sort it. And if the sound of your recorded voice makes you cringe, that’s easily rectified, too. With a compelling tale to tell, you’re most of the way there, so let’s look at how to ensure a production’s soundtrack truly completes the piece.

There are extras to buy, but they need not cost a lot, and some pro-studio advice to take in, although it’s largely common sense. We’ll start by looking at voice narratives for video and podcasts, moving on to movies and music later, so keep your ears fresh, be critical of your work, and stay receptive to the practices of professional producers.

Loud and clear
Quality capture of the human voice is crucial to success. Everybody has a voice and so the audience is sensitised towards it. Timbre varies, but whomever the speaker, the idea is to record what is spoken or sung as cleanly and as fully as possible.

The first instinct is to plug a microphone into your soundcard’s mic socket, hold it in front of your mouth, hit Record and start waffling. But you’ll end up with a thin, noisy recording because the tools and technique are not right for the job.

Soundcards are useless for our purposes. Not only does re-plugging mean scrabbling around at the back of the PC, but also soundcard sockets provide a very small, easily tarnished, contact patch that can create distortion of the audio signal. Most importantly, however, the connections are unbalanced.

A microphone cable makes an efficient aerial and so the electromagnetic interference pumping out of your computer setup will be received, recorded and then amplified into a take-ruining buzz. It’s better, then, to use an external desktop audio interface with balanced connections. The sockets will be more accessible and, used with balanced cables, noise at the pre-amplification stage will be less of a worry.

Yamaha has recently unveiled a portable hardware audio interface – the £279 GO46 – which not only has impressive analogue-to-digital conversion specs and balanced connectors, but also links to your computer via Firewire, so it can be cabled up and placed within easy reach. There are USB2 interfaces that do the same job, but you’ll not be able to daisy-chain them should you wish to expand.

See Edirol, M-Audio and Echo for more interface options. Whatever you choose, make sure it has XLR sockets (the name for the standard three-pin round connectors used for microphones) and phantom power (a way of powering a mic via its signal cables) so you can use the right type of microphone.

Taking the mic
You’re probably familiar with the 1/4-in or 6.3mm jack plug that’s used to connect electric guitars and dynamic microphones to preamplifiers. In studio circles, you’ll find that microphones hook up via three-pin XLR connectors, one of two types of balanced connector. There’s also the tip/ring/sleeve (TRS) type, which looks like a stereo headphone jack, but the XLR connector is used to deliver power to active microphones called condensers.

Passive mics, such as those bundled with multimedia PCs or used by singers on stage, generate an electrical signal mechanically and are not very sensitive. Condenser mics work by modifying an existing voltage (48V phantom power) and are much more sensitive, capturing a wider tonal range, thus making your voice sound rich and natural.

With such sensitivity, however, comes a downside. The human body makes sound other than speech. Breathing is a major problem, while certain other noises can mess with people’s minds if heard through in-ear headphones. Then there’s mechanical noise from handling or other sources.

To counter the latter, invest in a boom-arm microphone stand, such as Quiklok’s, or one from a major mic manufacturer. This will enable you to position the microphone without having to resort to a desk stand. And ensure you get a cradle with the mic; it’s a unit that supports the microphone’s barrel via bungee cords for further mechanical damping.

An AKG Perception 200 with a cradle and stand, for example, costs less than £150, and there are other good condenser mics available from Sennheiser, Beyer Dynamic, Audio-Technica and Rode.

A popular technique for reducing noise from breath is to invert the mic and suspend it opposite the bridge of your nose, where it’s out of the path of mouth and nostrils. Otherwise, a windshield (also known as a pop shield or pop filter), such as Shure’s PS-6 popper stopper, can be clamped to the stand, its cloth diaphragm reducing the pickup of vocal ‘plosives’. Moving the mic further away from the body helps limit other body-generated sounds, but that means turning up the recording level.

Environmental concerns
So far, we’ve a condenser mic, suitably positioned and suspended, plugged into a phantom-powered XLR socket on the audio interface and the digital signal is being fed into your audio recording or video-editing software (see ‘Sound software’ on the final page). The mic connection is balanced, so there’s no interference, and the digital link and computer system are reasonably quiet.

But what’s that whine in the background? The mic is so sensitive that it’s picking up the whirr of the computer’s fan. When choosing a condenser, get one with a cardioid pickup pattern – so called because the pattern is heart-shaped – which means it’s sensitive at the front, less so at the sides, and not sensitive at the rear. This helps cut down on noise emanating from particular directions, unlike an omnidirectional model, although it’s just as good, if possible, to move the computer into another room.

Double-glazing lessens the intrusion of road noise, and recording when the house is empty (with the phone unplugged) is also wise, but there’s still the acoustic properties of the room. As the mic moves further away, it picks up reflections from your voice bouncing off walls, tables, equipment – any hard surface – so deaden the room as much as possible.

Drawn curtains, soft furnishings and carpets help absorb unwanted reflections. Some people even resort to wall drapes and baffles on the ceiling to further dampen the acoustic. If you want a particular ambience, it’s best applied later in software when the recording’s over.

Time to perform
The secret to recording a good narrative is to prepare. Unlike live broadcasts, in which the presenter often has to wing it as events unfold, you’ve got the time to get notes together.

However, if it’s all too tightly scripted, you’re likely to come across as stilted because it’s hard to write natural-sounding speech – that’s why good speechwriters are worth their weight in gold. Type up notes as the sequence unfolds, checking facts as you go, and aim for a steady pace, with pauses so the audience can take in the visuals.

Naturally, you need to be able to hear what you’re doing, perhaps while recording, certainly while talking over a backing track, and definitely when checking each take. During the recording, you can’t monitor backing material on speakers because it’ll feed into the mic, so headphones are essential.

Closed-back types, as used in recording studios, are the best buy because a well-fitting pair eliminates backing-track spill into the mic and is useful for rough mix monitoring in noisy environments, or when you can’t crank up the main speakers. The AKG K 171 Studio, retailing at about £80, is a good choice, as is the similarly priced Beyer Dynamic DT100.

Once you’ve rehearsed and checked that the recording level meters are showing a healthy signal, not so high as to distort and not so low as to need a level boost afterwards, relax and press Record. Some people find it useful to speak as if talking to a friend sitting opposite. That way, they’re less self-conscious about the performance and sound more natural.

If you fluff, don’t worry – you can try again until it’s right. Also, if the sound of your voice coming over the headphones is offputting, take them off, but remember to turn the headphone level knob right down during the take.

Samples and bits
When setting up to perform, you’ll need to establish the appropriate capture resolution. As you will probably know, CD audio data has a sample rate of 44.1KHz, with 16 data bits per sample, and is stereo. This gives sufficient resolution and data per sample for high-quality audio reproduction and means that elements can be placed at various positions within the stereo panorama, which lends clarity to a mix.

However, recording studios use higher sample rates and greater bit depths than this, because the higher the resolution, the more leeway there is when processing material. In a modern sound studio, audio capture resolution starts at 96KHz with 24-bit sampling, and can go up to 192KHz, regardless of the final delivery medium. Hence, if you’re planning a solo narrative and want to keep quality high while processing the sound, set your audio editor to record at a high sample rate and make it mono.

Again, studio engineers record most things in mono (drum kits being a notable exception) because engineers and producers are the people creating the stereo panorama. Your voice and mic are mono anyway, so stereo is a waste of bytes.

The affordable audio interface options mentioned earlier are capable of recording and playing 96KHz, 24-bit audio, so use that resolution for capture, then dither down and compress to your preferred delivery format at the final stage. On that subject, Apple’s iPod supports the compressed Aac format, but not Windows Media Audio (wma), whereas Creative’s popular Zen range, in common with many other portables, supports wma, but not Aac. Both, however, support mp3, so it’s still the best format to use if you want a wide audience among mobile users.

For audio-only podcasts, the choice of compression options set in the audio-editing software’s export dialogue is up to your own ears, but for monophonic voice, reckon on 128Kbits/sec as the minimum. Music deserves stereo at 164Kbits/sec or more, while a 44.1KHz sample rate at 16-bit is good for both.

However, whether podcasting or mixing audio for visuals, it’s as well to have a clear picture of what the audience is likely to hear, which is not possible with domestic speaker systems, or with headphones.

Speakers’ corner
When studio engineers talk about monitors, they’re not referring to visual displays, but to the speakers they use to achieve an accurate mix. Their monitors must have a broad, even frequency response, not colouring the sound in any way.

Home stereo equipment is no good because it’s ‘sweetened’ to sound good in the showroom, while the multimedia speakers bundled with computers often make for useful bookends, but little else. Headphones, meanwhile, under-represent the bass.

For quality monitoring, choose a pair of near-field reference monitors, so called because they’re placed close to you and what you hear on them will sound good on any decent consumer system. M-Audio markets affordable Studiophile monitors with built-in amplification, so you don’t have to worry about a separate amp.

At the entry level, a pair of £129 DX4s provides 18W per channel and the cabinets are equipped with 1/4-in TRS balanced connections to reduce in-cable interference. Moving up, Yamaha produces the HS50M, a £129 bi-amplified cabinet delivering 70W.

One pair, in conjunction with the HS10W 150W subwoofer (£329), is good for 2.1-channel mixing, with a frequency range from a floor-rumbling 30Hz up to a bat-worrying 20KHz. If you’ve budgeted for five HS50Ms, or similar, and an audio interface with six or more balanced outputs, then 5.1 surround-sound mixing becomes an option.

Other top picks include powered studio monitors from Tannoy (the Reveal series, in particular), Behringer and Fostex.

Music and SFX
Now that you can hear what’s going on, you can add spice to your soundtrack with music and sound effects (see ‘Mixing it’ on the final page). Don’t worry if you can’t hold a tune. There are libraries online that host selections of music for every occasion. Be sure, however, to heed their licensing terms.

Production-music houses, such as Focus, supply premium, rights-managed content for which you’ll need a licence from the MCPS (see ‘Copyright matters’ on the final page). Such material is typically well-produced and available in high-quality stereo Wav or Aiff formats at CD and DVD quality, so paying extra could pay dividends.

Alternatively, there’s royalty-free music. You pay for it once and then use it as you will, although you should still check the vendor’s terms. Some impose limitations, so aim to arrange a buyout of the material so there are no future costs. Try CSS Music, Globalcuts and the aptly named Royalty Free Music for streamed previews of production-quality files so you can audition before you buy.

Sound effects (SFX) are easier because it’s difficult to press a rights claim to something that goes ‘boink’, ‘swoosh’, or otherwise. SFX compilations on CD abound, and you may have files supplied with your video or soundtrack-editing software, but for a handle on leading publishers, visit Time+Space. And if your appetite knows no bounds, try www.sound-effects-library.com, a portal to 87 libraries from around the world, including the BBC’s enormous collection.

As with editing video, placing SFX and incorporating music takes some thought. A dolorous tune while promoting product benefits will make for a grim presentation, whereas thrash metal over a pastoral video will induce sanity-threatening cognitive dissonance. Sound effects, meanwhile, are cheesy if thrown in liberally and obtrusively, so use your good taste. There’s nothing to stop you cutting the video to match certain sections of music – it’s common practice in TV and film to ensure everything gels.

When you import stereo audio, it may appear on two mono tracks, so pan one hard left and the other hard right to maintain separation. The resolution of the material you’ve bought is less of a concern than with video because you’ve got the technology to downsample it to suit the medium.

Compress and publish
If you can buy 24-bit, 48KHz audio content, that’s great because it’s the highest quality most consumers can enjoy. It’s the resolution of audio for movies on DVD, although the audio may be processed for surround playback using DTS or AC-3 codecs. There are future consumer formats afoot, but for now, DVD or CD-quality audio is fine for home movies.

Your video-editing software will dither the audio mix for output while processing the video, and the specs we gave earlier for podcasts also apply to video content destined for an online audience. If you’re rendering a slideshow or presentation from hard drive or CD, stick with CD-quality audio – mono for voice and stereo if you’ll be including other material.

Finally, it’s as well to mark up the file properly for future reference and the edification of others. Mp3 players, for example, display information (metadata) embedded in the media file, so make sure the info is there – a process called tagging. Audio- or video-editing software should offer access to the properties of the audio, and in the dialogue there’ll be fields to fill out. Boxes such as Title, Author, Year, Copyright, Genre and Comments can be populated and will show up when the recipient examines the file.

It’s all part of presenting your product in its best possible light and, following our advice, you’ll be able to concoct a production to which you’re proud to put your name – at least, for the soundtrack. As for the video, that’s a whole different ball game.

Sound software
Most video-editing packages have reasonable audio capabilities and, for simple soundtracks, could be all you need. However, if you want more sophistication, you should consider a dedicated audio sequencing package. There are the big audio/Midi music production suites, such as Logic Pro 7.2, Cubase 4 and Pro Tools 7.3, but audio-only podcasts merely require a simple audio recorder/editor, while audio for video can be arranged on a modest software budget.

Excellent sub-£100 editors include Wavelab Essential and Sound Forge Audio Studio 8 for the PC, and Peak LE 5 for the Mac. And if you’ve no budget at all, there’s the free open-source Audacity for both platforms, plus Linux. Managing multiple tracks of audio, the elements of which must be tightly matched to video, is best managed with software orientated more towards arranging sound clips than Midi data. Those accustomed to Sony’s Vegas video-editing software will find Acid Pro 6’s environment very familiar (this costs less than £300).

Adobe supports Premiere with Audition 2.0 at £287.88, while Mac users with Final Cut Pro can upgrade to the Studio edition, which features Soundtrack Pro. Each package shows a preview of the video footage, making it easy to drop in music clips, sound effects and dialogue bang on cue, apply audio processing and output high-resolution audio files that match the action perfectly.

Mixing it
When preparing an audio mix, check for unwanted noise. While balanced connectors may have been used for the microphone, it’s still possible to find hiss and hum in the background, with 50Hz (or 60Hz in some countries) mains hum being a common problem.

A capable audio editor will have a noise removal filter, by which a section of what’s meant to be silence can be sampled to get a ‘noise print’. The values thus obtained are then effectively subtracted from the whole track, removing noise and leaving the programme material intact.

Once that’s sorted, you should move on to stereo positioning, if you’re outputting to stereo. Run the vocals straight down the middle (pan centre), unless two or more people are speaking, in which case you could emulate their on-screen positions via the panning controls.

If you’ve bought licence-free music, it should be fine as is, with the stereo channels panned hard left and hard right, if necessary. But the backing may overpower the narrative, so reduce its volume slightly when the voice comes in – a technique called ‘ducking’. Also, if certain frequencies of the backing material still interfere with the voice, use equalisation (EQ, a sophisticated, frequency-specific tone control) to knock them back.

Finally, should the results still sound limp, apply a stereo-linked compression preset to the mix. In crude terms, a compressor squashes down the loud parts so that the quiet parts seem louder. Multiband compression operates on independent ranges of frequencies and is useful for maintaining clarity and balance while giving the whole mix more punch.

Copyright matters
If the material you’ve recorded is your own creation, you own the copyright as soon as it’s committed to disk. The problems come when incorporating the work of others into your production.

A slideshow created for personal enjoyment with, say, a backing track from a favourite album shouldn’t be a worry because you can make a copy for personal use. But if you distribute that slideshow, whether for financial gain or not, you could fall foul of copyright law, if the copyright holder notices what you’re doing.

Using rights-free material from a library music vendor is OK if you’ve paid for it, but licensed content from production libraries will attract a fee, based on how many copies you make.

The body that administers the fees charged for copying music in the UK is the Mechanical Copyright Protection Society and its website hosts extensive guidance on usage and pricing. The Limited Availability Licensing Agreement seems most applicable to small runs of non-commercial audio-visual works, and you can find the conditions, fees, exceptions and application form online.

Another caveat is, if you play your slideshow with music to a room full of people, perhaps at a camera club, then you’ll need to clear it with the Performing Right Society, which issues licenses for public airings. Again, cost details are available online, or call 0800 068 4828.

www.pcw.co.uk/2174608
This article was printed from the Personal Computer World web site
© Incisive Media Ltd. 2008
Incisive Media Limited, Haymarket House, 28-29 Haymarket, London SW1Y 4RX, is a company registered in the United Kingdom with company registration number 04038503
Close this window to return to the website