QUICKSTART

Solid, pops. Just...

Upload some speech

Generate animation -- from demo audio or your speech

It's that simple. When you need the advanced features, read the manual below.

ARTICLES

Basic principles for lip sync

The Hanna-Barbera chart – classic

The Hanna-Barbera chart – modern

MANUAL

1.0 INPUT - upload mp3

1.5 Readymades

2.0 Advanced: Make a custom speaker

2.1 The Hanna/Barbera mouth chart

2.2 Phonemes

2.3 Variables

2.4 Smoothing rules

2.5 Your custom graphics

3.0 OUTPUT - HTML5 and Flash

3.1 Flash: Moving your animation into the flash editor

3.2 HTML5

1.0 INPUT - Upload mp3

Grab your microphone and record a little banter, and you should be good to go.

Speak directly, or upload an MP3 file from your harddisk.

The requirements for uploaded files are

MP3 only

File size < 4 Mb

CLEAR SPEECH. If you upload music, or stuff you don't have the rights to, it will be erased

Stereo will be parsed as mono

Your audio should consist of one CLEAR SPEAKER with silence between words.

The parser is sensitive to background noise, such as crowd walla or hiss, and may add unwanted mouth movements.

1.5 Readymades - create flash animation

Speaker
Select the radio button by the speaker you wish to use.

All the readymades have transparent backgrounds – both the full heads, and the mouth readymades. In your flash editor environment, they'll blend beautifully with background content.

When you create new speakers on the 'advanced' page, they will appear with a blue border here.

MP3 speech file
Select your uploaded audio, or a demo file. The demos files are short and to the point, consisting of male and female speakers, and cartoony speech.

When you click "Create animation", the engine swings into action and makes a new flash file.

The flash file will be viewable from 'Your Page' for as long as it resides on the server.

2.0 Advanced: Make a custom speaker

A small tweak to the framerate – or a complete reworking of the speaker. This is the page where the advanced user can control the animation in detail.

This section covers:

The Hanna/Barbera mouth chart

Phonemes

Variables

Smoothing rules

Your custom graphics

2.1 The Hanna/Barbera mouth chart

To make good use of your animation, you must know about the mouth chart.

Most animated features, from Flinstones to Futurama, are built on the same mouth chart. It's a fantastic time saver, invented by the Hanna/Barbera animators in the early days of tv animation.

On this chart, you have the basic mouth positions A, B, C, D, E, F, G, H.

These 8 basic shapes are all you need. Look for them, and you'll spot them in most animated series –

Dexter's Lab

South Park

The Simpsons

Walther & Gromit

et c...

Each image is called a VISEME from now on (VISEME = visual equivalent of a sound - PHONEME)

The letter does not indicate the sound the image represents, but is a reference code

A
SILENCE is mouth A.
Mouth A is also used for the closed-mouth consonants: M,B and P.

B
Mouth B is used mainly for the clenched-teeth consonants: N, D, G, K

C
Mouth C opens wider for I and E

D
Mouth D opens the widest of them all for the A sounds: hut, ate, hide

E
This mouth is for the "AW" sound, as in cow. It differs from the pout by lowering the mouth edges and jutting the chin.

F
Mouth F pouts for the "OH" sound, as in ought, part, and oh!

G
Mouth G - Upper teeth bite the lower lip for "F" and "V" sounds.

H
Mouth H - The "FL" sound in "flag" lifts the tongue up under the upper teeth. This viseme is found in all cartoons: Even the crude animation in South Park shows the tongue for "FL" and "L".

Here's a tip: If your first 4 visemes - A to D - animate smoothly, you're doing well. Remember how a basic ventriloquist dummy (or Muppet Show muppet) gives a good illusion of speaking, just by flapping its mouth open and closed. If your first 4 visemes flow well, you've already got the heavy lifting of your animation sorted out.

I use the rest of the visemes past H for smoothing – it is useful to define a few "inbetweens" to smooth the visual transition from for example a wide open mouth (D) to a small pout (F).

2.2 Phonemes

The lipsync engine recognizes the minute individual sounds (phonemes) that make up human speech, in any language.

I've so far seen good animation from: English, Danish, Norwegian, Hebrew, German, Dutch.

This application recognizes 40 different phonemes. Many, like "P", are self-explanatory, but some are best explained with examples of the words they contribute to.

DRAG the phonemes to the viseme you wish to associate them with.

Phoneme overview

AA odd
AE at
AH hut
AO ought
AW cow
AY hide
b be
CH cheese
d dee
DH thee
EH Ed
ER hurt
EY ate
f fee
g green
h he
IH it
IY eat
j gee
k key
l lee
m me
n knee
NG ping
OW oat
OY toy
p pee
r read
s sea
SH she
t tea
TH theta
UH hood
UW two
v vee
w we
y yield
z zee
ZH seizure

x silence
blink blink animation

2.3 Variables

These variables create the look of your animation. You can leave most of them alone, setting only the name and "Frames Per Second", or you can tweak...

Speaker name
Your new speaker. This name will identify your new speaker from the 'Readymades' and from 'Your Page'.

Frames per second
Set this between 10 and 35. And remember the value. When you import your animation into the flash editor, your container movie must employ the same frames per second value, or the animation will run off-time to the sound.

Smooth
Use the smoothing rules in a smoothing pass, which softens – or "inbetween" – visual transitions.

Flow
Do a clean-up pass, remove single-frame visemes. This gets rid of jitters at high Frames Per Second values, but can make the animation look dull at lower speeds.

Add audio to .swf file
Embeds your uploaded MP3 file into the movie. This will allow you to accurately preview your animation, and is turned on by default in all the readymades.
Embedded audio is for some reason NOT imported into the flash editor environment: Here, you need to find your MP3 again, and import it to the library.

Eyeblinks
Turns on blinking, if you are animating a speaker with a complete face. Drag the 'blink' phoneme to the viseme where the eyes are closed.

Blink frequency
The character will blink between words only, and pausing at least 2 seconds between blinks.
The frequency is determined by the rate you set. 4 is good. Over 8 looks very nervous.

2.4 Smoothing rules

The smoothing rules are a powerful tool that makes you animation flow.

They are quite easy to edit, even though they may look strange at first.

The syntax is "VISEME BEFORE - INBETWEEN - VISEME AFTER"

Example rule: ABD
This rule is used by all the readymades.
It tells the engine to place the "B" viseme between occurrences of "A" and "D"

The engine adds the appropriate middle-position, whenever it detects a sharp transition between a closed mouth "A" and a wide open mouth "D".

Smoothing rules run both ways - you do not need to define both ABD and DBA

You can add visemes past "H" that have no phonemes, but are used by the smoothing rules.
Look at the readymade "Springfield Generic", it has visemes "J" and "K" to soften the dramatic "OH" and "AH" sounds.

2.5 Your custom graphics

This feature is being built.

3.0 OUTPUT - HTML5 and Flash

When you see the "Success" box, your HTML5 animation, as well as flash movie have been created.

Download, or preview them directly on the server.

The flash movie is a .swf file which contains animation and sound:

1 bitmap graphic and 1 "graphic" instance containing the bitmap, per viseme – eg. if you use visemes A-E, your library will contain bitmaps 1-5 and graphics 1-5

Each master instance mentioned above can easily be re-linked or replaced with your own material. The change to one master viseme will reflect globally throughout the whole movie

'Your Page' gives you quick access to all the files you generate, for as long as the content resides on the server.

A single site cookie provides your logged-on state, so don't clear cookies :-)

3.1 Flash: Moving your animation into the flash editor

Steps (basic):

Open flash editor

Create a new movie

Set the frame-rate to match the framerate you encoded your lipsync at

Import the .swf file with your lipsync to the stage

Create a new layer for the audio

Import your audio file to the library (The editor doesn't import the audio embedded in the .swf!)

Drag audio from library onto the central window – the waveform will now be painted into your audio layer

Fast preview your imported movie and audio by hitting [enter] (not [return])

Steps (advanced):

Open flash editor

Create a new movie

Set the frame-rate to match the framerate you encoded your lipsync at

Create a new GRAPHIC (ctrl + f8)
You now see a new timeline: This is your container graphic, which will hold the lipsync animation
Import the .swf file with your lipsync to the stage

Create a new layer for the audio

Import your audio file to the library

Drag audio from library onto the central window – the waveform will now be painted into your audio layer

(NB: This is a curiosity with flash: you will probably want to have audio in your container graphic to be able to hear it while editing. On the main movie stage, the graphic however makes no sound, so the main movie must also contain the audio)

Fast preview your container graphic by hitting [enter] - or by scrubbing the mouse over the timeline

Return to the main movie by clicking "Main movie" to the left over the timeline

Drag your graphic from the library to the stage

Create a new layer in the main movie and import your audio. Fill out the timeline with blank frames (press F5) for the duration of the audio
You now have a very editable graphic, which contains the lipsync animation that runs in sync with the main timeline. Stretch it, rotate it, move it, mirror it, the lipsync keeps playing

3.2 HTML5

Press the 'download' button next to the HTML5 logo. You will get a zip archive with your animation. Run index.htm to view.

The audio: MP3 and OGG resides in the 'parsed' directory. Your animation data resides in 'index.htm'.

Supported
  Mac: Chrome, Safari
  iOS: 6+ iPad / iPhone
  Win: Chrome

Firefox and IE: The .ogg files generated should run easily on Firefox, but I've given up for the moment. Be my guest.

The SoundJS library abstracts away most of the cross-browser flakiness, but HTML5 audio is still a bit brittle.

WebAudio is very exciting, as it allows script interation with audio streams in detail. But, as mentioned, browser support is still somewhat brittle as of primo 2013.

Caveat
Security restrictions regarding audio in the browser may require you to run the files from a web server, as opposed to straight from the desktop, or other local folder.

� SW 2025 | Contact

QUICKSTART

Solid, pops. Just... Upload some speech Generate animation -- from demo audio or your speech It's that simple. When you need the advanced features, read the manual below.

ARTICLES

Basic principles for lip sync The Hanna-Barbera chart – classic The Hanna-Barbera chart – modern

MANUAL

1.0 INPUT - upload mp3 1.5 Readymades 2.0 Advanced: Make a custom speaker 2.1 The Hanna/Barbera mouth chart 2.2 Phonemes 2.3 Variables 2.4 Smoothing rules 2.5 Your custom graphics 3.0 OUTPUT - HTML5 and Flash 3.1 Flash: Moving your animation into the flash editor 3.2 HTML5

1.0 INPUT - Upload mp3

1.5 Readymades - create flash animation

2.0 Advanced: Make a custom speaker

A small tweak to the framerate – or a complete reworking of the speaker. This is the page where the advanced user can control the animation in detail. This section covers: The Hanna/Barbera mouth chart Phonemes Variables Smoothing rules Your custom graphics

2.1 The Hanna/Barbera mouth chart

A

SILENCE is mouth A. Mouth A is also used for the closed-mouth consonants: M,B and P.

B

Mouth B is used mainly for the clenched-teeth consonants: N, D, G, K

C

Mouth C opens wider for I and E

D

Mouth D opens the widest of them all for the A sounds: hut, ate, hide

E

This mouth is for the "AW" sound, as in cow. It differs from the pout by lowering the mouth edges and jutting the chin.

F

Mouth F pouts for the "OH" sound, as in ought, part, and oh!

G

Mouth G - Upper teeth bite the lower lip for "F" and "V" sounds.

H

Mouth H - The "FL" sound in "flag" lifts the tongue up under the upper teeth. This viseme is found in all cartoons: Even the crude animation in South Park shows the tongue for "FL" and "L".

2.2 Phonemes

AA odd AE at AH hut AO ought AW cow AY hide b be CH cheese d dee

DH thee EH Ed ER hurt EY ate f fee g green h he IH it IY eat j gee

k key l lee m me n knee NG ping OW oat OY toy p pee r read s sea

SH she t tea TH theta UH hood UW two v vee w we y yield z zee ZH seizure x silence blink blink animation

2.3 Variables

2.4 Smoothing rules

2.5 Your custom graphics

This feature is being built.

3.0 OUTPUT - HTML5 and Flash

3.1 Flash: Moving your animation into the flash editor

3.2 HTML5

Solid, pops. Just...

Upload some speech

Generate animation -- from demo audio or your speech

It's that simple. When you need the advanced features, read the manual below.

Basic principles for lip sync

The Hanna-Barbera chart – classic

The Hanna-Barbera chart – modern

1.0 INPUT - upload mp3

1.5 Readymades

2.0 Advanced: Make a custom speaker

2.1 The Hanna/Barbera mouth chart

2.2 Phonemes

2.3 Variables

2.4 Smoothing rules

2.5 Your custom graphics

3.0 OUTPUT - HTML5 and Flash

3.1 Flash: Moving your animation into the flash editor

3.2 HTML5

A small tweak to the framerate – or a complete reworking of the speaker. This is the page where the advanced user can control the animation in detail.

This section covers:

The Hanna/Barbera mouth chart

Phonemes

Variables

Smoothing rules

Your custom graphics

SILENCE is mouth A.
Mouth A is also used for the closed-mouth consonants: M,B and P.

AA odd
AE at
AH hut
AO ought
AW cow
AY hide
b be
CH cheese
d dee

DH thee
EH Ed
ER hurt
EY ate
f fee
g green
h he
IH it
IY eat
j gee

k key
l lee
m me
n knee
NG ping
OW oat
OY toy
p pee
r read
s sea

SH she
t tea
TH theta
UH hood
UW two
v vee
w we
y yield
z zee
ZH seizure

x silence
blink blink animation