bread-loafBaked (Pre-Baked) Lip Sync

Pre-bake your audio clips' lip sync data offline in the editor, then play it back at runtime with zero FFT cost. Ideal for WebGL, mobile, deterministic playback, or performance-critical scenes.


Overview

The baked lip sync workflow has three parts:

  1. Bake - An editor window analyzes your AudioClip offline using the same spectral analysis as real-time lip sync, and saves the per-frame viseme weights into a ScriptableObject asset.

  2. Store - The CrystalLipSyncClipData ScriptableObject holds all baked data: viseme weights, dominant visemes, and RMS volumes at a configurable frame rate.

  3. Play - The CrystalBakedLipSync runtime component reads the baked data synchronized with an AudioSource and writes viseme weights to the controller ... no FFT at runtime.


When To Use Baked Lip Sync

Scenario
Recommendation

WebGL build

✅ Best option (no WebGL FFT quirks)

Mobile (battery/thermal)

✅ Zero CPU cost per character

Many characters speaking

✅ Scales to dozens without frame drops

Cutscenes / pre-recorded VO

✅ Deterministic results every time

Dynamic / user-generated audio

❌ Use real-time lip sync instead

Microphone input

❌ Use real-time lip sync

Text lip sync (no audio)

❌ Use CrystalTextLipSync


Baking Audio Clips

Open the Bake Window

Tools → Crystal LipSync → Bake Lip Sync

Single Clip

  1. Drag an AudioClip into the Clip field

  2. (Optional) Assign a Profile to auto-fill sensitivity/threshold/smoothing

  3. Adjust analysis settings if needed:

    • FFT Size ... Higher = better frequency resolution (1024 recommended)

    • Sensitivity ... Match your controller's sensitivity setting

    • Volume Threshold ... Match your controller's threshold

    • Smoothing Attack / Release ... Match your controller's smoothing

  4. Set the Frame Rate (60 fps default ... 30 fps saves memory, 60 fps is smoother)

  5. Click Bake Single Clip

  6. Choose a save location ... a .asset file is created

Batch Bake

  1. Expand Batch Bake in the bake window

  2. Add multiple clips via + Add Clip or drag-and-drop

  3. Click Bake All (Batch)

  4. Choose a destination folder ... one .asset per clip

Matching Settings to Your Controller

For best results, use the same settings in the bake window as on your CrystalLipSyncController:

Bake Window
Controller Inspector

FFT Size

FFT Size

Sensitivity

Sensitivity

Volume Threshold

Volume Threshold

Smoothing Attack

Smoothing Attack

Smoothing Release

Smoothing Release

Or assign the same Profile asset to both.


Runtime Setup

Components Required

Component
GameObject
Purpose

CrystalLipSyncController

Character

Holds VisemeWeights[], fires events

CrystalBakedLipSync

Character

Reads baked data, writes to controller

CrystalLipSyncBlendshapeTarget or CrystalLipSyncJawBoneTarget

Character

Reads VisemeWeights[], drives mesh/bone

AudioSource

Character (or shared)

Plays the voice audio

Basic Setup

  1. Add CrystalBakedLipSync to the same GameObject as your CrystalLipSyncController

  2. Assign the baked Clip Data asset

  3. (Optional) Assign the Audio Source ... if left empty, uses the controller's AudioSource

  4. Play the AudioClip on the AudioSource ... lip sync starts automatically

Inspector Settings

Setting
Default
Description

Controller

Auto-detect

The controller to write weights into

Audio Source

Controller's source

AudioSource to sync timing with

Clip Data

(required)

The baked CrystalLipSyncClipData asset

Auto Play

Start/stop with the AudioSource automatically

Time Offset

0

Shift lip sync timing (positive = later, negative = earlier)

Intensity

1.0

Weight multiplier for all baked visemes

Additional Smoothing

0

Extra smoothing on top of baked data (0 = raw)


Scripting API

Swap Clips at Runtime

Manual Play/Stop

Events

Querying Baked Data Directly


Priority & Coexistence

CrystalBakedLipSync takes priority over real-time FFT when actively playing:

  • Baked overrides real-time. When baked playback is active, it writes to VisemeWeights[] after the controller's real-time analysis (via [DefaultExecutionOrder(100)]), effectively overriding whatever the controller wrote. This is intentional ... baked data represents the same spectral analysis, just pre-computed.

  • Real-time fills gaps. When baked playback is NOT active, the controller's real-time FFT drives the weights as usual.

  • Text lip sync yields. CrystalTextLipSync only writes when IsActive is false (no audio playing). This is unchanged.

Tip: You can freely have both CrystalBakedLipSync and real-time analysis on the same controller. When baked is playing, it wins. When baked stops, real-time takes over seamlessly.

For WebGL, you have three options:

Approach
Pros
Cons

Real-time

No baking needed, works with dynamic audio

Managed FFT cost per frame

Baked only

Zero CPU cost, deterministic

Requires baking step

Hybrid (both)

Best of both worlds

Slightly more setup

Note: Real-time lip sync works on WebGL out of the box ... the controller automatically uses OnAudioFilterRead + managed FFT. Baking is an optional optimization, not a requirement.

For a baked-only setup, do NOT assign the AudioSource to the Controller ... assign it only to CrystalBakedLipSync:


Dialogue System Integration

Baked lip sync works automatically with Game Creator 2 Dialogue and PixelCrushers Dialogue System ... no per-line scripting needed. The key is the Baked Clip Lookup table.

How It Works

When you bake an AudioClip, you get a CrystalLipSyncClipData asset. The Baked Clip Lookup (CrystalBakedClipLookup) maps each AudioClip to its baked data:

At runtime, when a dialogue line plays an AudioClip, the bridge component looks it up in the table. If a match exists, it automatically assigns the baked data and starts playback ... the character's mouth moves using the pre-baked viseme weights for that specific clip.

Any baked clip works automatically. As long as the AudioClip is registered in the lookup table, switching between different dialogue lines, different speakers, or different conversations all works without any extra setup per line.

Setup - Game Creator 2

  1. Bake all your voice-over AudioClips (use Batch Bake with Auto-Populate Lookup enabled)

  2. Create a Baked Clip Lookup asset: Assets → Create → Crystal LipSync → Baked Clip Lookup

  3. Register each AudioClip ↔ ClipData pair (automatic if Auto-Populate is configured in the Bake Window)

  4. Assign the lookup to CrystalDialogueLipSyncBaked Clip Lookup field

  5. Ensure each speaking character has a CrystalBakedLipSync component

The flow at runtime:

If a clip has no baked data in the lookup, the bridge skips silently and the controller's real-time FFT handles it instead.

Setup - PixelCrushers Dialogue System

  1. Bake all your voice-over AudioClips

  2. Create and populate a Baked Clip Lookup asset

  3. Assign the lookup to CrystalDialogueSystemLipSyncBaked Clip Lookup field

  4. Ensure each speaking character has a CrystalBakedLipSync component

The flow at runtime:

Mixing Baked and Non-Baked Lines

You don't need to bake every clip. Within the same conversation:

  • Lines with baked data → baked lip sync drives the mouth

  • Lines without baked data → real-time FFT analysis takes over

  • Lines without audio → text lip sync kicks in (if configured)

All three modes can coexist on the same character seamlessly.


Memory & Performance

Baked Asset Size

Frame Rate
Clip Length
Frames
Weight Data
Total Asset

30 fps

5s

150

9 KB

~10 KB

60 fps

5s

300

18 KB

~19 KB

60 fps

30s

1800

108 KB

~112 KB

60 fps

60s

3600

216 KB

~224 KB

Each frame stores 15 floats (weights) + 1 int (dominant) + 1 float (volume) = 68 bytes.

Runtime Cost

Operation
Cost

Sample weights (lerp)

~15 multiplies + 15 adds

Apply to controller

~15 copies

Total per frame

< 0.01ms


Bake Window Preview

After baking, the editor window shows a visual timeline of the baked data:

  • Height = RMS volume

  • Color = dominant viseme (hue-mapped)

  • Width = clip duration

This gives you a quick visual sanity check before testing in-game.


Troubleshooting

Lips don't move

  • Verify the Clip Data asset has data (HasData checkbox in inspector)

  • Check that Auto Play is enabled, or call Play() manually

  • Ensure the controller's IsActive is false (no real-time FFT overriding baked data)

Timing is off

  • Adjust Time Offset (positive = later, negative = earlier)

  • Ensure the AudioSource and baked data reference the same AudioClip

Baked data looks wrong

  • Re-bake with settings matching your controller's (especially Sensitivity and Smoothing)

  • Try a higher Frame Rate (60+) for faster speech

"No data" in baked asset

  • The AudioClip may be compressed with Load Type: Streaming. Change to Decompress On Load or Compressed In Memory and re-bake.


FAQ

Q: Can I bake clips at edit time and play them on WebGL? A: Yes ... that's the primary use case. Bake in the editor, ship the .asset files with your build.

Q: Do I still need CrystalLipSyncController? A: Yes. The controller holds VisemeWeights[] and fires OnVisemeWeightsUpdated. Targets read from the controller.

Q: Can I use baked lip sync and text lip sync together? A: They both write to controller.VisemeWeights[] when IsActive is false. The last writer wins. Avoid running both simultaneously on the same controller.

Q: Can I re-bake a clip without losing the reference? A: Yes. If you save to the same path, the existing asset is updated in-place. All references remain valid.

Q: What frame rate should I use? A: 60 fps is recommended for smooth results. 30 fps is fine for stylized characters. Below 30 fps may look choppy.

Last updated