# WebGL Lip Sync Guide

CrystalLipSync fully supports **WebGL** builds ... both real-time audio analysis and pre-baked playback. No plugins, no JavaScript bridges.

***

### Overview

Unity's `AudioSource.GetSpectrumData()` is **unavailable on WebGL**. CrystalLipSync provides two alternatives:

| Approach      | How It Works                                           | CPU Cost                          | Setup                           |
| ------------- | ------------------------------------------------------ | --------------------------------- | ------------------------------- |
| **Real-time** | `OnAudioFilterRead` + managed C# FFT                   | \~0.1 ms/frame per character      | **None** ... automatic on WebGL |
| **Baked**     | Offline analysis → ScriptableObject → runtime playback | **< 0.01 ms/frame** per character | Bake in editor, assign asset    |

Both approaches work out of the box. You can even mix them ... baked for known voice-over, real-time as a fallback for dynamic audio.

***

### Real-Time Lip Sync on WebGL

#### How It Works

When running on WebGL, `CrystalLipSyncController` automatically:

1. Detects the WebGL platform at startup
2. Attaches a `CrystalLipSyncAudioCapture` component to the AudioSource's GameObject
3. Routes analysis through managed FFT instead of `GetSpectrumData`

**No setup required.** Your existing scene configuration works without changes.

#### Architecture

```
AudioSource (plays audio)
       │
       ▼
CrystalLipSyncAudioCapture          ← OnAudioFilterRead (audio thread)
  • Ring buffer (mono PCM)
  • Blackman-Harris window
  • Managed Cooley-Tukey FFT
  • Magnitude spectrum output
       │  FillBuffers()
       ▼
CrystalLipSyncAnalyzer              ← AnalyzeFromBuffers() (main thread)
  • 6-band energy analysis
  • Spectral centroid
  • Viseme classification
  • Temporal smoothing
       │  VisemeWeights[]
       ▼
Blendshape / Jaw Bone Target
```

**Data flow:**

1. **Audio Thread**: `OnAudioFilterRead` fires \~every 20ms with raw PCM samples. The capture component downmixes to mono and writes to a lock-protected ring buffer. Audio passes through unmodified.
2. **Main Thread**: The controller calls `FillBuffers()` which copies samples, applies a Blackman-Harris window, runs a 2N-point Cooley-Tukey FFT, and extracts magnitude bins.
3. **Analysis**: The spectrum feeds into the same pipeline used on desktop ... identical band energies, spectral centroid, and viseme classification.

#### Testing in the Editor

Enable **Force Audio Capture** on the `CrystalLipSyncController` inspector to use the WebGL path on desktop ... useful for verifying behavior without building.

```csharp
if (controller.IsUsingAudioCapture)
    Debug.Log("Using OnAudioFilterRead capture path");
```

#### Why Not a JavaScript Bridge (`.jslib`)?

Some plugins use a `.jslib` to tap into the browser's Web Audio API. CrystalLipSync avoids this because:

* **Fragile AudioSource binding** ... Unity doesn't expose which Web Audio nodes correspond to which AudioSource
* **Browser inconsistencies** ... `AnalyserNode` returns dB-scaled magnitudes with different windowing than Unity
* **Maintenance burden** ... raw JavaScript with no type safety or C# debugger support
* **`OnAudioFilterRead` already works** ... the data is already available on the managed side

#### Performance

| Metric                | Value                               |
| --------------------- | ----------------------------------- |
| FFT Size (default)    | 1024 spectrum bins → 2048-point FFT |
| Operations per frame  | \~22,500 multiply-adds              |
| Ring buffer memory    | \~32 KB                             |
| Per-frame allocations | **Zero**                            |
| Overhead              | \~0.1ms per frame                   |

***

### Baked Lip Sync on WebGL

#### Why Bake?

Baking is an **optional optimization** ... real-time works out of the box. Baking is best for:

* **Even lower CPU cost** ... \~10× cheaper than real-time FFT
* **Deterministic results** ... identical lip sync on every device, every run
* **Scalability** ... dozens of characters speaking with negligible overhead
* **Pre-recorded voice-over** ... if audio is known at build time, why analyze it every frame?

If your audio is dynamic (microphone, procedural, user-uploaded), stick with real-time.

#### Step 1 - Prepare Audio Clips

Set each AudioClip's **Load Type** to `Decompress On Load` or `Compressed In Memory`:

> ⚠️ `Streaming` clips cannot be baked ... the editor needs the full waveform.

#### Step 2 - Bake

Open **Tools → Crystal LipSync → Bake Lip Sync**.

<figure><img src="https://1935854846-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FuciaAIoc6XDGWTpXdIP7%2Fuploads%2FwSN1E1ECsWTU2aIqLaHI%2Fimage.png?alt=media&#x26;token=6226e156-dbc2-47f1-8b3e-869cf123dbb2" alt=""><figcaption></figcaption></figure>

**Single clip:** Drag an AudioClip → click **Bake Single Clip** → choose save location.

**Batch:** Expand **Batch Bake** → add clips → click **Bake All (Batch)** → choose destination folder.

Match your bake settings to your controller (FFT Size, Sensitivity, Threshold, Smoothing) or assign the same **Profile** to both.

#### Step 3 - Scene Setup

Add these to your character:

| # | Component                                           | Purpose                                |
| - | --------------------------------------------------- | -------------------------------------- |
| 1 | `CrystalLipSyncController`                          | Holds `VisemeWeights[]`                |
| 2 | `CrystalBakedLipSync`                               | Reads baked data, writes to controller |
| 3 | `CrystalLipSyncBlendshapeTarget` or `JawBoneTarget` | Drives the mesh/bone                   |
| 4 | `AudioSource`                                       | Plays the voice clip                   |

**Baked-Only Setup (Recommended for WebGL)**

```
CrystalLipSyncController
  └─ Audio Source:  (leave empty)

CrystalBakedLipSync
  ├─ Audio Source:  YourAudioSource
  ├─ Clip Data:     MyClip_LipSync.asset
  └─ Auto Play:     ✅
```

**Hybrid Setup (Baked + Real-Time Fallback)**

```
CrystalLipSyncController
  └─ Audio Source:  YourAudioSource

CrystalBakedLipSync
  ├─ Audio Source:  YourAudioSource
  └─ Clip Data:     MyClip_LipSync.asset
```

When baked data is available it takes priority (via `[DefaultExecutionOrder(100)]`). When no baked data exists, real-time FFT fills in.

#### Step 4 - Play

```csharp
audioSource.clip = myVoiceClip;
audioSource.Play();
// Auto Play detects playback start and begins baked lip sync
```

**Swapping clips dynamically:**

```csharp
bakedLipSync.SetClipData(bakedData[index]);
audioSource.clip = voiceClips[index];
audioSource.Play();
```

#### Dialogue System Integration

Both GC2 and PixelCrushers integrations support baked lip sync automatically via a **Baked Clip Lookup** table (maps AudioClip → baked data). See the Baked Lip Sync guide for setup details.

#### Organizing Baked Assets

```
Assets/
  LipSyncData/
    Characters/
      Hero/
        hero_greeting_LipSync.asset
        hero_farewell_LipSync.asset
      Merchant/
        merchant_shop_LipSync.asset
```

#### Re-Baking

Save to the same path to update in-place ... all scene/prefab references remain intact.

***

### WebGL Build Checklist

1. **File → Build Settings → WebGL** - build as normal
2. No special settings required for lip sync
3. Baked assets `.asset` are included automatically (referenced by components)

#### Browser Audio Requirement

Browsers require **user interaction** before playing audio. Ensure your game has a "Start" or "Click to Play" screen before any AudioSource playback. This is a browser requirement, not a CrystalLipSync limitation.

***

### Compatibility

| Feature                            |     WebGL Support    |
| ---------------------------------- | :------------------: |
| Audio lip sync (real-time)         |           ✅          |
| Baked lip sync                     |           ✅          |
| Text lip sync                      |   ✅ (no FFT needed)  |
| Microphone lip sync                | ❌ (browser security) |
| Blendshape targets                 |           ✅          |
| Jaw bone targets                   |           ✅          |
| Profiles & moods                   |           ✅          |
| All FFT sizes (256...4096)         |           ✅          |
| GC2 Dialogue integration           |           ✅          |
| PixelCrushers Dialogue integration |           ✅          |

***

### Troubleshooting

#### Mouth doesn't move

| Check                           | Fix                                                        |
| ------------------------------- | ---------------------------------------------------------- |
| AudioSource not assigned        | Assign it to the controller (real-time) or baked component |
| Audio blocked by browser        | Add a user interaction screen before playback              |
| Baked Clip Data missing         | Assign the baked `.asset` to the baked component           |
| Auto Play disabled              | Enable the toggle, or call `Play()` manually               |
| Force Audio Capture not checked | Enable it in the Editor to test the WebGL path             |

#### Timing feels off (baked)

* Adjust **Time Offset** on `CrystalBakedLipSync` (try `-0.05` to `0.05`)
* WebGL audio scheduling can introduce small latency ... a negative offset compensates

#### Lip sync quality differs between Editor and WebGL

The managed FFT produces slightly different magnitudes than Unity's native FFT. Viseme classification uses **relative** band energy ratios, so results should be nearly identical. If you notice differences:

* Adjust **Sensitivity** slightly (±1...2)
* Tweak **Volume Threshold** since WebGL audio levels can differ

#### Bake fails or produces silent data

* AudioClip **Load Type** must not be `Streaming`
* Verify the clip contains audio (check waveform in Inspector)
* Very quiet clips may fall below the Volume Threshold ... lower it and re-bake

***

### FAQ

**Q: Do I need to add CrystalLipSyncAudioCapture manually?** A: No. The controller creates and manages it automatically on WebGL.

**Q: Does this work with Addressables / Asset Bundles?** A: Yes. As long as the AudioClip plays through an AudioSource, both real-time and baked paths work.

**Q: Should I use real-time or baked for WebGL?** A: Both work. Real-time requires zero setup. Baking is \~10× cheaper in CPU and gives deterministic results. For voice-over heavy games, baking is recommended.

**Q: Can I use both on the same character?** A: Yes. Baked takes priority when active; real-time fills gaps when no baked data exists.

**Q: What about mobile WebGL?** A: Works on mobile browsers that support Web Audio API (all modern mobile browsers). The same user-interaction requirement applies.