Talking Head 🗣️

AI-generated podcast-style audio from turn-based conversations in WordPress.

No point watching without audio

Description

Talking Head lets you write multi-speaker conversations in the WordPress block editor, then generate podcast-quality audio using AI text-to-speech. Each speaker (“head”) gets their own voice, and the plugin stitches the segments together into a single audio file with configurable silence gaps — or serves segments individually using virtual stitching for faster publishing.

Features

Episode editor — Gutenberg blocks for writing turn-based conversations
Speaker profiles — Custom post type for managing voices and personas
OpenAI TTS — Generate speech using OpenAI’s text-to-speech API (alloy, echo, fable, onyx, nova, shimmer)
Azure OpenAI TTS — Alternative provider using Azure-hosted OpenAI deployments
WordPress AI (Core) — On WordPress 7.0+, use the built-in AI Client for TTS via Settings → Connectors (no API key required)
Background processing — Audio generation runs via Action Scheduler, with progress tracking
Audio stitching — FFmpeg-based concatenation with silence gaps and loudness normalization, or pure PHP fallback
Virtual stitching — Serve audio segments individually without server-side concatenation, with client-side sequential playback
Player block — Embed episode playback in any post or page, with optional transcript
Provider selector — Settings page dropdown to switch between providers; only relevant fields are shown
Provider interface — Extensible architecture for adding more TTS providers

Requirements

WordPress 6.8+
PHP 8.3+
(Although optional, FFmpeg installed on the server allows the user to download the conversation; PHP fallback is available.)

Installation

Download the latest talking-head.zip.
In WordPress, go to Plugins → Add New → Upload Plugin and upload the zip.
Activate the plugin.

The plugin updates itself automatically via GitHub releases using plugin-update-checker.

Configuration

Go to Talking Head > Settings and configure. The settings page has three tabs:

Tab	Setting	Description
Provider	TTS Provider	`OpenAI`, `Azure OpenAI`, or `WordPress AI (Core)` (WP 7.0+)
	Default Voice	Default voice for new speaker profiles
	OpenAI API Key	Your OpenAI API key for TTS
	TTS Model	`tts-1` (standard), `tts-1-hd` (high quality), or `gpt-4o-mini-tts` (supports instructions)
	Azure OpenAI API Key	Your Azure OpenAI API key
	Azure OpenAI Endpoint	Azure resource endpoint URL
	Azure OpenAI Deployment ID	Name of your TTS deployment
	Azure OpenAI API Version	API version string
Audio	Stitching Mode	File (concatenate on server) or Virtual (serve segments individually)
	FFmpeg Path	Absolute path to the FFmpeg binary (optional — PHP fallback if not found)
	Output Format	MP3 or AAC
	Output Bitrate	128k / 192k / 256k / 320k
	Silence Gap	Milliseconds of silence between turns
Limits	Max Segments	Maximum turns per episode (1–200)
	Max Characters	Maximum text length per turn (100–4096)
	Rate Limit	API requests per minute (1–60)

Settings can also be set via constants in wp-config.php (highest priority) or environment variables. See CONFIG.md for the full list of 16 constants.

Usage

1. Create Speaker Profiles

Go to Talking Head > Heads and create speaker profiles. Each head has:

A name
A voice ID (e.g., nova, onyx)
A provider (openai, azure_openai, or wordpress on WP 7.0+)
Speed (0.25–4.0, default 1.0)
Optional speaking style/instructions (used with gpt-4o-mini-tts)
Optional avatar (featured image)

2. Write an Episode

Go to Talking Head > Add New Episode. The editor loads with an Episode container block and one Turn block. For each turn:

Select a speaker from the dropdown
Write the dialogue text

Add more turns with the block appender.

3. Generate Audio

Select the Episode block and click Generate Audio in the block toolbar. The plugin:

Validates the manuscript (speakers assigned, text within limits)
Creates a background job via Action Scheduler
Generates TTS audio for each turn via the configured provider
Stitches segments with FFmpeg into a single MP3 (file mode), or prepares segments for individual playback (virtual mode)
Stores the result in wp-content/uploads/talking-head/

Progress is shown in the sidebar via polling.

4. Embed the Player

Use the Talking Head Player block in any post or page. Select an episode from the searchable dropdown and optionally enable transcript display. The block renders a native <audio> element.

Development

See DEVELOPER.md for build commands, REST API, architecture, and workflow details.

License

GPL-2.0-or-later

Changelog

See CHANGELOG.md.

📦 Source: soderlind/talking-head · Edit on GitHub