Speech & Transcription
(71)Mcporter
Use the mcporter CLI to list, configure, auth, and call MCP servers/tools directly (HTTP or stdio), including ad-hoc servers, config edits, and CLI/type generation.
/mcporterOpenClaw YouTube Transcript
Transcribe YouTube videos to text by extracting captions and subtitles directly from the video URL using yt-dlp without audio processing.
/openclaw-youtube-transcriptYouTube Transcript
Fetch and summarize YouTube video transcripts. Use when asked to summarize, transcribe, or extract content from YouTube videos. Handles transcript fetching via residential IP proxy to bypass YouTube's cloud IP blocks.
/youtube-transcriptLocal Whisper
Local speech-to-text using OpenAI Whisper. Runs fully offline after model download. High quality transcription with multiple model sizes.
/local-whisperelevenlabs-voices
High-quality voice synthesis with 18 personas, 32.
/elevenlabs-voiceselevenlabs-tts
ElevenLabs TTS - the best ElevenLabs integration for OpenClaw.
/elevenlabs-ttsVoice Transcribe
Transcribe audio files using OpenAI's gpt-4o-mini-transcribe model with vocabulary hints and text replacements. Requires uv (https://docs.astral.sh/uv/).
/voice-transcribejarvis-voice
Metallic AI voice persona with TTS and visual transcript styling.
/jarvis-voicekokoro-tts
Generate spoken audio from text using the local Kokoro TTS engine.
/kokoro-ttsElevenLabs Speech-to-Text
Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
/elevenlabs-sttMlx Whisper
Local speech-to-text with MLX Whisper (Apple Silicon optimized, no API key).
/mlx-whisperTranscribe audio files via OpenRouter using audio-capable models
Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).
/openrouter-transcribeGemini STT
Transcribe audio files using Google's Gemini API or Vertex AI
/gemini-sttTts
Convert text to speech using Hume AI (or OpenAI) API. Use when the user asks for an audio message, a voice reply, or to hear something "of vive voix".
Local Whisper
Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.
/whisper-mlx-localTranscribe
Transcribe audio files to text using local Whisper (Docker). Use when receiving voice messages, audio files (.mp3, .m4a, .ogg, .wav, .webm), or when asked to transcribe audio content.
/transcribeassemblyai-transcribe
Transcribe audio/video with AssemblyAI.
/assemblyai-transcribeLocal STT (Nvidia Parakeet + Whisper Support)
Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual).
/local-sttaudio-gen
Generate audiobooks, podcasts, or educational audio content.
/audio-gencritical-article-writer
Generate draft articles, outlines.
/critical-article-writerit will help you to send voice messages to your AI Assistant and also can make it talk
Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI voices and accurate transcription.
/elevenlabs-voiceelevenlabs-transcribe
Transcribe audio to text using ElevenLabs.
/elevenlabs-transcribeParakeet Stt
Local speech-to-text with NVIDIA Parakeet TDT 0.6B v3 (ONNX on CPU). 30x faster than Whisper, 25 languages, auto-detection, OpenAI-compatible API. Use when transcribing audio files, converting speech to text, or processing voice recordings locally without cloud APIs.
/parakeet-sttannouncer
Announce text throughout the house via AirPlay speakers using Airfoil +.
/announcerSpeech To Text
Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,...
/speech-to-textVoice
Convert text to speech using Microsoft Edge's TTS engine with customizable voices, direct playback, and automatic temporary file cleanup.
/voiceaddis-assistant-stt
Provides Speech-to-Text (STT) and text.
/addis-assistant-sttPocket Tts
Generate high-quality English speech offline on CPU using 8 built-in voices or custom voice cloning with Kyutai's Pocket TTS model.
/pocket-ttsVoicenotes
Sync and access voice notes from Voicenotes.com. Use when the user wants to retrieve their voice recordings, transcripts, and AI summaries from Voicenotes. Supports fetching notes, syncing to markdown, and searching transcripts.
/voicenotesTranscribe Audio with Parakeet MLX
Local speech-to-text with Parakeet MLX (ASR) for Apple Silicon (no API key).
/parakeet-mlxcult-of-carcinization
Give your agent a voice — and ears.
/cult-of-carcinizationdeepdub-tts
Generate speech audio using Deepdub and attach it as a MEDIA.
/deepdub-ttschichi-speech
A RESTful service for high-quality text-to-speech using Qwen3.
/chichi-speechVoicenotes Official
This official skill from the Voicenotes team gives OpenClaw access to new APIs and the ability to search semantically, retrieve full transcripts, filter by t...
/voicenotes-officialtl;dw - YouTube Video Summarizer
Extracts YouTube video transcripts and provides concise summaries highlighting main points, arguments, and conclusions without watching the full video.
Openai Tts.Bak 2026 01 28T18:01:23+10:30
Text-to-speech via OpenAI Audio Speech API.
/openai-tts-bak-2026-01-28t18-01-23-10-30speech-recognition
通用语音识别 Skill。支持多种音频格式(ogg/mp3/wav/m4a),使用硅基流动 SenseVoice API 进行语音转文字。当用户发送语音消息、音频文件,或需要转录音频时触发。
/speech-recognitionfreshbooks-cli
FreshBooks CLI for managing invoices, clients, and billing.
/freshbooks-cliText To Speech
Convert text to natural speech with DIA TTS, Kokoro, Chatterbox, and more via inference.sh CLI. Models: DIA TTS (conversational), Kokoro TTS, Chatterbox, Hig...
/text-to-speechAssemblyAI Transcriber
Transcribe audio files with speaker diarization (who speaks when). Supports 100+ languages, automatic language detection, and timestamps. Use for meetings, interviews, podcasts, or voice messages. Requires AssemblyAI API key.
/assemblyai-transcriberWhisper Transcribe
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to large). Use when transcribing audio recordings, podcasts, voice messages, lectures, meetings, or any audio/video file to text. Handles mp3, wav, m4a, ogg, flac, webm, opus, aac formats.
/whisper-transcribeeternal-haven-lore-pack
Eternal Haven Chronicles lore + mythic persona pack.
/eternal-haven-lore-packakaunting
Interact with Akaunting open-source accounting software via REST API.
/akauntingauto-whisper-safe
RAM-safe voice transcription with auto-chunking — works on 16GB machines without crashes.
/auto-whisper-safebrw-de-ai-ify
Remove AI-generated jargon and restore human voice to text.
/brw-de-ai-ifydellight-cro-revenue-ops
DELLIGHT.AI is an AI startup in DIFC, Dubai.
/dellight-cro-revenue-opsdoubao-api-open-tts
Text-to-Speech service using Doubao (Volcano Engine)
/doubao-api-open-ttseachlabs-voice-audio
TTS, STT, voice conversion using ElevenLabs, Whisper, RVC.
/eachlabs-voice-audiofeishu-minutes
Fetch info, stats, transcript, and media from Feishu.
/feishu-minutesgettr-transcribe-summarize
Download audio from a GETTR post.
/gettr-transcribe-summarizehebrew-nikud
Hebrew nikud (vowel points) reference for AI agents.
/hebrew-nikudnorman-categorize-transactions
Review and categorize uncategorized bank transactions, match them with invoices, and verify bookkeeping entries.
/norman-categorize-transactionsnorman-monthly-reconciliation
Perform a complete monthly financial reconciliation - review all transactions, match invoices, check outstanding.
/norman-monthly-reconciliationressemble
Text-to-Speech and Speech-to-Text integration using Resemble AI HTTP API.
/ressemblesiliconflow-tts-gen
Text-to-Speech using SiliconFlow API (CosyVoice2)
/siliconflow-tts-gen