Home/Image & Video Generation
🎨

Image & Video Generation

(377)
🎖️Featured
59,724

Nano Banana Pro

Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K; use --input-image.

🎖️Featured
30,420

Video Frames

Extract frames or short clips from videos using ffmpeg.

🎖️Featured
21,254

Xiaohongshu (小红书) Automation

Automate Xiaohongshu (RedNote) content operations using a Python client for the xiaohongshu-mcp server. Use for: (1) Publishing image, text, and video content, (2) Searching for notes and trends, (3) Analyzing post details and comments, (4) Managing user profiles and content feeds. Triggers: xiaohongshu automation, rednote content, publish to xiaohongshu, xiaohongshu search, social media management.

🎖️Featured
17,454

Openai Image Gen

Batch-generate images via OpenAI Images API. Random prompt sampler + `index.html` gallery.

🎖️Featured
16,064

Openai Whisper Api

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

🎨Image & Video Generation/openai-whisper-api
🎖️Featured
15,993

description: 将用户讲稿一键生成乔布斯风极简科技感竖屏HTML演示稿。当用户需要生成PPT、演示文稿、Slides、幻灯片,或要求科技风/极简风/乔布斯风格的演示时触发此技能。输出为单个可直接运行的HTML文件。

将用户讲稿一键生成乔布斯风极简科技感竖屏HTML演示稿。当用户需要生成PPT、演示文稿、Slides、幻灯片,或要求科技风/极简风/乔布斯风格的演示时触发此技能。输出为单个可直接运行的HTML文件。

Memory Hygiene

Audit, clean, and optimize Clawdbot's vector memory (LanceDB). Use when memory is bloated with junk, token usage is high from irrelevant auto-recalls, or setting up memory maintenance automation.

diagram-generator

生成和编辑各种类型的图表(drawio、mermaid、excalidraw)。支持流程图、时序图、类图、ER图、思维导图、架构图、网络拓扑图等常见图表类型。能够根据自然语言描述创建新图表,也能读取并修改已有的 drawio/mermaid/excalidraw 文件。使用独立的 MCP server (mcp-diagram-generator) 生成图表文件,减少 token 消耗并保证输出一致性。支持自动配置管理,默认输出路径为项目目录下的 diagrams/{format}/,支持自定义路径和自动创建目录。

Remotion Video Toolkit

Complete toolkit for programmatic video creation with Remotion + React. Covers animations, timing, rendering (CLI/Node.js/Lambda/Cloud Run), captions, 3D, charts, text effects, transitions, and media handling. Use when writing Remotion code, building video generation pipelines, or creating data-driven video templates.

🎨Image & Video Generation/remotion-video-toolkit

Prompt Engineering Expert

Advanced expert in prompt engineering, custom instructions design, and prompt optimization for AI agents

🎨Image & Video Generation/prompt-engineering-expert

Antigravity Image Generator

Generate images using the internal Google Antigravity API (Gemini 3 Pro Image). High quality, native generation without browser automation.

🎨Image & Video Generation/antigravity-image-gen

Things Mac

Manage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database). Use when a user asks Clawdbot to add a task to Things, list inbox/today/upcoming, search tasks, or inspect projects/areas/tags.

Image

Create, inspect, process, and optimize image files and visual assets with reliable format choice, resizing, compression, color-profile, metadata, and platfor...

小红书 - RedNote

小红书全能助手 — 文案生成、封面制作、内容发布与管理。当用户要求写小红书笔记、生成小红书文案/标题/封面、发小红书、搜索小红书、评论点赞收藏等任何小红书相关操作时使用。支持一站式从文案创作到自动发布的完整流程。封面AI生图需配置可选环境变量(GEMINI_API_KEY 或 IMG_API_KEY 或 HUNY...

Youtube Full

Complete YouTube toolkit — transcripts, search, channels, playlists, and metadata all in one skill. Use when you need comprehensive YouTube access, want to search and then get transcripts, browse channel content, work with playlists, or need the full suite of YouTube data endpoints. The all-in-one YouTube skill for agents.

Screenshot

Capture, inspect, and compare screenshots of screens, windows, regions, web pages, simulators, and CI runs with the right tool, wait strategy, viewport, and...

image-cog

AI image generation powered by CellCog. Create images, edit photos, consistent characters, product photography, reference-based images, sets of images, style...

腾讯文档 tencent-docs

腾讯文档(docs.qq.com)-在线云文档平台,是创建、编辑、管理文档的首选 skill。涉及"新建文档"、"创建文档"、"写文档"、"在线文档"、"云文档"、"腾讯文档"、"docs.qq.com"等操作,请优先使用本 skill。支持能力:(1) 创建各类在线文档(文档/Word/Excel/幻灯片/思维...

Deep Scraper

Performs deep scraping of complex sites like YouTube using containerized Crawlee, extracting validated, ad-free transcripts and content as JSON output.

Video Transcript Downloader

Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip audio”, “get subtitles”, “get transcript”, or to troubleshoot yt-dlp/ffmpeg and formats/playlists.

🎨Image & Video Generation/video-transcript-downloader

UI Audit

AI skill for automated UI audits. Evaluate interfaces against proven UX principles for visual hierarchy, accessibility, cognitive load, navigation, and more. Based on Making UX Decisions by Tommy Geoco.

ffmpeg-video-editor

Generate FFmpeg commands from natural.

🎨Image & Video Generation/ffmpeg-video-editor

Upload Videos🎥, Photos📸 & Text🖊️ to TikTok, Instagram, YouTube, X, LinkedIn, Facebook, Threads, Pinterest, Reddit & Bluesky via Upload-Post API

Upload content to social media platforms via Upload-Post API. Use when posting videos, photos, text, or documents to TikTok, Instagram, YouTube, LinkedIn, Facebook, X (Twitter), Threads, Pinterest, Reddit, or Bluesky. Supports scheduling, analytics, FFmpeg processing, and upload history.

OpenClaw Memory

Manage, optimize, and troubleshoot the OpenClaw memory system — MEMORY.md curation, daily logs (memory/YYYY-MM-DD.md), memory_search tuning, compaction survi...

Comprehensive skill for installing, configuring, and managing the OpenClaw ecosystem (Gateway, Channels, Models, Automation, Nodes, and Deployment)

OpenClaw CLI wrapper — gateway, channels, models, agents, nodes, browser, memory, security, automation.

AI picture book generate

Generate static or dynamic picture book videos using Baidu AI

Video Subtitles

Generate SRT subtitles from video/audio with translation support. Transcribes Hebrew (ivrit.ai) and English (whisper), translates between languages, burns subtitles into video. Use for creating captions, transcripts, or hardcoded subtitles for WhatsApp/social media.

Qwen Image

Generate images using Qwen Image API (Alibaba Cloud DashScope). Use when users request image generation with Chinese prompts or need high-quality AI-generated images from text descriptions.

Frontend Design

Frontend development with React, Next.js, Tailwind CSS. Build landing pages, dashboards, forms, components. Responsive, accessible, performant UI.

figma

Professional Figma design analysis and asset export.

Context7 MCP

Context7 MCP - Intelligent documentation search and context for any library

Clean Code

Pragmatic coding standards - concise, direct, no over-engineering, no unnecessary comments

chart-image

Generate publication-quality chart images from data.

Remotion Best Practices

Best practices for Remotion - Video creation in React

🎨Image & Video Generation/remotion-best-practices

Gemini Image Simple

Generate and edit images with Gemini API using pure Python stdlib. Zero dependencies - works on locked-down environments where pip/uv aren't available.

🎨Image & Video Generation/gemini-image-simple

Yt Dlp Downloader

Download videos from YouTube, Bilibili, Twitter, and thousands of other sites using yt-dlp. Use when the user provides a video URL and wants to download it, extract audio (MP3), download subtitles, or select video quality. Triggers on phrases like "下载视频", "download video", "yt-dlp", "YouTube", "B站", "抖音", "提取音频", "extract audio".

🎨Image & Video Generation/yt-dlp-downloader-skill

video-cog

Long-form AI video production: the frontier of multi-agent coordination. CellCog orchestrates 6-7 foundation models to produce up to 4-minute videos from a single prompt — scripted, filmed, voiced, lipsync'd, scored, and edited automatically. Create marketing videos, product demos, explainer videos, educational content, spokesperson videos, training materials, UGC content, news reports.

Excalidraw Diagram Generator

Generate hand-drawn style diagrams, flowcharts, and architecture diagrams as PNG images from Excalidraw JSON

Remotion

Best practices for Remotion - Video creation in React

AI Image Generation

Create AI images with GPT Image, Gemini Nano Banana, FLUX, Imagen, and top providers using prompt engineering, style control, and smart editing.

beauty-generation-api

FREE AI image generation service for creating.

🎨Image & Video Generation/beauty-generation-api

Transcriptapi

Full TranscriptAPI toolkit — fetch YouTube transcripts, search videos and channels, browse channel uploads, get latest videos, and explore playlists. Use when the user wants to work with YouTube content programmatically, get transcripts for summarization or analysis, find videos, or monitor channels. Triggers on YouTube URLs, "transcript", "transcriptapi", "video summary", "what did they say", "find videos about", "search youtube".

Video Agent (Deprecated)

[DEPRECATED] Use `create-video` for prompt-based video generation or `avatar-video` for precise avatar/scene control. This legacy skill combines both workflo...

Veo

Generate video using Google Veo (Veo 3.1 / Veo 3.0).

Gemini Image Gen

Generate and edit images via Google Gemini API. Supports Gemini native generation, Imagen 3, style presets, and batch generation with HTML gallery. Zero depe...

ai-video-gen

End-to-end AI video generation - create videos from text.

Image Editing

Edit images with AI inpainting, outpainting, background removal, upscaling, and restoration tools.

Douyin Hot Trend

获取抖音热榜/热搜榜数据,包含热门视频、挑战赛、音乐等多领域热门内容,并输出标题、热度值、跳转链接及封面图(如有)。

TubeScribe

YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance quality. Requires internet for YouTube access. No paid APIs or subscriptions. Use when user sends a YouTube URL or asks to summarize/transcribe a YouTube video.

AI Video Script Generator

AI视频脚本生成器。根据用户输入的主题/关键词,生成完整的视频脚本,包含分镜描述、画面提示词、配音文案。适用于短视频创作者、AI视频制作者、内容营销人员。触发词:视频脚本、分镜、AI视频、短视频文案、视频策划。

FFmpeg CLI

Process video and audio using FFmpeg CLI for transcoding, cutting, merging, audio extraction, thumbnails, GIFs, speed, filters, subtitles, and watermarks.

best-image-generation

Best quality AI image generation (~$0.12-0.20/image)

🎨Image & Video Generation/best-image-generation

Nano Banana Pro

Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).

Prezentit

Generate beautiful AI-powered presentations instantly. Create professional slides with custom themes, visual designs, and speaker notes—all through natural language commands.

Transcript

Get transcripts from any YouTube video — for summarization, research, translation, quoting, or content analysis. Use when the user shares a video link or asks "what did they say", "get the transcript", "transcribe this video", "summarize this video", or wants to analyze spoken content.

AI Video Generation

Create AI videos with Sora 2, Veo 3, Seedance, Runway, and modern APIs using reliable prompt and rendering workflows.

ClawShot - The Visual Layer for AI Agents

Instagram for AI agents. Build your following, grow your influence. Share screenshots, get likes & comments, engage with @mentions. Be a creator, not just a coder.

fal-ai

Generate images, videos, and audio via fal.ai API (FLUX, SDXL, Whisper, etc.).

Voice Reply

Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud. Supports multiple languages including German (thorsten) and English (ryan) voices. Outputs Telegram-compatible voice notes with [[audio_as_voice]] tag.

Computer Vision Expert

SOTA Computer Vision Expert (2026). Specialized in YOLO26, Segment Anything 3 (SAM 3), Vision Language Models, and real-time spatial analysis.

🎨Image & Video Generation/computer-vision-expert

Pexoclawskill

Use this skill when the user wants to produce a short video (5–60 seconds). Supports any video type: product ads, TikTok/Instagram/YouTube content, brand vid...

Table Image

Generate clean table images from data. Perfect for Discord/Telegram where ASCII tables look broken. Supports dark/light mode, custom styling, and auto-sizing...

🎨Image & Video Generation/table-image-generator

captions

Extract closed captions and subtitles from YouTube videos.

google-gemini-media

Use the Gemini API.

🎨Image & Video Generation/google-gemini-media

Subtitles

Get subtitles from YouTube videos for translation, language learning, or reading along. Use when the user asks for subtitles, subs, foreign language text, or wants to read video content. Supports multiple languages and timestamped output for sync'd reading.

gamma

Generate AI-powered presentations, documents, and social posts using Gamma.app.

Telegram Offline Voice

本地生成 Telegram 语音消息,支持自动清洗、分段与临时文件管理。

🎨Image & Video Generation/telegram-offline-voice

Gemini Yt Video Transcript

Create a verbatim transcript for a YouTube URL using Google Gemini (speaker labels, paragraph breaks; no time codes). Use when the user asks to transcribe a YouTube video or wants a clean transcript (no timestamps).

🎨Image & Video Generation/gemini-yt-video-transcript

Ai Image Generation

Generate AI images with FLUX, Gemini, Grok, Seedream, Reve and 50+ models via inference.sh CLI. Models: FLUX Dev LoRA, FLUX.2 Klein LoRA, Gemini 3 Pro Image,...

🎨Image & Video Generation/ai-image-generation

MoltTok

After installing, tell your agent: Activate MoltTok. Creative expression platform where AI agents post ASCII art, SVG, HTML, p5.js, and poetry to a live feed.

minimax-understand-image

使用 MiniMax MCP 进行图像理解和分析。触发条件:(1) 用户要求分析图片、理解图像、描述图片内容 (2) 需要识别图片中的物体、文字、场景 (3) 使用 MiniMax 的 understand_image 功能

🎨Image & Video Generation/minimax-understand-image

canva-connect

Manage Canva designs, assets, and folders via the Connect API.

抖音下载器-douyin_downloader

最稳定的抖音视频下载工具,用户提供抖音链接或modal_id即可自动解析并下载。

Video Transcript

Extract full transcripts from video content for analysis, summarization, note-taking, or research. Use when the user wants a written version of video content, asks to "transcribe this", "get the text from this video", "convert video to text", or shares a video URL for content extraction.

minara

Crypto trading: swap, perps, transfer, pay, deposit (credit card / crypto), withdraw, AI chat, market discovery.

Qwen3-tts

Local text-to-speech using Qwen3-TTS-12Hz-1.7B-CustomVoice. Use when generating audio from text, creating voice messages, or when TTS is requested. Supports 10 languages including Italian, 9 premium speaker voices, and instruction-based voice control (emotion, tone, style). Alternative to cloud-based TTS services like ElevenLabs. Runs entirely offline after initial model download.

YouTube Ultimate

Free transcripts, 4K downloads, and video exploration — zero API quotas burned.

heygen-avatar-lite

Create AI digital human videos with HeyGen API.

🎨Image & Video Generation/heygen-avatar-lite

Seisoai

Unified media generation gateway for agents. Discover tools dynamically, choose API key or x402 auth, invoke image/video/audio/music/3D/training tools, and h...

小红书 - RedNote

小红书全能助手 — 文案生成、封面制作、内容发布与管理。当用户要求写小红书笔记、生成小红书文案/标题/封面、发小红书、搜索小红书、评论点赞收藏等任何小红书相关操作时使用。支持一站式从文案创作到自动发布的完整流程。封面AI生图需配置可选环境变量(GEMINI_API_KEY 或 IMG_API_KEY 或 HUNY...

LLMWhisperer

Extract text and layout from images and PDFs using LLMWhisperer API. Good for handwriting and complex forms.

xiaohongshu-title

Maximize CTR (Click-Through Rate) by leveraging emotional hooks and platform algorithms.

Veo 3 Video Gen (Gemini API)

Generate and stitch short videos via Google Veo 3.x using the Gemini API (google-genai). Use when you need to create video clips from prompts (ads, UGC-style clips, product demos) and want a reproducible CLI workflow (generate, poll, download MP4, optionally stitch multiple segments).

Remotion Server

Headless video rendering with Remotion v5 on any Linux server — no Mac or GUI needed. Templates for chat demos, promos, and more. Uses Chrome Headless Shell...

Moltbook Curator

A curation platform where molts vote on the most interesting Moltbook posts to share with humans.

cad-agent

Rendering server for AI agents doing CAD work.

Krea.ai API

Generate images via Krea.ai API (Flux, Imagen, Ideogram, Seedream, etc.)

ElevenLabs Music

Generate music from text prompts using ElevenLabs Eleven Music API. Use when creating songs, soundtracks, jingles, lullabies, or any audio music from descriptions. Supports vocals with AI-generated lyrics, instrumental tracks, and multiple genres/styles. Requires paid ElevenLabs plan.

comfy-cli

Install, manage, and run ComfyUI instances.

Canvas Design

Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.

Table Image

Generate images from tables for better readability in messaging apps like Telegram. Use when displaying tabular data.

NanoBanana PPT Skills

基于 AI 自动分析文档内容,智能规划并生成多风格高清 PPT 图片,支持可选转场视频和交互式播放体验。

🎨Image & Video Generation/nanobanana-ppt-skills

tesseract-ocr

Extract text from images using the Tesseract OCR engine directly via command line.

Image2Prompt

Analyze images and generate detailed prompts for image generation. Supports portrait, landscape, product, animal, illustration categories with structured or natural output.

Douyin Video Fetch

下载抖音视频到本地(无水印优先)。用于给后续视频分析/复刻提供原始素材,支持 URL 或 video_id 输入、批量列表输入与统一输出目录。

🎨Image & Video Generation/douyin-video-fetch

Nvidia Image Gen

Generate and edit images using NVIDIA FLUX models. Use when user asks to generate images, create pictures, edit photos, or modify existing images with AI. Supports text-to-image generation and image editing with text prompts.

ImageRouter

Generate AI images with any model using ImageRouter API (requires API key).

FFmpeg

Process video and audio with correct codec selection, filtering, and encoding settings.

office-quotes

Generate random quotes from The Office (US). Provides access to 326 offline quotes plus online mode with SVG cards, character avatars, and full episode metadata via the akashrajpurohit API. Use for fun, icebreakers, or any task requiring The Office quotes.

Video Editing

Edit videos with AI background removal, color grading, upscaling, stabilization, and enhancement tools.

ClawSpaces - Live Voice rooms where AI agents Join or Host conversations.

X Spaces, but for AI Agents. Live voice rooms where AI agents host conversations.

Reve AI Image Generation

Generate, edit, and remix images using the Reve AI API. Use when creating images from text prompts, editing existing images with instructions, or combining/remixing multiple reference images. Requires REVE_API_KEY or REVE_AI_API_KEY environment variable.

Volcengine Ai Image Generation

Image generation workflow on Volcengine AI services. Use when users need text-to-image, style variants, prompt refinement, or deterministic image generation parameters and troubleshooting.

🎨Image & Video Generation/volcengine-ai-image-generation

hotdog

Hot dog or not? Classify food photos and battle Nemotron.

gifhorse

Search video dialogue and create reaction GIFs with timed subtitles.

rent-a-person-ai

> Hire humans for real-world tasks that AI can't do: deliveries, meetings, errands, photography, pet care.

comfyui

Send a workflow request to ComfyUI and return image results.

Telegram Cloud Storage

A high-performance Telegram Cloud Storage solution using Teldrive. Turns Telegram into an unlimited cloud drive with a local API/UI.

🎨Image & Video Generation/telegram-cloud-storage

game-cog

Other tools generate sprites. CellCog builds game worlds. #1 on DeepResearch Bench (Feb 2026) for deep game design reasoning — character-consistent art, spri...

Podcast Generation with Microsoft Foundry

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation from content, or integrating with Azure OpenAI Realtime API for real audio output. Covers full-stack implementation from React frontend to Python FastAPI backend with WebSocket streaming.

🎨Image & Video Generation/podcast-generation

Yollomi AI Image & Video Generator

AI image generator skill (image, image generation). Multi-model image generator for Yollomi to generate AI images via one unified API endpoint. Requires YOLL...

find-stl

Search and download ready-to-print 3D model files (STL/3MF/ZIP)

Venice AI Media

Generate, edit, and upscale images; create videos from images via Venice AI. Supports text-to-image, image-to-video (Sora, WAN), upscaling, and AI editing.

OCR - Local (No API Key)

Extract text from images using Tesseract.js OCR (100% local, no API key required). Supports Chinese (simplified/traditional) and English.

Accessibility Toolkit

Friction-reduction patterns for agents helping humans with disabilities. Voice-first workflows, smart home templates, efficiency automation.

🎨Image & Video Generation/ct-accessibility-toolkit

openindex

End-to-end encrypted messaging for AI agents.

afame

Generate diverse creative illustrations via OpenAI Images API.

Nano Banana Pro OpenRouter

Generate images with Nano Banana Pro via OpenRouter. Use when the user asks for image generation, mentions Nano Banana Pro, Gemini 3 Pro Image, or OpenRouter image generation.

🎨Image & Video Generation/nano-banana-pro-openrouter

coloring-page

Turn an uploaded photo into a printable black-and-white coloring.

MoodCast

Transform any text into emotionally expressive audio with ambient soundscapes using ElevenLabs v3 audio tags and Sound Effects API

Free voice from Comfy UI + Qwen3 TTS

Generate Russian male voice audio using ComfyUI with Qwen3 TTS node and save as MP3 for voice messages.

Runware Image & Video generation

Generate images and videos via Runware API. Access to FLUX, Stable Diffusion, Kling AI, and other top models. Supports text-to-image, image-to-image, upscaling, text-to-video, and image-to-video. Use when generating images, creating videos from prompts or images, upscaling images, or doing AI image transformation.

Generate images & videos with: Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key

Generate images & videos with AIsa. Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key.

🎨Image & Video Generation/openclaw-aisa-image-video-models-wan2-6-gemini-3-pro-image-nano-banana

nanobanana-pro-fallback

Nano Banana Pro with auto model fallback — generate/edit images via Gemini Image API.

🎨Image & Video Generation/nanobanana-pro-fallback

Comfyui-Api

Connects to a ComfyUI server to generate images from prompts, auto-detects URLs, translates Chinese prompts, and supports REST and WebSocket APIs.

Sound FX

Generate short sound effects via ElevenLabs SFX (text-to-sound). Use when you need SFX clips like applause, canned laughter, whooshes, ambience, or short stingers, and optionally convert to WhatsApp-friendly .ogg/opus.

Fal Ai

Generate images and media using fal.ai API (Flux, Gemini image, etc.). Use when asked to generate images, run AI image models, create visuals, or anything involving fal.ai. Handles queue-based requests with automatic polling.

Visla AI Video Creation

Creates AI-generated videos from text scripts, URLs, or PPT/PDF documents using Visla. Use when the user asks to generate a video, turn a webpage into a vide...

Google Photos Manager for OpenClaw

Manage Google Photos library. Upload photos, create albums, and list library content. Use when the user wants to backup, organize, or share images via Google Photos.

Render Stl Png

Render an STL file to a PNG image with a solid color using a deterministic software renderer and adjustable 3D perspective parameters.

Bilibili Subtitle Downloader

下载 Bilibili 视频字幕,将其进行分块以供 LLM(大语言模型)处理,并生成高质量的总结。当用户提供 Bilibili BV 号或 URL,并希望获取视频内容的总结、核心要点或详细的分解时使用。

🎨Image & Video Generation/bilibili-subtitle-download-skill

MoltMedia

The official visual expression layer for AI Agents. Post images to MoltMedia.lol and join the AI visual revolution.

Ultrahuman (OpenClaw)

Fetch and summarize Ultrahuman Ring/CGM metrics inside OpenClaw using the Ultrahuman MCP server (via mcporter). Use when the user asks about Ultrahuman data such as sleep score, total sleep, sleep stages, HR/HRV/RHR, steps, recovery index, movement index, VO2 max, or wants a daily/weekly Ultrahuman summary.

🎨Image & Video Generation/ultrahuman-openclaw

Ai Social Media Content

Create AI-powered social media content for TikTok, Instagram, YouTube, Twitter/X. Generate: images, videos, reels, shorts, thumbnails, captions, hashtags. To...

🎨Image & Video Generation/ai-social-media-content

clinkding

Manage linkding bookmarks - save URLs, search, tag, organize.

Trace To Svg

Trace bitmap images (PNG/JPG/WebP) into clean SVG paths using potrace/mkbitmap. Use to convert logos/silhouettes into vectors for downstream CAD workflows (e.g., create-dxf etch_svg_path) and for turning reference images into manufacturable outlines.

Sapi Tts

Windows SAPI5 text-to-speech with Neural voices. Lightweight alternative to GPU-heavy TTS - zero GPU usage, instant generation. Auto-detects best available voice for your language. Works on Windows 10/11.

iBlipper Expressive Typography - Remember reading is automatic!

Generate kinetic typography animations for expressive agent-to-human communication. Use when you want to communicate with visual flair - animated text for announcements, alerts, greetings, dramatic reveals, or any message that deserves more than plain text. Outputs shareable URLs or can display in canvas.

openai-tts-python

Text-to-speech conversion using OpenAI's TTS API for generating high-quality, natural-sounding audio. Supports 6 voices (alloy, echo, fable, onyx, nova, shimmer), speed control (0.25x-4.0x), HD quality model, multiple output formats (mp3, opus, aac, flac), and automatic text chunking for long content (4096 char limit per request). Use when: (1) User requests audio/voice output with triggers like "read this to me", "convert to audio", "generate speech", "text to speech", "tts", "narrate", "speak", or when keywords "openai tts", "voice", "podcast" appear. (2) Content needs to be spoken rather than read (multitasking, accessibility). (3) User wants specific voice preferences like "alloy", "echo", "fable", "onyx", "nova", "shimmer" or speed adjustments.

fal

Search, explore, and run fal.ai generative AI models (image generation, video, audio, 3D). Use when user wants to generate images, videos, or other media with AI models.

Meshy AI

Use the Meshy.ai REST API to generate assets: (1) text-to-2d (Meshy Text to Image) and (2) image-to-3d, then download outputs locally. Use when the user wants Meshy generations, needs polling async tasks, and especially when they want the resulting OBJ saved to disk. Requires MESHY_API_KEY in the environment.

🎨Image & Video Generation/clawdbot-meshyai-skill

Advanced QR Intelligence

Generate and read QR codes. Use when the user wants to create a QR code from text/URL, or decode/read a QR code from an image file. Supports PNG/JPG output and can read QR codes from screenshots or image files.

Mux Video Bot

Mux Video infrastructure skill for designing, ingesting, transcoding/packaging, playback ID policy, live streaming, clipping, and observability with Mux Data. Use when architecting or operating Mux-based video pipelines, live workflows, playback security, or diagnosing playback issues.

image-gen

Generate images using multiple AI models — Midjourney (via Legnext.ai), Flux, SDXL, Nano Banana (Gemini)

smart-image-loader

Smart image loader that handles both URLs and local files, automatically downloads URLs to temporary locations, and displays images using the read tool. Use when a user wants to view or display an image, whether it's a web URL or a file in the workspace.

🎨Image & Video Generation/smart-image-loader

Glasses to Social

Turn smart glasses photos into social media posts. Monitors a Google Drive folder for new images from Meta Ray-Ban glasses (or any smart glasses), analyzes them with vision AI, drafts tweets/posts in the user's voice, and publishes on approval. Use when setting up a glasses-to-social pipeline, processing smart glasses photos for social media, or creating hands-free content workflows.

Masonry: generate images and video with models across providers

AI-powered image and video generation. Generate images, videos, manage jobs, and explore models via the masonry CLI.

🎨Image & Video Generation/masonry-generate-image-and-video

Primattography Color & Film Logic

Ultimate DaVinci Resolve DCTL & Color Science Engineering Skill.

🎨Image & Video Generation/primattography-color-science

Seede Design Agent Skills

The ultimate AI design CLI for Agents and Developers. Generate professional UI, social media graphics, and posters with state-of-the-art AI models. Best choi...

kameo

Generate expressive talking-head videos from static images using Kameo AI. Converts static avatars/portraits into dynamic 5-second videos with realistic facial expressions, lip-sync, and motion. Use when you need to bring static images to life, create AI character videos, demonstrate visual communication, or generate talking avatars from photos.

Generate Protoss-style (StarCraft) voice effects using SoX and FFmpeg.

Apply Protoss-style (StarCraft) psionic effects to ANY audio file. Use as a post-processing layer for TTS or user recordings.

Renderful AI

Generate images and videos via renderful.ai API (FLUX, Kling, Sora, WAN, etc.) with crypto payments. Use when the user wants to create AI images, videos, or needs a crypto-friendly generation service. Triggers: renderful, renderful.ai, generate image, generate video, crypto payment generation

Sora Video Generation

Generate videos using OpenAI's Sora API. Use when the user asks to generate, create, or make videos from text prompts or reference images. Supports image-to-video generation with automatic resizing.

Chinese

Write Chinese that sounds human. Not formal, not robotic, not AI-generated.

Youtube Podcast summarizer via Elevenlabs

Transform YouTube videos into podcast-style voice summaries using ElevenLabs TTS

🎨Image & Video Generation/youtube-voice-summarizer-elevenlabs

Molt Beach, the Million Dollar Page for AI Agents – Own a piece of internet history

Claim your pixel on Molt Beach - The Million Dollar Page for AI Agents. Purchase pixels, create animations, draw emoji art, build neighborhoods with other agents, and leave your mark on digital history. Starting at $1 per pixel.

agentos-mesh

Enables real-time communication between AI agents.

nk-images-search

Search 1+ million free high-quality AI stock photos.

Weather TV style

Generate a TV-style weather infographic with a location-specific seasonal background. Use when the user asks for a visual weather forecast or a weather infographic for a specific address.

🎨Image & Video Generation/weather-infographic

Qwen Image Plus Sophnet

Generate images via Sophnet Qwen-Image-Plus and poll for task completion. Use when the user asks for Sophnet image generation, Qwen-Image-Plus, or requests an image from the Sophnet API.

🎨Image & Video Generation/qwen-image-plus-sophnet

comi-cog

Comic and manga creation powered by CellCog. Create comics, manga, webtoons, graphic novels, comic strips, visual storytelling, sequential art, character consistency. AI-powered comic creator.

Fliz AI Video Generator

Complete integration guide for the Fliz REST API - an AI-powered video generation platform that transforms text content into professional videos with voiceovers, AI-generated images, and subtitles. Use this skill when: - Creating integrations with Fliz API (WordPress, Zapier, Make, n8n, custom apps) - Building video generation workflows via API - Implementing webhook handlers for video completion notifications - Developing automation tools that create, manage, or translate videos - Troubleshooting Fliz API errors or authentication issues - Understanding video processing steps and status polling Key capabilities: video creation from text/Brief, video status monitoring, translation, duplication, voice/music listing, webhook notifications.

🎨Image & Video Generation/fliz-ai-video-generator

fal-text-to-image

Generate, remix, and edit images using fal.ai's AI.

options-spread-conviction-engine

Multi-regime options spread analysis engine with quantitative rigor.

🎨Image & Video Generation/options-spread-conviction-engine

Senior Computer Vision

Computer vision engineering skill for object detection, image segmentation, and visual AI systems. Covers CNN and Vision Transformer architectures, YOLO/Fast...

🎨Image & Video Generation/senior-computer-vision

Voice Note To Midi

Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing

🎨Image & Video Generation/voice-note-to-midi

Frinkiac - Search, browse, and generate memes from The Simpsons, Futurama, Rick and Morty, and 30 Rock scenes

Search TV show screenshots and generate memes from The Simpsons, Futurama, Rick and Morty, and 30 Rock

tube-cog

YouTube content creation powered by CellCog. Create YouTube videos, Shorts, thumbnails, scripts, long-form content, educational videos, tutorials, vlogs. AI-powered YouTube creator tools.

Philips Hue Thinking Indicator

Visual AI activity indicator using Philips Hue lights. Pulse red when thinking, green when done.

🎨Image & Video Generation/philips-hue-thinking

小红书内容创作

Generate Xiaohongshu (小红书/RED) content optimized for the platform's CES algorithm. Use when: (1) creating xiaohongshu/小红书 posts, (2) writing Chinese social m...

🎨Image & Video Generation/xhs-content-creator

free-kameo

Generate expressive talking-head videos from static images using Kameo AI. Converts static avatars/portraits into dynamic 5-second videos with realistic facial expressions, lip-sync, and motion. Use when you need to bring static images to life, create AI character videos, demonstrate visual communication, or generate talking avatars from photos.

Blender

Avoid common Blender mistakes — transform application, modifier order, UV seams, and export settings for game engines.

slides-generation-skills

AI-powered presentation generation using 2slides API. Create slides from text content, match reference image styles, or summarize documents into presentations. Use when users request to "create a presentation", "make slides", "generate a deck", "create slides from this content/document/image", or any presentation creation task. Supports theme selection, multiple languages, and both synchronous and asynchronous generation modes.

🎨Image & Video Generation/slides-generation-skills

clawaifu - OpenClaw Waifu

Your AI waifu companion that sends anime-style selfies

Nano banana korean rendering

비라틴 문자(한글, 일본어, 중국어 등)를 AI 이미지에 정확히 렌더링하는 스킬. Canvas 프리렌더링과 Gemini를 활용하여 텍스트 깨짐 없이 이미지를 생성합니다.

🎨Image & Video Generation/nano-banana-korean-rendering

Meta Video Ad Analyzer

Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative content, extracting text overlays, or generating scene-by-scene descriptions.

🎨Image & Video Generation/meta-video-ad-analyzer

image-hosting

Upload images to img402.dev and get a public URL.

superpower

**When to use:** User has a task they want to do or want you to do, or they feel frustrated, upset, stressed.

pod-cog

A great podcast needs three things: compelling content, natural-sounding voices, and polished production. CellCog delivers all three — #1 on DeepResearch Bench (Feb 2026) for script depth, frontier multi-voice dialogue, and automatic music + editing. Podcast production, episode scripts, show notes, interview prep, audiograms — single prompt to finished MP3.

Product to Ads (Ad-Ready)

Generate professional advertising images from product URLs using the Ad-Ready pipeline on ComfyDeploy. Use when the user wants to create ads for any product by providing a URL, optionally with a brand profile (70+ brands) and funnel stage targeting. Supports model/talent integration, brand-aware creative direction, and multi-format output. Differs from Morpheus (manual fashion photography) — Ad-Ready is URL-driven, brand-intelligent, and funnel-stage aware.

Moltagram

The visual social network for AI agents. See images, generate images, share visual content.

Images & videos generation with Gemini 3 Pro Image + Qwen Wan 2.6 (video) via one API key

Generate images & videos with AIsa. Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key.

🎨Image & Video Generation/openclaw-aisa-llm-image-video

Gemini Image Proxy

Generate and edit images with Gemini API using the OpenAI Python SDK.

🎨Image & Video Generation/gemini-image-proxy

Captcha breaker

Automatically solves image, ReCaptcha V2/V3, and HCaptcha challenges via XEvil server using Base64 uploads or sitekey proxying.

agent-avatars

Mint your unique AI agent avatar — CryptoPunks-style pixel art. Register, get claimed by your human (X verification), then mint your one-of-a-kind avatar. Use when an agent needs a profile picture, wants to establish visual identity, or needs to register with molt.avatar.

Clawdvine

Short-form video for AI agents. Generate videos using the latest models, pay with USDC via x402.

Morfeo Remotion Style

Morfeo Academy's Remotion video style guide. Use when creating Remotion videos, stories, or animations for Paul/Morfeo Academy. Triggers on "estilo Morfeo", "mi estilo Remotion", "video para Morfeo", "story estilo Morfeo", or any Remotion video request from Paul.

🎨Image & Video Generation/morfeo-remotion-style

aikek

Access AIKEK APIs for crypto/DeFi research and image generation.

eachlabs-image-edit

Edit, transform, upscale images using 200+ AI models.

🎨Image & Video Generation/eachlabs-image-edit

acorn-prover

Verify and write proofs using the Acorn theorem prover for mathematical and cryptographic formalization.

NadFun Token Creation

Guide users through uploading an image and metadata, mining a vanity salt, and deploying a token on-chain via BondingCurveRouter in a 4-step flow on Nad.fun.

🎨Image & Video Generation/nadfun-token-creation

preisrunter

Search and compare grocery prices and promotions in Austria and Germany via the Preisrunter API.

falimagegen

Call fal.ai model APIs for image generation (text-to-image and image-to-image). Use when a user asks to integrate fal, construct requests, run jobs, handle auth, or return image URLs from fal model APIs.

TikTok Growth — Video Content System

TikTok content strategy and video script generator for any niche. Hook formulas, viral script structures, trend-riding tactics, and a systematic approach to...

openclaw-dashboard

Real-time operations dashboard for OpenClaw. Monitors sessions, costs, cron jobs, and gateway health. Use when installing the dashboard, starting the server,...

🎨Image & Video Generation/openclaw-dashboard

Recraft AI

Generate, vectorize, upscale, replace background, variate, remove background, and transform images via Recraft API.

Clawver Orders

Manage Clawver orders. List orders, track status, process refunds, generate download links. Use when asked about customer orders, fulfillment, refunds, or order history.

4claw

4claw — a moderated imageboard for AI agents. Boards, threads, replies, media uploads, bumping (bump=false to not bump), greentext, and automatic capacity purges. Post spicy hot takes (what you’re REALLY thinking) — but no illegal content, doxxing, harassment, or minors.

AIsa Media Gen

Generate images & videos with AIsa. Gemini 3 Pro Image (image) + Qwen Wan 2.6 (video) via one API key.

ace-music

Generate AI music using ACE-Step 1.5 via ACE Music's free API.