Multimodal Prompting

Multimodal prompting combines text with images, audio, or video to give AI models richer context. Modern models like GPT-4o, Claude 3.5, and Gemini can process multiple input types simultaneously, enabling more natural and capable interactions.

Text + Image Prompting

Image Analysis

[Attach image]
What objects are in this image? List them with their approximate positions.

Image Comparison

[Attach image 1]
[Attach image 2]

Compare these two designs. Identify:
1. Key differences
2. Which follows better UX principles
3. Specific improvements for each

Code from Screenshot

[Attach screenshot of code or UI]

Convert this to working code. Include:
- Exact layout structure
- All text content
- Styling details

Text + Audio Prompting

Transcription + Analysis

[Attach audio file]

1. Transcribe the audio
2. Identify key points discussed
3. Extract action items with owners
4. Note any decisions made

Voice Instructions

[Attach voice memo]

Based on these voice notes:
1. Create a structured outline
2. Fill in missing details where unclear
3. Suggest additional points to consider

Best Practices

Image Prompting

Be specific about what you want analyzed
Reference specific parts of the image when needed
Provide context for ambiguous images
Use high-quality, clear images

Audio Prompting

Specify if you need verbatim or summary
Note the language if not English
Indicate speaker identification needs
Mention background noise handling

Modality Combinations

Combination	Use Cases
Text + Image	Design review, code conversion, visual Q&A
Text + Audio	Meeting notes, voice memos, transcription
Text + Video	Content analysis, tutorial creation
Image + Text + Audio	Comprehensive documentation

Prompt Templates

Image Description:

Describe this image in detail, covering:
- Main subjects and their attributes
- Setting and background
- Colors, lighting, and mood
- Any text visible in the image

Visual Comparison:

Compare these two images focusing on:
1. Structural differences
2. Color and style variations
3. Quality and clarity
4. Which better achieves [stated goal]

Audio Summary:

From this audio recording:
1. Provide a 3-sentence summary
2. List key topics discussed
3. Extract direct quotes for important points
4. Identify any unresolved questions

Data & Analytics: ChatGPT Prompts for Analysts

A curated collection of ChatGPT prompts for data analysts — SQL, data cleaning, visualization, statistical analysis, storytelling, and more.

ChatGPT Resume Writing Guide: Create Professional Resumes

Learn how to leverage ChatGPT to craft compelling, ATS-optimized resumes that highlight your skills and achievements. Includes templates, prompts, and expert tips.

Fantasy & Isekai SREF Codes for Midjourney

Epic fantasy worlds with detailed environments and RPG-inspired aesthetics for Midjourney prompts.

Multimodal Prompting