Data & Analytics: ChatGPT Prompts for Analysts

A curated collection of ChatGPT prompts for data analysts — SQL, data cleaning, visualization, statistical analysis, storytelling, and more.

June 10, 2026
chatgptdata-analysisanalyticspromptstemplatessqlpythonprofessional

Data & Analytics Prompts

Prompt templates organized by analysis task. Each assumes you provide the data context — schema, sample rows, business question.

SQL Generation

Query from Description

Given this table schema:
[paste CREATE TABLE statements or column descriptions]

Write a SQL query that: [describe what you need].

Requirements:
- Handle NULL values explicitly (don't assume data is clean)
- Use CTEs instead of subqueries for readability
- Add comments explaining non-obvious logic
- Include a note about which indexes would speed this up

Output the query and a plain-English summary of what it does.

Query Optimization

This query is running slow on a [X million] row table:
[paste current query]

Current execution time: [X seconds]. Target: [Y seconds].

Optimize this query:
1. Identify the bottleneck (scan, join, sort)
2. Rewrite the query for better performance
3. Suggest indexes that would help
4. Explain why each change improves performance

If the query can't be optimized further without schema changes,
say so and explain what schema change would help.

Window Function Usage

I need to calculate [metric] for each [grouping] over [time window].

Table: [table name]
Key columns: [list]. Time column: [column name].

Write a query using window functions that:
1. Computes the metric per group
2. Includes a running total or moving average
3. Ranks groups by the metric
4. Handles gaps in time data (missing days/weeks)

Provide the query and explain the window frame specification.

Python & Pandas

Data Cleaning Pipeline

Write a Python function that cleans a dataset with these issues:
[List specific problems: missing values in column X, outliers in Y,
 inconsistent date formats in Z, duplicate rows based on columns A and B, etc.]

Requirements:
- Use pandas
- Log what was cleaned (rows dropped, values imputed) for audit trail
- Return both the cleaned DataFrame and a cleaning summary
- Assume the input is a CSV with columns: [list column names and types]

Include docstring and type hints.

Statistical Analysis

I have a DataFrame with columns: [list columns with types].

Research question: [what I want to test or understand].

Design the analysis:
1. What statistical test is appropriate and why?
2. What assumptions does this test require? How would I check them?
3. What would a significant vs non-significant result tell me?
4. Write the Python code to run this analysis using scipy/statsmodels
5. Interpret the output — what numbers to look at and what they mean

Data Visualization

Chart Selection

I need to visualize this data: [describe data structure and what I want to show].

Audience: [executives/analysts/public]. Medium: [dashboard/presentation/report].

Recommend the best chart type:
1. Why this chart over alternatives (name 2 alternatives and why they're worse)
2. What to put on each axis
3. Color strategy (categorical, sequential, diverging)
4. Common misinterpretation of this chart type and how to prevent it
5. Annotations or reference lines that would add context

Chart Description for AI Image Tools

Describe a chart that shows [data relationship] for use with an AI image generator.

Provide:
1. A detailed text description of the chart (type, axes, data points, labels, annotations)
2. Key insights the chart should communicate
3. Suggested color palette with hex codes
4. What to omit — chart junk that would distract from the message

Make it specific enough that someone could sketch the chart from this description.

Analysis & Storytelling

Finding the Narrative

Here's a dataset analysis result: [paste summary statistics, correlation matrix,
 or key findings].

The audience is [describe]. They care about [what matters to them].

Transform this into a narrative:
1. The headline insight — one sentence they'll remember
2. 3 supporting findings with specific numbers
3. One "surprising" finding that challenges assumptions
4. Recommended action based on the data
5. One caveat or limitation to acknowledge

Keep it under 300 words. No jargon unless the audience uses it.

Stakeholder Q&A Prep

I'm presenting this analysis to [stakeholder type]. 

Key findings: [list 3-5 findings].

Anticipate their questions:
1. "How confident are you in these numbers?" — prepare a response with confidence intervals
2. "What would change if we looked at [alternative segment/time period]?"
3. "How does this compare to [benchmark/competitor/previous period]?"
4. "What should we do differently based on this?"
5. "What data are we missing that would make this more definitive?"

For each, provide a 2-3 sentence response I can deliver verbally.

A/B Testing

Test Design

Design an A/B test for [change we want to test].

Current metric: [baseline value]. Minimum detectable effect: [X%].
Significance level: 0.05. Power: 0.80.

Provide:
1. Sample size required per variant
2. Duration needed given our daily traffic of [X users/day]
3. Primary metric and 3 guardrail metrics (metrics that shouldn't change)
4. Ramp-up plan (start at 1%, increase to 50%)
5. When to stop the test early (if variant is clearly winning or losing)

Results Interpretation

A/B test results:
- Control: [N] users, conversion rate [X%]
- Variant: [N] users, conversion rate [Y%]
- P-value: [number]

Analyze these results:
1. Is the difference statistically significant? At what confidence level?
2. Is it practically significant? (Does the effect size matter for the business?)
3. What's the confidence interval for the lift?
4. Should we ship the variant? If not, what additional data would change your recommendation?
5. What are 2 alternative explanations for the result (besides the variant causing the change)?

Note:

Replace bracketed placeholders with actual schema, data, and business context. The more precise your input, the more actionable the output.