This post describes a real project I worked on. Names, specific systems, and identifying details have been changed. The technical problems and approaches are real, but code snippets are simplified illustrations not production code.
I Built 5 Dashboards. Leadership Used 1.
In my first month, I was asked to "build some dashboards."
So I built five. Operations overview. Asset health. Engineer performance. Ticket trends. Capacity planning.
Lots of graphs. Lots of data. Real-time updates. I was proud.
Two months later, I checked the access logs:
- Operations Overview: 12 views (all me testing)
- Asset Health: 3 views
- Engineer Performance: 847 views
- Ticket Trends: 8 views
- Capacity Planning: 2 views
One dashboard got used constantly. The others were abandoned.
I needed to understand why.
What Made the Winner Different
I interviewed the people who actually used the Engineer Performance dashboard.
Why do you check it?
- "I need to know which engineers are overloaded"
- "My boss asks 'how's the team?' every standup"
- "I want to catch problems before they become fires"
What do you do after?
- "Reassign tickets if someone's swamped"
- "Screenshot it for my boss"
- "Ping someone directly if their queue is stuck"
What would make you stop using it?
- "If the data was wrong"
- "If it took more than 10 seconds to load"
- "If I had to explain what the graphs mean"
The pattern: they used it because it answered a question they got asked every single day.
Why the Others Failed
Operations Overview: 20 panels showing everything. Information overload. Nobody knew where to look.
Asset Health: Beautiful technical metrics. Signal distributions. Capacity histograms. But leadership doesn't care about dBm values. They care about "are we meeting SLAs?" I was showing data. They wanted answers.
Ticket Trends: Gorgeous time-series. But "tickets up 10% this week" doesn't prompt action. So what? What do I do?
Capacity Planning: Updated monthly. They looked once, said "cool," never came back. No daily utility.
The Winning Formula
The Engineer Performance dashboard worked because:
- Answered a specific question: "How's my team right now?"
- Prompted immediate action: Overloaded → reassign tickets
- Was dead simple: 4 panels, not 20
- Updated in real-time: Always current
- Explained itself: Green = good. Red = act.
The Glanceability Principle
Leadership has 30 seconds, not 30 minutes.
Before:
P95 Latency: 234ms
P99 Latency: 892ms
Error Rate: 0.3%
Request Count: 45,892
Cache Hit Rate: 94.2%After:
┌─────────────────────────┐
│ SYSTEM HEALTH: OK │
│ All metrics normal │
│ Last incident: 12 days│
└─────────────────────────┘One big status. Details on click. The summary visible instantly.
Color as Language
I stopped thinking about colors as decoration and started using them as a communication system:
Red = Act now (something's broken)
Yellow = Watch this (approaching threshold)
Green = All good (relax)No legend needed. No explanation needed. Universal.
The Grafana threshold config:
thresholds:
- value: 0
color: green
- value: 10
color: yellow # 10+ tickets = getting heavy
- value: 15
color: red # 15+ tickets = overloadedThe colors do the talking. Engineers don't read numbers they see a wall of green and keep walking, or see red and stop.
The One Number That Matters
Every leadership meeting, same question: "How are we doing on SLA?"
I built one panel just for this:
SELECT
ROUND(
COUNT(*) FILTER (WHERE resolved_within_sla) * 100.0 / COUNT(*),
1
) as sla_percentage
FROM tickets
WHERE created_at > NOW() - INTERVAL '7 days'Display: One big number. 97.3%
Below it: Target 95% - Above target
That panel became the screenshot in every status email. One number that answers the question they actually ask.
Actionable vs. Informational
Every panel should answer: "What do I do about this?"
Informational: "14 high-priority tickets in queue"
Actionable: "14 high-priority tickets 8 unassigned → [ASSIGN NOW]"
The link goes directly to the assignment page. One click from seeing the problem to fixing it.
One Dashboard, One Audience
I stopped building "one dashboard to rule them all."
| Audience | Dashboard Focus |
|---|---|
| Engineers | My tickets, my metrics |
| Team Leads | Team workload, blockers |
| Directors | Summary KPIs, trends |
Same data. Different views. Different questions answered.
The Rebuilt Asset Health Dashboard
Old: 15 panels of technical metrics.
New: Three sections answering one question: "What needs attention?"
- Critical (act today): 2 specific items with links
- Warning (this week): 23 items summarized by category
- Healthy: "2,847 of 2,870 operating normally"
Open, see green, close. Open, see red, click the link. Ten seconds.
Results
| Dashboard | Before | After |
|---|---|---|
| Engineer Performance | 847/mo | 1,200+ |
| Asset Health | 3/mo | 340 |
| Operations | 12/mo | 180 |
More importantly: people made decisions from dashboards instead of asking me for custom reports.
The Dashboard Checklist
Before publishing, I ask:
- Who is the audience? (Not "everyone")
- What specific question does this answer?
- Can someone understand it in 10 seconds?
- Does every panel prompt a clear action?
- Have I removed everything non-essential?
- Have I tested it with an actual user?
If I can't check all boxes, I don't ship it.
Lessons Learned
Dashboards compete with doing nothing. If checking takes longer than asking someone, they'll ask. Make it faster.
Fewer panels, more impact. Every panel dilutes the others. 20 panels = paralysis. 4 panels = clarity.
Design for the question, not the data. Don't show "ticket count by status." Answer "do I need to act right now?"
Test with real users. I thought Operations was useful. Access logs said otherwise.
Related Reading
- Turning a 4-Hour Report Into a Button Click - The data behind these dashboards
- When the Frontend Sends a Query as a String - When dashboards aren't enough
- Scheduled Jobs That Actually Recover - Keeping the data fresh
