Monitoring

Prometheus + Grafana Signals Lab

SLO drafting, recording rules, and alert fatigue triage with synthetic checks that behave like production surprises.

Format
Night labs + mentor AMA
Duration
5 weeks · nightly office hours optional
Tuition (informational)
KRW 1,320,000
Mentor
Mateo Silva
Illustration for Prometheus + Grafana Signals Lab

Program narrative

Labs emit intentionally noisy metrics so you practice silencing vs fixing. We weave incident retrospectives into dashboards—annotating spikes with human-readable context instead of empty charts.

What is included

  • Histogram bucket tuning with concrete SLIs
  • Alertmanager routing trees with on-call shadowing
  • Recording rule cost tradeoff spreadsheet
  • Exemplar tracing bridge to Tempo (read-only)
  • Dashboard review rubric used by release mentors
  • Post-incident template aligned to internal comms style
  • Dark launch metric canary exercise

Outcomes you can show

  1. Ship a three-tier SLO doc tied to business KPIs
  2. Reduce paging noise with documented routes
  3. Facilitate a retro using our annotation pattern
Avatar for Mateo Silva

Mateo Silva

Monitoring specialist; previously embedded with SaaS operations groups in Singapore.

Cohort FAQ

Accordion stays compact—one limitation answer is baked into each course.

Long-term storage?

We integrate Thanos concepts but do not host long-term retention—bring your vendor or self-host plan.

Kubernetes required?

ServiceMonitor examples exist, but you can complete core modules with docker-compose profiles.

Mentor ratio?

Office hours are shared across cohorts; critical blockers get async Loom walkthroughs within 36 hours.

Experience notes

Histogram lab tied to the Linux cohort journalctl filters—nice cross-course continuity.

Ivy · Early-career developer · 5/5 · survey

Alertmanager routing tree exercise exposed gaps in our on-call tree—constructive discomfort.

Kenji · Support team lead

← Back to catalog Request information