Monitoring

Prometheus + Grafana Signals Lab

SLO drafting, recording rules, and alert fatigue triage with synthetic checks that behave like production surprises.

Format: Night labs + mentor AMA
Duration: 5 weeks · nightly office hours optional
Tuition (informational): KRW 1,320,000
Mentor: Mateo Silva

Illustration for Prometheus + Grafana Signals Lab

Program narrative

Labs emit intentionally noisy metrics so you practice silencing vs fixing. We weave incident retrospectives into dashboards—annotating spikes with human-readable context instead of empty charts.

What is included

Histogram bucket tuning with concrete SLIs
Alertmanager routing trees with on-call shadowing
Recording rule cost tradeoff spreadsheet
Exemplar tracing bridge to Tempo (read-only)
Dashboard review rubric used by release mentors
Post-incident template aligned to internal comms style
Dark launch metric canary exercise

Outcomes you can show

Ship a three-tier SLO doc tied to business KPIs
Reduce paging noise with documented routes
Facilitate a retro using our annotation pattern

Mateo Silva

Monitoring specialist; previously embedded with SaaS operations groups in Singapore.

Cohort FAQ

Accordion stays compact—one limitation answer is baked into each course.

Long-term storage?

We integrate Thanos concepts but do not host long-term retention—bring your vendor or self-host plan.

Kubernetes required?

ServiceMonitor examples exist, but you can complete core modules with docker-compose profiles.

Mentor ratio?

Office hours are shared across cohorts; critical blockers get async Loom walkthroughs within 36 hours.

Experience notes

Histogram lab tied to the Linux cohort journalctl filters—nice cross-course continuity.

Ivy · Early-career developer · 5/5 · survey

Alertmanager routing tree exercise exposed gaps in our on-call tree—constructive discomfort.

Kenji · Support team lead

← Back to catalog Request information