Skip to main content

Site Reliability Engineer

Matthew NavaleSRE turned AI Engineer.

Building internal tools for the SRE team using agentic AI coding — automating workflows, enhancing monitoring capabilities, and enabling AI adoption across a fintech engineering team.

Skills & Technologies

Tools and disciplines I use day-to-day in reliability engineering.

LinuxPythonDockerTerraformGrafanaCloud InfrastructureIncident ResponseAI EngineeringAgentic AIComputer VisionCNN / Model TrainingGit

Featured Project

What I'm most proud of right now.

Spotlight Project

Watchtower

Real-time infrastructure status dashboard built for the SRE team — monitoring 20+ metrics across AWS, Wazuh SIEM, Cloudflare, and custom servers with WebSocket live updates.

  • Real-time WebSocket updates across 20+ configurable metrics — AWS CloudWatch, Wazuh SIEM, Cloudflare, custom servers
  • Full incident lifecycle: creation, linked metrics, screenshot evidence uploads to S3, resolution tracking
  • Role-based access control — superadmin, operator, viewer — with session cookies and API key auth
  • Webhook system with SSRF prevention, n8n workflow integration, and a dedicated MCP server for AI tooling
React 18ViteNode.jsExpressSocket.ioPostgreSQLDockerNginxAWS S3
watchtower · live
GET /api/metrics/sections
20 metrics · 4 sections · 3 users online
 
POST /api/status
✓ cloudflare.cdn operational
✓ aws.cloudwatch operational
⚠ wazuh.siem caution
→ incident #47 auto-created
→ webhooks fired · audit logged
 

More Projects

Other things I've built.