Colton Shepard
Senior infrastructure and data engineer for complex distributed systems.
Built and owned petabyte-scale databases, Meta-scale load testing, and
10+ TB/day data pipelines. Comfortable in customer-facing roles.
Skills
Technologies
- Languages: Python, Bash, SQL
- Databases: PostgreSQL, BigQuery
- AI: Claude Code, Gemini, agentic workflows, custom skill/subagent development
- Pipelines: Airflow, Fivetran, WAL decoding
- Cloud: GCP, Azure, AWS, Terraform, Ansible
Buzzword Compliance
- Automation, alerting, full stack, load testing, microservices, observability
- Cloud-native, enterprise support, high availability, platform engineering
- Data pipelines, distributed databases, change data capture, streaming, emergent features
- Massive-scale data, agentic AI, context engineering, RAG, MCP
Employment
Production Engineer, E5 — Meta
March 2025 – July 2026
- Run large-scale production load tests to determine hardware needs.
- Build and maintain load testing tooling and integrations.
- Drive onboarding initiatives for load testing and capacity tools.
- Total projected hardware savings in excess of 20,000 servers.
- Integrated health check tooling with load testing features.
- Drove large refactor to allow dynamic targeting, enabling fully automated tests.
- Heavily AI-assisted development workflow; extensive use of custom skills and subagents.
Senior Software Engineer — TRM Labs
May 2021 – February 2025
- Designed, built, scaled, and operated data pipelines and infrastructure.
- Distributed Postgres (Citus) serving layer with over 1 PB of disk.
- Airflow-orchestrated BigQuery pipeline with 10+ TB of daily updates.
- Operational tooling, alerting, systems design, training, and maintenance.
- Served data for regulatory compliance and forensic investigations.
Solutions Engineer / Software Engineer II — Microsoft (Citus Data)
May 2018 – May 2021
- Helped customers build and deploy 500 TB+ PostgreSQL clusters.
- Performed low-downtime cross-architecture multi-TB migrations.
- Built migration, monitoring, update, and analysis tools.
- Product management, account management, and community outreach roles.
- Conference talks including PostgresOpen (see Public Appearances below).
Applications Engineer — Alarm.com (formerly iControl Networks)
June 2016 – May 2018
- Production firmware deployments for 1M+ ADT Pulse systems and 5M+ security devices.
- Built monitoring, update, analysis, and reporting tooling.
- Analyzed stability, responsiveness, and business intelligence metrics across the fleet.
Earlier experience
- Accellion — Support Engineer (March 2015 – June 2016)
- Blackboard (via Sutherland Global Services) — Technical Support (June 2008 – March 2015)