
Brandon Kauffman
Linux Administration
RHCSA and Linux+ certified. I have managed 300+ RHEL servers implementing Satellite, IAM, and ansiblizing deployments. I led regular systems performance monitoring optimizing performance such as increasing IO throughput by 20%.
Kubernetes
CKA and CCA certified. I have built and managed on-prem and EKS cloud Kubernetes clusters. I utilized advanced networking features to integrate with existing infrastructure using.
I managed a 20 node physical cluster which monitored our on-prem environment of 800 VMs and 7,000 IoT devices at Liberty University. This cluster also worked as the monitoring system for our larger OpenShift cluster.
In EKS, I led a migration to utilize spot instances where possible in an autoscaling group for EKS to reduce costs by 70%. at Pulsar Informatics Inc. I automated deployments utilizing Terraform, and Flux across accounts.Rust
I have built webservers, front ends, BPF programs, and other tools using Rust. It is my primary programming language for leet code and new projects. I often use Tauri to create desktop applications that speed up my workflow. I've contributed to several open source Rust projects.
Go
I have led developmet on multiple projects such as an internal alert management. Over the course of a year, the project was able to reduce MTTA by 20% and reduce MTTR by ~80%. I have also used Go to create many internal Prometheus exporters. I've contributed to several open source Go projects.
Python
I have built and maintained a variety of internal python packages for a team of 12 others. I've done data analysis on Oracle and Microsoft SQL Server with python. I have contributed and built Django applications with GraphQL and REST. I've implemented OTEL to an OSS projects and modified open source libraries
Site Reliability Engineering
I have SRE experience with a variety of monitoring products. I have maintained Elasticsearch, Grafana, Prometheus, OTEL with Clickhouse, and Thanos.
I have used those products, Datadog, and Cloudwatch to implement SLOs. Using these tools to navigate deployment strategies uptime was increased from ~99.997% to ~99.99999% for crucial services.
I have worked with Projects in Golang, Python, Java, and PHP to implement RUM, APM, Profiling, and Distributed Tracing. I've assisted developers in debugging applications and reduced avg response time by ~20%.
I've used SNMP to monitor network and physical appliances that support infrastructure and created dashboards and reports to improve capacity planning.
I've used SNMP to monitor network and physical appliances that support infrastructure and created dashboards and reports to improve capacity planning.
Kafka/Red Panda
I maintained a production kafka cluster with over 250 topics on-prem. The largest topic ingested 100,000 events per second. After switching to Red Panda, this topic was able to maintain 10,000,000 events per second after a consumer failure had been restored. I was also able to reduce servers by 40% by implementing better tunables.
AWS
I am a certified AWS Solutions Architect Professional. By utilizing best practices I have managed to reduce cloud spend by ~15%. across various services. I have implemented proper security practices to harden our organization and improved account structure for better management and scalability.
Devops
I have managded CI/CD pipelines with a variety of tools. These pipelines included workloads for mobile workloads, Terraform, Ansible, Flux, and more.
I was able to reduce CI/CD runtime by 60% by using best practices and developed a self hosted auto-scaling cluster to process the pipelines. I automated alert resolutions for various systems by creating a service to process alerts. I managed a variety of terraform modules to deploy our services with IaC.
I was the SME a variety of databases such as Postgres, MongoDB, and . I staged regular backups, replication, and managed major version upgrades. In postgres I was able to tune parameters that optimized the data for dev, staging, and prod workloads and reduced query times.
I was the SME a variety of databases such as Postgres, MongoDB, and . I staged regular backups, replication, and managed major version upgrades. In postgres I was able to tune parameters that optimized the data for dev, staging, and prod workloads and reduced query times.