Golang Job: Site Reliability Engineer Lead

Job added on 2025-07-22

Share Apply for Job

Job Skills

Company

Camelot Integrated Solutions I

Location

Plano, Texas - United States of America

Job type

Full-Time

Golang Job Details

The team is looking for a Contract Senior Site Reliability Engineer to join our dynamic and fast-paced team. The ideal candidate will have extensive experience in managing large-scale microservice based systems, ensuring high availability, and implementing best practices in reliability engineering. You will work closely with development and operations teams to enhance our infrastructure and improve system performance while being mindful of cost-effectiveness.

Responsibilities:

Proactively identify performance improvements in areas such as responsiveness, availability, and scalability.
Establish best practices around topics like observability, monitoring and incident response and drive adoption across the organization.
Lead incident response efforts and conduct post-mortem analyses to prevent future occurrences.
Coordinate with Software Engineering and DevOps teams to design, implement, and maintain scalable and reliable systems using Kubernetes, Docker, and Istio.
Monitor system performance and troubleshoot issues proactively, utilizing Datadog for observability.
Implement and tune Horizontal Pod Autoscalers (HPAs) to optimize resource utilization.
Develop and maintain automation tools for deployment, monitoring, and incident response.
Collaborate with software engineering teams to improve system reliability and performance.
Implement A/B deployments, canary deployments, and traffic mirroring strategies to ensure critical updates go smoothly and can be rolled back with minimal impact if necessary.
Mentor junior engineers and contribute to team knowledge sharing.
Oversee and coordinate with SREs in other parts of the world, ensuring effective collaboration during on-call rotations.
Establish and enforce best practices for system reliability and performance across the organization.
Utilize Helm charts for application deployment and management.
Understand and implement AWS systems, including AWS Load Balancers and routing, to support systems handling millions of requests per hour.
Participate in on-call rotations and provide support for production systems.

Required Qualifications:

5+ years of production experience working as a Site Reliability Engineer, DevOps Engineer, or Software Engineer
Demonstrated ability to deliver highly available solutions at scale.
Demonstrates advanced problem-solving, troubleshooting, decision making skills
Expertise in containerization technologies (Docker, Kubernetes, and Istio) to build, package, and deploy optimized container images
Expertise in AWS
Experience with Argo CD for continuous delivery and GitOps practices.
Proficiency in monitoring and alerting tools, particularly Datadog, AppDynamics, ELK, Grafana, or Prometheus.
Familiarity with A/B, Canary, Blue/Green deployments, and traffic mirroring techniques.
Experience with scripting and orchestration tools such as Terraform, Ansible, or equivalent.
Demonstrated ability to balance cost considerations with performance and reliability.
Experience delegating tasks to junior engineers
Experience in leading initiatives under direction
Ability to apply systems thinking to understand interdependencies and design solutions that achieve results
Ability to learn and apply new technologies, programming practices, patterns, and methods
Experience mentoring, providing technical guidance, and training more junior team members
Ability to work independently and take ownership of tasks/assignments
Organized and detail-oriented
Ability to develop healthy working relationships and collaborate with peers and leaders
Exhibits integrity and high standards in work quality
Excellent verbal and written communication skills
Proficiency in Golang or Rust are both a plus but not required.
Values diversity and differences amongst individuals in interactions

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Apply for Job

Back to Golang Job List

Job Skills

More Developer Job Boards

Fullstack Developer Jobs Golang Jobs JavaScript Jobs Python Jobs React Jobs Rust Jobs Java Jobs

Golang Top Open-Source Projects