Collaborate with and across teams to design, develop, test, implement, and support technical solutions for container orchestration platforms
Build standard processes and procedures to automate the deployment, troubleshooting, monitoring, and recovery of infrastructure in the cloud leveraging infrastructure as code practices
Architect and execute migration of existing workloads from on-prem, traditional infrastructure, to the cloud
Share your passion for staying on top of tech trends, experimenting with and learning new technologies, participating in internal and external technology communities, and mentoring other members of the team.
Build tools to monitor systems and automate processes around the core network, storage and network infrastructure
Core contributor to our Architecture Review Board, change management and blameless postmortem processes
Collaborate with teams and assist in troubleshooting issues across the whole stack – hardware, software, applications, and network
Capacity planning and performance engineering related projects
Collaborate with other business functions to bring best of breed product and solution to fruition with automation, reliability, scalability, and observability as core tenets
Build for resilience. Our goal is that nobody gets called off-hours, ever! While w work on that, participate in a weekly on-call rotation.
Improve our infrastructure capabilities, optimizing for cost, simplicity, and maintainability
Experience running high availability cloud deployments with a major provider, namely Microsoft Azure
Experience automating systems administration tasks using tools like Ansible, Terraform and languages such as Python, Bash or Go
Experience with cloud monitoring and observability
Comfortable with git and Infrastructure as Code workflows
At least 4 years of Linux system administration experience. In-depth experience with RHEL, CentOS, Windows Server with strong debugging, troubleshooting and problem-solving skills.
At least 3 years of experience in DevOps Engineering – Internship experience will be considered
At least 2 years of experience with Cloud Native technologies, namely Microsoft Azure
2+ years’ experience with scripting and coding (Bash, Python, SQL or Golang or comparable languages)
2+ years of experience with Terraform or Docker or Ansible, Git, and Jenkins
2+ years of experience with multi-tenant container orchestration platforms and services including Docker or Kubernetes
2+ years of experience working with Agile Development Practices
Plus to have experience with Kubernetes based cloud-native technologies such as argo, Kubeflow, istio, linkerd, and dex
Experience with Docker or Kubernetes to create and manage portable, extensible, containerized workloads and services a plus
Company
Cambridge Resources Inc
United States of America
Location
Remote Position
(From Everywhere/No Office Location)
Job type
Full-Time
Golang Job Details
**This is a fully remote, employee role**
We are seeking a Senior System Reliability Engineer, who is passionate about delivering reliable, scalable, efficient, and highly available platforms. Working under the general supervision of the Director Enterprise Technology Services, you will constantly be optimizing and automating our processes and systems to improve reliability, scalability, and reduce toil.
Plus, you will participate in systems design, deployment and take on real-time responsibilities, such as monitoring, incident management, and recovery.
Primary Job Duties
Basic Qualifications:
Preferred Qualifications:
More Developer Job Boards
Fullstack Developer Jobs Golang Jobs JavaScript Jobs Python Jobs React Jobs Rust Jobs Java Jobs