Job Description
Full details about the role and requirements
Yukerja Summary
The Senior Site Reliability Engineer (SRE) role at PT Digital Tech Asia is curated from Glints (category Keuangan & Perbankan). Note the work location (Setiabudi) before applying. Yukerja.com is not the employer — applications are handled on the official source site.
About the Role
Currently our client looking for a Senior Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of production systems. You will drive automation, improve system availability, and collaborate with development teams to implement SRE best practices.
Key Responsibilities
Define and monitor SLOs, SLIs, and Error Budgets.
Build automation to improve system reliability and reduce operational tasks.
Lead incident management, root cause analysis, and post-mortems.
Manage monitoring, logging, and observability tools (Prometheus, Grafana, ELK/Loki).
Develop Infrastructure as Code (Terraform, Ansible, or Pulumi).
Manage Kubernetes clusters and cloud infrastructure (AWS, GCP, or Azure).
Build and maintain CI/CD pipelines.
Perform capacity planning and production performance optimization.
Collaborate with engineering teams to improve system reliability and operational excellence.
Requirements
Bachelor's degree in Computer Science, Software Engineering, or related field.
6+ years of experience in SRE, DevOps, or Platform Engineering.
Strong programming skills in Python, Go, or Bash.
Hands-on experience with Kubernetes in production.
Experience with AWS, GCP, or Azure.
Proficiency with Prometheus, Grafana, ELK Stack, and observability tools.
Experience with Terraform, CloudFormation, or Pulumi.
Strong troubleshooting, incident management, and communication skills.
Preferred Qualifications
Google Cloud DevOps Engineer, AWS DevOps Engineer Professional, CKA, or CKAD certification.
Experience with Istio or Linkerd.
Familiarity with Chaos Engineering tools.
Experience in financial services, e-commerce, or large-scale platforms.