Deskripsi Pekerjaan
Informasi lengkap tentang posisi dan persyaratan
Ringkasan Yukerja
Lowongan Senior DevOps Engineer di Indonesia Fintopia Technology kami kurasi dari JobStreet (kategori Teknologi & IT). Perhatikan lokasi kerja (Jakarta) sebelum melamar. Yukerja.com bukan pemberi kerja — lamaran diproses di situs sumber resmi.
About the Role
We are looking for a Senior DevOps / Site Reliability Engineer (SRE) to build, operate, and continuously improve our cloud infrastructure and production platforms. You will lead initiatives to improve system reliability, automation, observability, security, and deployment efficiency while ensuring our services meet the high availability and compliance standards required in the fintech industry.
Key Responsibilities
Design, build, and maintain highly available, scalable, and secure cloud infrastructure supporting business-critical systems.
Own the reliability, availability, performance, and operational excellence of production services.
Design and improve CI/CD pipelines to enable safe, efficient, and automated software delivery.
Manage infrastructure using Infrastructure as Code (IaC) and automation best practices.
Build and maintain monitoring, logging, alerting, and observability platforms to proactively detect and resolve issues.
Lead production incident response, root cause analysis, and post-incident improvement initiatives.
Collaborate with Backend, Frontend, Mobile, QA, Product, Security, Compliance, and other engineering teams to improve system stability and delivery efficiency.
Improve system resilience through capacity planning, disaster recovery, backup strategies, and high availability architecture.
Implement infrastructure security best practices, access control, vulnerability management, and platform hardening.
Support OJK compliance initiatives, audit preparation, disaster recovery exercises, and infrastructure governance requirements.
Optimize cloud infrastructure utilization, performance, and operational costs.
Mentor engineers and promote DevOps, SRE, automation, and operational excellence across engineering teams.
Utilize AI-assisted tools to improve infrastructure automation, troubleshooting, documentation, and operational efficiency.
Perform other responsibilities assigned by the Engineering Manager or Head of Engineering.
Requirements
Bachelor's degree or higher in Computer Science, Information Technology, Engineering, or a related technical field, or equivalent practical experience.
Minimum 5 years of experience in DevOps, Site Reliability Engineering, Platform Engineering, or Infrastructure Engineering.
Strong experience with Linux server administration and cloud infrastructure management.
Hands-on experience with Docker, Kubernetes, container orchestration, and microservices environments.
Strong knowledge of CI/CD tools such as GitLab CI/CD, Jenkins, GitHub Actions, or similar platforms.
Experience with Infrastructure as Code tools such as Terraform, Ansible, or Helm.
Strong understanding of networking concepts including DNS, Load Balancer, Reverse Proxy, VPN, Firewall, SSL/TLS, and TCP/IP.
Experience with monitoring and observability platforms such as Prometheus, Grafana, ELK, Loki, Jaeger, or OpenTelemetry.
Experience operating production databases, caching systems, message queues, and distributed systems.
Strong troubleshooting skills across infrastructure, application, database, and network layers.
Experience with cloud platforms such as Alibaba Cloud, AWS, GCP, or Azure.
Good understanding of security best practices, IAM, vulnerability management, backup strategies, disaster recovery, and compliance requirements.
Strong communication skills with the ability to collaborate across Engineering, Security, Compliance, Product, and Operations teams.
Experience leading incident management and driving long-term operational improvements.
Able to communicate effectively in English as part of the daily working environment.
Preferred Qualifications
Experience supporting fintech, banking, payment, digital lending, or other regulated industries.
Experience supporting OJK compliance, audit preparation, regulatory reviews, disaster recovery testing, or security compliance initiatives.
Experience managing Kubernetes clusters in production.
Experience with cloud security, secrets management, and service mesh technologies.
Experience implementing GitOps or Platform Engineering practices.
Experience using AI-assisted operational tools for automation, troubleshooting, and infrastructure optimization.
Experience working with multinational or regional engineering teams.
Mandarin communication skills are an advantage.