As part of the SRE team, you will:
- Engage in and improve the whole lifecycle of services - from design, through deployment, operation and refinement.
- Work with engineering teams to design and write code to create systems which are highly available and able to scale seamlessly.
- Help improve reliability, stability and scalability challenges with engineering teams
- Get involved in deep diagnosis of incidents, and engage with multiple highly skilled engineering teams on resolutions.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Contribute to a culture of learning and responsibility by writing detailed postmortem reports.
- Identify and resolve problems relating to critical service operations and to prevent their recurrence using automation.
- Be part of a cool team, responsible for one of the largest cloud based services in South East Asia.
- Mentor other engineers, define our technical culture, and help build a fast-growing team
Requirements
- BS degree in computer science, software engineering, information technology or related technical field involving coding, or equivalent practical experience.
- Experience with algorithms, data structures, complexity analysis and software design.
- Experience in one or more of the following: Go, C, C++, Java, Python, Perl or Ruby.
- Possess analytical skills, mental resiliency and the ability to think systematically under stressful conditions
- Highly accountable and takes ownership. Outstanding work ethic, high-integrity, team player, and a lifelong learner
Really Nice to Haves
- Experience in Go.
- Experience with cloud based large-scale infrastructure from vendors such as Amazon Web Services, Azure or Google Cloud Platform
- Contributes to open source project experience with performance analysis and debugging tools.
- Ability to debug and optimize code and automate routine tasks.