KKLab is looking for a candidate who can combine software and system engineering skills to build and operate reliable, scalable, and distributed systems for our AI/ML-based SaaS.
As a member of the Technical Infrastructure team, you will help us manage the challenges of our cloud-native web services, like KKRaaS, which provides enhancing personalized experiences with musical DNA designed for media and e-commerce, proven by the adoption of KKBOX.
To improve operational efficiency and reduce the complexity of system design, you will work closely with peers such as data engineers, software architects, and researchers. Best practices will be ensured through this collaboration.
To be successful in this role, you should be resilient and enjoy solving problems with curiosity. An individual who works efficiently in a hybrid environment (both remote and on-site) is highly desired, as we strive to create an open and fault-tolerant environment.
- Engage in and improve the whole lifecycle of service, from inception and design, through to deployment, operation, and refinement.
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health. Practice sustainable incident response and blameless postmortems.
- Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
- Develop innovative solutions to ensure efficient SRE processes under regulatory requirements. Bring SRE expertise to the product teams that enforce regulatory requirements.