A well-run game operation relies on stable server infrastructures, effective outage prevention, and an agile response team. Rayark is seeking an experienced software engineer in site reliability to monitor, improve, and design our server infrastructures. We want to hear from engineers who know to build and maintain a scalable and highly available service. He/she will develop strategies and tools to operate large-scale distributed systems and troubleshoot system issues causing incidents.
You will start by facilitating and deploying all kinds of services to the Google Cloud Platform. Then you will collaborate with backend engineers to design high-quality architecture to build scalable and reliable services.
- Design software architecture to improve the availability, scalability, and maintainability.
- Design infrastructure management workflow
- Design CI/CD workflow
- Maintain production services availability. (on-call)
- 6+ years of professional experience in designing, analyzing, and troubleshooting scalable distributed systems with high availability
- Experience in developing large-scale backend systems
- Experience in service deployment/operation/monitoring
- Strong understanding of container technologies (Docker, Kubernetes)
- Knowledge of networking theory and protocols (HTTP/HTTPS, DNS, TCP/UDP, IP)
- Knowledge of database management
- Minimum 4 years experience in any programming languages
- Experience with cloud services, especially GCP (Google Cloud Platform)
REQUIRED APPLICATION MATERIALS