Site Reliablity Engineer - #1128100
Quess
We are seeking a skilled SRE Engineer to manage, optimize, and support enterprise infrastructure. T
Key Responsibilities
Maintain and optimize open-source application monitoring infrastructure; evaluate and migrate to new solutions as needed.
Support application teams in migrating to the latest OpenShift versions, deploying stateful/stateless applications, and troubleshooting Kubernetes/OpenShift issues.
Work with developers to implement instrumentation libraries and frameworks for application monitoring.
Maintain metrics data stores using TSDBs like Prometheus, including administration, tuning, and resource optimization.
Manage distributed tracing platforms such as OpenTelemetry, Jaeger, Zipkin, including tuning, sampling strategies, and troubleshooting microservices traces.
Provide production support for enterprise logging platforms (ELK Stack, Grafana Loki) and manage Elasticsearch index lifecycle.
Implement alerting infrastructure, integrating with tools like PagerDuty and MS Teams; define alerting rules in collaboration with application support teams.
Deploy and administer visualization tools (Grafana, Kibana); create reusable dashboards and implement RBAC for enterprise users.
Promote observability culture across the development community; assist in defining SLIs, SLOs, golden signals, error budgets, MTTD, and MTTR.
Troubleshoot infrastructure issues in Linux VMs and Kubernetes PODs; configure secure reverse proxies, TLS, MFA, LDAPS, OAuth as required.
Configure and maintain CI/CD pipelines for monitoring infrastructure; extend pipelines to support multiple environments/regions.
Required Skills & Technologies
Elasticsearch/Kibana – Cluster management, search optimization, dashboarding
Prometheus & Grafana – Metrics collection, visualization, alerting
OpenTelemetry – Distributed tracing setup and troubleshooting
Kubernetes & OpenShift – Deployments, CI/CD integration
Linux OS – Troubleshooting and administration
Strong understanding of SRE practices and observability principles
Soft Skills
Excellent problem-solving and analytical skills
Ability to work in a collaborative, cross-functional environment
Strong communication skills and documentation practices
Proactive, goal-oriented, and able to manage multiple priorities
Disclaimer: The company is committed to ensuring the privacy and security of your information. By submitting this form, you consent to the collection, processing, and retention of the information you provide. The data collected (which may include your contact details, educational background, work experience and skills) will be used solely for the purpose of evaluating your qualifications for the position you're applying for. Your data will be stored securely and retained for the duration necessary to fulfill our hiring process. If you are not selected for the position, your data will be kept on file for a limited period in case future opportunities arise. You have the right to access, correct, or delete your data at any time by contacting us at Quess Singapore | A Leading Staffing Services Provider in Singapore (quesscorp.sg)
This is in partnership with the Employment and Employability Institute Pte Ltd (“e2i”).
e2i is the empowering network for workers and employers seeking employment and employability solutions. e2i serves as a bridge between workers and employers, connecting with workers to offer job security through job-matching, career guidance and skills upgrading services, and partnering employers to address their manpower needs through recruitment, training, and job redesign solutions. e2i is a tripartite initiative of the National Trades Union Congress set up to support nation-wide manpower and skills upgrading initiatives. By applying for this role, you consent to Quesscorp Singapore’s PDPA and e2i’s PDPA
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
HR Admin Assistant
E&I Technician (Chemical/Oil&Gas/West Area)