To Say Hello!

Find next jobs

job_search_content_direct

Expert, Site Reliability Engineering (Techcom Life)

Techcombank
Updated: 08/12/2025

Employment Information

Benefits

  • Laptop
  • Chế độ bảo hiểm
  • Phụ cấp
  • Đồng phục
  • Chế độ thưởng
  • Chăm sóc sức khỏe
  • Đào tạo
  • Tăng lương
  • Nghỉ phép năm

Job requirement

Job Purpose

Perform specialized tasks, provide special skills in daily monitoring of IT infrastructure / applications / services of critical services (ROC, COC,...) to ensure that critical services meet SLAs committed to the business; As well as in the process of handling alerts, try to restore services as quickly as possible as well as remaining issues to ensure the best service delivery to customers.

Key Acountabilties (1)

'Participate in monitoring and handling system alerts/incidents/problems:

- Perform 24/7 monitoring and handle alerts of services of the entire IT infrastructure/application/services. In case encounter difficulties, escalate to L3 for coordinated processing.

- Ensure projects/specialized operations departments provide adequate alert/incident handling instructions for new services before golive and periodically review and update existing alert/incident handling instructions.

- Responsible for periodically reviewing issues/vulnerabilities in IT infrastructure/applications/services within scope of responsibility

- Provide in-depth transfer skills in monitoring and handling alerts and critical IT service incidents

- Participate Lead the standardizing and developing relevant processes and regulations to ensure effective monitoring and handling of alerts/incidents.

- Coordinate with relevant units to promptly restore services/systems, investigate root causes, propose solutions and implement solutions.

- Participate in implementing changes across the software development environment, including on Prem and cloud.

Participate in building and optimizing centralized monitoring tools:

- Implement the development and promulgation of standards and operate centralized monitoring tools (Dynatrace, Grafana, Splunk...)

- Implement monitoring tool integration and support building monitoring dashboards for new IT infrastructure/applications/services

- Ensure projects/specialized operations departments provide adequate monitoring indicators/monitoring thresholds for new services before golive.

Key Acountabilties (2)

'System problem and incident management:

- Manage the lifecycle of IT incidents, including identifying, classifying, coordinating and resolving incidents according to SLAs

- Be the contact point during troubleshooting, ensuring effective communication among technical, operations and business departments

- Root cause analysis (RCA) after each incident, recommending preventive measures and process improvements. Coordinate with relevant teams to minimize downtime and improve system availability.

- Participate in developing and maintaining incident management processes according to standards and best practices

Key Acountabilties (3)

'Responsibilities in Risk Management and Compliance:

- Control and ensure the unit's activities comply with issued policies, regulations, procedures and instructions.

- Identify the unit's risks during operations, coordinate with relevant units to develop methods to measure, evaluate and minimize risks.

Report periodically to management levels and perform other tasks as directed by management

Job requirement

Qualifications & Work experience

- Bachelor's degree or higher in finance, economics, banking, business administration, or computer science.

Experience

- At least 8 years in IT development and operations at a large enterprise.

Language Proficiency

- English, Level 3 (TOEIC = 550) / or as per company regulations from time to time.

Other Requirements

- International certification in Systems is an advantage.

More Information

  • Qualification: Bachelor
  • Age: Unlimited
  • Job type: Permanent

Company Overview

NGÂN HÀNG TMCP KỸ THƯƠNG VIỆT NAM (TECHCOMBANK)

https://techcombank.com/ Number of employees: 10.000-19.999

Techcombank mang sứ mệnh dẫn dắt hành trình số hóa của ngành tài chính, tạo động lực cho mỗi cá nhân, doanh nghiệp và tổ chức phát triển bền vững và bứt phá... View more

Expert, Site Reliability Engineering (Techcom Life)

Techcombank