hero



The future you've been looking for

Senior Reliability Engineer

CyrusOne

CyrusOne

Other Engineering
Remote
USD 140k-170k / year
Posted on Jan 16, 2026
The Senior Reliability Engineer serves as a subject-matter expert and strategic technical authority for infrastructure reliability across a portfolio of mission-critical data center sites. This role leads the design, governance, and continuous improvement of reliability strategies for power, cooling, and control systems, applying advanced engineering judgment, analytics, and risk-based decision-making.

The Senior Reliability Engineer independently evaluates complex reliability risks, prioritizes initiatives under uncertainty, and influences operational, maintenance, and capital decisions that materially impact uptime, safety, and lifecycle cost. This role operates with minimal oversight and is expected to shape standards, mentor others, and elevate reliability capability across the organization.

Responsibilities:

Enterprise Reliability Strategy & Asset Care

  • Architect and govern portfolio-level, risk-based asset strategies for mission-critical power and cooling infrastructure.
  • Apply advanced RCM principles to define maintenance and inspection strategies aligned to failure risk, system criticality, and redundancy posture.
  • Evaluate and balance tradeoffs between maintenance investment, operational risk, spares coverage, redundancy, and capital replacement.
  • Establish and maintain enterprise PM quality standards, including audits, task effectiveness reviews, and elimination of low-value maintenance.

Operational Governance & Change Risk Management

  • Serve as a final technical authority for high-risk SOPs, MOPs, EOPs, and operational change packages.
  • Perform system-level risk assessments for planned work, incidents, and abnormal operating conditions.
  • Guide site teams in CMMS data integrity, work management maturity, and adherence to approved operating procedures.
  • Lead or oversee complex reliability investigations involving multiple systems, teams, or contributing factors.

Advanced Analytics & Condition Monitoring

  • Design and mature predictive condition-monitoring programs across the portfolio (oil analysis, thermography, vibration, battery monitoring, controls analytics).
  • Develop and interpret leading reliability indicators and degradation trends to anticipate failures before impact.
  • Apply statistical analysis, reliability modeling, and engineering judgment to evaluate failure likelihood and consequence.
  • Translate analytical insights into strategic maintenance, operational mitigations, or capital recommendations.

Critical Spares & Lifecycle Strategy

  • Define and govern enterprise critical spares strategies, accounting for supplier risk, lead times, and system exposure.
  • Identify systemic spares gaps and drive remediation plans in partnership with Supply Chain and Operations.
  • Lead lifecycle asset assessments to guide long-range capital planning and replacement prioritization.
  • Provide data-driven input to business cases supporting capital investments and infrastructure upgrades.

Incident Leadership, RCA & Continuous Improvement

  • Lead high-impact post-incident RCAs and FMEAs, ensuring depth of analysis beyond proximate causes.
  • Identify and address latent design, procedural, and organizational contributors to reliability events.
  • Ensure lessons learned result in durable changes to standards, procedures, maintenance strategies, or training.
  • Champion continuous improvement initiatives that measurably reduce risk and failure recurrence across sites.

Technical Leadership & Capability Development

  • Act as a mentor and technical escalation point for Reliability Engineers, site engineers, and CE leaders.
  • Coach teams on reliability methods, risk-based decision-making, and interpretation of condition-monitoring data.
  • Influence and evolve enterprise reliability standards, playbooks, and operating philosophies.
  • Partner with leadership to strengthen operator certification, training rigor, and operational discipline.

Qualifications:

  • 10+ years of experience in reliability engineering, maintenance engineering, or facilities engineering within mission-critical environments.
  • Demonstrated leadership of complex, multi-system reliability programs with measurable business impact.
  • Expert-level knowledge of RCM, FMEA, RCA, and maintenance optimization methodologies.
  • Deep technical understanding of mission-critical infrastructure, including UPS, generators, switchgear, chillers, cooling towers, CRAH/CRAC, and BMS/EPMS.
  • Proven experience governing SOP/MOP/EOP programs and assessing operational change risk in live environments.
  • Advanced ability to analyze condition-monitoring, CMMS, and operational datasets and convert insights into strategic actions.
  • Proficiency in data analysis and visualization tools (Excel, Power BI, or similar).
  • Ability to apply statistical techniques or reliability modeling to support risk-informed decision-making under uncertainty.
  • Strong executive-level communication skills; able to influence senior leaders and defend technical positions.

Preferred Experience:

  • Experience designing and scaling enterprise critical spares and lifecycle asset management programs.
  • Hands-on experience with predictive analytics, failure modeling, or reliability simulations.
  • Proficiency with Python, R, or similar tools for advanced reliability analytics.
  • Working knowledge of SQL or other data query languages.
  • Strong familiarity with NFPA, IEEE, ASHRAE, and other relevant codes and standards.
  • Experience presenting reliability risk, capital tradeoffs, and investment recommendations to executive audiences.

Education & Certifications:

  • Bachelor’s degree in Mechanical, Electrical, or Industrial Engineering (or equivalent experience).
  • Preferred: CMRP, CRE, or similar advanced reliability or maintenance certification.

Work Conditions:

  • Supports 24×7 mission-critical operations; participates in on-call rotation and may support after-hours events.
  • Ability to work safely in energized environments in compliance with LOTO and NFPA 70E.
  • Travel to supported sites approximately 25%.

Salary range: $140,000-$170,000

CyrusOne is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, sex, sexual orientation, gender identity, religion, national origin, disability, veteran status, or other legally protected status.

CyrusOne provides reasonable accommodation for qualified individuals with disabilities in accordance with the Americans with Disabilities Act (ADA) and any other state or local laws. We will respond to requests for reasonable accommodations to assist you in applying for positions at CyrusOne, or to submit a resume.