The future you've been looking for

Engineer

The Judge Group

Irving, TX, USA

Posted on May 14, 2026

Apply now

Principal Engineer, Platform Engineering & Production Support

Locations: Irving, TX; Charlotte, NC; Minneapolis, MN
Work Model: Hybrid (3 days/week in-office)
Job Type: 12-month contract (potential extension or conversion)
Start Date: As soon as possible

About the Role

We are seeking a Principal Engineer to join our Platform Engineering team, focused on production support, reliability, and scalability of critical applications. This role is highly hands-on and requires deep expertise in DevOps and Site Reliability Engineering (SRE), with a strong focus on observability, incident management, and cloud-native environments.

You will work in a fast-paced, production-critical environment, ensuring application health, preventing outages, and improving system reliability through automation and modern engineering practices.

What You’ll Do

Lead production support for a portfolio of 20+ applications, ensuring high availability and performance
Design and implement monitoring, alerting, and observability solutions using tools like Splunk, Grafana, AppDynamics, and Prometheus
Proactively identify risks through gap analysis, anomaly detection, and predictive alerting
Troubleshoot complex issues in distributed microservices architectures and reduce mean time to resolution (MTTR)
Drive adoption of SRE best practices, including automation, AIOps, and intelligent monitoring
Support and scale applications running on OpenShift and cloud-native platforms
Partner with development teams to ensure production readiness during release cycles
Participate in an on-call rotation and respond to incidents with urgency and ownership
Mentor engineers and elevate team capabilities in DevOps and platform engineering
Serve as a technical leader managing competing priorities in a high-impact environment

Minimum Qualifications

7+ years of engineering experience or equivalent practical experience
10+ years of experience in platform engineering and production support
5+ years of experience with:
- Red Hat Linux, OpenShift, Kubernetes
- Java, Spring Boot, Python, microservices architectures
- Observability tools (Grafana, Splunk, AppDynamics)
- Incident management and alerting systems (AIOps, ServiceNow, BigPanda)
4+ years of experience with:
- Distributed systems and cloud-native architectures
- React.js, Kafka, Apache, and relational databases

Preferred Qualifications

Experience in financial services or highly regulated industries
Background in software development (especially Java-based ecosystems)
Strong ability to operate across SRE, DevOps, and production support roles
Demonstrated ability to manage multiple priorities in high-pressure environments
Experience with proactive monitoring, automation, and reliability engineering practices

Work Environment & Schedule

Hybrid model with three in-office days per week (8 hours per in-office day)
Standard 40-hour workweek
Typical working hours between 8:00 AM – 8:00 PM
Monthly on-call rotation (with offshore support; minimal extended hours expected)

About the Team

The Platform Engineering team focuses on stabilizing, scaling, and operating applications post-deployment. This is an application-centric role (not traditional infrastructure support), emphasizing reliability, performance, and operational excellence in cloud environments.

By providing your phone number, you consent to: (1) receive automated text messages and calls from the Judge Group, Inc. and its affiliates (collectively “Judge”) to such phone number regarding job opportunities, your job application, and for other related purposes. Message & data rates apply and message frequency may vary. Consistent with Judge's Privacy Policy, information obtained from your consent will not be shared with third parties for marketing/promotional purposes. Reply STOP to opt out of receiving telephone calls and text messages from Judge and HELP for help.

Apply now

See more open positions at The Judge Group