Houston, TX, United States
19 hours ago
Senior Lead Site Reliability Engineer

Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.

As a Senior Lead  Site Reliability Engineer at JPMorgan Chase within the Enterprise technology, identity and access team, you work with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for the services in your application and product lines. You will ensure those NFRs are accounted for in your products’ design and test phases, that your service level indicators are effectively measuring customer experience, and that service level objectives are defined with stakeholders and implemented in production.

Job responsibilities

Oversee the Privilege Management Application to ensure it meets enterprise standards, collaborating with cross-functional teams to design, implement, and maintain scalable and secure infrastructure solutions.

Utilize DevOps practices to automate and streamline deployment processes, ensuring efficient and reliable software delivery, while leveraging containerization technologies to enhance application scalability and manageability.

Integrate AI/ML solutions to optimize system performance and enhance security measures, and implement HashiCorp technologies, including Hashi Vault, for secure and efficient privilege management.

Work with CyberArk and other related tools to enhance privileged access management and security protocols, collaborating with cybersecurity teams to ensure compliance with security protocols and best practices.

Monitor system performance, identify potential issues, and implement proactive solutions to prevent downtime, ensuring continuous system reliability and performance.

Provide technical leadership and mentorship to junior team members, fostering a culture of continuous learning and improvement within the team.

 

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 5+ years applied experienceAdvanced knowledge in site reliability culture and principles with demonstrated ability to implement site reliability within an application or platformAdvanced knowledge and experience in observability such as white and black box monitoring, service level objectives, alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.A minimum of 15 years of experience in Site Reliability Engineering or related fields.Proven experience with DevOps practices, containerization technologies (e.g., Docker, Kubernetes), and public cloud platforms (e.g., AWS, Azure, Google Cloud).Strong understanding of AI/ML technologies and their application in enhancing system performance and security.Experience with HashiCorp technologies, particularly Hashi Vault, for privileged access management. Alternatively, Cyberark experience will be preferred.

Background in cybersecurity and privileged access management is highly preferred.

 

Preferred qualifications, capabilities, and skills In-depth knowledge of enterprise design principles and best practices.Excellent problem-solving skills and the ability to work effectively in a fast-paced, dynamic environment.Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams.

#LI-ID1

 

Confirm your E-mail: Send Email