Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.
As a Principal Site Reliability Engineer at JPMorgan Chase within the CLOUD RELIABILITY SERVICES, you work with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for the services in your application and product lines. You will ensure those NFRs are accounted for in your products’ design and test phases, that your service level indicators are effectively measuring customer experience, and that service level objectives are defined with stakeholders and implemented in production.
Job responsibilities
Creates high quality designs, roadmaps, and program charters that are delivered by you or the engineers under your guidanceLead and Implement SRE frameworks to support global google cloud environments and ensure the highest level of SLOs through operational excellenceMaster of application, data, infrastructure, and serverless architecture disciplinesUnderstanding of financial control and budget management using expertise in working in partnership with colleagues throughout the firm, and in leading collaborative teams to achieve common goalsProvide support to develop & improve the quality of technical engineering documentationProvide technical supervision, oversight and problem resolution for engineering activitiesParticipate in 24x7 SRE on-call rotations and escalation workflowsChampion a DevOps model so that services are automated and elastic across all platformsMakes significant contributions to JPMorgan Chase’s site reliability community via internal forums, communities of practice, guilds, and conferencesRequired qualifications, capabilities, and skills
Google cloud expertise in a mission critical production environmentStrong understanding about container technologies such as Docker, Kubernetes, GKE and HELMExperience in programming in one of the following languages: Python, PowerShell, shell scripting or GO along with good understanding of REST APIsHands-on experience with cloud-based technologies and tools especially in deployment, monitoring and operations, such as Google Observability, Data Dog, Prometheus, Splunk, Elasticsearch and Grafana.Strong understanding about the Google Cloud governance and compliance and cost managementStrong working knowledge of modern development technologies and tools such Agile, CI/CD, Git, Infrastructure as Code, Terraform and Jenkins.Google Cloud certification or equivalent technical experience in the Public Cloud.Preferred qualifications, capabilities, and skillsGood understanding of operating systems such as Windows, Linux (Redhat / Ubuntu)Good understanding of .net and SQL or other databases Good understanding of LLM and other AI/ML frameworks which can be used in AIOPS