Jersey City, NJ, USA
4 days ago
Lead Software Engineer, AI Cloud Infrastructure

We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.

As a Lead Software Engineer at JPMorgan Chase within the Chief Data and Analytics Office, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.

Job responsibilities

Architect solutions to enhance the reliability and scalability of AI/ML platforms and applications to accommodate fast growing demands. 

Create the tooling and services needed to safely deploy and operate ML Models. 

Build monitoring and observability tools to track model performance, data quality, and system health 

Reduce operational toil by automating repetitive tasks and building self-healing systems and remediation workflows. 

Build strong cross-functional relationships that foster engagements across the organization and deliver solutions to user problems. 

Participates in on-call rotations and debug and solve issues in a production environment. 

Take full ownership of problems, develop solutions, and acquire new knowledge to complete the task. 

Mentor and guide junior engineers. 

 

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 5+ years applied experience. 

Hands-on practical experience delivering system design, application development, testing, and operational stability. 

Advanced proficiency in one or more programming languages (Python, Go). 

Proficiency in automation and continuous delivery methods. 

Proficient in all aspects of the Software Development Life Cycle. 

Hands-on experience with Google Cloud platform (GCP). 

Strong proficiency with IaC tools (Terraform) and Google Cloud Client Libraries. 

Systematic problem-solving and troubleshooting skills in a complex system. 

Excellent communication skills and ability to represent and present business and technical concepts to stakeholders.  

Self-managed, self-motivated with strong sense of ownership, urgency, and drive 

 

 

Preferred qualifications, capabilities, and skills

Prior experience in Google Vertex AI, AWS Bedrock, Azure OpenAI. 

Prior experience working in GPU scheduling, Model deployment or ML training workloads. 

Prior experience developing GenAI Apps, AI Agents, Vector Search, and RAG patterns. 

Extensive experience implementing advanced observability using tools like Open Telemetry, Dynatrace, Grafana, and/or cloud-native services. 

Multi cloud experience such as (AWS/Azure) is a plus
Confirm your E-mail: Send Email