We have an exciting and rewarding opportunity for you to take your software engineering career to the next level.
As a Software Engineer III at JPMorgan Chase within the Corporate Sector, Infrastructure Platforms team, you serve as a seasoned member of an agile team to design and deliver trusted, market-leading technology products in a secure, stable, and scalable way. You are responsible for carrying out critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.
You will build, deploy, and maintain robust infrastructure platforms within cloud environments, tailored for AI and machine learning workloads. This role involves building and maintaining highly scalable and resilient infrastructure platforms to enable training and inference for Large Language Models.
Job responsibilities
• Execute software solutions, design, development, and technical troubleshooting with the ability to think beyond routine or conventional approaches to build solutions or break down technical problems
• Create secure and high-quality production code and maintain algorithms that run synchronously with appropriate systems
• Produce architecture and design artifacts for complex applications while being accountable for ensuring design constraints are met by software code development
• Gather, analyze, synthesize, and develop visualizations and reporting from large, diverse data sets in service of continuous improvement of software applications and systems
• Proactively identify hidden problems and patterns in data and use these insights to drive improvements to coding hygiene and system architecture
• Contribute to software engineering communities of practice and events that explore new and emerging technologies
• Engineer infrastructure platforms that are secure, scalable, and optimized for AI and machine learning workloads
• Collaborate with AI teams to understand computational needs and translate these into infrastructure requirements
• Monitor, manage, and optimize cloud resources to maximize performance and minimize costs
• Design and implement continuous integration and delivery pipelines for machine learning workloads
• Develop automation scripts and infrastructure as code to streamline deployment and management tasks
Required qualifications, capabilities, and skills
• Formal training or certification in software engineering concepts and 3+ years of applied experience
• Good knowledge of cloud computing delivery models (IaaS, PaaS, and SaaS) and deployment models related to Public, Private, and Hybrid Cloud services
• Proficient in Linux environments, including scripting and administration
• Familiarity with cloud data services and big data processing tools
• Foundational understanding of machine learning concepts such as transformer architecture, ML training, and inference
• Experience in one or more high-performance computing and machine learning frameworks such as vLLM, Ray.io, or Slurm is preferable
• Strong hands-on coding experience with Python and/or Golang
• Experience in solutions design and engineering, with experience in containerization (Docker, Kubernetes) and cloud service providers (AWS, Azure, GCP)
• Experience with Infrastructure as Code (Terraform, CloudFormation) and automation tools (Ansible, Chef, Puppet)
• Strong background in network architecture, database programming (SQL/NoSQL), and data modeling
• Deep understanding of cloud component architecture: microservices, containers, IaaS, storage, security, and knowledge of routing/switching technologies