We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible. Data is one of our most significant competitive assets and within our business, data is a crucial enabler for impactful initiatives that enhance efficiency and accelerate business growth.
As a Lead Software- ETL/ELT Pipelines / Python / Pyspark Engineer at JPMorgan Chase within the Asset and Wealth Management Technology Team, you will play a crucial role as part of an agile team dedicated to transforming and building client centric view of all investment data to unify client data in a secure, stable, and scalable manner. As a core technical contributor, you are responsible for conducting critical technology solutions across multiple technical areas within various business functions in support of the firm’s business objectives.
Job responsibilities
Lead the development of secure high-quality production code, and review and debug code written by othersEnsure data quality, integrity, and security across all data systems and platforms and enforce data governance policies and best practicesDesign and implement scalable data solutions that align with business objectives and technology strategies and technical troubleshooting with ability to think beyond routine or conventional approaches to build and support solutions or break down technical problemsDesign, develop, and optimize robust ETL/ELT pipelines using SQL, Python, and PySpark for large-scale, complex data environmentsCollaborate with cross-functional teams to understand data requirements and translate them into technical specificationsConduct performance tuning and optimization of data systems to ensure high availability and scalabilityIdentify opportunities to eliminate or automate remediation of recurring issues to improve overall operational stability of software applications and systemsStay current on emerging ETL and data engineering technologies with industry trends to drive innovationWork closely with stakeholders to identify opportunities for data-driven improvements and efficienciesMaintain detailed documentation for pipelines, data models, and integration processes
Required qualifications, capabilities, and skills
Formal training or certification on software engineering concepts and 5+ years applied experienceProven experience as a lead engineer in data management, ETL/ELT pipeline development, and large-scale data processing with strong hands-on coding proficiency in Python, PySpark, Apache Spark, SQL, and AWS cloud services such as AWS EMR, S3, Athena, RedshiftStrong understanding of data quality, security, and lineage best practicesHands-on experience with AWS cloud and data lake platforms, Snowflake, Databricks etcExperience with cloud-based data warehouse migration and modernizationIntimate knowledge and ability to implement unit, integration and functional testing strategiesExperience providing the tools that will enable data to be made available on Mesh and distributed to meet consumer needProficiency in automation and continuous delivery methods and understanding of agile methodologies such as CI/CD, Application Resiliency, and SecurityExcellent problem-solving and troubleshooting skills, with ability to optimize performance and troubleshoot complex data pipelinesStrong communication and documentation abilitiesAbility to collaborate effectively with business and technical stakeholders
Preferred Qualifications and Skills
Knowledge of Apache IcebergIn-depth knowledge of the financial services industry and IT systems