Data modeling, Data warehousing, ETL pipelines, Flink, Kafka, Spark, Kinesis, Airflow, Python, Java, Scala, Monitoring
Years of Exp : 6+
Education : Bachelor's degree in Computer Science or related field.
Key Responsibilities:
● Design, develop, and maintain scalable data pipelines and ETL processes
● Optimize data flow and collection for cross-functional teams
● Build infrastructure required for optimal extraction, transformation, and loading of data
● Ensure data quality, reliability, and integrity across all data systems
● Collaborate with data scientists and analysts to help implement models and algorithms
● Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, etc.
● Create and maintain comprehensive technical documentation
● Evaluate and integrate new data management technologies and tools
● Monitoring using tools like Datadog/Grafana
Mandatory Skills:
● Extensive experience with big data technologies (e.g., Spark, Flink, Hadoop), Terraform, Cloud Formation, Kubernetes, DataDog/Grafana, CI/CD
● Experience in monitoring using any one tool is mandatory - DataDog/Grafana and the like
● Experience with containerization and orchestration tools (Kubernetes)
● Experience with data modeling, data warehousing, and building ETL pipelines
● Experience with cloud platforms (AWS, Azure, or GCP) and their data services. AWS Preferred
● Experience with building streaming pipelines with flink, Kafka, Kinesis. Flink Preferred.
● Strong knowledge of data pipeline and workflow management tools (e.g., Airflow, Luigi, NiFi)
● Expert knowledge of SQL and experience with relational databases (e.g., PostgreSQL, Redshift, TIDB, MySQL, Oracle, Teradata)
● Proficiency in at least one programming language such as Python, Java, or Scala
● Understanding of data governance and data security principles
● Experience with version control systems (e.g., Git) and CI/CD practices
Preferred Skills :
● Basic knowledge of machine learning workflows and MLOps
● Experience with NoSQL databases (MongoDB, Cassandra, etc.)
● Familiarity with data visualization tools (Tableau, Power BI, etc.)
● Experience with real-time data processing
● Knowledge of data governance frameworks and compliance requirements (GDPR, CCPA, etc.)