Senior Data Engineer (AI / ML / Databricks / AWS)
Harvey Nash Polska•Warszawa
💰 Wynagrodzenie
Widełki nieujawnione
📋 Informacje
📝 Opis główny / Wstęp
Senior Data Engineer (AI / Machine Learning / Databricks / AWS)
You will join a team responsible for building an AI-driven data platform supporting advanced analytics and machine learning solutions used across the organization. The team focuses on transforming large volumes of data into scalable, high-quality datasets that power AI, ML and Generative AI use cases.
You will work closely with AI Engineers, ML Scientists and Software Engineers to design and build data infrastructure that enables cutting-edge AI applications.
This is a great opportunity for engineers who enjoy working with large-scale data platforms, distributed systems and modern cloud technologies.
Your Responsibilities
• Design and build scalable data pipelines using Databricks, Spark and AWS services
• Ingest and process large-scale structured and unstructured datasets from internal and external sources
• Develop high-performance data processing workflows using Python and distributed processing frameworks
• Collaborate with AI/ML teams to prepare datasets and infrastructure supporting machine learning models and GenAI applications
• Optimize data pipelines for performance, scalability and cost efficiency in cloud environments
• Implement data governance, access control and data lineage best practices
• Build monitoring, validation and testing frameworks ensuring data quality and reliability
• Support onboarding of new data sources and external data providers
• Evaluate emerging GenAI and LLM data infrastructure tools
Tech Stack
Python
Spark / Databricks
AWS (Glue, EMR, Fargate, Step Functions)
Delta Lake
Distributed data processing
Data architecture & data modeling
Optional: Graph databases