Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Job Description Your Role Design, build, and maintain data pipelines and ETL processes using Databricks and Apache Spark. Optimize data workflows for performance, scalability, and cost efficiency. Implement data Lakehouse architecture and manage data ingestion from multiple sources. Collaborate with data scientists and analysts to enable advanced analytics and machine learning workloads. Ensure data quality, governance, and security across all data assets. Monitor and troubleshoot Databricks clusters, jobs, and workflows. Integrate Databricks with cloud services (AWS, Azure, or GCP) and other enterprise systems. Document processes, standards, and best practices for data engineering. Your Profile Hands‑on experience with Databricks, Apache Spark, and PySpark. Strong knowledge of SQL, Python, and data modeling principles. Experience with cloud platforms (AWS, Azure, or GCP) and their data services. Familiarity with Delta Lake, Lakehouse architecture, and data governance. Understanding of CI/CD pipelines and DevOps practices for data workflows. #J-18808-Ljbffr
Lead Data Engineer
CAPGEMINI ENGINEERING
bogotá, bogotá
Publicado hace 13 días
Denunciar empleo