Senior Data Engineer
Karachi
Remote
Project-based role
We are seeking a highly skilled Cloudera to Databricks Migration Specialist to drive and execute the migration of our data platform from Cloudera to Databricks. In this role, you will be responsible for planning, designing, and implementing the migration process, ensuring minimal downtime, and preserving data integrity. Your expertise in data engineering, cloud computing, and both Cloudera and Databricks platforms will be key to delivering a successful transition.
Description
- Develop and execute a clear migration strategy from Cloudera to Databricks, identifying key tasks, timelines, and resources needed for the migration process.
- Collaborate with the engineering and data teams to design an optimal architecture for
Databricks, ensuring scalability, security, and performance are maintained. - Lead the effort in migrating large datasets, and complex ETL pipelines from Cloudera to
Databricks while ensuring minimal data loss and downtime - Ensure that the migrated systems integrate seamlessly with other platforms, such as data lakes,
cloud storage, and external applications. - Optimize the performance of Databricks environments by leveraging best practices, managing
resource allocation, and addressing any performance bottlenecks during and post-migration. - Work closely with cross-functional teams, including data engineers, data scientists, cloud
architects, and other stakeholders, to ensure smooth execution of the migration. - Document the migration process, challenges, and solutions, and provide training to internal
teams to ensure long-term success with Databricks.
Requirements
- 6+ years of experience in data engineering and cloud solutions, with a strong focus on Cloudera
and Databricks platforms. - Expert-level experience with Cloudera ecosystem tools (e.g., Hadoop, Hive, Impala, HBase) and
Databricks platform (e.g., Apache Spark, Delta Lake, Databricks SQL). - Hands-on experience in migrating data pipelines, workflows, and processing systems between
Cloudera and Databricks, including both structured and unstructured data. - Familiarity with cloud platforms such as AWS, Azure, or GCP, and their integration with
Databricks. - Excellent communication and collaboration skills to interact with both technical teams and nontechnical stakeholders.
- Excellent written and verbal communication skills.