We are looking for a Cloud Data Engineer who has experience in building data products using Databricks and related technologies for one of our prestigious clients for REMOTE location. The brief JD is as below. JD:
Overall Exp - min 8yrs
Exp in Cloud Data eng - min 6 yrs
Exp in AWS - min 5 yrs
Exp in Databricks - min 4yrs
Exp in Data Architecture
Work Mode: REMOTE
Interview: 2/3 Rounds Virtual
NP: IMMEDIATE / Max 15 to 20 days (if NP is more than 20 days and if you are not able to join on or before 5th of June please excuse)
Employment: On PMT FTE basis with client (No third party)
Please reply with
T.EXP
R. EXP in Dabricks
R.EXP in AWS
CCTC
ECTC
NP
UPDATED RESUME
Primary Responsibilities
Analyze and understand existing data warehouse implementations to support migration and consolidation efforts. - Reverse-engineer legacy stored procedures (PL/SQL, SQL) and translate business logic into scalable Spark SQL code within Databricks notebooks. - Design and develop data lake solutions on AWS using S3 and Delta Lake architecture, leveraging Databricks for processing and transformation. - Build and maintain robust data pipelines using ETL tools with ingestion into S3 and processing in Databricks. - Collaborate with data architects to implement ingestion and transformation frameworks aligned with enterprise standards. - Evaluate and optimize data models (Star, Snowflake, Flattened) for performance and scalability in the new platform. - Document ETL processes, data flows, and transformation logic to ensure transparency and maintainability. - Perform foundational data administration tasks including job scheduling, error troubleshooting, performance tuning, and backup coordination. - Work closely with cross-functional teams to ensure smooth transition and integration of data sources into the unified platform.
Participate in Agile ceremonies and contribute to sprint planning, retrospectives, and backlog grooming
Triage, debug and fix technical issues related to Data Lakes
Maintain and Manage Code repositories like Git
You Must Have:
8-12 years of experience in data engineering, data warehousing, or big data platform development. - 5+ years of hands-on experience with Databricks including Spark, Delta Lake, and performance optimization. - 4+ years of experience designing and implementing cloud-based data lake or lakehouse architectures on Databricks.
Strong expertise in Spark technologies including Spark SQL, PySpark, or Scala.
Advanced SQL and PL/SQL skills with the ability to interpret and refactor legacy stored procedures. - Strong understanding of data modeling techniques including Star Schema, Snowflake, and modern Lakehouse modeling approaches. - Proficiency in at least one programming language (Python, Scala, Java). - Hands-on experience with data modeling and warehouse design principles. - Experience designing enterprise-scale ETL/ELT pipelines and ingestion frameworks.
Strong understanding of performance tuning, partitioning strategies, and cost optimization for large-scale data platforms. - Experience implementing CI/CD pipelines and DevOps practices for data engineering workflows. - Bachelors degree in computer science, Information Technology, Data Engineering, or related field. - Experience working in Agile environments and contributing to iterative development cycles. - Experience working on Agile projects and Agile methodology in general
We Value
Databricks Professional or Associate Cloud Certification. - Experience with enterprise data governance, metadata management, and data catalog platforms. With regards,