TRAFFIC Not Specified

Senior Data Engineer

Indsafri India Private Limited

Job Description

We are seeking a Senior Data Engineer (senior individual contributor) to design, build, and operate Databricks & Lakehouse data platforms that support analytics, AI, and Generative AI applications.

This role works within product-aligned squads and focuses on delivering high-quality, governed, and scalable data assets consumed by analytics platforms, machine learning models, and GenAI applications including LLM- and agent-based systems.

Role Clarification

This is a senior individual contributor role.

The role does not include formal people management or technical lead accountability

The focus is on delivery, quality, and enabling AI and GenAI outcomes

Job Description / Responsibilities

  • Data Engineering & Lakehouse Delivery
  • Build, and maintain data pipelines and lakehouse structures
  • Deliver data solutions that support:
  • Analytics and BI
  • Machine learning workloads
  • Generative AI applications and agents
  • Apply enterprise data lake and lakehouse principles to ensure data is:
  • Reliable
  • Well-governed and aligned to Bank's governance
  • Secure
  • Fit for downstream consumption
  • Translate business and analytical requirements into production-ready data solutions

Databricks & Platform Usage

  • Build and operate solutions using Databricks, including:
  • Databricks Jobs and Workflows
  • Unity Catalog
  • Databricks Bundles
  • Notebooks and shared libraries
  • Analytics and reporting tools
  • Downstream operational systems
  • Support feature-style and curated data access patterns required by AI and GenAI workloads

Generative AI Enablement

  • Build data pipelines that feed Generative AI applications, including:
  • Curated knowledge datasets
  • Structured and semi-structured data sources
  • Metadata and lineage required for AI consumption
  • Enable data patterns commonly used in GenAI, such as:
  • Retrieval-Augmented Generation (RAG)
  • Context and prompt data preparation
  • Model input, output, and feedback data flows
  • Work closely with AI Engineers and Product Owners to align data engineering deliverables to GenAI use cases. Note: you will also be involved in AI Engineer development.
  • Develop production-grade pipelines using Python, PySpark, SQL, and Apache Spark
  • Implement automated testing and CI/CD practices for data workloads
  • Ensure data solutions are:
  • Observable
  • Resilient
  • Contribute to improving data quality, reliability, and operational stability

Collaboration & Ways of Working

  • Work as a senior engineer within a cross-functional product squad
  • Collaborate closely with:
  • Product Owners
  • Analytics teams
  • Platform and security teams
  • Provide engineering input into design discussions and delivery decisions
  • Support peer reviews and shared engineering standards

Risk, Governance & Run

  • Ensure data solutions comply with enterprise security, risk, and governance standards
  • Support operational stability of data pipelines used by analytics and AI workloadsParticipate in incident resolution and root cause analysis
  • Maintain appropriate documentation and runbooks

Background and experience required

  • Years of Experience
  • 6+ years industry experience

Must-have Skills (Mandatory Skills)

Include minimum years of experience required per skill

Technical Skills

  • Proven experience as a Senior / Lead Data Engineer 6+ years
  • Hands-on experience working in Databricks environments 2+ years
  • Strong understanding of enterprise data lake and lake house architecture 6+ years
  • Proficiency in:
  • Python 3+ years
  • SQL 3+ years
  • Apache Spark 3+ years
  • Experience building and operating production-grade data platforms 3+ years
  • Experience working in enterprise or regulated environments 5+ years

Beneficial Skills (Desired Skills)

  • Experience enabling AI, ML, or Generative AI use cases from a data engineering perspective
  • Familiarity with:
  • RAG data patterns
  • Feature-style or AI-serving datasets
  • Vector or embedding-ready data workflows
  • Experience working in Agile, product-aligned squads
  • Exposure to cloud-native data platforms (AWS or Azure)
Skills

Python SQL PySpark CI/CD Generative AI Data Pipelines AI analytics apache spark agile Databricks Data Engineering Delta Lake Retrieval-Augmented Generation (RAG) Unity Catalog Lakehouse data platforms agent-based systems LLM Enterprise data lake Notebooks Databricks Jobs and Workflows shared libraries Databricks Bundles Cloud-native data platforms (AWS or Azure)

About This Role

Career insights for Database Architects positions

Salary Benchmark
R24,300/month
R15,695 to R38,292/month
Source: WageIndicator ZAR data
Job Outlook
This career will grow rapidly in the next few years.
Key Skills for This Role
Complex Problem Solving Critical Thinking Judgment and Decision Making Reading Comprehension Systems Analysis
Common Technologies
Amazon DynamoDB Elasticsearch (now Elastic) MongoDB Atlas Apache Hive Blackboard Learn IBM Db2 Django Angular