Senior Data Engineer
Job Description
We are seeking a Senior Data Engineer (senior individual contributor) to design, build, and operate Databricks & Lakehouse data platforms that support analytics, AI, and Generative AI applications.
This role works within product-aligned squads and focuses on delivering high-quality, governed, and scalable data assets consumed by analytics platforms, machine learning models, and GenAI applications including LLM- and agent-based systems.
Role Clarification
This is a senior individual contributor role.
The role does not include formal people management or technical lead accountability
The focus is on delivery, quality, and enabling AI and GenAI outcomes
Job Description / Responsibilities
- Data Engineering & Lakehouse Delivery
- Build, and maintain data pipelines and lakehouse structures
- Deliver data solutions that support:
- Analytics and BI
- Machine learning workloads
- Generative AI applications and agents
- Apply enterprise data lake and lakehouse principles to ensure data is:
- Reliable
- Well-governed and aligned to Bank's governance
- Secure
- Fit for downstream consumption
- Translate business and analytical requirements into production-ready data solutions
Databricks & Platform Usage
- Build and operate solutions using Databricks, including:
- Databricks Jobs and Workflows
- Unity Catalog
- Databricks Bundles
- Notebooks and shared libraries
- Analytics and reporting tools
- Downstream operational systems
- Support feature-style and curated data access patterns required by AI and GenAI workloads
Generative AI Enablement
- Build data pipelines that feed Generative AI applications, including:
- Curated knowledge datasets
- Structured and semi-structured data sources
- Metadata and lineage required for AI consumption
- Enable data patterns commonly used in GenAI, such as:
- Retrieval-Augmented Generation (RAG)
- Context and prompt data preparation
- Model input, output, and feedback data flows
- Work closely with AI Engineers and Product Owners to align data engineering deliverables to GenAI use cases. Note: you will also be involved in AI Engineer development.
- Develop production-grade pipelines using Python, PySpark, SQL, and Apache Spark
- Implement automated testing and CI/CD practices for data workloads
- Ensure data solutions are:
- Observable
- Resilient
- Contribute to improving data quality, reliability, and operational stability
Collaboration & Ways of Working
- Work as a senior engineer within a cross-functional product squad
- Collaborate closely with:
- Product Owners
- Analytics teams
- Platform and security teams
- Provide engineering input into design discussions and delivery decisions
- Support peer reviews and shared engineering standards
Risk, Governance & Run
- Ensure data solutions comply with enterprise security, risk, and governance standards
- Support operational stability of data pipelines used by analytics and AI workloadsParticipate in incident resolution and root cause analysis
- Maintain appropriate documentation and runbooks
Background and experience required
- Years of Experience
- 6+ years industry experience
Must-have Skills (Mandatory Skills)
Include minimum years of experience required per skill
Technical Skills
- Proven experience as a Senior / Lead Data Engineer 6+ years
- Hands-on experience working in Databricks environments 2+ years
- Strong understanding of enterprise data lake and lake house architecture 6+ years
- Proficiency in:
- Python 3+ years
- SQL 3+ years
- Apache Spark 3+ years
- Experience building and operating production-grade data platforms 3+ years
- Experience working in enterprise or regulated environments 5+ years
Beneficial Skills (Desired Skills)
- Experience enabling AI, ML, or Generative AI use cases from a data engineering perspective
- Familiarity with:
- RAG data patterns
- Feature-style or AI-serving datasets
- Vector or embedding-ready data workflows
- Experience working in Agile, product-aligned squads
- Exposure to cloud-native data platforms (AWS or Azure)
Python SQL PySpark CI/CD Generative AI Data Pipelines AI analytics apache spark agile Databricks Data Engineering Delta Lake Retrieval-Augmented Generation (RAG) Unity Catalog Lakehouse data platforms agent-based systems LLM Enterprise data lake Notebooks Databricks Jobs and Workflows shared libraries Databricks Bundles Cloud-native data platforms (AWS or Azure)
About This Role
Career insights for Database Architects positions