NTT DATA: Offshore Pod Leads (TBD, AN, IN)

Req ID:360093

NTT DATA strives to hire exceptional, innovative and passionate individuals who want to grow with us. If you want to be part of an inclusive, adaptable, and forward-thinking organization, apply now.

We are currently seeking a Offshore Pod Leads to join our team in TBD, Andaman and Nicobar Islands (IN-AN), India (IN).

JOB DESCRIPTION

Data Engineering Pod Lead

Databricks Lakehouse Migration Program

Two Roles: Informatica Pod Lead | AWS Glue Pod Lead

Engagement Type	Contract / Staff Augmentation or Full-Time Employee (FTE) — Open
Seniority Level	Lead / Architect — 12+ years of relevant experience
Number of Openings	2 (one per pod)
Team Size	4–6 Data Engineers per pod lead
Cloud Platform	AWS (Glue, Redshift, S3, Kinesis Streams, IAM, CloudWatch)
Target Platform	Databricks Lakehouse (Unity Catalog, Delta Lake, Workflows)
Program Type	Client-facing migration engagement — ETL modernization

Program Context & Opportunity

Our client is undertaking a large-scale data platform modernization initiative — migrating from a legacy ETL ecosystem (Informatica PowerCenter, AWS Glue, and Amazon Kinesis Streams) feeding Amazon Redshift into a unified Databricks Lakehouse architecture built on Delta Lake. This is a high-impact, high-visibility program requiring experienced technical leaders who can navigate complex legacy systems, architect modern solutions, and lead skilled engineering teams through the full migration lifecycle.

We are hiring two dedicated Pod Leads — one for each legacy source domain — who will be jointly accountable for technical excellence, delivery velocity, and team development throughout the engagement.

Common Responsibilities — Both Pod Leads

Technical Leadership & Architecture

Own the end-to-end technical design and implementation of the migration from the respective source platform to Databricks Lakehouse (Delta Lake, Unity Catalog, Databricks Workflows).
Conduct thorough assessments of existing ETL jobs — analyzing lineage, dependencies, transformation logic, scheduling, and data quality rules — prior to migration planning.
Define migration patterns, reusable frameworks, and coding standards adopted across the pod.
Architect scalable, cost-efficient pipelines using Databricks PySpark, Spark SQL, and Delta Live Tables (DLT) as appropriate.
Make and document key architectural decisions (ADRs) with clear rationale and trade-off analysis.
Drive adoption of software engineering best practices: version control (Git), CI/CD, unit testing, and code review within the pod.

Team Leadership & Delivery Management

Directly lead a pod of 4–6 Data Engineers, providing technical mentorship, task assignment, code reviews, and unblocking day-to-day impediments.
Manage sprint planning, backlog refinement, and progress tracking against migration milestones in close coordination with the Program Manager.
Hold the team accountable for quality and velocity — proactively flag risks, scope changes, and dependencies before they become blockers.
Conduct regular 1:1s and technical feedback sessions to support the professional growth of pod members.
Foster a culture of ownership, collaboration, and continuous improvement within the pod.

Client & Stakeholder Communication

Serve as the primary technical point of contact for your pod's workstream with the client.
Translate complex technical concepts and migration trade-offs into clear, concise communications for both technical and non-technical stakeholders.
Participate in program-level status reviews, architecture governance meetings, and client steering committees as required.
Manage expectations around scope, timelines, and quality, escalating issues appropriately.

Quality, Governance & Documentation

Ensure all migrated pipelines meet data quality, SLA, and observability requirements defined by the client.
Champion data governance best practices including lineage tracking, catalog registration in Databricks Unity Catalog, and access control alignment.
Produce and maintain clear technical documentation: architecture diagrams, runbooks, migration playbooks, and handover materials.
Coordinate with QA/testing resources to validate migrated pipelines against source-system outputs.

ROLE 1 — Informatica PowerCenter Pod Lead

Role Overview

The Informatica Pod Lead will own the migration of Informatica PowerCenter-based ETL jobs to the Databricks Lakehouse platform. This role demands deep expertise in Informatica's architecture, transformation logic, and metadata — paired with the ability to re-engineer complex legacy workflows into modern, cloud-native Databricks pipelines on AWS.

Role-Specific Responsibilities

Analyze and decompose Informatica PowerCenter mappings, sessions, workflows, and worklets to understand full transformation logic, source/target connectivity, and scheduling dependencies.
Define and execute a structured migration methodology — assess, convert, validate — for translating Informatica logic into equivalent PySpark/Spark SQL code on Databricks.
Identify opportunities to simplify or consolidate legacy transformations during migration rather than performing a lift-and-shift.
Manage Informatica repository metadata, mapping exports (XML), and PowerCenter Designer artifacts as inputs to the migration pipeline.
Coordinate with source system owners (databases, flat files, legacy APIs) to ensure source connectivity is preserved or rerouted through AWS S3/Glue Catalog during migration.
Validate migrated pipelines against Informatica source outputs using row-count reconciliation, checksum comparisons, and business rule validation.

Required Qualifications

Technical Skills

12+ years in data engineering with at least 5+ years of hands-on Informatica PowerCenter experience (mappings, sessions, workflows, transformations, parameter files, workflow monitor).
Strong proficiency in PySpark and Spark SQL for building production-grade ETL/ELT pipelines.
Hands-on experience with Databricks (Spark clusters, notebooks, Jobs/Workflows, Delta Lake) — Databricks certification preferred.
Solid understanding of AWS data services: S3, Redshift, Glue Data Catalog, IAM, CloudWatch.
Experience migrating or re-platforming Informatica workloads to a modern data platform (Databricks, Spark, or cloud-native ETL).
Proficiency in SQL and familiarity with Redshift-specific SQL dialects and optimization patterns.
Familiarity with Unity Catalog, Delta Live Tables, or similar data governance/pipeline orchestration frameworks is a strong plus.
Experience with CI/CD tooling (Git, GitHub Actions, Jenkins, or similar) applied to data pipeline development.

Leadership & Soft Skills

Proven track record leading a team of 4+ data engineers in a delivery-focused engagement or program.
Strong analytical and problem-solving skills with the ability to work through ambiguous, undocumented legacy systems.
Excellent written and verbal communication; able to present technical findings and migration plans to client stakeholders.
Experience working in Agile/Scrum delivery environments.
Consulting or client-engagement experience is a significant advantage.

ROLE 2 — AWS Glue Pod Lead

Role Overview

The AWS Glue Pod Lead will own the migration of AWS Glue-based ETL jobs (feeding Amazon Redshift) to the Databricks Lakehouse platform. This role requires deep expertise across the AWS data ecosystem — particularly Glue, Redshift, S3, and IAM — combined with the architectural vision to translate cloud-native ETL patterns into optimized Databricks pipelines that leverage the full power of Delta Lake.

Role-Specific Responsibilities

Audit and catalog all existing AWS Glue jobs — including PySpark and Python shell scripts, Glue Crawlers, Glue Data Catalog configurations, triggers, and job bookmarks.
Assess Redshift loading patterns (COPY commands, stored procedures, views, materialized views) and define equivalent target-state patterns in Databricks using Delta Lake MERGE, upsert, and partition strategies.
Evaluate and migrate Glue Crawlers and Glue Data Catalog schemas to Databricks Unity Catalog, ensuring metadata consistency and lineage continuity.
Redesign Glue workflows and triggers as Databricks Workflow DAGs, preserving scheduling intent while improving observability and retry logic.
Collaborate with AWS and cloud infrastructure teams to manage IAM role transitions, S3 access patterns, and network configurations during and after migration.
Validate migrated pipelines against Glue/Redshift source outputs, including Redshift audit tables, row counts, and business-critical KPI reconciliation.
Assess and migrate Amazon Kinesis Streams-based ingestion pipelines — analyzing stream consumers, shard configurations, and downstream processing logic — and re-architect them using Databricks Structured Streaming with Delta Lake as the target sink.
Design low-latency streaming pipeline patterns on Databricks (Auto Loader, Structured Streaming) to replace Kinesis consumer applications, ensuring at-least-once or exactly-once delivery semantics are preserved.

Required Qualifications

Technical Skills

12+ years in data engineering with at least 5+ years of hands-on AWS Glue experience (PySpark ETL scripts, Python shell jobs, Glue Studio, Crawlers, Data Catalog, job bookmarks, triggers).
Deep expertise in Amazon Redshift — data modeling, distribution/sort keys, COPY/UNLOAD operations, stored procedures, performance tuning, and Redshift Spectrum.
Hands-on experience with Amazon Kinesis Streams — stream consumers (KCL/Lambda/Glue Streaming), shard management, retention policies, and integration with downstream AWS services.
Strong AWS platform proficiency: S3, IAM, CloudWatch, Kinesis Streams, AWS Secrets Manager, Lake Formation — AWS Solutions Architect or Data Analytics certification preferred.
Strong proficiency in PySpark and Spark SQL for building and optimizing production pipelines on Databricks.
Hands-on experience with Databricks (Spark clusters, notebooks, Jobs/Workflows, Delta Lake) — Databricks certification preferred.
Experience migrating Glue-based workloads to Databricks or equivalent Spark-based platforms.
Familiarity with Unity Catalog, Delta Live Tables, and Databricks Asset Bundles is a strong plus.
Experience with CI/CD tooling (Git, GitHub Actions, AWS CodePipeline, or similar) applied to data pipeline development.

Leadership & Soft Skills

Proven track record leading a team of 4+ data engineers in a delivery-focused engagement or program.
Strong ability to navigate AWS service interdependencies and translate cloud infrastructure nuances into migration decisions.
Excellent written and verbal communication; able to present technical migration plans and risk assessments to client stakeholders.
Experience working in Agile/Scrum delivery environments.
Consulting or client-engagement experience is a significant advantage.

Preferred Qualifications — Both Roles

Experience with real-time or near-real-time streaming pipelines using Databricks Structured Streaming, Delta Live Tables, or Apache Kafka — particularly migrating from Amazon Kinesis-based architectures.
Databricks Certified Data Engineer Associate or Professional certification.
AWS Certified Data Analytics – Specialty or AWS Certified Solutions Architect.
Prior experience on a large-scale ETL migration or data platform modernization program.
Familiarity with data observability tools (Monte Carlo, Great Expectations, Deequ) or Databricks built-in data quality frameworks.
Experience with infrastructure-as-code tools such as Terraform or AWS CDK for managing Databricks workspace configurations.
Knowledge of data mesh principles, medallion architecture (Bronze/Silver/Gold), and lakehouse design patterns.
Prior consulting, systems integration, or professional services delivery experience.

Confidential — For Internal Distribution and Candidate Use Only

About NTT DATA

NTT DATA is a $30 billion business and technology services leader, serving 75% of the Fortune Global 100. We are committed to accelerating client success and positively impacting society through responsible innovation. We are one of the world's leading AI and digital infrastructure providers, with unmatched capabilities in enterprise-scale AI, cloud, security, connectivity, data centers and application services. our consulting and Industry solutions help organizations and society move confidently and sustainably into the digital future. As a Global Top Employer, we have experts in more than 50 countries. We also offer clients access to a robust ecosystem of innovation centers as well as established and start-up partners. NTT DATA is a part of NTT Group, which invests over $3 billion each year in R&D.

Whenever possible, we hire locally to NTT DATA offices or client sites. This ensures we can provide timely and effective support tailored to each client’s needs. While many positions offer remote or hybrid work options, these arrangements are subject to change based on client requirements. For employees near an NTT DATA office or client site, in-office attendance may be required for meetings or events, depending on business needs. At NTT DATA, we are committed to staying flexible and meeting the evolving needs of both our clients and employees. NTT DATA recruiters will never ask for payment or banking information and will only use @nttdata.com and @talent.nttdataservices.com email addresses. If you are requested to provide payment or disclose banking information, please submit a contact us form, https://us.nttdata.com/en/contact-us.

NTT DATA endeavors to make https://us.nttdata.com accessible to any and all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact us at https://us.nttdata.com/en/contact-us. This contact information is for accommodation requests only and cannot be used to inquire about the status of applications. NTT DATA is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status. For our EEO Policy Statement, please click here. If you'd like more information on your EEO rights under the law, please click here. For Pay Transparency information, please click here.

Originally posted on Himalayas

Offshore Pod Leads (TBD, AN, IN)

JOB DESCRIPTION

Data Engineering Pod Lead

Databricks Lakehouse Migration Program

Engagement Type

Seniority Level

Number of Openings

2 (one per pod)

Team Size

4–6 Data Engineers per pod lead

Cloud Platform

Target Platform

Program Type

Program Context & Opportunity

Common Responsibilities — Both Pod Leads

Technical Leadership & Architecture

Team Leadership & Delivery Management

Client & Stakeholder Communication

Quality, Governance & Documentation

ROLE 1 — Informatica PowerCenter Pod Lead

Role Overview

Role-Specific Responsibilities

Required Qualifications

Technical Skills

Leadership & Soft Skills

ROLE 2 — AWS Glue Pod Lead

Role Overview

Role-Specific Responsibilities

Required Qualifications

Technical Skills

Leadership & Soft Skills

Preferred Qualifications — Both Roles

About NTT DATA

Similar Job Offers

Project Management | Talent Community

Intern - Advanced Intelligence and Research

Security & Compliance Partner

Ready to unplug?