Lead Data Engineer - Scalable Data Pipelines - Contract to Hire

Remote Full-time
Lead Data Engineer (PySpark, Airflow, Azure) – Scalable Data Pipelines We’re looking for an experienced Senior Data Engineer to design, build, and optimize large-scale data pipelines powering analytics and machine learning workloads. This role is ideal for someone who is hands-on, performance-oriented, and comfortable leading other engineers while owning end-to-end data workflows. You’ll work on both batch and real-time processing, take ownership of Spark performance tuning, and help enforce best practices around data quality, governance, and reliability. ⸻ Responsibilities • Design, develop, and optimize scalable data pipelines using Python, PySpark, Apache Spark, and Airflow • Build and maintain batch and streaming data processing systems on Spark • Design and manage Airflow DAGs to orchestrate complex, dependency-heavy workflows • Implement data partitioning, caching, and Spark performance tuning to handle large datasets efficiently • Ensure data quality, governance, security, and reliability across the data lifecycle • Monitor, troubleshoot, and optimize data jobs, SLAs, and pipeline dependencies • Manage cloud infrastructure (Azure) for data workloads, including cost optimization • Implement CI/CD pipelines for data workflows using Git, Docker, and Infrastructure-as-Code tools • Support analytics and ML use cases by working with structured and unstructured data • Lead and mentor other data engineers, providing architectural guidance and code reviews • Promote best practices in coding standards, documentation, and version control • Collaborate effectively with distributed, remote teams in an Agile environment ⸻ ✅ Requirements • 8+ years of hands-on experience in Data Engineering • Strong expertise with Apache Spark / PySpark, including internals such as: • RDDs, DataFrames, DAG execution, partitioning, shuffles, and caching • Proven experience building and operating Airflow DAGs (scheduling, dependencies, retries, SLAs) • Advanced Python and SQL skills with a focus on performance and maintainability • Solid experience with Azure data and compute infrastructure • Working knowledge of Docker, Kubernetes, Terraform, and CI/CD best practices • Strong problem-solving skills and ability to optimize large-scale data processing systems • Prior experience leading or mentoring engineers • Comfortable working in Agile/Scrum environments • Excellent communication skills and ability to collaborate with remote teams ⸻ ⭐ Nice to Have • Experience with streaming frameworks (Spark Structured Streaming, Kafka, Event Hubs) • Familiarity with data governance, lineage, and observability tools • Experience supporting ML or advanced analytics pipelines • Background in cost-efficient Spark optimization at scale Apply tot his job
Apply Now

Similar Opportunities

Data Engineer – MUST HAVE AZURE & IICS – 100% Remote

Remote Full-time

Staff Data Platform Engineer

Remote Full-time

Corporate Vice President - Data Protection Engineer

Remote Full-time

SOX Control Tester

Remote Full-time

Principal Product Manager, Reporting & Optimization Insights [Remote]

Remote Full-time

Software Engineer II - Data Platform

Remote Full-time

Data Engineer 5 - Privacy

Remote Full-time

Senior Software Engineer, iOS

Remote Full-time

Data Scientist/Analyst - Remote

Remote Full-time

Data Scientist(Remote)

Remote Full-time

Experienced Remote Data Entry Clerk – Oil and Gas Industry Expertise – Full-Time Opportunity with blithequark

Remote Full-time

Product Researcher, Freelance

Remote Full-time

Principal Product Manager Tech, Sponsored Products Marketplace Intelligence

Remote Full-time

Content Creator & On-Camera Talent

Remote Full-time

Experienced Customer Service Representative – Remote Opportunity for Exceptional Client Experience and Career Growth at blithequark

Remote Full-time

**Experienced Global Qual + Quant Pharma Project Manager – Remote Work Opportunity with a Fast-Growing Market Research Firm**

Remote Full-time

**Experienced Customer Service Advocate – National Remote Opportunity to Revolutionize Healthcare**

Remote Full-time

**Experienced Bilingual Customer Engagement Representative – Hybrid Role at arenaflex**

Remote Full-time

Experienced Online Amazon Customer Service Representative - Remote Position: Delivering Exceptional Customer Experiences and Driving Success in a Dynamic E-commerce Environment

Remote Full-time

Remote Content Writer jobs – Full‑Time Senior Content Creator & Copywriter (Remote) – $70K‑$95K • WordPress, Ahrefs, Canva • Based in St. Marys, Georgia

Remote Full-time
← Back to Home