Ortseam Technologies
AI

Machine Learning Engineer with Data engineering

Ortseam Technologies · Hybrid - Bengaluru, India

Actively hiring Posted 4 months ago

Job description: ML Data Engineer

Location: Bangalore

Work: Onsite (Mon - Thu) and Remote on Friday

Contract to Hire: Contract 4 months to Full time employment

Job Title-Machine Learning Engineer (Data Engineering Focus) Databricks, Retail Grocery Overview: We are seeking a Machine Learning Engineer with strong Data Engineering skills to build and operationalize scalable data and ML solutions on Databricks running on Google Cloud Platform (GCP). This role focuses on developing end-to-end data pipelines, feature engineering, and production ML workflows that power critical retail grocery use cases such as demand forecasting, personalization, promotions, pricing, and inventory optimization.You will work across data engineering, data science, and platform teams to deliver reliable, production-grade ML systems at scale.

Key Responsibilities:Data Engineering & Platform

  • Design, build, and optimize batch and streaming data pipelines using Databricks (Spark / Structured Streaming).
  • Develop robust ETL/ELT pipelines ingesting retail data (POS, transactions, customer, inventory, promotions, supplier data).
  • Implement Delta Lake tables with best practices for performance, schema evolution, and data quality.
  • Orchestrate pipelines using Databricks Workflows and/or Cloud Composer (Airflow).
  • Ensure data reliability, observability, and cost efficiency across pipelines.

Machine Learning Engineering

  • Build and productionize ML pipelines using Databricks MLflow, Databricks Feature Store, and Spark ML / Python ML frameworks.
  • Collaborate with data scientists to convert experiments into scalable, reusable ML pipelines.
  • Deploy and manage batch and real-time inference workflows within Databricks.
  • Optimize model training and inference for performance and cost.

MLOps & Best Practices

  • Implement ML lifecycle management using MLflow (experiment tracking, model registry, versioning).
  • Enable CI/CD for data and ML pipelines using Git-based workflows.
  • Monitor model performance, data drift, and pipeline health.
  • Enforce best practices around testing, code quality, and reproducibility.

Retail Analytics & Collaboration

  • Partner with business, analytics, and product teams to translate retail grocery use cases into data and ML solutions.
  • Provide technical guidance on Spark optimization, data modeling, and ML architecture.
  • Contribute to platform standards and reusable components.

Required Qualifications

  • 3+ years of experience in Data Engineering and/or Machine Learning Engineering.
  • Strong hands-on experience with Databricks:

  • Apache Spark (PySpark / Spark SQL)

  • Delta Lake

  • MLflow

  • Strong proficiency in Python and SQL.

  • Experience with Databricks on GCP.

  • Familiarity with Cloud Composer (Airflow).

  • Experience with BigQuery, Pub/Sub, or GCP storage services.

  • Experience building production-grade data pipelines at scale.

  • Solid understanding of ML concepts, feature engineering, and model evaluation.

  • Experience deploying ML models in distributed environments.

Preferred

  • Retail, grocery, e-commerce, or CPG domain experience.
  • Experience with demand forecasting, recommendation systems, or pricing models.
  • Exposure to real-time/streaming ML use cases.

What Success Looks Like

  • ML solutions move from prototype to production efficiently.
  • Data pipelines are scalable, reliable, and cost-optimized.
  • Business teams rely on ML outputs for core retail decisions.
  • The Databricks platform supports multiple ML use cases with minimal friction.

**Tech Stack-

Cloud & Platform:** GCP, Databricks

Data Processing & Storage: Apache Spark (PySpark, Spark SQL), Delta Lake

Programming Languages: Python, SQL

Machine Learning & MLOps: MLflow, Databricks Feature Store, Spark ML

Orchestration & Streaming: Databricks Workflows, Airflow (Cloud Composer), Pub/Sub

Analytics & Data Services: BigQuery

Tags & focus areas

Used for matching and alerts on DevFound
Remote Pyspark Data Engineering Machine Learning Gcp Python Airflow Streaming Pubsub Data Science
Common Questions

Frequently asked questions

Quick answers about how DevFound's AI matching, resumes, and referrals work.

DevFound's AI Copilot ingests your profile, goals, and live job data to deliver curated matches in seconds. Every match includes a resume variant, suggested referrals, and interview prep so you can act immediately. The more feedback you provide, the sharper the Copilot becomes.

AI-led job searches shrink the hours spent sifting through boards and formatting resumes. DevFound pairs automation with your personal outreach, so you reserve energy for interviews and negotiation. Traditional networking still matters, but AI gives you a lift before you even send a message.

Modern AI roles expect comfort with production-grade code, data fluency, and practical ML tooling. The strongest candidates pair deep technical chops with storytelling—translating model impact to product, GTM, and exec partners. Continuous learning keeps you ahead as stacks evolve.

DevFound rewards active seekers. Keep your profile fresh, respond to match quality prompts, and enable alerts so you never miss a role. The AI prioritizes companies and teams that align with your feedback, accelerating both introductions and interview invites.

High-density tech hubs continue to host the deepest AI talent pools, yet distributed teams are catching up fast. Use DevFound filters to hone in on onsite, hybrid, or fully remote roles and watch openings expand across time zones.

DevFound aggregates thousands of remote AI openings and flags the nuances—core hours, async culture, and visa needs—up front. The Copilot also recommends how to position your distributed work experience so hiring managers know you can thrive on a remote team.