A
AION2h ago
Career Pages

Machine Learning Engineer

Bengaluru, Karnataka, India
Full Time
Mid Level

Auto Apply to 50+ AI Matched Machine Learning Engineer Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at AION

Responsibilities

Qualifications & Requirements

Experience Level: Mid Level

Full Job Description

We are seeking a hands-on Machine Learning Engineer with 4-6 years of experience to join our team in Bengaluru, Karnataka, India. This role focuses on building and fine-tuning large language models (LLMs) and transformer-based models, tackling complex problems at the intersection of ML research and production systems.

You will be involved in the entire ML development lifecycle, including data preparation, model fine-tuning, evaluation, and optimization. A strong understanding of what drives model performance and how to systematically improve it through experimentation is key. Experience with LLM fine-tuning techniques (LoRA, QLoRA), RLHF pipelines, and comprehensive model evaluation is highly desirable. We are looking for an individual with strong ownership, initiative, and a passion for developing production-ready ML models that will impact thousands of developers worldwide.

What You'll Do:

ML Model Development & Optimization

  • Design and implement end-to-end LLMOps pipelines for model training, fine-tuning, and evaluation.
  • Fine-tune and customize LLMs (e.g., Llama, Mistral, Gemma) using full fine-tuning and PEFT techniques (LoRA, QLoRA) with tools such as Unsloth, Axolotl, and HuggingFace Transformers.
  • Implement Reinforcement Learning from Human Feedback (RLHF) pipelines for model alignment and preference optimization.
  • Design experiments for automated hyperparameter tuning, training strategies, and model selection.
  • Prepare and validate training datasets, ensuring data quality, preprocessing, and format correctness.
  • Build comprehensive model evaluation systems with custom metrics (BLEU, ROUGE, perplexity, accuracy) and develop synthetic data generation pipelines.
  • Optimize model accuracy, token efficiency, and training performance through systematic experimentation.
  • Design and maintain prompt engineering workflows with version control systems.
  • Deploy models using vLLM with multi-adapter LoRA serving, hot-swapping, and basic optimizations like speculative decoding, continuous batching, and KV cache management.

ML Operations & Technical Leadership

  • Set up ML-specific monitoring for model quality, drift detection, and performance tracking, with automated retraining triggers.
  • Manage model versioning, artifact storage, lineage tracking, and reproducibility using experiment tracking tools.
  • Debug production model issues and optimize cost-performance trade-offs for training and inference.
  • Collaborate with infrastructure engineers on ML-specific compute requirements and deployment pipelines.
  • Document model development processes and share knowledge through internal tech talks.

Technical Skills & Experience:

We encourage you to apply if you meet some of these requirements and are eager to learn the rest.

  • 4-6 years of hands-on experience in machine learning engineering or applied ML roles.
  • Strong fine-tuning experience with modern LLMs, including practical knowledge of transformer architectures, attention mechanisms, and PEFT techniques (LoRA/QLoRA).
  • Deep understanding of transformer model architectures and their modern variants (MoE, Grouped-Query Attention, Flash Attention, state space models).
  • Production ML experience, including building and fine-tuning models for real-world applications.
  • Proficiency in Python and ML frameworks such as PyTorch, HuggingFace Transformers, PEFT, and TRL, with hands-on experience in tools like Unsloth and Axolotl.
  • Experience building model evaluation systems with metrics like BLEU, ROUGE, perplexity, and accuracy.
  • Hands-on experience with prompt engineering, synthetic data generation, and data preprocessing pipelines.
  • Basic deployment experience with vLLM, including multi-adapter serving, hot-swapping, and inference optimizations.
  • Understanding of GPU computing concepts such as memory management, multi-GPU training, mixed precision, and gradient accumulation.
  • Strong debugging skills for training failures, OOM errors, convergence issues, and data quality problems.
  • Experience with model alignment techniques (RLHF, DPO) and implementing RLHF pipelines is highly desirable.
  • Experience with distributed training (DeepSpeed, FSDP, DDP) is a plus.
  • Knowledge of model quantization techniques (GPTQ, AWQ) and their impact on model quality is desirable.
  • Prior experience with AWS SageMaker, MLflow for experiment tracking, and Weights & Biases is a strong plus.
  • Exposure to cloud platforms (AWS/GCP/Azure) for training workloads is beneficial.
  • Familiarity with Docker containerization for reproducible training environments.

Preferred Attributes:

  • High ownership, self-driven, and a bias for action.
  • Strong strategic thinking and the ability to connect technical decisions to business impact.
  • Excellent communication and mentoring skills.
  • Thrives in ambiguous, fast-paced environments and early-stage startup cultures.

Why Join AION?

  • Work directly with high-pedigree founders shaping technical and product strategy.
  • Contribute to building the infrastructure powering the future of AI compute globally.
  • Significant ownership and impact with equity reflective of your contributions.
  • Competitive compensation, flexible work options, and wellness benefits.

If you are a machine learning engineer ready to lead ML-as-a-Service (MLaaS) architecture and scale next-generation AI infrastructure, we encourage you to apply. Please include the following in your application summary:

  • Your resume highlighting relevant projects and leadership experience.
  • Links to products, code (GitHub), or demos you have built.
  • A brief note explaining why AION’s mission excites you.

Company

A

AION

AION is pioneering a decentralized AI cloud platform designed for high-performance computing (HPC). We are transforming the future of compute by democratizing access and offering managed services, aim...

Bengaluru, Karnataka, India
Posted on Career Pages
Machine Learning Engineer, Platform at AION | Bengaluru, Karnataka, India | Apply Now | MindMyJob | MindMyJob - AI Job Search Platform