Machine Learning Engineer Intern at GMI cloud

About Us

At GMI, we are at the forefront of scalable AI infrastructure solutions. Our platforms power state-of-the-art machine learning, enabling cutting-edge applications in the generative AI domain. As a fast-moving and innovative team, we thrive on leveraging open-source solutions and industry best practices to deliver robust, high-performance AI systems for our clients.

About the Role

We are seeking a Software Engineering Intern who will focus on adapting and optimizing open-source foundation models for our GPU inference platform. You will work closely with experienced engineers and AI researchers, gaining hands-on exposure to large-scale model deployment techniques. This is an opportunity to build valuable skills in model optimization, GPU acceleration, and systems-level engineering while contributing to the next generation of AI-powered products.

Key Responsibilities

Model Adaptation & Integration: Adapt open-source foundation models (e.g., LLMs, vision transformers, multimodal models) to run efficiently on our custom GPU inference infrastructure
Performance Optimization: Identify bottlenecks in model inference pipelines, implement GPU kernels, and optimize code to reduce latency and improve throughput
Platform Tooling & Automation: Develop scripts and tooling for automating model conversion, quantization, and configuration processes to streamline deployment workflows
Testing & Validation: Implement benchmarking tests and validation suites to ensure model accuracy, reliability, and performance meet internal standards
Collaboration with Cross-Functional Teams: Work closely with machine learning researchers, MLOps engineers, and infrastructure teams to refine performance strategies and ensure smooth integration of foundation models into production environments
Documentation & Knowledge Sharing: Document adaptation procedures, best practices, and lessons learned. Contribute to internal knowledge bases and present findings in team meetings

Qualifications

Educational Background: Currently pursuing a Graduate degree in Computer Science, Electrical Engineering, or a related technical field
Programming Skills: Proficiency in Python and familiarity with go and CUDA is a plus
Foundational Knowledge in Machine Learning: Understanding of attention-based models, PyTorch, and GPU-accelerated computing
Problem-Solving Mindset: Strong analytical skills, with the ability to troubleshoot performance issues and propose innovative optimization strategies
Team Player: Excellent communication skills, eagerness to learn, and the ability to collaborate effectively with diverse teams

What You’ll Gain

Real-world exposure to large-scale, production-grade AI deployments
Hands-on experience with state-of-the-art models and GPU acceleration techniques
Mentorship from experienced engineers and researchers
Opportunities to impact performance-critical aspects of cutting-edge AI products

If you’re passionate about AI systems engineering and excited to work at the intersection of machine learning and high-performance computing, we encourage you to apply!

Machine Learning Engineer Intern

About Us

About the Role

Key Responsibilities

Qualifications

What You’ll Gain

Location

Job type

Role

Keywords