AI Scientist - Palo Alto (Internship (Phd) at Mistral AI

About Mistral

At Mistral AI, we are a tight-knit, nimble team dedicated to bringing our cutting-edge AI technology to the world. Our mission is to make AI ubiquitous and open.
We are creative, low-ego, team-spirited, and have been passionate about AI for years.
We hire people who thrive in competitive environments because they find them more fun to work in.
We hire passionate women and men from all over the world.
Our teams are distributed across France, the UK, and the USA.

You will work with the fine-tuning team to develop state-of-the-art generative models.
You will run autonomous workstreams under the supervision of experienced scientists.
Location: Paris, France.
Internship Duration: 3 to 6 months.
We are open to CIFRE programs as a continuation after the internship.

Explore state-of-the-art LLM algorithms for fine-tuning under the supervision of top-level scientists.
Assist in the design and implementation of machine learning models and algorithms.
Conduct research on the latest advancements in natural language processing and LLMs.
Contribute to the development and optimization of our LLM systems.
Collaborate with cross-functional teams to integrate LLM technologies into various applications.
Perform data analysis and visualization to support research and development efforts.
Document research findings and contribute to technical reports and publications.
Participate in team meetings and brainstorming sessions to share ideas and insights.

Currently pursuing a PhD from a tier-1 engineering school or university (priority given to candidates near completion).
Strong scientific understanding of generative AI.
Broad knowledge of AI with specific interest in fine-tuning and using language models for applications.
Proficient programming skills in Python with experience in libraries such as TensorFlow, PyTorch, or similar.
Familiarity with natural language processing techniques and machine learning algorithms.
Ability to design complex software and make it usable in production.
Competence in navigating the full MLOps technical stack, focusing on architecture development and model evaluation.
Previous experience with LLMs or related technologies.
Knowledge of deep learning frameworks and techniques.
Experience with version control systems (e.g., Git) and a Linux shell environment.