Python Engineer - Reinforcement Learning Environments for LLMs (Contractor)
Camplight
Software Engineering, Data Science
United States · Remote
Posted on Jan 21, 2026
Are you a Python engineer with a background in AI/ML who wants to work on the systems that train large language models?
Project Overview:
You’ll be part of a team creating reinforcement learning environments used directly in advanced LLM training pipelines. Your work will influence how models learn complex behaviors and how those behaviors are evaluated and improved over time.
What You’ll Be Doing:
- Design and implement RL environments for LLM training
- Create task prompts that specify desired model behavior
- Build judges and automated evaluation logic
- Design and integrate tool interfaces for model interaction
- Work with data to create diverse, high-quality tasks
- Run, debug, and improve environments inside virtualized execution setups
- Collaborate with AI/ML engineers and researchers on reward signals and evaluation criteria
- Improve robustness, reproducibility, and diversity of the environment suite
Requirements:
- 5+ years of professional Python programming experience
- Background in AI / Machine Learning (industry, research, or advanced projects)
- Solid understanding of reinforcement learning concepts
- Experience with RL environments or frameworks (e.g. OpenAI Gym or similar)
- Experience building systems involving automated evaluation, validation, or judging logic
- Experience working with large language models
- Experience designing custom RL environments for training
- Experience with virtual machines, sandboxed execution, or containerized environments
- Background in AI research, ML infrastructure, or training pipelines
What do we offer?
We focus on health, wealth, and empowering relationships:
- Fully remote work with flexible work hours
- Competitive salary
- Opportunity to become a co-owner of the cooperative
- Individual career development plan
- Friendly team and company culture
- Prioritization of mental and physical health in the workplace
- Empowering relationships for engineering in a collaborative, growth-oriented environment
What does the interview process look like?
- Initial Interview: A 45-minute cultural and technical interview focusing on DevOps experience, leadership approach, and operational mindset.
- You can choose between two Technical Deep Dive options:
- Homework Assignment: A short practical assignment (~2 hours) followed by a 1-hour discussion.
- Live Technical Session: A 2-hour session focused on CI/CD, infrastructure, and incident-handling scenarios.
Regardless of the outcome, we will provide you with constructive feedback to help you grow.