About Gradient
Gradient builds AI systems for systematic investing. Our first product is a long/short equity fund powered end-to-end by our models. We move fast, measure reality, and we iterate relentlessly.
We’re looking for someone who can turn ambiguous goals into high-signal experiments, build the post-training/RL stack around them, and extract insights that actually change decisions.
About the Role
You’ll work directly with the founders to improve our post-training infrastructure and RL training environment, and to run the experiments that push our models toward real alpha.
This is a high-agency role at the intersection of:
- Post-training (SFT/RL, reward shaping, data filtering)
- Scalable experimentation (ablations, evals, reproducibility)
- And the messy reality of finance data and backtests
What You Will Do
- Improve infrastructure for post-training
- Training recipes, data pipelines, experiment tracking
- Tooling that makes running the “next best experiment” cheap and reliable
- Build out our environment for RL training
- Reward signal design, verifiers/constraints, rollout + sampling strategies
- Evaluation harnesses that capture what “better” means (beyond a single metric)
- Design experiments and run ablations
- Propose hypotheses, isolate variables, prioritize by expected insight per compute €
- Build the “ablation ladder” that quickly explains performance shifts
- Fine-tune models to find alpha
- SFT + RL-style training, dataset construction, filtering, curriculum ideas
- Iterate on training stability, generalization, and robustness
- Analyze results and distill insights
- Write short, high-clarity memos: what we tried → what happened → what it means → what we do next
What We Are Looking For
You don’t need a finance background, but you should be curious about markets and excited to learn fast.
Must-Haves
- Outstanding analytical ability — you can reason from messy evidence to crisp next actions
- Deep understanding of modern AI training (post-training + RL concepts)
- Strong intuition for experiment design and compute efficiency
- You don’t run obviously flawed experiments
- You know what to measure, what to freeze, and what to change
- Strong Python skills and comfort with the modern ML stack
Traits We Care About a Lot
- High agency: you don’t wait for specs; you define the work and ship
- Taste: you know what “good” looks like in evals, tooling, and results
- Scientific discipline: you can say “we don’t know yet” and design the test that makes it knowable
Nice to Have
- Experience with RLHF/GRPO-style methods, reward modeling, preference optimization
- Distributed training / large-scale rollouts / inference throughput optimization
- Experience building eval harnesses that guide iteration
What We Offer
- Freedom: we don’t care where or when you work — as long as you deliver
- No corporate bullshit: No bureaucracy and time-wasting busywork
- Leverage: work directly with the founders; your work changes the roadmap immediately
- A hard, interesting problem: RL + post-training applied to real financial prediction and portfolio outcomes
- Competitive compensation + meaningful upside (details depend on seniority and fit)
About Gradient Technologies
Gradient Technologies GmbH is a Hamburg-based AI and software company focused on financial analysis. We build AI systems that process large volumes of financial data to generate equity return forecasts and construct market-neutral investment strategies.
Our philosophy: Efficiency over complexity. Do the obvious things well. Follow the gradient.
How to Apply
Send your CV to contact@gradtec.ai. That’s it — no cover letter needed.