Joe Clinton

Joe Clinton

Video Models for VLA • Scalable Robot Learning • Offline RL

PhD Candidate, RoMA Lab, UCL

I research foundational AI for robotics, focusing on video model backbones for Vision-Language-Action models. I believe video generation architectures encode dynamics in ways VLMs cannot, enabling better generalisation and scaling toward embodied AGI. Supervised by Chris Xiaoxuan Lu and Dimitrios Kanoulas. Funded by EPSRC Landscape Award.

Publication

Planning Transformer: Long-Horizon Offline Reinforcement Learning with Planning Tokens

Joseph Clinton, Robert Lieck

arXiv 2024

Supervised learning approaches to offline reinforcement learning, particularly those utilizing the Decision Transformer, have shown effectiveness in continuous environments and for sparse rewards. However, they often struggle with long-horizon tasks due to the high compounding error of auto-regressive models. To overcome this limitation, we go beyond next-token prediction and introduce Planning Tokens, which contain high-level, long time-scale information about the agent's future. Predicting dual time-scale tokens at regular intervals enables our model to use these long-horizon Planning Tokens as a form of implicit planning to guide its low-level policy and reduce compounding error. This architectural modification significantly enhances performance on long-horizon tasks, establishing a new state-of-the-art in complex D4RL environments. Additionally, we demonstrate that Planning Tokens improve the interpretability of the model's policy through the interpretable plan visualisations and attention map.

Research Projects

Planning Transformer architecture

Planning Transformer

Novel enhancement to the Transformer architecture for offline RL that significantly improves long-horizon decision making. Master's thesis achieving state-of-the-art on D4RL benchmarks.

PyTorch Offline RL Transformers
hand-teleop demo

hand-teleop

Turn your webcam into real-time robot joint positions. Designed for LeRobot with Wilor GPU backend, Kalman smoothing, and plug-and-play integration. 48 GitHub stars.

Python Computer Vision Robotics
PromptMonkey leaderboard

PromptMonkey

Multi-agent many-shot code generation achieving 4th place out of 800+ participants in the NeurIPS 2024 Meta Hackercup AI Track. Extended MapCoder with careful prompt engineering, Codestral-22B, Maj@128 voting, and VLLM parallel inference (2000 tk/s).

LLM Agents VLLM Prompt Engineering
Hint distillation process

Hint Distill

Self-supervised hint distillation for improving LLMs on code generation. Finetunes Qwen3-4B using KL-divergence distillation where the model with hint access teaches itself.

LLM Post-Training Distillation PyTorch
VPCT environment

vpct-text

Prime verifiers environment for VPCT scenes. Won #3 in the Iterate London RL Environment Hackathon (Dec 2025). Scores model outputs predicting final bucket positions.

RL Environments Python Hackathon Winner

Open Source

LeRobot

LeRobot Contributions

Contributor to HuggingFace's open-source VLA framework for robot learning. Contributed BlockPush environment for benchmarking manipulation policies, assisted in porting HIL-SERL for online RL finetuning, and improved training speed through data loading optimizations.

Python Imitation Learning VLA
Robot arm viewer gif

robot-arm-viewer

Browser-based URDF viewer with IK-driven click-and-drag controls. Compare low-cost robot arms, export DAE models, and visualize reachable workspaces.

Three.js URDF IK
Generated faces

Latent Diffusion Slim

State-of-the-art latent diffusion model trained on FFHQ for photo-realistic face generation up to 128x128px. Optimized for single GPU training. 100% coursework grade.

Diffusion Models PyTorch Generative AI
Dirty dish detection

Dirty Dish Detection

Hackathon project detecting when housemates leave dirty dishes using a hybrid of YOLO and traditional computer vision state tracking algorithms.

YOLO Computer Vision PyTorch
SO100 camera mount

SO100 Camera Mount

Open source snap-on camera mount for SO100 robot arm and U20CAM camera. Optimized 30-degree angle based on testing. Parametric Fusion 360 design for easy customization.

CAD Robotics Hardware
Scratch Addons

Scratch Addons

Core contributor to browser extension with 593,000+ users. Developed 8 addons including a profiling tool for performance analysis of Scratch projects.

JavaScript Browser Extension Profiling

Side Projects

Qualicoder app

Qualicoder

AI-first qualitative coding platform for transcript analysis. Helping consultants efficiently analyze interview and research data.

AI Startup React NLP
Cluque puzzle game

Cluque

Daily cryptic puzzle game with social features. Play the global challenge or compete with friends in groups. Built with React and Firebase subscriptions.

React Firebase Consumer App
IBrecap website

IBrecap.com

Non-profit IB revision platform I founded and maintain. 1.6 million page views, #4 Google result for "IB revision". Full-stack PHP/SQL with custom CMS.

PHP SQL Full-Stack
3D game engine on Scratch

3D Game Engine

First complete 3D graphics and physics engine for Scratch. Built over 2 years with innovative binary space partitioning for efficiency. Demonstrates deep graphics understanding.

Scratch 3D Graphics Physics