PhD student · University of Toronto

Evgenii Opryshko

I work on Reinforcement Learning and Large Language Models with focus on reward hacking and better alignment (RLHF).

Portrait of Evgenii Opryshko
Computer Science graduate student in the Toronto Intelligent Systems Lab.

Publications

My publications span reinforcement learning, robotics, reward hacking, and LLM reasoning.

Test-Time Graph Search for Goal-Conditioned Reinforcement Learning

Evgenii Opryshko, Junwei Quan, Claas Voelcker, Yilun Du, Igor Gilitschenski · ICML 2026

Modification-Considering Value Learning for Reward Hacking Mitigation in RL

Evgenii Opryshko, Umangi Jain, Igor Gilitschenski · RLC 2026

Update-Free On-Policy Steering via Verifiers

Maria Attarian, Ian Vyse, Claas Voelcker, Jasper Gerigk, Evgenii Opryshko, Anas Almasri, Sumeet Singh, Yilun Du, Igor Gilitschenski · arXiv 2026

Robust Reasoning Benchmark

Pavel Golikov, Evgenii Opryshko, Gennady Pekhimenko, Mark C. Jeffrey · arXiv 2026

Experience

My path spans AI safety research and production engineering: from reward hacking mitigation in RLHF to shipping mobile products used by tens of millions of people.

CHAI, UC Berkeley

Research Intern

  • Worked with Michael Cohen in Stuart Russell's group (CHAI) on preventing reward hacking during RLHF post-training of large language models.
  • Designed a co-training scheme for a reward model and policy LLM to encourage pessimistic reward estimates on model-generated outputs.
  • Built the full research pipeline from proposal to implementation, including reward model and LLM training; the project continues as part of my PhD research.

Yandex

Senior Software Engineer

  • Led full-cycle development of major features in Yandex Browser and Yandex Search App, products with 100M+ installations and tens of millions of daily active users.
  • Drove architecture and implementation for personal data storage, main UI redesigns, tab manager redesign, recommendations, weather updates, analytics monitoring, and messenger integration.
  • Collaborated across Search App, Taxi, and Market on backend-driven UI infrastructure, including page caching and multi-screen navigation.
  • Improved testing infrastructure and release reliability, conducted technical interviews, and served as admission committee member, mentor, and lecturer for Yandex Mobile Development School.

Ronas IT

Mobile Developer

  • Architected and shipped Android, iOS, and Flutter applications for international clients, often owning the full lifecycle from requirements to release.
  • Worked directly with customers to translate business needs into technical user stories and product decisions.
  • Created an Android application skeleton with architecture, common libraries, screen generation scripts, navigation patterns, and network caching utilities.

WeCanDevelopIt

Software Developer

  • Started in PHP, then expanded into Node.js and Android development across backend, frontend, and mobile projects for UK and US clients.
  • Refactored legacy codebases, adopted Git workflows, and implemented architecture patterns and dependency injection in mobile applications.