Research

I aim to make the impact of AI on our society visible, demonstrate novel potential (and realistic) risks, and build interventions to target those risks. I have a particular interest in how AI-powered tools and infrastructure shape how people consume and interpret information.

I am primarily a computer scientist, evaluating and intervening upon AI systems to observe and control their behaviour. However, I draw upon social science methodologies (such as experimental design for valid causal inference), as well as insights from social science when investigating the possible impact of AI systems (such as theories about intermediation in media consumption). I appreciate the generality and power of mathematical formalism, but strongly believe in avoiding abstraction traps (see here and more recently, here).

I have a long-term goal of providing technical expertise to improve the design of regulation.

Current Projects

How do LLMs marginalise or prefer certain political perspectives when summarising documents?

ongoing

LLMs are increasingly used both for ’neutral’ infrastructure (such as summarisation tools for journalists) and by individuals pondering political decision-making. Political bias in LLMs is therefore highly consequential, and there has been a flurry of research into this area by both frontier labs and academics. One important and unstudied area is the bias of LLMs when processing documents - which they regularly do when summarising web search results or when pulling from databases. My thesis intends to explore this area, investigating possible causes for bias to be exacerbated or moderated when LLMs are used to summarise documents. With Paul Röttger (Oxford Internet Institute).

Can diffusion models imitate artistic style?

ongoing

Existing research has studied whether people are able to distinguish between AI-generated and human-created art, finding that they are close to indistinguishable for certain models (see here and here). No one has yet directly asked people to rate images according to how similar they are to an artist’s style. We devise a two-alternative forced-choice (2AFC) experiment aimed at directly measuring the stylistic fidelity of models when prompted to imitate an artist’s style. This provides a novel evaluation of model’s capability in this domain, disentangled from how good they are at producing technically correct (e.g. anatomically correct hands) images. With Raphaël Millière (Oxford Institute for Ethics in AI) and collaborators.

Can we use neurosymbolic methods to improve robustness of multi-agent negotiations?

ongoing

LLM-controlled agents currently fail at complex negotiation problems. This is especially risky in contexts where they make promises that their principals (such as individuals or corporations) are held to. Can we use neurosymbolic methods (where a sub-problem is solved using a old-school symbolic algorithm) to provide security and robustness guarantees to such systems? With Ilia Shumailov and Christian de Witt (Oxford Witt Lab).

Publications

The Missing Metrics in DSA Content Moderation Transparency

2026-01 DSA Observatory

The Digital Services Act makes platform transparency reporting mandatory and standardised, but the metrics it requires still fall short of what is needed for real accountability. Counts of removals and appeals alone cannot tell us whether content moderation systems are accurate, proportionate, or effective, making the absence of evaluation metrics such as precision and recall increasingly difficult to justify under the DSA’s risk-based logic. While these metrics are unlikely to surface in baseline transparency reports under Articles 15 and 24, the post argues they may yet emerge through heightened scrutiny of the largest online platforms and search engines (VLOPSEs), as regulatory expectations take shape through enforcement, systemic risk reporting, audits, and related obligations.

Computational Models of Human Decision-Making with Application to the Internet of Everything

2021 IEEE Wireless Communications, 28(1), 152–159

This paper presents computational models of human decision-making and their applications to IoT systems. We explore how understanding human behavioral patterns can inform the design of more effective and user-centered Internet of Everything architectures. Co-authored with Setareh Maghsudi.