argmax(blog)

Hi, I’m Max Shen. Here, I share my wanderings and wonderings. For more about me, see www.maxwshen.com

A quantitative measure of mode-seeking vs. mode-covering preferences

Another view

September 19, 2025 · 838 words · Max Shen

Stein discrepancy is mode-seeking

Stein discrepancy is more mode-seeking than reverse KL

September 18, 2025 · 778 words · Max Shen

Sampler evaluation: No mode-covering without importance sampling

An impossibility result

September 17, 2025 · 936 words · Max Shen

Defining mode-covering behavior with gradient analysis

What does mode-seeking vs. mode-covering behavior in distributional divergences really mean?

September 16, 2025 · 573 words · Max Shen

Outranking leaderboard models on test sets, with only their predictions

A brainteaser

January 21, 2025 · 1415 words · Max Shen

On diversity and many-model ensembling: AI government & AI-augmented public goods funding

When AI agents outnumber voters, how should we elect an AI government?

January 14, 2025 · 3452 words · Max Shen

The deep retrieval + remixing hypothesis

How do modern deep generative models produce high-quality outputs?

March 28, 2024 · 1914 words · Max Shen

What is data? The classical and postmodern views

Is data sacred?

March 27, 2024 · 628 words · Max Shen

On Creating AGI within Capitalism -- The Greatest Threat to AGI is ANI.

A speculative argument

September 30, 2022 · 1567 words · Max Shen

Implicit regularization of SGD in a linear model

A brief mathematical result

September 28, 2022 · 595 words · Max Shen

Treating high-dimensional prediction/generation as optimization

Exploring connections between diffusion models, alphafold’s recycling, structured prediction energy networks, and manifold learning.

August 29, 2022 · 2154 words · Max Shen

Primer on Score-based Generative Models

An introduction

August 14, 2022 · 1692 words · Max Shen

Implicit Differentiation through Equilibria

An introduction.

July 20, 2022 · 1823 words · Max Shen, Jan-Christian Huetter

On KL Divergence in Discrete Spaces

KL behaves differently in discrete spaces than in continuous spaces.

July 12, 2022 · 2237 words · Max Shen, Nathaniel Diamant