Based on your interest in "Splitting Adam," you are likely referring to research surrounding the widely used in machine learning. There isn't one single paper with that exact title, but several "interesting" papers analyze splitting the algorithm's components or its behavior in complex ways: 1. The Sign, Magnitude and Variance of Stochastic Gradients
It isolates the stochastic direction (the sign of the gradient) from the adaptive step size (the relative variance). Splitting Adam
If you are coming from a statistics or rare-event simulation background, "ADAM" refers to . Based on your interest in "Splitting Adam," you