Arrow Research search

Author name cluster

Markus Meister

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
2 author rows

Possible papers

3

ICML Conference 2021 Conference Paper

Learning by Turning: Neural Architecture Aware Optimisation

  • Yang Liu
  • Jeremy Bernstein
  • Markus Meister
  • Yisong Yue

Descent methods for deep networks are notoriously capricious: they require careful tuning of step size, momentum and weight decay, and which method will work best on a new benchmark is a priori unclear. To address this problem, this paper conducts a combined study of neural architecture and optimisation, leading to a new optimiser called Nero: the neuronal rotator. Nero trains reliably without momentum or weight decay, works in situations where Adam and SGD fail, and requires little to no learning rate tuning. Also, Nero’s memory footprint is square root that of Adam or LAMB. Nero combines two ideas: (1) projected gradient descent over the space of balanced networks; (2) neuron-specific updates, where the step size sets the angle through which each neuron’s hyperplane turns. The paper concludes by discussing how this geometric connection between architecture and optimisation may impact theories of generalisation in deep learning.

NeurIPS Conference 2020 Conference Paper

Learning compositional functions via multiplicative weight updates

  • Jeremy Bernstein
  • Jiawei Zhao
  • Markus Meister
  • Ming-Yu Liu
  • Anima Anandkumar
  • Yisong Yue

Compositionality is a basic structural feature of both biological and artificial neural networks. Learning compositional functions via gradient descent incurs well known problems like vanishing and exploding gradients, making careful learning rate tuning essential for real-world applications. This paper proves that multiplicative weight updates satisfy a descent lemma tailored to compositional functions. Based on this lemma, we derive Madam---a multiplicative version of the Adam optimiser---and show that it can train state of the art neural network architectures without learning rate tuning. We further show that Madam is easily adapted to train natively compressed neural networks by representing their weights in a logarithmic number system. We conclude by drawing connections between multiplicative weight updates and recent findings about synapses in biology.

NeurIPS Conference 1997 Conference Paper

Refractoriness and Neural Precision

  • Michael Berry
  • Markus Meister

The relationship between a neuron's refractory period and the precision of its response to identical stimuli was investigated. We constructed a model of a spiking neuron that combines probabilistic firing with a refractory period. For realistic refractoriness, the model closely reproduced both the average firing rate and the response precision of a retinal ganglion cell. The model is based on a "free" firing rate, which exists in the absence of refractoriness. This function may be a better description of a spiking neuron's response than the peri-stimulus time histogram.