Maria-Florina Balcan
(Carnegie Mellon University), Data-driven algorithm design
Data-driven algorithm design for combinatorial problems is an
important aspect of modern data science. Rather than using off the
shelf algorithms that only have worst case performance guarantees,
practitioners typically optimize over large families of parameterized
algorithms and tune the parameters of these algorithms using a
training set of problem instances from their domain to determine a
configuration with high expected performance over future instances.
However, most of this work comes with no performance guarantees. The
challenge is that for many combinatorial problems, including
partitioning and subset selection problems, a small tweak to the
parameters can cause a cascade of changes in the algorithm's behavior,
so the algorithm's performance is a discontinuous function of its
parameters.
In this talk, I will present new work that helps put data-driven
combinatorial algorithm selection on firm foundations. This includes
strong computational and statistical performance guarantees, both for
the batch and online scenarios where a collection of typical problem
instances from the given application are presented either all at once
or in an online fashion, respectively. I will describe both specific
examples (for clustering, partitioning, and subset selection problems)
and general principles that emerge in this context (including general
techniques for sample complexity guarantees in the batch setting and
no-regret guarantees in the online settings).
Nikhil Bansal
(CWI and Eindhoven University of Technology), Discrepancy, Rounding and Approximation
Discrepancy theory deals with the following question: Given a set
system on some universe of elements, color the elements red and blue
so that each set in the system is colored as evenly as possible. I
will give an overview of discrepancy and describe some of its
applications. Then we focus on some results and techniques in
discrepancy, and in particular show how they can be used to design new
general rounding techniques leading to improved approximation
guarantees for various algorithmic problems.
Maria Chudnovsky
(Princeton University), Induced subgraphs and tree decompositions
Tree decompositions are a powerful tool in structural graph theory,
that is traditionally used in the context of forbidden graph minors.
Connecting tree decompositions and forbidden induced subgraphs has so
far remained out of reach. Recently we obtained several results in
this direction; the talk will be a survey of these results.
Nicole Immorlica
(Microsoft Research), Incentivizing Exploration with Selective Data Disclosure
We study the design of rating systems that incentivize efficient
social learning. Agents arrive sequentially and choose actions, each
of which yields a reward drawn from an unknown distribution. A policy
maps the rewards of previously-chosen actions to messages for arriving
agents. The regret of a policy is the difference, over all rounds,
between the expected reward of the best action and the reward induced
by the policy. Prior work proposes policies that recommend a single
action to each agent, obtaining optimal regret under standard
rationality assumptions. We instead assume a frequentist behavioral
model and, accordingly, restrict attention to disclosure policies that
use messages consisting of the actions and rewards from a subsequence
of past agents, chosen ex ante. We design a policy with optimal regret
in the worst case over reward distributions. Our research suggests
three components of effective policies: independent focus groups,
group aggregators, and interlaced information structures.
Joint work with Jieming Mao, Alex Slivkins and Steven Wu.
Eduardo Sany Laber
(Pontifical Catholic University of Rio de Janeiro), On the Price of Explainability for some Clustering Problems
Machine learning models and algorithms have been used in a number of
systems that take decisions that affect our lives. Thus, explainable
methods are desirable so that people are able to have a better
understanding about their behaviour. However, we may be forced to lose
quality and/or efficiency in order to achieve explainability. In this
talk we investigate, from a theoretical perspective, the price of
explainability for some clustering problems.
Alexander Razborov
(The University of Chicago), Theons and Quasi-Randomness
There are two known approaches to the theory of limits of discrete
combinatorial objects: geometric (graph limits) and algebraic (flag
algebras). In the first part of the talk we present a general
framework intending to combine useful features of both theories and
compare it with previous attempts of this kind. Our main objects are
\(T\)-ons, for a universal relational first-order theory \(T\); they
generalize all previously considered partial cases, some of them (like
permutons) in a rather non-trivial way.
In the second part we apply this framework to offer a new perspective
on quasi-randomness for combinatorial objects more complicated than
ordinary graphs. Our quasi-randomness properties are natural in the
sense that they do not use ad hoc densities and they are preserved
under the operation of defining combinatorial structures of one kind
from structures of a different kind. One key concept in this theory is
that of unique coupleability roughly meaning that any alignment of two
objects on the same ground set should “look like” random.
Based on two joint papers with Leonardo Coregliano: Russian
Mathematical Surveys (2020, 75(4))
and arXiv:2012.11773.
Luca Trevisan
(Bocconi University), Graph and Hypergraph Sparsification
graph \(H\) is a sparsifier of a graph \(G\) if \(H\) has much fewer
edges than \(G\) and, in an appropriate technical sense, \(H\)
“approximates” \(G\). Sparsifiers are useful as compressed
representations of graphs and to speed up certain graph algorithms. In
a “cut sparsifier,” the notion of approximation is that
every cut is crossed by approximately the same number of edges in
\(G\) as in \(H\). In a “spectral sparsifier” a stronger,
linear-algebraic, notion of approximation holds. Similar definitions
can be given for hypergraph.
We discuss recent progress on constructions and lower bounds for
graph and hypergraph sparsification, and we point out some challenging
open problems.
Bianca Zadrozny
(IBM Research Brazil), Evaluating classifier learning methods under covariate shift and spatial correlation
Classifier learning methods commonly assume that the training data
consist of randomly drawn examples from the same distribution as the
test examples about which the learned model is expected to make
predictions. In the real world, the joint distribution of inputs to
the model and outputs of the model differs between training and test
data, a problem known as sample selection bias or dataset shift. In
this talk, I will review existing methods for dealing with this
problem, in particular of the special case known as covariate shift
where only the input distribution changes and the conditional
distribution of the output for a given input is assumed to remain
fixed. I will then introduce the problem of covariate shift in
geospatial data and illustrate the challenges of learning from
geospatial data by assessing existing methods for evaluating the
accuracy of classifier learning methods under covariate shift and
spatial correlation.
|