Project for the Markov Chain Monte Carlo Course

 

Goal

 

The goal of the project is to give you some (more) experience in reading a research paper, and at the same time to have you learn about a topic that we didn't manage to cover in the class, at a much deeper level than we could go into in the class. To this end, you will choose a paper, study it, and know it well; then we will meet, you will present the paper you've read and we will discuss it. Part of the goal is to give you more practice in reading papers, and in particular to give you a sense of what it feels like to know a paper or a subject in depth, even better than you'd need to know it for an exam or homework. It often surprises students how much work it takes to really know a paper and what it actually means to “know” a paper.

 

Details

 

Step 1: choose a paper, and email me the title and a link to an online version (or a reference to the dead-tree version). I’ll look at it, and give you an “okay”.

Which paper can you choose? You can generally choose any research paper in the MCMC area. It is conditioned on my approval, but I won’t be too picky, as long as it hasn’t been covered in class, and is not too easy/hard. I enclose below a list of “suggested” papers, as well as links to help you get started if you want to find a paper by yourself. Make sure to email me for approval even if you choose one of those, since I want to make sure no two students study the same paper. Email me before you start reading the paper too deeply, so that you won’t do wasted work in the (rare) case that I don’t okay it. When you get an "okay" from me, it is time to move on to step 2.

·         If you’re interested in a particular paper but it looks too big and scary, try to understand the structure and if it can be cut up in a reasonable way. If so, you can definitely suggest that your project will only be on a part of it (in fact, after you send me your paper suggestion, I will sometimes suggest to you to only study a part of it). If I think your suggestion is too small, we can discuss it. The idea here is that you should know something about the other parts of the paper as well, but maybe not know them completely. After all, the goal of the project is for you to learn something and have experience in reading and presenting a paper, not to kill you with too much work.

In any case, unless we agree otherwise, I expect you to know the entire paper.

 

 

Step 2:

Once you got an "okay” from me, read the paper in depth, and prepare to present it and to answer questions about it. You should know the paper almost as well as the authors. Meaning you should know all the details, but also know what those details really mean, how they fit in the general scheme of things, which ones are important and which are unimportant, etcetera. You should know why the result is proved in the way it is, and not in any one of the obviously-simpler ways that the authors could have tried (usually but not always it is because the simpler approach did not work, but the authors don't often tell you about this, so you need to figure this out for yourself). You should think what variants or strengthenings of the results could be true using the same techniques, and have some idea whether these variants or strengthenings really are true and are they provable using the same techniques. You should know which ideas are new in this paper, and which are taken from other papers. You should have a reasonable idea of the previous work and what the background of the work is (no need to know all the details, just background to some extent).

Of course, I don't expect you to know all of these things perfectly: often the authors don't know these very well either. Some of these questions have complicated, or even impossible-to-know answers (some simple approaches don't work because of some weird reason that might or might not be fixable, and no one really knows). I do expect you to make a very good effort to know all these things. For example, for most papers you can come up with multiple "proof sketches" that would prove the results in the paper in a simpler way, and you should come up with some of them, and know why they fail (or if they succeed).

 

Step 3:

When you are ready to schedule a time to meet to discuss the paper, email me and suggest a few possible times. You can discuss scheduling at any earlier point: as well, if you have a guess for when you will be ready.

 

Step 4:

Come in the time we set, and present the paper. This should take around an hour. It could be shorter, and might be longer (we will try to keep it to under an hour). You should prepare a detailed presentation of the paper (on blackboard, no powerpoint or such, roughly 45 minutes, covering all of the main ideas in whatever amount of detail fits in 45 minutes), but we will likely veer off at some point and I will ask you all kinds of difficult questions about the paper. We might continue more or less along the lines of the presentation, or we might go off-course completely, so there's no guarantee you really will get to give the whole presentation. The presentation is just so we have a "default" course of action.

Note that you are not graded on presentation skills, so the grade is not about the quality of the presentation (but do try to prepare a good presentation, even just for the sake of understanding the paper better). The grade is about the degree to which you know the paper (as best as I can judge), and only about that. I might ask you all kinds of hard questions about the paper that you might need time to think about; you can take time-outs at any point in the meeting (within reason) to think about these questions: sometimes it's hard to answer difficult questions on the spot.

 

Remarks

 

·         Ideally, doing the project might also give you ideas for research projects of your own. So when you read the paper, keep thinking about what further questions can be asked, what can be done with the techniques shown, etcetera. Most of this kind of thinking is useful anyway for understanding the paper better.

·         I don't expect you to make an inhumane effort: generally, the worse-written the paper is, and the more difficult the subject matter is, I will cut you more slack on what I expect.

·         Finally, remember that a lot of additional material about many research papers, e.g. slides, surveys, lecture notes, and so on, can be found online. Such sources might greatly help in understanding the result. So google a lot!

·         About working in pairs: There are some options here for working as a pair of students. In general I don't encourage it, but if two students really want to work on a research project together, or read two papers that are somewhat related, just ask me. I'll expect each student to present separately.

 

Source List

 

To find a paper to present, you can look in the list of general sources in the course homepage (I put in an expanded list)

Also, you can look at the homepages of various people in the field:

 

Homepages of people in the field

 

Mark Jerrum:  http://www.maths.qmul.ac.uk/~mj/

Eric Vigoda:  http://www.cc.gatech.edu/~vigoda/

Alistair Sinclair:  http://www.cs.berkeley.edu/~sinclair/

David Aldous:  http://www.stat.berkeley.edu/~aldous/

Ravi Kannan:  http://research.microsoft.com/en-us/um/people/kannan/

Dana Randall:  http://people.math.gatech.edu/~randall/

Dror Weitz:  http://dimacs.rutgers.edu/~dror/

Santosh Vempala:  http://www-math.mit.edu/~vempala/papers/rw.html

Mark Huber:  http://www.cmc.edu/pages/faculty/MHuber/index.html

Alan Frieze:  http://www.math.cmu.edu/~af1p/

Tom Hayes:  http://www.cs.unm.edu/~hayes/

Laszlo Lovasz (direct link to paper list on this topic)  http://www.cs.elte.hu/~lovasz/randwalk-papers.html

The Perfect Sampling Homepage: http://dimacs.rutgers.edu/~dbwilson/exact/

Sinclair’s list of projects in his course: http://www.cs.berkeley.edu/~sinclair/cs294/projlist.html

 

Recommended Papers

 

Here is a list of "recommended" papers, partitioned by topic. The partitioning is kind of arbitrary. In fact, the choice of papers is kind of arbitrary: feel free to choose anything else. Finally, the link I give is not necessarily to the best version. When you choose a paper, do at least a brief literature survey, see what versions there are, etc., and choose a good version. In the email you write to me, send me also a link to the version you chose. Journal versions are usually preferable to conference versions.

 

Sampling k-Colorings

 

1.       Coupling with the Stationary Distribution and Improved Sampling for Colorings and Independent Sets. 
T. Hayes, and E. Vigoda
http://www.cc.gatech.edu/~vigoda/Stationarity.pdf

 

2.       Randomly coloring constant degree graphs. 
M. Dyer, A. Frieze, T. Hayes, and E. Vigoda
http://www.cc.gatech.edu/~vigoda/ConstDegree.pdf

 

3.       Improved bounds for sampling colorings. 
E. Vigoda 
http://www.cc.gatech.edu/~vigoda/coloring.ps

 

4.       Randomly coloring planar graphs with fewer colors than the maximum degree. 
T. Hayes, J. Vera and E. Vigoda. 
http://arxiv.org/abs/0706.1530

 

5.       A Non-Markovian Coupling for Randomly Sampling Colorings. 
T. Hayes and E. Vigoda

http://www.cc.gatech.edu/~vigoda/NonMarkovian.pdf

(somewhat harder than most of the others; beware)

 

 

Volume Estimation and Random Walks in Polytopes

 

6.       Hit-and-run from a corner

Lovasz and Vempala

http://www-math.mit.edu/~vempala/papers/start.pdf

 

7.       Fast Algorithms for Logconcave Functions: Sampling, Rounding, Integration and Optimization

Lovasz and Vempala

http://www-math.mit.edu/~vempala/papers/integration.pdf

 

 

Approximating the Permanent and Sampling Perfect Matchings

 

8.       A polynomial-time approximation algorithm for the permanent of a matrix with non-negative entries. 
M. Jerrum, A. Sinclair, and E. Vigoda 
http://www.cc.gatech.edu/~vigoda/Permanent.pdf

(also discussed in some of the lecture notes)

 

9.       Exact sampling from perfect matchings in dense regular graphs
M. Huber

http://www.cmc.edu/pages/faculty/MHuber/Research/papers/huber2006a.pdf

 

 

Sampling Contingency Tables

 

10.   Rapidly mixing Markov chains for sampling contingency tables with a constant number of rows
Mary Cryan, Martin Dyer, Leslie Ann Goldberg, Mark Jerrum and Russell Martin,
http://homepages.inf.ed.ac.uk/mrj/papers/CDGJM06.pdf

 

11.   Sampling Binary Contingency Tables with a Greedy Start. 
I. Bezakova, N. Bhatnagar, and E. Vigoda 
http://www.cc.gatech.edu/~vigoda/BinaryContTables.pdf

 

 

MCMC on trees

 

12.   Fast mixing for independent sets, colorings and other models on trees

Fabio Martinelli, Alistair Sinclair and Dror Weitz

http://dimacs.rutgers.edu/~dror/pubs/fast_tree.pdf

(it’s a little long and it’s splittable so if you choose this paper, you can choose just part of it that looks interesting to you and tell me which part you chose, I’ll check if it’s ok and tell you)

 

13.   The Ising model on trees: Boundary conditions and mixing time
Fabio Martinelli, Alistair Sinclair and Dror Weitz

http://sunsite.berkeley.edu/TechRepPages/CSD-03-1256

 

14.   Counting independent sets up to the tree threshold

Dror Weitz

http://dimacs.rutgers.edu/~dror/pubs/ind_from_tree.pdf

 

15.   Phase Transition for Glauber Dynamics for Independent Sets on Regular Trees 
R. Restrepo, D. Stefankovic, J. C. Vera, E. Vigoda, and L. Yang 
http://arxiv.org/abs/1007.2255

 

16.   Phase Transition for the Mixing Time of the Glauber Dynamics for Coloring Regular Trees 
P. Tetali, J. C. Vera, E. Vigoda and L. Yang 
http://arxiv.org/abs/0908.2665

 

17.   Reconstruction for Colorings on Trees 
N. Bhatnagar, J. Vera, E. Vigoda, and D. Weitz 
http://arxiv.org/abs/0711.3664

 

 

Sampling Other Combinatorial Objects

 

18.   Random walks on truncated cubes and sampling 0-1 knapsack solutions

Ben Morris and Alistair Sinclair

http://www.cs.berkeley.edu/~sinclair/knapsack.ps

 

19.   A Deterministic Polynomial-time Approximation Scheme for Counting Knapsack Solutions 
D. Stefankovic, S. Vempala, and E. Vigoda 
http://arxiv.org/abs/1008.1687

 

20.   Fast perfect sampling from linear extensions
M. L. Huber
http://www.cmc.edu/pages/faculty/MHuber/Research/papers/huber2006b.pdf

 

 

Lower Bounds on Mixing time

 

21.   Coupling vs. conductance for the Jerrum-Sinclair chain

Kumar and Ramesh

http://ndssl.vbi.vt.edu/people/vskumar/publications/J10.pdf

 

22.   Torpid Mixing of Some Monte Carlo Markov Chain Algorithms in Statistical Physics 
C. Borgs, J. T. Chayes, J-H. Kim, A. Frieze, P. Tetali, E. Vigoda, and V. H. Vu 
http://www.cc.gatech.edu/~vigoda/torpid.pdf

 

23.   A general lower bound for mixing of single site dynamics on graphs

Tom Hayes and Alistair Sinclair

http://arxiv.org/abs/math.PR/0507517

 

 

 

Miscelanious

 

24.   Adaptive Simulated Annealing: A Near-optimal Connection between Sampling and Counting 
D. Stefankovic, S. Vempala, and E. Vigoda 
http://arxiv.org/abs/cs.DS/0612058

 

25.   Mixing in time and space for lattice spin systems: A combinatorial view. 
M. Dyer, A. Sinclair, E. Vigoda, and D. Weitz 
http://www.cc.gatech.edu/~vigoda/connections.ps

 

26.   Combinatorial criteria for Uniqueness of Gibbs Measures
Dror Weitz

 http://dimacs.rutgers.edu/~dror/pubs/unique_cond.pdf

 

27.   Rapid Mixing of Several Markov Chains for a Hard-Core Model

R. Kannan, M. W. Mahoney, and R. Montenegro

http://research.microsoft.com/en-us/um/people/kannan/Papers/Hardcore.pdf