Description
• You should write the solutions on your own. Dishonest behaviour and cheating in the assignment will be penalized with extreme measures. See: https://www.cse.iitk.ac.in/pages/AntiCheatingPolicy. html
1. Your LaTeX-ed main answer script compiled as a .pdf file with the name [Your roll number].pdf. Using LaTeX is compulsory.
2. Your LaTeX code for the main answer script as .tex file with the name [Your roll number].tex.
3. The program file for part (d) of Question 3 with the name [Your roll number] Q3 (with extension .py, .c etc according to your choice of the programming language)
4. A readme document explaining how the program file can be executed.
1. (15+15 marks) Endsem allocation
(a) You are really busy, and just randomly allocated each student to set 1 or set 2. Show that theexpected value of disrupted friendship connections is .
(b) Getting expected value is not enough, you need to find a proper allocation. But you cannot goover all the 2n allocations as n ≈ 150. Using the construction for pairwise independence given in class, show that you can find an allocation with at least half of the friendship connections disrupted in poly(n)-time.
2. (5+10+10+15 marks) Estimating the number of tickets
You are given a bag full of N tickets numbered 1,…,N(N is unknown to you). You can take out tickets one at a time, note their label, and put them back in the bag. Your task is to estimate N. We will do this in the same way as we estimated π in lecture:
(a) Assume you drew out k tickets. What will be the expected value of the mean of these tickets ? Calculate N in terms of this mean, call this N˜.
(b) Chernoff bound can be extended to work on the case when the Random Variables take valuesother than {0,1}. This is known as Hoeffding’s inequality. Use it to find a lower bound on the probability that the error in N, using the above calculation, will be less than δN(δ < 1/2). (in terms of N,δ,k)
(c) Assume k,N are odd. In calculation of part (a), instead of using the value of mean, we use the median of the labels of tickets drawn. Prove a lower bound of 1 on the probability that the error in N using the median will be less than δN(δ < 1/2). (in terms of N,δ,k)
1
(d) Start with a random hidden value of N in range 104 − 106. Write a function that gives k values from [N] when queried with equal probability. Use these values to calculate N˜ as in part (a) and (c), and plot them with respect to increasing k ≤ 1000. Repeat this estimation for a total of 3 different N, and put the plots in the main answer file. Submit the code you used to generate these plots, along with a readme on how to execute the code, zipped together with the main answer file into a single .zip file.
3. (15+15 marks) Markov Chain
Consider a homogeneous regular Markov chain with state space S of size |S|, and transition matrix M. Suppose that M is symmetric and entry-wise positive.
(a) Show that all the eigenvalues of M are bounded by 1 and that the uniform distribution is the unique stationary probability distribution for M.
(b) Starting from the stationary distribution, express the probability of returning to the same stateas the state at t = 0 after n ∈ N steps in terms of the eigenvalues of M. Compute the limit of the above probability as n → ∞.
You might find the second part to be easier than the first. Feel free to assume the first part and finish the second part (even when you can’t prove the first part).
4. (Optional) DNF Counting
Given a DNF formula F of n variables, how can we estimate the number of satisfying assignments without considering all cases? A natural approach is to randomly sample m assignments uniformly from the set of all possible assignments. Let X be the number of satisfying assignments among the m samples.
(a) Derive an unbiased estimate for the total number of satisfying assignments in terms of X,m,n.
(b) Construct one DNF formula each for number of variables n for n = 6,8,10,12 having n clauses with each clause composed of subsets of n − 4 variables.
(c) Simulate the above algorithm on the constructed DNF formulas with m = 4n. Compare the estimate from the algorithm against the actual number satisfying assignments for each of the constructed formula. Plot the estimate and the number of satisfying assignments on the y-axis vs n on the x-axis..
(d) Explain qualitatively why the above scheme requires a large number of samples to produce accurateestimates.
2




Reviews
There are no reviews yet.