Sale!

EE675 Assignment 1 Solved

Original price was: $35.00.Current price is: $30.00.

Download Details:

  • Name: A1-r6ycu6.zip
  • Type: zip
  • Size: 642.80 KB

Category: Tag:

Description

5/5 - (1 vote)

Part-A

(Chernoff Bound) [10 Marks] Suppose 𝑋1, 𝑋2, β‹― , 𝑋𝑛 are i.i.d. copies of a (0, 𝜎2
) r.v. Then
for 𝑋 =
1
𝑛 βˆ‘
𝑛

𝑖=1 𝑋𝑖 we know that
β„™[𝑋 β‰₯ πœ€] ≀ exp(
βˆ’π‘›πœ€2
2𝜎
2 )
.

Write a Python code to run Monte Carlo simulations that verify the inequality. Specifically,
for a given πœ€ and 𝜎, generate 𝑛 samples from the zero mean Gaussian distribution π‘₯1, π‘₯2, … , π‘₯𝑛
and check whether the sample average is more than πœ€. Repeat this experiment 500 times and
1

observe in how many experiments out of those 500 experiments, the sample average is more
than πœ€. This will gives us an empirical estimate of 𝑃[𝑋 β‰₯ πœ€].
Take 𝜎 = 0.1, πœ€ = 0.05, and plot the empirical estimate as a function of 𝑛 ∈ {100, 200, … , 1000}.
In the same plot, include the Chernoff upper bound as a function of 𝑛.

Part-B

(Multi arm bandits | Explore-then-Commit and UCB) [20 Marks] Consider a two-armed
Bernoulli bandit scenario with true means given by πœ‡1 =
1
2
, πœ‡2 =
1
2

+ Ξ”, for some Ξ” <
1
2
. Let the

time horizon be 𝑇 = 10000.
1. Take Ξ” = 1
4
and run the Monte Carlo simulations to estimate the expected regret of the
ETC algorithm which explores each arm π‘š = 𝑇
2/3(log 𝑇 )

1/3 times before committing.
Specifically, you run the ETC algorithm to compute the sample regret
πœ‡2 β‹… 𝑇 βˆ’
𝑇
βˆ‘
𝑑=1
𝑅𝑑
,
where 𝑅𝑑
is the reward obtained in time step 𝑑.

Repeat this experiment 500 times and estimate the expected regret by taking the average
of the sample regrets you obtained in all those 500 experiments. [5 Marks]
2. Repeat the above for various values of Ξ” ∈ {0.05, 0.1, 0.2, 0.3, 0.4, 0.45} and plot the estimated regret as a function of Ξ” and verify whether it satisfies the regret upper bound

we derived in class. [5 Marks]
3. Repeat the experiment with the UCB algorithm and plot the comparison with ETC. [10
Marks]

4. (Bonus) In the ETC algorithm, assume that we know Ξ”, and choose a better π‘š as
function of Ξ” and repeat the experiments and compare with UCB. What did you observe?
Hint: Check how many samples of exploration are required to make πœ€ < Ξ”
2 with a high

probability of 1 βˆ’ 1
𝑇
. [5 Marks]
2