Skip to content

Commit da60e51

Browse files
committed
Updates dl/reinforcement/reinforcement.md
Auto commit by GitBook Editor
1 parent 53d5eba commit da60e51

File tree

1 file changed

+1
-5
lines changed

1 file changed

+1
-5
lines changed

dl/reinforcement/reinforcement.md

+1-5
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Log-Likelihood:计算每一个动作的概率,$$log\pi_\theta(a|s) = log[P_\
1717

1818
**diagonal Gaussian policies 通常用在连续动作空间的场景**
1919

20-
采样阶段,生成随机动作的概率 $$a = \mu_\theta(s) +\delta_\theta(s)\odot z$$ $$z\sim N(0,I)$$
20+
采样阶段,生成随机动作的概率 $$a = \mu_\theta(s) +\delta_\theta(s)\odot z$$$$z\sim N(0,I)$$
2121

2222
Log-Likelihood: $$log\pi_\theta(a|s) = -\frac{1}{2}( \sum_{i=1}^{k}(\frac{(a_i-\mu_i)^2)}{\delta_i^2}))+klog2\pi)$$
2323

@@ -122,7 +122,3 @@ $$s_{t+1} \sim P(\odot|s_t, a_t)$$
122122
| :--- |
123123

124124

125-
126-
127-
[^1]: Enter footnote here.
128-

0 commit comments

Comments
 (0)