berkeley-deep-RL-pytorch-starter/hw2 at master · mdeib/berkeley-deep-RL-pytorch-starter

History

Name		Name	Last commit message	Last commit date
parent directory ..
cs285		cs285
README.txt		README.txt
cs285_hw2.pdf		cs285_hw2.pdf
requirements.txt		requirements.txt
setup.py		setup.py

README.txt

1) See hw1 if you'd like to see installation instructions. You do NOT have to redo them.


##############################################
##############################################


2) Code:

-------------------------------------------

Files to look at, even though there are no explicit 'TODO' markings:
- scripts/run_hw2_policy_gradient.py

-------------------------------------------

Relevant Code from the first HW has already been filled in in the following files:
- infrastructure/rl_trainer.py
- infrastructure/utils.py
- policies/MLP_policy.py

-------------------------------------------

Blanks to be filled in now (for this assignment) are marked with 'TODO'

The following files have these:
- agents/pg_agent.py
- policies/MLP_policy.py


##############################################
##############################################


3) Run code with the following command: 

$ python cs285/scripts/run_hw2_policy_gradient.py --env_name CartPole-v1 --exp_name test_pg_cartpole
$ python cs285/scripts/run_hw2_policy_gradient.py --env_name InvertedPendulum-v2 --exp_name test_pg_pendulum

Flags of relevance, when running the commands above (see pdf for more info):
-n number of policy training iterations
-rtg use reward_to_go for the value
-dsa do not standardize the advantage values

##############################################


4) Visualize saved tensorboard event file:

$ cd cs285/data/<your_log_dir>
$ tensorboard --logdir .

Then, navigate to shown url to see scalar summaries as plots (in 'scalar' tab), as well as videos (in 'images' tab)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hw2

hw2

README.txt

Files

hw2

Directory actions

More options

Directory actions

More options

Latest commit

History

hw2

Folders and files

parent directory

README.txt