berkeley-deep-RL-pytorch-starter/hw1 at master · mdeib/berkeley-deep-RL-pytorch-starter

History

Name		Name	Last commit message	Last commit date
parent directory ..
cs285		cs285
downloads/mjpro150		downloads/mjpro150
README.txt		README.txt
cs285_hw1.pdf		cs285_hw1.pdf
requirements.txt		requirements.txt
setup.py		setup.py

README.txt

1) install package by running:

$ python setup.py develop

##############################################
##############################################

2)install mujoco:
$ cd ~
$ mkdir .mujoco
$ cd <location_of_your_mjkey.txt>
$ cp mjkey.txt ~/.mujoco/
$ cd <this_repo>/downloads
$ cp -r mjpro150 ~/.mujoco/

add the following to bottom of your bashrc:
export LD_LIBRARY_PATH=~/.mujoco/mjpro150/bin/

NOTE IF YOU'RE USING A MAC:
The provided mjpro150 folder is for Linux. 
Please download the OSX version yourself, from https://www.roboti.us/index.html

##############################################
##############################################

3)install other dependencies

-------------------

a) [PREFERRED] Option A:

i) install anaconda, if you don't already have it:
Download Anaconda2 (suggested v5.2 for linux): https://www.continuum.io/downloads
$ cd Downloads
$ bash Anaconda2-5.2.0-Linux-x86_64.sh #file name might be slightly different, but follows this format

Note that this install will modify the PATH variable in your bashrc.
You need to open a new terminal for that path change to take place (to be able to find 'conda' in the next step).

ii) create a conda env that will contain python 3:
$ conda create -n cs285_env python=3.5

iii) activate the environment (do this every time you open a new terminal and want to run code):
$ source activate cs285_env

iv) install the requirements into this conda env
$ pip install --user --requirement requirements.txt

v) get the appropriate version of pytorch (1.5.0+cu101 was used here, but your version will vary based on your device) and some version of tnesorflow to run tensorboard

vi) allow your code to be able to see 'cs285'
$ cd <path_to_hw>
$ pip install -e .

Note: This conda environment requires activating it every time you open a new terminal (in order to run code), but the benefit is that the required dependencies for this codebase will not affect existing/other versions of things on your computer. This stand-alone environment will have everything that is necessary.

-------------------

b) Option B:

i) install dependencies locally, by running:
$ pip install -r requirements.txt

ii) get the appropriate version of pytorch (1.5.0+cu101 was used here, but your version will vary based on your device) and some version of tnesorflow to run tensorboard

iii) set path to cs285 folder in run_hw1_behavior_cloning.py 

##############################################
##############################################

4) code:

Blanks to be filled in are marked with "TODO"
The following files have blanks in them:
- scripts/run_hw1_behavior_cloning.py
- infrastructure/rl_trainer.py
- agents/bc_agent.py
- policies/MLP_policy.py
- infrastructure/replay_buffer.py
- infrastructure/utils.py

NOTE - tf_utils.py was deleted in the pytorch version

See the code + the hw pdf for more details.

##############################################
##############################################

5) run code: 

Run the following command(s) for Section 1 (Behavior Cloning):
(All identical, one for each env)

$ python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/Ant.pkl --env_name Ant-v2 --exp_name test_bc_ant --n_iter 1 --expert_data cs285/expert_data/expert_data_Ant-v2.pkl
$ python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/HalfCheetah.pkl --env_name HalfCheetah-v2 --exp_name test_bc_halfcheetah --n_iter 1 --expert_data cs285/expert_data/expert_data_HalfCheetah-v2.pkl
$ python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/Hopper.pkl --env_name Hopper-v2 --exp_name test_bc_hopper --n_iter 1 --expert_data cs285/expert_data/expert_data_Hopper-v2.pkl
$ python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/Humanoid.pkl --env_name Humanoid-v2 --exp_name test_bc_humanoid --n_iter 1 --expert_data cs285/expert_data/expert_data_Humanoid-v2.pkl
$ python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/Walker2d.pkl --env_name Walker2d-v2 --exp_name test_bc_walker2d --n_iter 1 --expert_data cs285/expert_data/expert_data_Walker2d-v2.pkl

Run the following command for Section 2 (DAGGER):
(NOTE: the --do_dagger flag, and the higher value for n_iter)

$ python cs285/scripts/run_hw1_behavior_cloning.py --expert_policy_file cs285/policies/experts/Walker2d.pkl --env_name Walker2d-v2 --exp_name test_dagger_walker --n_iter 10 --do_dagger --expert_data cs285/expert_data/expert_data_Walker2d-v2.pkl

##############################################

6) visualize saved tensorboard event file:

$ cd cs285/data/<your_log_dir>
$ tensorboard --logdir .

Then, navigate to shown url to see scalar summaries as plots (in 'scalar' tab), as well as videos (in 'images' tab)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hw1

hw1

README.txt

Files

hw1

Directory actions

More options

Directory actions

More options

Latest commit

History

hw1

Folders and files

parent directory

README.txt