You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

576 lines
14 KiB

2 months ago
  1. """
  2. Copied and adapted from https://github.com/mila-iqia/babyai.
  3. Levels described in the Baby AI ICLR 2019 submission.
  4. The instructions are a synthesis of those from `PutNext`, `Open`, `GoTo`, and `Pickup`.
  5. """
  6. from __future__ import annotations
  7. from minigrid.envs.babyai.core.levelgen import LevelGen
  8. class Synth(LevelGen):
  9. """
  10. ## Description
  11. Union of all instructions from PutNext, Open, Goto and PickUp.
  12. The agent may need to move objects around. The agent may have
  13. to unlock the door, but only if it is explicitly referred by
  14. the instruction.
  15. Competencies: Maze, Unblock, Unlock, GoTo, PickUp, PutNext, Open
  16. ## Mission Space
  17. "go to the {color} {type}"
  18. or
  19. "pick up a/the {color} {type}"
  20. or
  21. "open the {color} door"
  22. or
  23. "put the {color} {type} next to the {color} {type}"
  24. {color} is the color of the box. Can be "red", "green", "blue", "purple",
  25. "yellow" or "grey".
  26. {type} is the type of the object. Can be "ball", "box" or "key".
  27. ## Action Space
  28. | Num | Name | Action |
  29. |-----|--------------|-------------------|
  30. | 0 | left | Turn left |
  31. | 1 | right | Turn right |
  32. | 2 | forward | Move forward |
  33. | 3 | pickup | Pick up an object |
  34. | 4 | drop | Unused |
  35. | 5 | toggle | Unused |
  36. | 6 | done | Unused |
  37. ## Observation Encoding
  38. - Each tile is encoded as a 3 dimensional tuple:
  39. `(OBJECT_IDX, COLOR_IDX, STATE)`
  40. - `OBJECT_TO_IDX` and `COLOR_TO_IDX` mapping can be found in
  41. [minigrid/minigrid.py](minigrid/minigrid.py)
  42. - `STATE` refers to the door state with 0=open, 1=closed and 2=locked
  43. ## Rewards
  44. A reward of '1 - 0.9 * (step_count / max_steps)' is given for success, and '0' for failure.
  45. ## Termination
  46. The episode ends if any one of the following conditions is met:
  47. 1. The agent achieves the task.
  48. 2. Timeout (see `max_steps`).
  49. ## Registered Configurations
  50. - `BabyAI-Synth-v0`
  51. - `BabyAI-SynthS5R2-v0`
  52. """
  53. def __init__(self, room_size=8, num_rows=3, num_cols=3, num_dists=18, **kwargs):
  54. # We add many distractors to increase the probability
  55. # of ambiguous locations within the same room
  56. super().__init__(
  57. room_size=room_size,
  58. num_rows=num_rows,
  59. num_cols=num_cols,
  60. num_dists=num_dists,
  61. instr_kinds=["action"],
  62. locations=False,
  63. unblocking=True,
  64. implicit_unlock=False,
  65. **kwargs,
  66. )
  67. class SynthLoc(LevelGen):
  68. """
  69. ## Description
  70. Like Synth, but a significant share of object descriptions involves
  71. location language like in PickUpLoc. No implicit unlocking.
  72. Competencies: Maze, Unblock, Unlock, GoTo, PickUp, PutNext, Open, Loc
  73. ## Mission Space
  74. "go to the {color} {type} {location}"
  75. or
  76. "pick up a/the {color} {type} {location}"
  77. or
  78. "open the {color} door {location}"
  79. or
  80. "put the {color} {type} {location} next to the {color} {type} {location}"
  81. {color} is the color of the box. Can be "red", "green", "blue", "purple",
  82. "yellow" or "grey".
  83. {type} is the type of the object. Can be "ball", "box" or "key".
  84. {location} can be " ", "in front of you", "behind you", "on your left"
  85. or "on your right"
  86. ## Action Space
  87. | Num | Name | Action |
  88. |-----|--------------|-------------------|
  89. | 0 | left | Turn left |
  90. | 1 | right | Turn right |
  91. | 2 | forward | Move forward |
  92. | 3 | pickup | Pick up an object |
  93. | 4 | drop | Unused |
  94. | 5 | toggle | Unused |
  95. | 6 | done | Unused |
  96. ## Observation Encoding
  97. - Each tile is encoded as a 3 dimensional tuple:
  98. `(OBJECT_IDX, COLOR_IDX, STATE)`
  99. - `OBJECT_TO_IDX` and `COLOR_TO_IDX` mapping can be found in
  100. [minigrid/minigrid.py](minigrid/minigrid.py)
  101. - `STATE` refers to the door state with 0=open, 1=closed and 2=locked
  102. ## Rewards
  103. A reward of '1 - 0.9 * (step_count / max_steps)' is given for success, and '0' for failure.
  104. ## Termination
  105. The episode ends if any one of the following conditions is met:
  106. 1. The agent achieves the task.
  107. 2. Timeout (see `max_steps`).
  108. ## Registered Configurations
  109. - `BabyAI-SynthLoc-v0`
  110. """
  111. def __init__(self, **kwargs):
  112. # We add many distractors to increase the probability
  113. # of ambiguous locations within the same room
  114. super().__init__(
  115. instr_kinds=["action"],
  116. locations=True,
  117. unblocking=True,
  118. implicit_unlock=False,
  119. **kwargs,
  120. )
  121. class SynthSeq(LevelGen):
  122. """
  123. ## Description
  124. Like SynthLoc, but now with multiple commands, combined just like in GoToSeq.
  125. No implicit unlocking.
  126. Competencies: Maze, Unblock, Unlock, GoTo, PickUp, PutNext, Open, Loc, Seq
  127. ## Mission Space
  128. Action mission space:
  129. "go to the {color} {type} {location}"
  130. or
  131. "pick up a/the {color} {type} {location}"
  132. or
  133. "open the {color} door {location}"
  134. or
  135. "put the {color} {type} {location} next to the {color} {type} {location}"
  136. {color} is the color of the box. Can be "red", "green", "blue", "purple",
  137. "yellow" or "grey".
  138. {type} is the type of the object. Can be "ball", "box" or "key".
  139. {location} can be " ", "in front of you", "behind you", "on your left"
  140. or "on your right"
  141. And mission space:
  142. Two action missions concatenated with "and"
  143. Example:
  144. go to the green key
  145. and
  146. put the box next to the yellow ball
  147. Sequence mission space:
  148. Two missions, they can be action or and missions, concatenated with
  149. ", then" or "after you".
  150. Example:
  151. open a red door and go to the ball on your left
  152. after you
  153. put the grey ball next to a door
  154. ## Action Space
  155. | Num | Name | Action |
  156. |-----|--------------|-------------------|
  157. | 0 | left | Turn left |
  158. | 1 | right | Turn right |
  159. | 2 | forward | Move forward |
  160. | 3 | pickup | Pick up an object |
  161. | 4 | drop | Unused |
  162. | 5 | toggle | Unused |
  163. | 6 | done | Unused |
  164. ## Observation Encoding
  165. - Each tile is encoded as a 3 dimensional tuple:
  166. `(OBJECT_IDX, COLOR_IDX, STATE)`
  167. - `OBJECT_TO_IDX` and `COLOR_TO_IDX` mapping can be found in
  168. [minigrid/minigrid.py](minigrid/minigrid.py)
  169. - `STATE` refers to the door state with 0=open, 1=closed and 2=locked
  170. ## Rewards
  171. A reward of '1 - 0.9 * (step_count / max_steps)' is given for success, and '0' for failure.
  172. ## Termination
  173. The episode ends if any one of the following conditions is met:
  174. 1. The agent achieves the task.
  175. 2. Timeout (see `max_steps`).
  176. ## Registered Configurations
  177. - `BabyAI-SynthSeq-v0`
  178. """
  179. def __init__(self, **kwargs):
  180. # We add many distractors to increase the probability
  181. # of ambiguous locations within the same room
  182. super().__init__(
  183. locations=True, unblocking=True, implicit_unlock=False, **kwargs
  184. )
  185. class MiniBossLevel(LevelGen):
  186. """
  187. ## Description
  188. Command can be any sentence drawn from the Baby Language grammar.
  189. Union of all competencies. This level is a superset of all other levels.
  190. Compared to BossLevel this has a smaller room and a lower probability of
  191. locked rooms.
  192. ## Mission Space
  193. Action mission space:
  194. "go to the {color} {type} {location}"
  195. or
  196. "pick up a/the {color} {type} {location}"
  197. or
  198. "open the {color} door {location}"
  199. or
  200. "put the {color} {type} {location} next to the {color} {type} {location}"
  201. {color} is the color of the box. Can be "red", "green", "blue", "purple",
  202. "yellow" or "grey".
  203. {type} is the type of the object. Can be "ball", "box" or "key".
  204. {location} can be " ", "in front of you", "behind you", "on your left"
  205. or "on your right"
  206. And mission space:
  207. Two action missions concatenated with "and"
  208. Example:
  209. go to the green key
  210. and
  211. put the box next to the yellow ball
  212. Sequence mission space:
  213. Two missions, they can be action or and missions, concatenated with
  214. ", then" or "after you".
  215. Example:
  216. open a red door and go to the ball on your left
  217. after you
  218. put the grey ball next to a door
  219. ## Action Space
  220. | Num | Name | Action |
  221. |-----|--------------|-------------------|
  222. | 0 | left | Turn left |
  223. | 1 | right | Turn right |
  224. | 2 | forward | Move forward |
  225. | 3 | pickup | Pick up an object |
  226. | 4 | drop | Unused |
  227. | 5 | toggle | Unused |
  228. | 6 | done | Unused |
  229. ## Observation Encoding
  230. - Each tile is encoded as a 3 dimensional tuple:
  231. `(OBJECT_IDX, COLOR_IDX, STATE)`
  232. - `OBJECT_TO_IDX` and `COLOR_TO_IDX` mapping can be found in
  233. [minigrid/minigrid.py](minigrid/minigrid.py)
  234. - `STATE` refers to the door state with 0=open, 1=closed and 2=locked
  235. ## Rewards
  236. A reward of '1 - 0.9 * (step_count / max_steps)' is given for success, and '0' for failure.
  237. ## Termination
  238. The episode ends if any one of the following conditions is met:
  239. 1. The agent achieves the task.
  240. 2. Timeout (see `max_steps`).
  241. ## Registered Configurations
  242. - `BabyAI-MiniBossLevel-v0`
  243. """
  244. def __init__(self, **kwargs):
  245. super().__init__(
  246. num_cols=2,
  247. num_rows=2,
  248. room_size=5,
  249. num_dists=7,
  250. locked_room_prob=0.25,
  251. **kwargs,
  252. )
  253. class BossLevel(LevelGen):
  254. """
  255. ## Description
  256. Command can be any sentence drawn from the Baby Language grammar.
  257. Union of all competencies. This level is a superset of all other levels.
  258. ## Mission Space
  259. Action mission space:
  260. "go to the {color} {type} {location}"
  261. or
  262. "pick up a/the {color} {type} {location}"
  263. or
  264. "open the {color} door {location}"
  265. or
  266. "put the {color} {type} {location} next to the {color} {type} {location}"
  267. {color} is the color of the box. Can be "red", "green", "blue", "purple",
  268. "yellow" or "grey".
  269. {type} is the type of the object. Can be "ball", "box" or "key".
  270. {location} can be " ", "in front of you", "behind you", "on your left"
  271. or "on your right"
  272. And mission space:
  273. Two action missions concatenated with "and"
  274. Example:
  275. go to the green key
  276. and
  277. put the box next to the yellow ball
  278. Sequence mission space:
  279. Two missions, they can be action or and missions, concatenated with
  280. ", then" or "after you".
  281. Example:
  282. open a red door and go to the ball on your left
  283. after you
  284. put the grey ball next to a door
  285. ## Action Space
  286. | Num | Name | Action |
  287. |-----|--------------|-------------------|
  288. | 0 | left | Turn left |
  289. | 1 | right | Turn right |
  290. | 2 | forward | Move forward |
  291. | 3 | pickup | Pick up an object |
  292. | 4 | drop | Unused |
  293. | 5 | toggle | Unused |
  294. | 6 | done | Unused |
  295. ## Observation Encoding
  296. - Each tile is encoded as a 3 dimensional tuple:
  297. `(OBJECT_IDX, COLOR_IDX, STATE)`
  298. - `OBJECT_TO_IDX` and `COLOR_TO_IDX` mapping can be found in
  299. [minigrid/minigrid.py](minigrid/minigrid.py)
  300. - `STATE` refers to the door state with 0=open, 1=closed and 2=locked
  301. ## Rewards
  302. A reward of '1 - 0.9 * (step_count / max_steps)' is given for success, and '0' for failure.
  303. ## Termination
  304. The episode ends if any one of the following conditions is met:
  305. 1. The agent achieves the task.
  306. 2. Timeout (see `max_steps`).
  307. ## Registered Configurations
  308. - `BabyAI-BossLevel-v0`
  309. """
  310. def __init__(self, **kwargs):
  311. super().__init__(**kwargs)
  312. class BossLevelNoUnlock(LevelGen):
  313. """
  314. ## Description
  315. Command can be any sentence drawn from the Baby Language grammar.
  316. Union of all competencies. This level is a superset of all other levels.
  317. No implicit unlocking.
  318. ## Mission Space
  319. Action mission space:
  320. "go to the {color} {type} {location}"
  321. or
  322. "pick up a/the {color} {type} {location}"
  323. or
  324. "open the {color} door {location}"
  325. or
  326. "put the {color} {type} {location} next to the {color} {type} {location}"
  327. {color} is the color of the box. Can be "red", "green", "blue", "purple",
  328. "yellow" or "grey".
  329. {type} is the type of the object. Can be "ball", "box" or "key".
  330. {location} can be " ", "in front of you", "behind you", "on your left"
  331. or "on your right"
  332. And mission space:
  333. Two action missions concatenated with "and"
  334. Example:
  335. go to the green key
  336. and
  337. put the box next to the yellow ball
  338. Sequence mission space:
  339. Two missions, they can be action or and missions, concatenated with
  340. ", then" or "after you".
  341. Example:
  342. open a red door and go to the ball on your left
  343. after you
  344. put the grey ball next to a door
  345. ## Action Space
  346. | Num | Name | Action |
  347. |-----|--------------|-------------------|
  348. | 0 | left | Turn left |
  349. | 1 | right | Turn right |
  350. | 2 | forward | Move forward |
  351. | 3 | pickup | Pick up an object |
  352. | 4 | drop | Unused |
  353. | 5 | toggle | Unused |
  354. | 6 | done | Unused |
  355. ## Observation Encoding
  356. - Each tile is encoded as a 3 dimensional tuple:
  357. `(OBJECT_IDX, COLOR_IDX, STATE)`
  358. - `OBJECT_TO_IDX` and `COLOR_TO_IDX` mapping can be found in
  359. [minigrid/minigrid.py](minigrid/minigrid.py)
  360. - `STATE` refers to the door state with 0=open, 1=closed and 2=locked
  361. ## Rewards
  362. A reward of '1 - 0.9 * (step_count / max_steps)' is given for success, and '0' for failure.
  363. ## Termination
  364. The episode ends if any one of the following conditions is met:
  365. 1. The agent achieves the task.
  366. 2. Timeout (see `max_steps`).
  367. ## Registered Configurations
  368. - `BabyAI-BossLevelNoUnlock-v0`
  369. """
  370. def __init__(self, **kwargs):
  371. super().__init__(locked_room_prob=0, implicit_unlock=False, **kwargs)