Agent experience is divided into atomic episodes, in each the agent perpect and perform a single action. The next episode does not depend on the actions taken in previous episodes.
Many classification tasks are episodic. For example, an agent that has to spot defective parts on an assembly line bases each decision on the current part, regardless of previous decisions; moreover, the current decision doesn’t affect whether the next part is defective.
---
#intro