Writing the Code

Tutorial

·

intermediate

·

+10XP

·

40 mins

·

(7)

Unity Technologies

Writing the Code

In this tutorial you’ll write all of the C# code needed for the penguin ML-Agents. These scripts manage Scene setup (such as randomized placement of the penguin agent), penguin decision making, fish movement, and interaction between the penguin agent and the Scene.

Languages available:

1. Getting Started

First, you’ll create all of the C# scripts needed for this project. After you’ve created them, we’ll walk through the code for each.

1. Create a new folder in Unity called Scripts inside the Penguin folder.

2. Create four new C# scripts inside the Scripts folder (Figure 01):
a. PenguinAcademy
b. PenguinArea
c. PenguinAgent
d. Fish

Figure 01: C# scripts in the Scripts folder

Figure 01: C# scripts in the Scripts folder

2. PenguinAcademy.cs

PenguinAcademy will build on top of the Academy class. The Academy class is part of the ML-Agents namespace and can be found in ML-Agents\Scripts\Academy.cs if you would like to learn more about it. The Academy is required for ML-Agents because it manages the machine learning training and inference; however, it cannot be added directly to the Scene because it’s an abstract class. To work around this, you will create a PenguinAcademy class that inherits from Academy but does not add any new functionality.

1. Open PenguinAcademy.cs.

2. Add a using statement for MLAgents.

3. Update the class definition to inherit from Academy instead of Monobehaviour.

4. Add a couple float accessors that will be used to specify the speed of the fish and acceptable radius for feeding a baby fish (more on this later).

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

5. Override the InitializeAcademy() function.

6. Register two callbacks so that whenever the Academy resets with a new value from the curriculum, it’s accessible.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

3. PenguinArea.cs

The PenguinArea (Figure 02) will manage a training area with one penguin, one baby, and multiple fish. It has the responsibilities of removing fish, spawning fish, and random placement of the penguins. There might be multiple PenguinAreas in a Scene for more efficient training.

Figure 02: A single PenguinArea with penguins and fish, which you will be creating later and attaching the PenguinArea.cs script to.

Figure 02: A single PenguinArea with penguins and fish, which you will be creating later and attaching the PenguinArea.cs script to.

1. Open PenguinArea.cs.

2. Delete the Start() function.

3. Delete the Update function.

4. Add using statements for MLAgents and TMPro.

5. Update the class definition to inherit from Area instead of Monobehaviour.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

6. Add the following variables inside the class (between { }).

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

These variables will keep track of important objects in the Scene. You will hook up objects to the public variables in Unity later in this tutorial.

7. Add a new ResetArea() function inside the class after the last private variable.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The functions in the code above do not exist yet, but we will create them later in this script.

8. Add a new RemoveSpecificFish() function.

9. Add a new FishRemaining() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

When the penguin catches a fish, the PenguinAgent script will call RemoveSpecificFish() to remove it from the water.

The next few functions will handle placement of the animals in the area. It makes the most sense to spawn fish in the water and place the baby penguin on land. The penguin can move between land and water, so it can be placed in either. In Figure 03 below, you can see where the script will randomly position each type of animal.

Figure 03: Placement regions for the penguin, the baby penguin, and fish.

Figure 03: Placement regions for the penguin, the baby penguin, and fish.

10. Add a new ChooseRandomPosition() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

This function uses special radius and angle limits to pick a random position within wedges around the central point in the area. Read the comments in the code for more detail.

11. Add a new RemoveAllFish() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The ResetArea() function calls RemoveAllFish() to make sure no fish are in the area before spawning new fish.

12. Add a new PlacePenguin() function.

13. Add a new PlaceBaby() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

These functions place the penguins. In both cases, they set rigidbody velocities to zero because unexpected things can happen when training for long periods of time at 100x speed. For example, the penguin could fall through the floor, then accelerate downward. When the area resets, the position would be reset, but if the downward velocity is not reset, the penguin might blast through the ground.

14. Add a new SpawnFish() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

This code places a specified number of fish in the area and sets their default swim speed. See the comments in the code for more detail.

15. Add a new Start() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

This function finds the PenguinAcademy in the Scene and resets the academy. Note that there should only be one PenguinAcademy in the Scene that controls one or many PenguinAreas.

16. Add a new Update() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

This function updates the cumulative reward display text on the back wall of the area every frame. It is not necessary for training, but it helps you see how well the penguins are performing.

That’s all for the PenguinArea script!

4. PenguinAgent.cs

The PenguinAgent class, which inherits from the Agent class, is where the cool stuff happens. It handles observing the environment, taking action, interacting, and accepting player input.

1. Open PenguinAgent.cs.

2. Delete the Start() function.

3. Delete the Update() function.

4. Add a MLAgents using statement.

5. Change the class definition to inherit from Agent instead of Monobehaviour.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

6. Add public variables to keep track of the move and turn speed of the penguin agent as well as the Prefabs for the heart and regurgitated fish.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

7. Add private variables to keep track of things.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

InitializeAgent() is called once, automatically, when the agent wakes up. It is not called every time the agent is reset, which is why there is a separate ResetAgent() function. We’ll use it to find a few objects in our Scene.

8. Override InitializeAgent()

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

AgentAction() is where the agent receives and responds to commands. These commands may originate from a neural network or a human player, but this function treats them the same.

The vectorAction parameter is an array of numerical values that correspond to actions the agent should take. For this project, we are using "discrete" actions, which means each integer value (e.g., 0, 1, 2, …) corresponds to a choice. The alternative is "continuous" actions, which instead allows a choice of any fractional value between -1 and +1 (e.g., -.7, 0.23, .4, …). Discrete actions allow only one choice at a time with no in-between.

In this case:

  • vectorAction[0] can either be 0 or 1, indicating whether to remain in place (0) or move forward at full speed (1).
  • vectorAction[1] can either be 0, 1, or 2, indicating whether to not turn (0), turn in the negative direction (1), or turn in the positive direction (2).

The neural network, when trained, actually has no concept of what these actions do. It only knows that when it sees the environment a certain way, some actions tend to result in more reward points. This is why it will be very important to create an effective observation of the environment later in this script.

After interpreting the vector actions, the AgentAction() function applies the movement and rotation and then adds a small negative reward. This small negative reward encourages the agent to complete its task as quickly as possible.

In this case, a reward of -1 / 5000 is given for each of the 5,000 steps. If the penguin finishes early — in 3,000 steps, for example — the negative reward added from this line of code would be -3000 / 5000 = -0.6. If the penguin takes all 5,000 steps, the total negative reward would be -5000 / 5000 = -1.

9. Override AgentAction()

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The Heuristic() function allows control of the agent without a neural network. This function will read inputs from the human player via the keyboard, convert them into actions, and return a list of those actions.

In our project:

  • The default forwardAction will be 0, but if the player presses 'W' on the keyboard, this value will be set to 1.
  • The default turnAction will be 0, but if the player presses 'A' or 'D' on the keyboard, the value will be set to 1 or 2 respectively to turn left or right.

10. Override the Heuristic() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The base Agent class calls the AgentReset() function automatically when the agent is done feeding the baby all of the fish. We will use it to empty the penguin’s belly and reset the area.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

The penguin agent observes the environment in two different ways. The first way is with raycasts. This is like shining a bunch of laser pointers out from the penguin and seeing if they hit anything. It's similar to LIDAR, which is used by autonomous cars and robots. Raycast observations are added via a RayPerceptionSensor component as of version 0.12, which we'll add in the Unity Editor later.

The second way the agent observes the environment is with numerical values. Whether it's a true/false value, a distance, an XYZ position in space, or a quaternion rotation, you can convert an observation into a list of numbers and add it as an observation for the agent. Check out the comments in the code to understand what we're adding.

You need to be very thoughtful when choosing what to observe. If the agent doesn't have enough information about its environment, it will not be able to complete its task. Imagine your agent is floating in space, blindfolded. What would it need to be told about its environment to make an intelligent decision?

This penguin agent, as currently implemented, doesn't have any memory. We need to help it out by telling it where things are every update step so that it can make a decision. It’s possible to use memory in ML-Agents, but that’s beyond the scope of this tutorial. You can read more about it in the ML-Agents Recurrent Neural Network documentation.

11. Override the CollectObservations() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Next we'll add FixedUpdate(), which will check if the penguin is close enough to the baby and then try to regurgitate the fish to feed it. We do a check inside RegurgitateFish() to see if it has a full belly before doing so.

11. Add a new FixedUpdate() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Next we'll implement OnCollisionEnter() and test for collisions with items that have the tag "fish" or "baby" and respond accordingly.

13. Add a new OnCollisionEnter() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Now we can add a function to eat fish, assuming the penguin doesn't already have a full stomach. It will remove that fish from the area and get a reward.

14. Add a new EatFish() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Finally, we'll add a function to regurgitate fish and feed the baby. We’ll spawn a regurgitated fish blob on the ground as well as a heart floating in the air to show how much the baby loves its parent for feeding it. We’ll also set an auto-destroy timer. The agent gets a reward, and if there are no fish remaining, we call Done(), which will automatically call AgentReset().

15. Create a new RegurgitateFish() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

That's all for the PenguinAgent script!

5. Fish.cs

The Fish class will attach to each fish and make it swim. Unity doesn’t have water physics built in, so our code just moves them in a straight line toward a target destination to keep things simple.

1. Open Fish.cs.

2. Delete the Start() function.

3. Delete the Update() function.

4. Add several variables as shown.

Here’s an overview of the variables:

  • fishSpeed controls the average speed of the fish.
  • randomizedSpeed is a slightly altered speed that we will change randomly each time a new swim destination is picked.
  • nextActionTime is used to trigger the selection of a new swim destination.
  • targetPosition is the position of the destination the fish is swimming toward.
[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

FixedUpdate is called at a regular interval of 0.02 seconds (it is independent of frame rate) and will allow us to interact even when the agent is training at an increased game speed, which is common for training ML-Agents. In it, we check if the fish should swim and, if so, call the Swim() function.

5. Add a new FixedUpdate() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

Next, we’ll add swim functionality. At any given update, the fish will either pick a new speed and destination, or move toward its current destination.

When it is time to take a new action, the fish will:

  • Choose a new randomized speed between 50% and 150% of the average fish speed.
  • Pick a new random target position (in the water) to swim toward.
  • Rotate the fish to face the target.
  • Calculate the time needed to get there.

Otherwise, the fish will move toward the target and make sure it doesn't swim past it.

6. Add a new Swim() function.

[@portabletext/react] Unknown block type "code", specify a component for it in the `components.types` prop

That’s all for the Fish script!

6. Conclusion

You should now have all of the code you need to train the penguins to catch fish and feed their babies. In the next tutorial, you will set up your Scene to use this code.

Complete this tutorial