This is the repo for AI for Oceans from Code.org.
Like the Dance Party repo, it is a standalone repo that is published as an NPM package, and consumed by the main repo.
AI for Oceans was produced for the Hour of Code in 2019. This module provides the student experience for the 5 interactive levels in the AI for Oceans script at https://studio.code.org/s/oceans.
We have measured over one million unique completions of the script.
These 5 levels are invoked with a "mode" (stored internally as appMode
) parameter:
The user trains the AI to differentiate between fish and trash, and then examines the results.
Next, the concept of non-fish sea creatures is introduced to show that AI is only as good as its training. In this mode, the experience is abbreviated: the user doesn't do training, but rather the mode demonstrates what happens when fish-specific training encounters non-fish.
In this mode, the user trains the AI again, but this time encountering fish, non-fish creatures, and trash.
In this mode, the user chooses from one of six adjectives and then categorizes fish based on that. The AI is trained on which fish fit into this arbitrary category or not, and then demonstrates this training.
In this mode, the user chooses from one of fifteen adjectives. With more subjectivity in this list, the user can explore more subtle implications of training and categorization.
Adapted from content at https://code.org/oceans:
Levels 2-4 (
fishvtrash
,creaturesvtrashdemo
,creaturesvtrash
) use a pretrained model provided by the TensorFlow MobileNet project. A MobileNet model is a convolutional neural network that has been trained on ImageNet, a dataset of over 14 million images hand-annotated with words such as "balloon" or "strawberry". In order to customize this model with the labeled training data the student generates in this activity, we use a technique called Transfer Learning. Each image in the training dataset is fed to MobileNet, as pixels, to obtain a list of annotations that are most likely to apply to it. Then, for a new image, we feed it to MobileNet and compare its resulting list of annotations to those from the training dataset. We classify the new image with the same label (such as "fish" or "not fish") as the images from the training set with the most similar results.Levels 6-8 (
short
,long
) use a Support-Vector Machine (SVM). We look at each component of the fish (such as eyes, mouth, body) and assemble all of the metadata for the components (such as number of teeth, body shape) into a vector of numbers for each fish. We use these vectors to train the SVM. Based on the training data, the SVM separates the "space" of all possible fish into two parts, which correspond to the classes we are trying to learn (such as "blue" or "not blue").
The AI for Oceans script presents a linear narrative structure. This app is designed to deliver the interactive levels for this script, one mode at a time, with no need to persist data to the browser or server between each level.
The app itself presents a variety of "scenes", with each mode using a different subset. The scenes (known as currentMode
internally) are as follows:
A simple "loading" screen, used when loading or processing data.
The user selects from a list of adjectives for the short
& long
modes.
The user trains the AI by choosing one of two options (true or false) for each item (fish, non-fish sea creatures, trash).
The user watches A.I. (the "bot") categorizing items, one at a time.
The user is shown the result of the predictions. The user can toggle between the matching & non-matching sets.
In the short
and long
modes, the pond also has a metapanel which can show general information about the ML processing, or, when a fish is selected, specific information about that fish's categorization:
The app uses three layers in the DOM. Underneath, one canvas contains the scene's background image, while another canvas contains all the sprites. On top, the app uses React to render HTML elements for the user interface, implemented here.
The app is fully responsive by scaling the canvases and also scaling the size of the HTML elements correspondingly. This way, the UI simply shrinks to match the underlying canvases.
The animation is designed to be be smooth and frame-rate independent.
The prediction screen notably renders the progression based on the concept of a "current offset in time", making it possible to pause, and even reverse the animation, as well as adjust its speed.
All items have simple "bobbing" animations, using offsets cycling in a sine loop, such as here.
The fish pause under the scanner using a simple S-curve adjustment to their movement, implemented here.
After initial playtests, we identified a need to slow the pacing of the tutorial and tell a clear story. The solution we adopted was text boxes with "typing" text, reminiscent of old-school computer games.
"The Guide" is the implementation of this solution, and was designed to be a simple but flexible system that allowed us to add a variety of text for every step and situation encountered in the tutorial.
Each piece of Guide text is declared, along with the app state needed for it to show (which can even include code for more expressiveness), here.
This simple system enabled the team to add a detailed narrative voice to the script, and allowed a variety of team members to contribute text.
We also use modal popups to give extra information.
The app's runtime state is stored in a very simple module here. Updates to state trigger a React render, unless deliberately skipped.
The full functionality of this app is enabled when hosted by https://studio.code.org. The main repo loads this app via code here. Specific parameters passed in during initialization, here, include a foreground and background canvas, the appMode
, a callback when the user continues to the next level, callbacks for loading & playing sound effects, and localized strings.
If Google Analytics is available on the page, the app generates a synthetic page view for each scene, allowing for an understanding of usage and duration of each scene in the script.
The documentation for common operations for AI Lab is comprehensive and should apply to this project too: https://github.com/code-dot-org/ml-playground#common-operations
Steps to get up and running:
git clone [email protected]:code-dot-org/ml-activities.git
cd ml-activities
nvm install
nvm use
npm install -g yarn
yarn
yarn start
At this point the app will be running at http://localhost:8080 with live-reloading on file changes.
Integration with local code-dot-org repo
Similar to https://github.com/code-dot-org/dance-party, ml-activities is built from a small repo as an app which is then used by the code.org dashboard to run individual levels in a script.
If you want to make changes locally in ml-activities and have them show up in your apps build, do the following:
- In the ml-activities root directory
yarn link
- In the code-dot-org apps/ directory
yarn link @code-dot-org/ml-activities
This will set up a symlink in apps/node_modules/@code-dot-org to point at your local changes. Runyarn build
in ml-activities, and then the code-dot-org apps build should pick up the changes (generated in ml-activities'dist/
) next time it occurs (including in already-runningyarn start
build in code-dot-org).- Note that ml-activities'
yarn start
can be left running whenyarn build
is run. But a new invocation ofyarn start
will intentionally clear thedist/
directory populated byyarn build
to ensure we don't have outdated assets left in it.
- Note that ml-activities'
- If you want to go back to using the published module, in the code-dot-org apps/ directory run
yarn unlink @code-dot-org/ml-activities
. You'll be given additional instructions on how to force the module to be rebuilt after that.
First, ensure you have the main
branch checked out locally, and that it's up to date.
To publish a new version, the following command should work:
npm version 0.0.29
With 0.0.29
replaced by the new version number that should be published.
Note: make sure you are logged into npm
first. If not, the command may fail with a misleading E404
error. You can see if you're logged in with npm whoami
, and if not logged in, can can use npm login
.
All fish components live in public/images/fish
in their respective folders (eg bodies live in body/
). Despite the fact that the fish face right in most of the tutorial, they are built as if they face left in order to simplify the math for the anchor points. This means that all components should be oriented as if the fish is facing left, which might require flopping any new assets. After adding the assets, they will need to be added to src/utils/fishData.js
. bin/determineKnnData.js
will output some of the lines that will be needed in fishData
.
All components can define exclusions
, which are modes that the component won't be used in. Components appear in all modes by default.
Some components need more configuration:
Bodies need an anchor point for the body then all of the other components, relative to the bounds of the body image. A face anchor point is used for both the eyes and the mouth. The eyes and mouth are arranged with respect to each other and the defined anchor point. The tail Y anchor point is set from where the center of the component should be.
Some dorsal fins define an x-adjustment to shift the anchor point. This is useful for dorsal fins that might look odd is not positioned correctly (eg symmetical).
By default, this tutorial is in English. The strings live at i18n/oceans.json and should not be moved without corresponding changes to the I18n pipeline in code-dot-org
. Translations can be passed into the app using the i18n
param. If any translations are missing, the English string will be used as a default. This also means that adding a new string is safe and does not require any further steps.
We currently have support for two machine learning algorithms: k nearest neighbor (KNN) and support vector machine (SVM). We also have a mobilenet model that is saved at src/oceans/model.json
(it's saved here to avoid a call to googleapis.com).