solution:
metadata:
id: living-canvas
title: Living Canvas, a web-based puzzle game powered by Generative AI
description: Learn how to create an AI-powered web game using Angular, PhaserJS, Gemini, Imagen, Veo and Firebase App Hosting.
fallbackCta:
title: Launch in Firebase Studio
icon: firebase-studio
url: https://studio.firebase.google.com/import?url=https%3A%2F%2Fgithub.com%2FFirebaseExtended%2Fsolution-living-canvas
accentColor:
foreground: "white"
background: "#141e2a"
sections:
- type: watch
youtubeId: dN1DbYyopks
title: Living Canvas, a web-based puzzle game powered by Generative AI
description: Learn how to create an AI-powered web game using Angular, PhaserJS, Gemini, Imagen, Veo and Firebase App Hosting.
technologies:
- icon: /external-assets/angular.svg
label: Angular
url: https://angular.io
- icon: /external-assets/firebase-apphosting.svg
label: Firebase App Hosting
url: https://firebase.google.com/docs/app-hosting
- icon: /external-assets/gemini.svg
label: Gemini
url: https://ai.google.dev/
- icon: /external-assets/vertex-ai.svg
label: Vertex AI
url: https://cloud.google.com/vertex-ai
- icon: /external-assets/googlecloud.svg
label: Imagen
url: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview
- icon: /external-assets/veo.svg
label: Veo
url: https://ai.google.dev/gemini-api/docs/video
- type: explore
mode: video
subsections:
- id: video-1
title: Gemini image analysis and image generation
description: |
The goal in every puzzle is for the user to find a way for the key in the level to reach the lock goal. The user draws a visual of a fire on the canvas, Gemini analyses the image and transforms it into a fire graphic with image generation. The object has the burning property attached to it which causes the ice to melt.
videoPath: external-assets/living-canvas/living-canvas-1.mp4
orientation: landscape
frame: laptop
layout: sidebyside
logs:
- timestamp: 10000
summary: Extracting canvas data to send to server
imageData: /external-assets/living-canvas/message_asset_1_1.png
sequence:
- angular
- apphosting
inspect: angular-apphosting/angular-apphosting-0
- timestamp: 11000
summary: Creating the game object with default properties and waiting for Gemini's analysis
sequence:
- angular
- apphosting
inspect: angular-apphosting/angular-apphosting-1
- timestamp: 12000
summary: Uploading user drawing from browser to Firebase App Hosting backend.
imageData: /external-assets/living-canvas/message_asset_1_1.png
sequence:
- angular
- apphosting
inspect: vertexai-gemini/vertexai-gemini-0
- timestamp: 14000
summary: Gemini response to Vertex AI and Firebase App Hosting. Interpreted as "Fire".
detail: |
{
...
burning: true,
...
}
imageData: ""
sequence:
- gemini
- vertexai
- apphosting
- angular
inspect: vertexai-gemini/vertexai-gemini-0
- timestamp: 17500
summary: Gemini image generation response to Vertex AI and Firebase App Hosting.
imageData: /external-assets/living-canvas/message_asset_1_2.png
sequence:
- gemini
- vertexai
- apphosting
- angular
inspect: vertexai-gemini/vertexai-gemini-1
- id: video-2
title: Image and video generation with Imagen and Veo
description: |
The user draws a picture of a heart and an image generation request is sent to Imagen. Upon receiving the generated image we place the gameplay object in the world with a higher fidelity image. Simultaneously we send the upgraded graphic to Veo in order to generate an animated video based on that graphic. The resultant video file's frames are extracted and used to animate the object in the game world.
Note: the processing time of the video generation has been shortened for illustrative purposes.
videoPath: external-assets/living-canvas/living-canvas-2.mp4
orientation: landscape
frame: laptop
layout: sidebyside
logs:
- timestamp: 6000
summary: Extracting canvas data to send to server
imageData: /external-assets/living-canvas/message_asset_2_1.png
sequence:
- angular
- apphosting
inspect: angular-apphosting/angular-apphosting-0
- timestamp: 7000
summary: Creating the game object with default properties and waiting for Gemini's analysis
sequence:
- angular
- apphosting
inspect: angular-apphosting/angular-apphosting-1
- timestamp: 8000
summary: Uploading user drawing from browser to Firebase App Hosting backend.
imageData: /external-assets/living-canvas/message_asset_2_1.png
sequence:
- angular
- apphosting
inspect: vertexai-gemini/vertexai-gemini-0
- timestamp: 10000
summary: Gemini response to Vertex AI and Firebase App Hosting. Interpreted as "Heart".
detail: |
{
...
heals: true,
...
}
imageData: ""
sequence:
- gemini
- vertexai
- apphosting
- angular
inspect: vertexai-gemini/vertexai-gemini-0
- timestamp: 12000
summary: Imagen generation response to Firebase App Hosting.
imageData: /external-assets/living-canvas/message_asset_2_2.png
sequence:
- imagen
- vertexai
- apphosting
- angular
inspect: vertexai-imagen
- timestamp: 22000
summary: Veo video generation response to Firebase App Hosting.
imageData: /external-assets/living-canvas/message_asset_2_x.gif
sequence:
- veo
- geminidevapi
- apphosting
- angular
inspect: apphosting-veo
- id: video-3
title: Function interpretation from natural language text with Gemini
description: |
The user writes a freeform text command that they want to see happen in the game world.
We send that command to the server for Gemini to analyse the text and transform it into a JSON representation of the command which the game engine will then process as a function call.
videoPath: external-assets/living-canvas/living-canvas-3.mp4
orientation: landscape
frame: laptop
layout: sidebyside
logs:
- timestamp: 8000
summary: Uploading user command to Firebase App Hosting backend with text "douse the fires"
sequence:
- angular
- apphosting
- vertexai
- gemini
inspect: vertexai-gemini/vertexai-gemini-2
- timestamp: 9000
summary: Analysis from Gemini 2.5 Flash on "douse the fires"
detail: |
{
...
verb: "douse",
target: "fires",
...
}
sequence:
- gemini
- vertexai
- apphosting
- angular
inspect: vertexai-gemini/vertexai-gemini-2
- type: inspect
subsections:
- id: vertexai-gemini
title: Vertex AI and Gemini
mode: html
examples:
- title: Image analysis with Gemini
icons:
- /external-assets/firebase-apphosting.svg
- /external-assets/vertex-ai.svg
- /external-assets/gemini.svg
mode: markdown
info: |
When the user’s drawing arrives on the server, we send it to Gemini for a series of analysis steps.
Firstly, we ask Gemini if the image matches any of the predefined object and property mappings we already have. These mappings are defined in the `ai-config.json` configuration file and are injected into the `analysis_initialGuess` prompt.
We provide predefined mappings and a set of expected objects to ensure a more reliable gameplay experience for the majority of puzzles. For example, we know that a drawing of fire should always result in the burning property being attached to that object so our analysis pipeline is more efficient and reliable for our users by including this shortcut step in the analysis.
However, if the image does not match any of the predefined object types then we ask Gemini what it thinks the image is instead via the `analysis_genericGuess` prompt. Gemini returns to us a one or two word description of the drawing.
We then give this description to Gemini along with all the possible set of properties that can be attached in the game. Using the prompt `analysis_attributesGuess`, we ask Gemini which of these properties make sense for that described object. This enables the player to draw anything they want – regardless of whether it was an object we anticipated in our design – and attaches properties completely dynamically.
code:
language: javascript
file: external-assets/living-canvas/vertexai-gemini-1.ts
region_tag: image_to_config
links:
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/ai-analysis.ts
type: github
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json
type: other
label: Explore the AI prompts
- url: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#gemini-text-and-image-samples-nodejs
type: other
label: Related Gemini docs
- title: Multimodal image generation with Gemini and input imagery
icons:
- /external-assets/firebase-apphosting.svg
- /external-assets/vertex-ai.svg
- /external-assets/gemini.svg
mode: markdown
info: |
After analysing the user's drawing, we send a request to one of our image generation models to upgrade the user's drawing into a higher fidelity graphic. In this example, we'll look at the Gemini image generation backend.
Gemini 2.0 Flash with image generation enables us to provide multimodal input to the model and receive a generated image as a result.
Here we construct the multimodal request by combining the text and base 64 image data of the user's drawing. The text prompt is constructed by referencing the type of object we want to create (e.g. fire, magnet) along with information about the visual style we want for the object. We know what the drawing is based on the initial analysis steps handled by Gemini earlier.
The full prompt is constructed as such:
`"Generate an image of a TYPE, centered on a coloured background in a similar 2D side-on view with the following visual style: VISUAL_STYLE."`
For reference, the different visual styles are described as following:
* **Realistic**: `"realistic, like 3D renderings from a 3D movie."`
* **Cartoon**: `"cartoon, like a comic book. Poppy art style. Strong solid colours."`
* **Retro**: `"8-bit retro pixel video game art. Sprites should be very low resolution. As if they are 16 pixels by 16 pixels but resized."`
code:
language: javascript
file: external-assets/living-canvas/vertexai-gemini-3.ts
region_tag: image_generation
links:
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/gemini-generation.ts
type: github
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json
type: other
label: Explore the AI prompts
- url: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#gemini-text-and-image-samples-nodejs
type: other
label: Related Gemini docs
- title: Generating a programmatic command with Gemini
icons:
- /external-assets/firebase-apphosting.svg
- /external-assets/vertex-ai.svg
- /external-assets/gemini.svg
mode: markdown
info: |
Another feature of the puzzle experience is the addition of the 'Spell Casting' ability. In addition to being able to draw on the canvas to create objects, we provide the user with a button in the Angular UI which shows a modal dialog.
Whatever text the user types into this modal dialog is then sent to the server for analysis and transformation by Gemini. We prompt Gemini to transform the natural language text into a command that the user wants to see happen in the game world. In other words, we cast a spell.
For example, if the user writes “douse all the fires” or “destroy the magnet” then Gemini will transform these commands into JSON representation which the game engine can process as a function call. In the case of "destroy the magnet", Gemini will respond with a JSON response of `{ "verb": "destroy", "target": "magnet" }`.
The verbs recognized in these commands are predefined functions but the objects that the command can target are provided to Gemini based on what objects currently exist in the game world. This helps scope the AI’s analysis.
code:
language: javascript
file: external-assets/living-canvas/vertexai-gemini-2.ts
region_tag: text_to_command
links:
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/ai-analysis.ts
type: github
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json
type: other
label: Explore the AI prompts
- id: vertexai-imagen
title: Vertex AI and Imagen
mode: html
examples:
- title: Image generation with Imagen 3.0
icons:
- /external-assets/firebase-apphosting.svg
- /external-assets/vertex-ai.svg
- /external-assets/googlecloud.svg
- /external-assets/imagen.svg
mode: markdown
info: |
After analysing the user's drawing, we send a request to one of our image generation models to upgrade the user's drawing into a higher fidelity graphic. In this example, we'll look at the Imagen model.
Imagen is one of our best image generation models and in this code we call the API from Firebase App Hosting via our Google Cloud credentials.
The exact prompt is constructed by referencing the type of object we want to create (e.g. fire, magnet) along with information about the visual style we want for the object.
The full prompt is constructed as such:
`"Generate an image of a TYPE, centered on a coloured background in a similar 2D side-on view with the following visual style: VISUAL_STYLE."`
For reference, the different visual styles are described as following:
* **Realistic**: `"realistic, like 3D renderings from a 3D movie."`
* **Cartoon**: `"cartoon, like a comic book. Poppy art style. Strong solid colours."`
* **Retro**: `"8-bit retro pixel video game art. Sprites should be very low resolution. As if they are 16 pixels by 16 pixels but resized."`
code:
language: javascript
file: external-assets/living-canvas/vertexai-imagen.ts
region_tag: image_generation
links:
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/imagen-generation.ts
type: github
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json
type: other
label: Explore the AI prompts
- url: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview
type: other
label: Related Imagen docs
- id: apphosting-veo
title: Firebase App Hosting, Gemini Developer API and Veo
mode: html
examples:
- title: Video generation with Veo and frame extraction for animation
icons:
- /external-assets/firebase-apphosting.svg
- /external-assets/googlecloud.svg
- /external-assets/veo.svg
mode: markdown
info: |
Using Veo we can upgrade an existing image generated via Imagen to an animated version which is swapped out on the PhaserJS object in real-time.
Veo is a state of the art multimodal video generation model. Note that we read the existing image into the `imageBuffer` variable and then upload this image along with a text prompt to Veo.
Our text prompt is designed to let Veo focus on the image itself but guide the model with some restraints such as highlighting that we want the animation to be 'gently moving' and that the image should float in place and always be visible, centered and without zoom or cropping. These subtle suggestions help ensure the optimal experience.
The video returned from Veo must be either portrait or landscape. After receiving the video file we use ffmpeg on the server to crop video to a square image and apply the borders used by our other game objects via the Sharp image editing library. View the code on GitHub for more understanding on the video frame extraction process.
code:
language: javascript
file: external-assets/living-canvas/vertexai-veo.ts
region_tag: video_generation
links:
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/veo-generation.ts
type: github
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json
type: other
label: Explore the AI prompts
- url: https://cloud.google.com/vertex-ai/generative-ai/docs/video/generate-videos
type: other
label: Related Veo docs
- id: angular-apphosting
title: Angular and PhaserJS with Firebase App Hosting
examples:
- title: Extracting canvas data to send to server
icons:
- /external-assets/chrome.svg
- /external-assets/firebase-apphosting.svg
mode: markdown
info: |
The user's drawing is captured on a hidden canvas element which is transparent. To prepare the image for upload we copy the image data to another temporary canvas with a white background and an additional padding area added to it.
The additional padding and white background improve the recognition and understanding of the image data for Gemini's analysis. On the client we convert the image to base64 before uploading to simplify our pipeline.
code:
language: javascript
file: external-assets/living-canvas/angular-apphosting-1.ts
region_tag: get_base64_data
links:
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/client/src/game/HiddenCanvas.ts#L131
type: github
- title: Instantiating the game object based on Gemini's analysis
icons:
- /external-assets/chrome.svg
- /external-assets/firebase-apphosting.svg
mode: markdown
info: |
In this code snippet we create the PhaserJS game object and place it in the world with all the properties provided by Gemini's analysis of the image. However, this process takes time as we must complete the Gemini analysis on the server and simultaneously request a higher fidelity image from the image generation backend.
Therefore, we begin by creating the object with default properties that result in it appearing as a static image with just the user's drawing represented on the object. The graphic fades in and out as we wait on the server's processing. Once the analysis returns from Gemini we merge the properties on the object and apply them. Later, we receive the upgraded image from the image generation model and update the object's sprite.
Finally the object becomes truly interactive now that all the processing is complete.
code:
language: javascript
file: external-assets/living-canvas/angular-apphosting-3.ts
region_tag: process_canvas
links:
- url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/client/src/game/LivingCanvas.ts#L965
type: github
- type: quiz
questions:
- title: Why is Angular a useful framework for integrating into a web game such as Living Canvas?
answers:
- answer: Angular is optimised for game development
- answer: Game engines are not typically optimised for UI development
correct: true
- answer: Angular has a built-in physics engine
- title: When deciding which gameplay properties to attach to the user's drawing why did the Gemini image analysis process involve potentially 3 calls to the API?
answers:
- answer: Making 3 calls in a row is faster than a single call
- answer: Gemini always requires 3 calls to the API to process an image
- answer: Separating the prompts improved the accuracy of the analysis by narrowing the scope of possibilities at each step
correct: true
- title: Which model was best suited for processing the natural language text to the JSON command structure used to call the game function for the "spell casting" feature?
answers:
- answer: Imagen
- answer: Gemini Flash with image generation
- answer: Gemini Flash
correct: true
- answer: Veo
- title: To achieve the best user experience with the animated sprite generation pipeline, which model or combination of models were used?
answers:
- answer: Gemini and Imagen
- answer: Veo and Imagen
correct: true
- answer: Veo
- answer: Gemini
- type: build
promoType: firebase-studio
links:
- url: https://github.com/FirebaseExtended/solution-living-canvas
label: Open on GitHub
- url: https://developers.google.com/solutions
label: Explore more Solutions
architecture:
entities:
- id: angular
icon: /external-assets/angular.svg
label: Angular and PhaserJS
x: 0
y: 0
connections:
- from: angular
to: apphosting
inspect: angular-apphosting
- id: apphosting
icon: /external-assets/firebase-apphosting.svg
label: Firebase App Hosting
x: 2
y: 0
connections:
- from: apphosting
to: vertexai
- from: apphosting
to: geminidevapi
inspect: apphosting-veo
- id: vertexai
icon: /external-assets/vertex-ai.svg
label: Vertex AI
x: 4
y: 0
connections:
- from: vertexai
to: gemini
inspect: vertexai-gemini
- from: vertexai
to: imagen
inspect: vertexai-imagen
- id: geminidevapi
icon: /external-assets/gemini-dev-api.svg
label: Gemini Developer API
x: 4
y: 2
connections:
- from: geminidevapi
to: veo
- id: gemini
icon: /external-assets/gemini.svg
label: Gemini
x: 6
y: -2
- id: imagen
icon: /external-assets/imagen.svg
label: Imagen
x: 6
y: 0
- id: veo
icon: /external-assets/veo.svg
label: Veo
x: 6
y: 2
badges:
startBadge: https://developers.google.com/profile/badges/playlists/solutions/living-canvas/view
exploreBadge: https://developers.google.com/profile/badges/playlists/solutions/living-canvas/learn
quizBadge: https://developers.google.com/profile/badges/playlists/solutions/living-canvas/quiz
buildBadge: https://developers.google.com/profile/badges/playlists/solutions/living-canvas/action