solution: metadata: id: living-canvas title: Living Canvas, a web-based puzzle game powered by Generative AI description: Learn how to create an AI-powered web game using Angular, PhaserJS, Gemini, Imagen, Veo and Firebase App Hosting. fallbackCta: title: Launch in Firebase Studio icon: firebase-studio url: https://studio.firebase.google.com/import?url=https%3A%2F%2Fgithub.com%2FFirebaseExtended%2Fsolution-living-canvas accentColor: foreground: "white" background: "#141e2a" sections: - type: watch youtubeId: dN1DbYyopks title: Living Canvas, a web-based puzzle game powered by Generative AI description: Learn how to create an AI-powered web game using Angular, PhaserJS, Gemini, Imagen, Veo and Firebase App Hosting. technologies: - icon: /external-assets/angular.svg label: Angular url: https://angular.io - icon: /external-assets/firebase-apphosting.svg label: Firebase App Hosting url: https://firebase.google.com/docs/app-hosting - icon: /external-assets/gemini.svg label: Gemini url: https://ai.google.dev/ - icon: /external-assets/vertex-ai.svg label: Vertex AI url: https://cloud.google.com/vertex-ai - icon: /external-assets/googlecloud.svg label: Imagen url: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview - icon: /external-assets/veo.svg label: Veo url: https://ai.google.dev/gemini-api/docs/video - type: explore mode: video subsections: - id: video-1 title: Gemini image analysis and image generation description: | The goal in every puzzle is for the user to find a way for the key in the level to reach the lock goal. The user draws a visual of a fire on the canvas, Gemini analyses the image and transforms it into a fire graphic with image generation. The object has the burning property attached to it which causes the ice to melt. videoPath: external-assets/living-canvas/living-canvas-1.mp4 orientation: landscape frame: laptop layout: sidebyside logs: - timestamp: 10000 summary: Extracting canvas data to send to server imageData: /external-assets/living-canvas/message_asset_1_1.png sequence: - angular - apphosting inspect: angular-apphosting/angular-apphosting-0 - timestamp: 11000 summary: Creating the game object with default properties and waiting for Gemini's analysis sequence: - angular - apphosting inspect: angular-apphosting/angular-apphosting-1 - timestamp: 12000 summary: Uploading user drawing from browser to Firebase App Hosting backend. imageData: /external-assets/living-canvas/message_asset_1_1.png sequence: - angular - apphosting inspect: vertexai-gemini/vertexai-gemini-0 - timestamp: 14000 summary: Gemini response to Vertex AI and Firebase App Hosting. Interpreted as "Fire". detail: | { ... burning: true, ... } imageData: "" sequence: - gemini - vertexai - apphosting - angular inspect: vertexai-gemini/vertexai-gemini-0 - timestamp: 17500 summary: Gemini image generation response to Vertex AI and Firebase App Hosting. imageData: /external-assets/living-canvas/message_asset_1_2.png sequence: - gemini - vertexai - apphosting - angular inspect: vertexai-gemini/vertexai-gemini-1 - id: video-2 title: Image and video generation with Imagen and Veo description: | The user draws a picture of a heart and an image generation request is sent to Imagen. Upon receiving the generated image we place the gameplay object in the world with a higher fidelity image. Simultaneously we send the upgraded graphic to Veo in order to generate an animated video based on that graphic. The resultant video file's frames are extracted and used to animate the object in the game world. Note: the processing time of the video generation has been shortened for illustrative purposes. videoPath: external-assets/living-canvas/living-canvas-2.mp4 orientation: landscape frame: laptop layout: sidebyside logs: - timestamp: 6000 summary: Extracting canvas data to send to server imageData: /external-assets/living-canvas/message_asset_2_1.png sequence: - angular - apphosting inspect: angular-apphosting/angular-apphosting-0 - timestamp: 7000 summary: Creating the game object with default properties and waiting for Gemini's analysis sequence: - angular - apphosting inspect: angular-apphosting/angular-apphosting-1 - timestamp: 8000 summary: Uploading user drawing from browser to Firebase App Hosting backend. imageData: /external-assets/living-canvas/message_asset_2_1.png sequence: - angular - apphosting inspect: vertexai-gemini/vertexai-gemini-0 - timestamp: 10000 summary: Gemini response to Vertex AI and Firebase App Hosting. Interpreted as "Heart". detail: | { ... heals: true, ... } imageData: "" sequence: - gemini - vertexai - apphosting - angular inspect: vertexai-gemini/vertexai-gemini-0 - timestamp: 12000 summary: Imagen generation response to Firebase App Hosting. imageData: /external-assets/living-canvas/message_asset_2_2.png sequence: - imagen - vertexai - apphosting - angular inspect: vertexai-imagen - timestamp: 22000 summary: Veo video generation response to Firebase App Hosting. imageData: /external-assets/living-canvas/message_asset_2_x.gif sequence: - veo - geminidevapi - apphosting - angular inspect: apphosting-veo - id: video-3 title: Function interpretation from natural language text with Gemini description: | The user writes a freeform text command that they want to see happen in the game world. We send that command to the server for Gemini to analyse the text and transform it into a JSON representation of the command which the game engine will then process as a function call. videoPath: external-assets/living-canvas/living-canvas-3.mp4 orientation: landscape frame: laptop layout: sidebyside logs: - timestamp: 8000 summary: Uploading user command to Firebase App Hosting backend with text "douse the fires" sequence: - angular - apphosting - vertexai - gemini inspect: vertexai-gemini/vertexai-gemini-2 - timestamp: 9000 summary: Analysis from Gemini 2.5 Flash on "douse the fires" detail: | { ... verb: "douse", target: "fires", ... } sequence: - gemini - vertexai - apphosting - angular inspect: vertexai-gemini/vertexai-gemini-2 - type: inspect subsections: - id: vertexai-gemini title: Vertex AI and Gemini mode: html examples: - title: Image analysis with Gemini icons: - /external-assets/firebase-apphosting.svg - /external-assets/vertex-ai.svg - /external-assets/gemini.svg mode: markdown info: | When the user’s drawing arrives on the server, we send it to Gemini for a series of analysis steps. Firstly, we ask Gemini if the image matches any of the predefined object and property mappings we already have. These mappings are defined in the `ai-config.json` configuration file and are injected into the `analysis_initialGuess` prompt. We provide predefined mappings and a set of expected objects to ensure a more reliable gameplay experience for the majority of puzzles. For example, we know that a drawing of fire should always result in the burning property being attached to that object so our analysis pipeline is more efficient and reliable for our users by including this shortcut step in the analysis. However, if the image does not match any of the predefined object types then we ask Gemini what it thinks the image is instead via the `analysis_genericGuess` prompt. Gemini returns to us a one or two word description of the drawing. We then give this description to Gemini along with all the possible set of properties that can be attached in the game. Using the prompt `analysis_attributesGuess`, we ask Gemini which of these properties make sense for that described object. This enables the player to draw anything they want – regardless of whether it was an object we anticipated in our design – and attaches properties completely dynamically. code: language: javascript file: external-assets/living-canvas/vertexai-gemini-1.ts region_tag: image_to_config links: - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/ai-analysis.ts type: github - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json type: other label: Explore the AI prompts - url: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#gemini-text-and-image-samples-nodejs type: other label: Related Gemini docs - title: Multimodal image generation with Gemini and input imagery icons: - /external-assets/firebase-apphosting.svg - /external-assets/vertex-ai.svg - /external-assets/gemini.svg mode: markdown info: | After analysing the user's drawing, we send a request to one of our image generation models to upgrade the user's drawing into a higher fidelity graphic. In this example, we'll look at the Gemini image generation backend. Gemini 2.0 Flash with image generation enables us to provide multimodal input to the model and receive a generated image as a result. Here we construct the multimodal request by combining the text and base 64 image data of the user's drawing. The text prompt is constructed by referencing the type of object we want to create (e.g. fire, magnet) along with information about the visual style we want for the object. We know what the drawing is based on the initial analysis steps handled by Gemini earlier. The full prompt is constructed as such: `"Generate an image of a TYPE, centered on a coloured background in a similar 2D side-on view with the following visual style: VISUAL_STYLE."` For reference, the different visual styles are described as following: * **Realistic**: `"realistic, like 3D renderings from a 3D movie."` * **Cartoon**: `"cartoon, like a comic book. Poppy art style. Strong solid colours."` * **Retro**: `"8-bit retro pixel video game art. Sprites should be very low resolution. As if they are 16 pixels by 16 pixels but resized."` code: language: javascript file: external-assets/living-canvas/vertexai-gemini-3.ts region_tag: image_generation links: - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/gemini-generation.ts type: github - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json type: other label: Explore the AI prompts - url: https://cloud.google.com/vertex-ai/generative-ai/docs/start/quickstarts/quickstart-multimodal#gemini-text-and-image-samples-nodejs type: other label: Related Gemini docs - title: Generating a programmatic command with Gemini icons: - /external-assets/firebase-apphosting.svg - /external-assets/vertex-ai.svg - /external-assets/gemini.svg mode: markdown info: | Another feature of the puzzle experience is the addition of the 'Spell Casting' ability. In addition to being able to draw on the canvas to create objects, we provide the user with a button in the Angular UI which shows a modal dialog. Whatever text the user types into this modal dialog is then sent to the server for analysis and transformation by Gemini. We prompt Gemini to transform the natural language text into a command that the user wants to see happen in the game world. In other words, we cast a spell. For example, if the user writes “douse all the fires” or “destroy the magnet” then Gemini will transform these commands into JSON representation which the game engine can process as a function call. In the case of "destroy the magnet", Gemini will respond with a JSON response of `{ "verb": "destroy", "target": "magnet" }`. The verbs recognized in these commands are predefined functions but the objects that the command can target are provided to Gemini based on what objects currently exist in the game world. This helps scope the AI’s analysis. code: language: javascript file: external-assets/living-canvas/vertexai-gemini-2.ts region_tag: text_to_command links: - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/ai-analysis.ts type: github - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json type: other label: Explore the AI prompts - id: vertexai-imagen title: Vertex AI and Imagen mode: html examples: - title: Image generation with Imagen 3.0 icons: - /external-assets/firebase-apphosting.svg - /external-assets/vertex-ai.svg - /external-assets/googlecloud.svg - /external-assets/imagen.svg mode: markdown info: | After analysing the user's drawing, we send a request to one of our image generation models to upgrade the user's drawing into a higher fidelity graphic. In this example, we'll look at the Imagen model. Imagen is one of our best image generation models and in this code we call the API from Firebase App Hosting via our Google Cloud credentials. The exact prompt is constructed by referencing the type of object we want to create (e.g. fire, magnet) along with information about the visual style we want for the object. The full prompt is constructed as such: `"Generate an image of a TYPE, centered on a coloured background in a similar 2D side-on view with the following visual style: VISUAL_STYLE."` For reference, the different visual styles are described as following: * **Realistic**: `"realistic, like 3D renderings from a 3D movie."` * **Cartoon**: `"cartoon, like a comic book. Poppy art style. Strong solid colours."` * **Retro**: `"8-bit retro pixel video game art. Sprites should be very low resolution. As if they are 16 pixels by 16 pixels but resized."` code: language: javascript file: external-assets/living-canvas/vertexai-imagen.ts region_tag: image_generation links: - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/imagen-generation.ts type: github - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json type: other label: Explore the AI prompts - url: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview type: other label: Related Imagen docs - id: apphosting-veo title: Firebase App Hosting, Gemini Developer API and Veo mode: html examples: - title: Video generation with Veo and frame extraction for animation icons: - /external-assets/firebase-apphosting.svg - /external-assets/googlecloud.svg - /external-assets/veo.svg mode: markdown info: | Using Veo we can upgrade an existing image generated via Imagen to an animated version which is swapped out on the PhaserJS object in real-time. Veo is a state of the art multimodal video generation model. Note that we read the existing image into the `imageBuffer` variable and then upload this image along with a text prompt to Veo. Our text prompt is designed to let Veo focus on the image itself but guide the model with some restraints such as highlighting that we want the animation to be 'gently moving' and that the image should float in place and always be visible, centered and without zoom or cropping. These subtle suggestions help ensure the optimal experience. The video returned from Veo must be either portrait or landscape. After receiving the video file we use ffmpeg on the server to crop video to a square image and apply the borders used by our other game objects via the Sharp image editing library. View the code on GitHub for more understanding on the video frame extraction process. code: language: javascript file: external-assets/living-canvas/vertexai-veo.ts region_tag: video_generation links: - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/helpers/veo-generation.ts type: github - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/server/ai-config.json type: other label: Explore the AI prompts - url: https://cloud.google.com/vertex-ai/generative-ai/docs/video/generate-videos type: other label: Related Veo docs - id: angular-apphosting title: Angular and PhaserJS with Firebase App Hosting examples: - title: Extracting canvas data to send to server icons: - /external-assets/chrome.svg - /external-assets/firebase-apphosting.svg mode: markdown info: | The user's drawing is captured on a hidden canvas element which is transparent. To prepare the image for upload we copy the image data to another temporary canvas with a white background and an additional padding area added to it. The additional padding and white background improve the recognition and understanding of the image data for Gemini's analysis. On the client we convert the image to base64 before uploading to simplify our pipeline. code: language: javascript file: external-assets/living-canvas/angular-apphosting-1.ts region_tag: get_base64_data links: - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/client/src/game/HiddenCanvas.ts#L131 type: github - title: Instantiating the game object based on Gemini's analysis icons: - /external-assets/chrome.svg - /external-assets/firebase-apphosting.svg mode: markdown info: | In this code snippet we create the PhaserJS game object and place it in the world with all the properties provided by Gemini's analysis of the image. However, this process takes time as we must complete the Gemini analysis on the server and simultaneously request a higher fidelity image from the image generation backend. Therefore, we begin by creating the object with default properties that result in it appearing as a static image with just the user's drawing represented on the object. The graphic fades in and out as we wait on the server's processing. Once the analysis returns from Gemini we merge the properties on the object and apply them. Later, we receive the upgraded image from the image generation model and update the object's sprite. Finally the object becomes truly interactive now that all the processing is complete. code: language: javascript file: external-assets/living-canvas/angular-apphosting-3.ts region_tag: process_canvas links: - url: https://github.com/FirebaseExtended/solution-living-canvas/blob/main/client/src/game/LivingCanvas.ts#L965 type: github - type: quiz questions: - title: Why is Angular a useful framework for integrating into a web game such as Living Canvas? answers: - answer: Angular is optimised for game development - answer: Game engines are not typically optimised for UI development correct: true - answer: Angular has a built-in physics engine - title: When deciding which gameplay properties to attach to the user's drawing why did the Gemini image analysis process involve potentially 3 calls to the API? answers: - answer: Making 3 calls in a row is faster than a single call - answer: Gemini always requires 3 calls to the API to process an image - answer: Separating the prompts improved the accuracy of the analysis by narrowing the scope of possibilities at each step correct: true - title: Which model was best suited for processing the natural language text to the JSON command structure used to call the game function for the "spell casting" feature? answers: - answer: Imagen - answer: Gemini Flash with image generation - answer: Gemini Flash correct: true - answer: Veo - title: To achieve the best user experience with the animated sprite generation pipeline, which model or combination of models were used? answers: - answer: Gemini and Imagen - answer: Veo and Imagen correct: true - answer: Veo - answer: Gemini - type: build promoType: firebase-studio links: - url: https://github.com/FirebaseExtended/solution-living-canvas label: Open on GitHub - url: https://developers.google.com/solutions label: Explore more Solutions architecture: entities: - id: angular icon: /external-assets/angular.svg label: Angular and PhaserJS x: 0 y: 0 connections: - from: angular to: apphosting inspect: angular-apphosting - id: apphosting icon: /external-assets/firebase-apphosting.svg label: Firebase App Hosting x: 2 y: 0 connections: - from: apphosting to: vertexai - from: apphosting to: geminidevapi inspect: apphosting-veo - id: vertexai icon: /external-assets/vertex-ai.svg label: Vertex AI x: 4 y: 0 connections: - from: vertexai to: gemini inspect: vertexai-gemini - from: vertexai to: imagen inspect: vertexai-imagen - id: geminidevapi icon: /external-assets/gemini-dev-api.svg label: Gemini Developer API x: 4 y: 2 connections: - from: geminidevapi to: veo - id: gemini icon: /external-assets/gemini.svg label: Gemini x: 6 y: -2 - id: imagen icon: /external-assets/imagen.svg label: Imagen x: 6 y: 0 - id: veo icon: /external-assets/veo.svg label: Veo x: 6 y: 2 badges: startBadge: https://developers.google.com/profile/badges/playlists/solutions/living-canvas/view exploreBadge: https://developers.google.com/profile/badges/playlists/solutions/living-canvas/learn quizBadge: https://developers.google.com/profile/badges/playlists/solutions/living-canvas/quiz buildBadge: https://developers.google.com/profile/badges/playlists/solutions/living-canvas/action