Loading...
Skip to main content

AI Integration

The AI Integration Microservice serves as a centralized platform designed to manage and simplify integrations with various AI providers. Currently, the microservice is integrated with OpenAI, allowing interaction with its AI capabilities.

How to Create an Assistant

  1. Create an OpenAI Account
  • Visit OpenAI Signup and follow the instructions to create an account.
  1. Create an OpenAI API Key

Screenshot of the Create API Key button in the upper right of the screen

  • Provide a name tag, select the project to be used, and set the desired security level (restricted, read-only, or full permissions).

  • Copy and save the API key in a secure location. You will not be able to view the key again, so be sure to store it safely for future use.

  1. Create an Assistant

Screenshot of the Create API Key button in the upper right of the screen

  • Set a name for the assistant.

  • Write detailed instructions for how the assistant should interact. The system instructions are used by the assistant and should not exceed 256,000 characters.

  • Add the necessary files for the assistant’s knowledge base and activate the File Seach tool. The resources required depend on the tools the assistant will use. For example, the code_interpreter tool requires a list of file IDs, while the file_search tool requires a list of vector store IDs.

Screenshot of the Create API Key button in the upper right of the screen

How to Create Agents in the XR Editor

  1. Create a Project in the XR Editor
  • Open your project in the XR Editor.

  • In the Elements menu, select AI Agent.

Screenshot of the Create API Key button in the upper right of the screen

  1. Set Parameters for the Agent

In the Properties Panel of the AI Agent, you can configure the parameters to connect your assistant.

Screenshot of the Create API Key button in the upper right of the screen

  • Provider: Select the service that the agent will use to process user requests. Currently, only OpenAI is supported.

  • OpenAI API Key: Input the secret API key you created earlier. Make sure to encrypt this key before publishing the project.

  • Agent Name: The name that will appear in the chat interface.

  • Assistant ID: This is the knowledge base ID for the agent. For OpenAI, it starts with asst_, followed by a unique identifier. You can find this ID below the "Name" input field on the OpenAI assistant page.

  • Voice: Choose the voice strategy for the agent's responses:

    • OpenAI: Uses OpenAI’s voice generation.

    • WebAPI: Uses browser-based voice synthesis. Note that some browsers may not support this feature, and others may sound robotic. Google Chrome typically offers the best performance.

    • Silent: Disables the voice feature. The agent will only communicate through text chat.

  • Emit Events: Enable this to allow the agent to emit events for use in scripting. When enabled, the agent will emit the following events: user-enter, user-leave, agent-talk-start, agent-talk-talking, agent-talk-end, agent-thinking

    • user-enter: This means a user enters the proximity region of the agent. This event is dispatched only if ‘Emit events’ is enabled and proximity is activated in the AI Agent component.

    • user-leave: This is emitted when the user leaves the proximity region, if 'Emit events' is enabled and proximity is being used.

    • agent-talk-start: This is emitted whenever the agent starts its speech (only once at each start). When the audio is played with the play button, it is also emitted whenever the voice synthesis sequence starts.

    • agent-talk-talking: This is emitted during the agent's speech in each frame, sending the modulated amplitude of the sound wave. This is a float value indicating the audio intensity (use what is already implemented in the avatars when speaking to animate the scale). When subscribed to this event, the sound wave arrives as a float value as a parameter of the event handlers.

      Example:

      bot.addEventListener('agent-talk-progress', (amp) => { ... })

      amp is similar to the value used in MUDz for the scale animation of avatars when they speak. Here, we would take it to send the audio value of the agent speaking.

    • agent-talk-end: This is emitted when the agent stops speaking.

    • agent-thinking: This is emitted when the agent is thinking, i.e., when the dots are shown in the chat.

  • Enable Proximity: This feature triggers events when a user enters or leaves the agent's proximity range. You can also configure the agent to send a message when a user enters or leaves the proximity area.