Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Workspace agent functions and prompt #14426

Merged
merged 4 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion packages/ai-workspace-agent/src/browser/frontend-module.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,13 @@ import { ContainerModule } from '@theia/core/shared/inversify';
import { ChatAgent } from '@theia/ai-chat/lib/common';
import { Agent, ToolProvider } from '@theia/ai-core/lib/common';
import { WorkspaceAgent } from './workspace-agent';
import { FileContentFunction, GetWorkspaceFileList } from './functions';
import { FileContentFunction, GetWorkspaceDirectoryStructure, GetWorkspaceFileList } from './functions';

export default new ContainerModule(bind => {
bind(WorkspaceAgent).toSelf().inSingletonScope();
bind(Agent).toService(WorkspaceAgent);
bind(ChatAgent).toService(WorkspaceAgent);
bind(ToolProvider).to(GetWorkspaceFileList);
bind(ToolProvider).to(FileContentFunction);
bind(ToolProvider).to(GetWorkspaceDirectoryStructure);
});
171 changes: 132 additions & 39 deletions packages/ai-workspace-agent/src/browser/functions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,62 @@ import { inject, injectable } from '@theia/core/shared/inversify';
import { FileService } from '@theia/filesystem/lib/browser/file-service';
import { FileStat } from '@theia/filesystem/lib/common/files';
import { WorkspaceService } from '@theia/workspace/lib/browser';
import { FILE_CONTENT_FUNCTION_ID, GET_WORKSPACE_FILE_LIST_FUNCTION_ID } from '../common/functions';
import { FILE_CONTENT_FUNCTION_ID, GET_WORKSPACE_DIRECTORY_STRUCTURE_FUNCTION_ID, GET_WORKSPACE_FILE_LIST_FUNCTION_ID } from '../common/functions';

function shouldExclude(stat: FileStat): boolean {
const excludedFolders = ['node_modules', 'lib'];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was there before, but it'd be better to at least put the function as a method of the tool, so adopters can at least customize this hard-coded list of folders to be excluded.
Ideally it should even be an injectable service, maybe with a default implementation that either provides a commonly useful list or looks into the gitignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe two settings:
consider .gitignore: boolean
ignore directories: String[]
plus an injectable service?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be great plus extra points :-)
To me it'd be important that platform adopters can easily customize the filter list via injection and that there are reasonable defaults for Theia IDE. Configuration options for the Theia IDE user is then extra nice on top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, did the services and added user settings as a follow-up (#14119)

return stat.resource.path.base.startsWith('.') || excludedFolders.includes(stat.resource.path.base);
}

@injectable()
export class GetWorkspaceDirectoryStructure implements ToolProvider {
static ID = GET_WORKSPACE_DIRECTORY_STRUCTURE_FUNCTION_ID;

getTool(): ToolRequest {
return {
id: GetWorkspaceDirectoryStructure.ID,
name: GetWorkspaceDirectoryStructure.ID,
description: `Retrieve the complete directory structure of the workspace, listing only directories (no file contents). This structure excludes specific directories,
such as node_modules and hidden files, ensuring paths are within workspace boundaries.`,
handler: () => this.getDirectoryStructure()
};
}

@inject(WorkspaceService)
protected workspaceService: WorkspaceService;

@inject(FileService)
protected readonly fileService: FileService;

private async getDirectoryStructure(): Promise<string[]> {
const wsRoots = await this.workspaceService.roots;

if (wsRoots.length === 0) {
throw new Error('Workspace root not found');
JonasHelming marked this conversation as resolved.
Show resolved Hide resolved
}

const workspaceRootUri = wsRoots[0].resource;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With just looking into the first workspace root, we don't really support multi-root workspaces. I think it'd be good to try supporting multi-root workspaces here. We could try to just make this explicit to the LLM:

{
  "<path-of-workspace-root[0]>": [ <file1>, ... ],
  "<path-of-workspace-root[1]>": [ <file1>, ... ],
}

And then in the FileContentFunction we'd need the as a parameter or request absolute paths again? Maybe that would work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer making "mutiple workspaces support" a follow-up, added it here: #14119


return this.buildDirectoryStructure(workspaceRootUri);
}

private async buildDirectoryStructure(uri: URI, prefix: string = ''): Promise<string[]> {
const stat = await this.fileService.resolve(uri);
const result: string[] = [];

if (stat && stat.isDirectory && stat.children) {
for (const child of stat.children) {
if (!child.isDirectory || shouldExclude(child)) { continue; };
const path = `${prefix}${child.resource.path.base}/`;
result.push(path);
result.push(...await this.buildDirectoryStructure(child.resource, `${path}`));
}
}

return result;
}
}

/**
* A Function that can read the contents of a File from the Workspace.
*/
@injectable()
export class FileContentFunction implements ToolProvider {
static ID = FILE_CONTENT_FUNCTION_ID;
Expand All @@ -32,13 +83,15 @@ export class FileContentFunction implements ToolProvider {
return {
id: FileContentFunction.ID,
name: FileContentFunction.ID,
description: 'Get the content of the file',
description: `The relative path to the target file within the workspace. This path is resolved from the workspace root, and only files within the workspace boundaries
are accessible. Attempting to access paths outside the workspace will result in an error.`,
parameters: {
type: 'object',
properties: {
file: {
type: 'string',
description: 'The path of the file to retrieve content for',
description: `Return the content of a specified file within the workspace. The file path must be provided relative to the workspace root. Only files within
workspace boundaries are accessible; attempting to access files outside the workspace will return an error.`,
}
}
},
Expand All @@ -61,15 +114,36 @@ export class FileContentFunction implements ToolProvider {
}

private async getFileContent(file: string): Promise<string> {
const uri = new URI(file);
const fileContent = await this.fileService.read(uri);
return fileContent.value;
const wsRoots = await this.workspaceService.roots;

if (wsRoots.length === 0) {
throw new Error('Workspace root not found');
}

const workspaceRootUri = wsRoots[0].resource;

const targetUri = workspaceRootUri.resolve(file);

if (!targetUri.toString().startsWith(workspaceRootUri.toString())) {
throw new Error('Access outside of the workspace is not allowed');
}
JonasHelming marked this conversation as resolved.
Show resolved Hide resolved

try {
const fileStat = await this.fileService.resolve(targetUri);

if (!fileStat || fileStat.isDirectory) {
return JSON.stringify({ error: 'File not found' });
}

const fileContent = await this.fileService.read(targetUri);
return fileContent.value;

} catch (error) {
return JSON.stringify({ error: 'File not found' });
}
}
}

/**
* A Function that lists all files in the workspace.
*/
@injectable()
export class GetWorkspaceFileList implements ToolProvider {
static ID = GET_WORKSPACE_FILE_LIST_FUNCTION_ID;
Expand All @@ -78,9 +152,22 @@ export class GetWorkspaceFileList implements ToolProvider {
return {
id: GetWorkspaceFileList.ID,
name: GetWorkspaceFileList.ID,
description: 'List all files in the workspace',

handler: () => this.getProjectFileList()
parameters: {
type: 'object',
properties: {
path: {
type: 'string',
description: `Optional relative path to a directory within the workspace. If no path is specified, the function lists contents directly in the workspace
root. Paths are resolved within workspace boundaries only; paths outside the workspace or unvalidated paths will result in an error.`
}
}
},
description: `List files and directories within a specified workspace directory. Paths are relative to the workspace root, and only workspace-contained paths are
allowed. If no path is provided, the root contents are listed. Paths outside the workspace will result in an error.`,
handler: (arg_string: string) => {
const args = JSON.parse(arg_string);
return this.getProjectFileList(args.path);
}
};
}

Expand All @@ -90,45 +177,51 @@ export class GetWorkspaceFileList implements ToolProvider {
@inject(FileService)
protected readonly fileService: FileService;

async getProjectFileList(): Promise<string[]> {
// Get all files from the workspace service as a flat list of qualified file names
async getProjectFileList(path?: string): Promise<string[]> {
const wsRoots = await this.workspaceService.roots;
const result: string[] = [];
for (const root of wsRoots) {
result.push(...await this.listFilesRecursively(root.resource));

if (wsRoots.length === 0) {
throw new Error('Workspace root not found');
}

const workspaceRootUri = wsRoots[0].resource;
const targetUri = path ? workspaceRootUri.resolve(path) : workspaceRootUri;

if (!targetUri.toString().startsWith(workspaceRootUri.toString())) {
throw new Error('Access outside of the workspace is not allowed');
}
JonasHelming marked this conversation as resolved.
Show resolved Hide resolved

try {
const stat = await this.fileService.resolve(targetUri);
if (!stat || !stat.isDirectory) {
return ['Error: Directory not found'];
}
return await this.listFilesDirectly(targetUri, workspaceRootUri);

} catch (error) {
return ['Error: Directory not found'];
}
return result;
}

private async listFilesRecursively(uri: URI): Promise<string[]> {
private async listFilesDirectly(uri: URI, workspaceRootUri: URI): Promise<string[]> {
const stat = await this.fileService.resolve(uri);
const result: string[] = [];

if (stat && stat.isDirectory) {
if (this.exclude(stat)) {
if (shouldExclude(stat)) {
return result;
}
const children = await this.fileService.resolve(uri);
if (children.children) {
for (const child of children.children) {
result.push(child.resource.toString());
result.push(...await this.listFilesRecursively(child.resource));
const relativePath = workspaceRootUri.relative(child.resource);
if (relativePath) {
result.push(relativePath.toString());
}
}
}
}
return result;
}

// Exclude folders which are not relevant to the AI Agent
private exclude(stat: FileStat): boolean {
if (stat.resource.path.base.startsWith('.')) {
return true;
}
if (stat.resource.path.base === 'node_modules') {
return true;
}
if (stat.resource.path.base === 'lib') {
return true;
}
return false;
return result;
}
}
1 change: 1 addition & 0 deletions packages/ai-workspace-agent/src/common/functions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@
// *****************************************************************************
export const FILE_CONTENT_FUNCTION_ID = 'getFileContent';
export const GET_WORKSPACE_FILE_LIST_FUNCTION_ID = 'getWorkspaceFileList';
export const GET_WORKSPACE_DIRECTORY_STRUCTURE_FUNCTION_ID = 'getWorkspaceDirectoryStructure';
51 changes: 16 additions & 35 deletions packages/ai-workspace-agent/src/common/template.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,50 +14,31 @@
// SPDX-License-Identifier: EPL-2.0 OR GPL-2.0-only WITH Classpath-exception-2.0
// *****************************************************************************
import { PromptTemplate } from '@theia/ai-core/lib/common';
import { GET_WORKSPACE_FILE_LIST_FUNCTION_ID, FILE_CONTENT_FUNCTION_ID } from './functions';
import { GET_WORKSPACE_FILE_LIST_FUNCTION_ID, FILE_CONTENT_FUNCTION_ID, GET_WORKSPACE_DIRECTORY_STRUCTURE_FUNCTION_ID } from './functions';

export const workspaceTemplate = <PromptTemplate>{
id: 'workspace-system',
template: `# Instructions

You are an AI assistant integrated into the Theia IDE, specifically designed to help software developers by
providing concise and accurate answers to programming-related questions. Your role is to enhance the
developer's productivity by offering quick solutions, explanations, and best practices.
Keep responses short and to the point, focusing on delivering valuable insights, best practices and
simple solutions.
You are specialized in providing insights based on the Theia IDE's workspace and its files.
Use the following functions to access the workspace:
- ~{${GET_WORKSPACE_FILE_LIST_FUNCTION_ID}}
- ~{${FILE_CONTENT_FUNCTION_ID}}. Never shorten the file paths when using this function.
You are an AI assistant integrated into Theia IDE, designed to assist software developers with concise answers to programming-related questions. Your goal is to enhance
productivity with quick, relevant solutions, explanations, and best practices. Keep responses short, delivering valuable insights and direct solutions.

## Guidelines
Use the following functions to interact with the workspace files as needed:
- **~{${GET_WORKSPACE_DIRECTORY_STRUCTURE_FUNCTION_ID}}**: Returns the complete directory structure.
- **~{${GET_WORKSPACE_FILE_LIST_FUNCTION_ID}}**: Lists files and directories in a specific directory.
- **~{${FILE_CONTENT_FUNCTION_ID}}**: Retrieves the content of a specific file.

1. **Understand Context:**
- **Always answer in context of the workspace and its files. Avoid general answers**.
- Use the provided functions to access the workspace files. **Never assume the workspace structure or file contents.**
- Tailor responses to be relevant to the programming language, framework, or tools like Eclipse Theia used in the workspace.
- Ask clarifying questions if necessary to provide accurate assistance. Always assume it is okay to read additional files from the workspace.
### Workspace Navigation Guidelines

2. **Provide Clear Solutions:**
- Offer direct answers or code snippets that solve the problem or clarify the concept.
- Avoid lengthy explanations unless necessary for understanding.
- Provide links to official documentation for further reading when applicable.
1. **Confirm Paths**: Always verify paths by listing directories or files as you navigate. Avoid assumptions based on user input alone.
2. **Start from Root**: Begin at the root and navigate subdirectories step-by-step.

3. **Support Multiple Languages and Tools:**
- Be familiar with popular programming languages, frameworks, IDEs like Eclipse Theia, and command-line tools.
- Adapt advice based on the language, environment, or tools specified by the developer.
### Response Guidelines

4. **Facilitate Learning:**
- Encourage learning by explaining why a solution works or why a particular approach is recommended.
- Keep explanations concise and educational.

5. **Maintain Professional Tone:**
- Communicate in a friendly, professional manner.
- Use technical jargon appropriately, ensuring clarity for the target audience.

6. **Stay on Topic:**
- Limit responses strictly to topics related to software development, frameworks, Eclipse Theia, terminal usage, and relevant technologies.
- Politely decline to answer questions unrelated to these areas by saying, "I'm here to assist with programming-related questions.
For other topics, please refer to a specialized source."
1. **Contextual Focus**: Provide answers relevant to the workspace, avoiding general advice. Use provided functions without assuming file structure or content.
2. **Clear Solutions**: Offer direct answers and concise explanations. Link to official documentation as needed.
3. **Tool & Language Adaptability**: Adjust guidance based on the programming language, framework, or tool specified by the developer.
4. **Supportive Tone**: Maintain a friendly, professional tone with clear, accurate technical language.
5. **Stay Relevant**: Limit responses to software development, frameworks, Theia, terminal usage, and related technologies. Decline unrelated questions politely.
`
};
Loading