|
12 | 12 | - [Advanced Setup](#-advanced-setup) |
13 | 13 | - [Development](#-development) |
14 | 14 | - [Testing LLM Agents](#-testing-llm-agents) |
| 15 | +- [Function Testing with ftester](#-function-testing-with-ftester) |
15 | 16 | - [Building](#%EF%B8%8F-building) |
16 | 17 | - [Credits](#-credits) |
17 | 18 | - [License](#-license) |
@@ -808,6 +809,243 @@ simple_json: |
808 | 809 |
|
809 | 810 | This tool helps ensure your AI agents are using the most effective models for their specific tasks, improving reliability while optimizing costs. |
810 | 811 |
|
| 812 | +## 🔍 Function Testing with ftester |
| 813 | + |
| 814 | +PentAGI includes a versatile utility called `ftester` for debugging, testing, and developing specific functions and AI agent behaviors. While `ctester` focuses on testing LLM model capabilities, `ftester` allows you to directly invoke individual system functions and AI agent components with precise control over execution context. |
| 815 | + |
| 816 | +### Key Features |
| 817 | + |
| 818 | +- **Direct Function Access**: Test individual functions without running the entire system |
| 819 | +- **Mock Mode**: Test functions without a live PentAGI deployment using built-in mocks |
| 820 | +- **Interactive Input**: Fill function arguments interactively for exploratory testing |
| 821 | +- **Detailed Output**: Color-coded terminal output with formatted responses and errors |
| 822 | +- **Context-Aware Testing**: Debug AI agents within the context of specific flows, tasks, and subtasks |
| 823 | +- **Observability Integration**: All function calls are logged to Langfuse and Observability stack |
| 824 | + |
| 825 | +### Usage Modes |
| 826 | + |
| 827 | +#### Command Line Arguments |
| 828 | + |
| 829 | +Run ftester with specific function and arguments directly from the command line: |
| 830 | + |
| 831 | +```bash |
| 832 | +# Basic usage with mock mode |
| 833 | +cd backend |
| 834 | +go run cmd/ftester/main.go [function_name] -[arg1] [value1] -[arg2] [value2] |
| 835 | + |
| 836 | +# Example: Test terminal command in mock mode |
| 837 | +go run cmd/ftester/main.go terminal -command "ls -la" -message "List files" |
| 838 | + |
| 839 | +# Using a real flow context |
| 840 | +go run cmd/ftester/main.go -flow 123 terminal -command "whoami" -message "Check user" |
| 841 | + |
| 842 | +# Testing AI agent in specific task/subtask context |
| 843 | +go run cmd/ftester/main.go -flow 123 -task 456 -subtask 789 pentester -message "Find vulnerabilities" |
| 844 | +``` |
| 845 | + |
| 846 | +#### Interactive Mode |
| 847 | + |
| 848 | +Run ftester without arguments for a guided interactive experience: |
| 849 | + |
| 850 | +```bash |
| 851 | +# Start interactive mode |
| 852 | +go run cmd/ftester/main.go [function_name] |
| 853 | + |
| 854 | +# For example, to interactively fill browser tool arguments |
| 855 | +go run cmd/ftester/main.go browser |
| 856 | +``` |
| 857 | + |
| 858 | +<details> |
| 859 | +<summary><b>Available Functions</b> (click to expand)</summary> |
| 860 | + |
| 861 | +### Environment Functions |
| 862 | +- **terminal**: Execute commands in a container and return the output |
| 863 | +- **file**: Perform file operations (read, write, list) in a container |
| 864 | + |
| 865 | +### Search Functions |
| 866 | +- **browser**: Access websites and capture screenshots |
| 867 | +- **google**: Search the web using Google Custom Search |
| 868 | +- **duckduckgo**: Search the web using DuckDuckGo |
| 869 | +- **tavily**: Search using Tavily AI search engine |
| 870 | +- **traversaal**: Search using Traversaal AI search engine |
| 871 | +- **perplexity**: Search using Perplexity AI |
| 872 | + |
| 873 | +### Vector Database Functions |
| 874 | +- **search_in_memory**: Search for information in vector database |
| 875 | +- **search_guide**: Find guidance documents in vector database |
| 876 | +- **search_answer**: Find answers to questions in vector database |
| 877 | +- **search_code**: Find code examples in vector database |
| 878 | + |
| 879 | +### AI Agent Functions |
| 880 | +- **advice**: Get expert advice from an AI agent |
| 881 | +- **coder**: Request code generation or modification |
| 882 | +- **maintenance**: Run system maintenance tasks |
| 883 | +- **memorist**: Store and organize information in vector database |
| 884 | +- **pentester**: Perform security tests and vulnerability analysis |
| 885 | +- **search**: Complex search across multiple sources |
| 886 | + |
| 887 | +### Utility Functions |
| 888 | +- **describe**: Show information about flows, tasks, and subtasks |
| 889 | + |
| 890 | +</details> |
| 891 | + |
| 892 | +<details> |
| 893 | +<summary><b>Debugging Flow Context</b> (click to expand)</summary> |
| 894 | + |
| 895 | +The `describe` function provides detailed information about tasks and subtasks within a flow. This is particularly useful for diagnosing issues when PentAGI encounters problems or gets stuck. |
| 896 | + |
| 897 | +```bash |
| 898 | +# List all flows in the system |
| 899 | +go run cmd/ftester/main.go describe |
| 900 | + |
| 901 | +# Show all tasks and subtasks for a specific flow |
| 902 | +go run cmd/ftester/main.go -flow 123 describe |
| 903 | + |
| 904 | +# Show detailed information for a specific task |
| 905 | +go run cmd/ftester/main.go -flow 123 -task 456 describe |
| 906 | + |
| 907 | +# Show detailed information for a specific subtask |
| 908 | +go run cmd/ftester/main.go -flow 123 -task 456 -subtask 789 describe |
| 909 | + |
| 910 | +# Show verbose output with full descriptions and results |
| 911 | +go run cmd/ftester/main.go -flow 123 describe -verbose |
| 912 | +``` |
| 913 | + |
| 914 | +This function allows you to identify the exact point where a flow might be stuck and resume processing by directly invoking the appropriate agent function. |
| 915 | + |
| 916 | +</details> |
| 917 | + |
| 918 | +<details> |
| 919 | +<summary><b>Function Help and Discovery</b> (click to expand)</summary> |
| 920 | + |
| 921 | +Each function has a help mode that shows available parameters: |
| 922 | + |
| 923 | +```bash |
| 924 | +# Get help for a specific function |
| 925 | +go run cmd/ftester/main.go [function_name] -help |
| 926 | + |
| 927 | +# Examples: |
| 928 | +go run cmd/ftester/main.go terminal -help |
| 929 | +go run cmd/ftester/main.go browser -help |
| 930 | +go run cmd/ftester/main.go describe -help |
| 931 | +``` |
| 932 | + |
| 933 | +You can also run ftester without arguments to see a list of all available functions: |
| 934 | + |
| 935 | +```bash |
| 936 | +go run cmd/ftester/main.go |
| 937 | +``` |
| 938 | + |
| 939 | +</details> |
| 940 | + |
| 941 | +<details> |
| 942 | +<summary><b>Output Format</b> (click to expand)</summary> |
| 943 | + |
| 944 | +The `ftester` utility uses color-coded output to make interpretation easier: |
| 945 | + |
| 946 | +- **Blue headers**: Section titles and key names |
| 947 | +- **Cyan [INFO]**: General information messages |
| 948 | +- **Green [SUCCESS]**: Successful operations |
| 949 | +- **Red [ERROR]**: Error messages |
| 950 | +- **Yellow [WARNING]**: Warning messages |
| 951 | +- **Yellow [MOCK]**: Indicates mock mode operation |
| 952 | +- **Magenta values**: Function arguments and results |
| 953 | + |
| 954 | +JSON and Markdown responses are automatically formatted for readability. |
| 955 | + |
| 956 | +</details> |
| 957 | + |
| 958 | +<details> |
| 959 | +<summary><b>Advanced Usage Scenarios</b> (click to expand)</summary> |
| 960 | + |
| 961 | +### Debugging Stuck AI Flows |
| 962 | + |
| 963 | +When PentAGI gets stuck in a flow: |
| 964 | + |
| 965 | +1. Pause the flow through the UI |
| 966 | +2. Use `describe` to identify the current task and subtask |
| 967 | +3. Directly invoke the agent function with the same task/subtask IDs |
| 968 | +4. Examine the detailed output to identify the issue |
| 969 | +5. Resume the flow or manually intervene as needed |
| 970 | + |
| 971 | +### Testing Environment Variables |
| 972 | + |
| 973 | +Verify that API keys and external services are configured correctly: |
| 974 | + |
| 975 | +```bash |
| 976 | +# Test Google search API configuration |
| 977 | +go run cmd/ftester/main.go google -query "pentesting tools" |
| 978 | + |
| 979 | +# Test browser access to external websites |
| 980 | +go run cmd/ftester/main.go browser -url "https://example.com" |
| 981 | +``` |
| 982 | + |
| 983 | +### Developing New AI Agent Behaviors |
| 984 | + |
| 985 | +When developing new prompt templates or agent behaviors: |
| 986 | + |
| 987 | +1. Create a test flow in the UI |
| 988 | +2. Use ftester to directly invoke the agent with different prompts |
| 989 | +3. Observe responses and adjust prompts accordingly |
| 990 | +4. Check Langfuse for detailed traces of all function calls |
| 991 | + |
| 992 | +### Verifying Docker Container Setup |
| 993 | + |
| 994 | +Ensure containers are properly configured: |
| 995 | + |
| 996 | +```bash |
| 997 | +go run cmd/ftester/main.go -flow 123 terminal -command "env | grep -i proxy" -message "Check proxy settings" |
| 998 | +``` |
| 999 | + |
| 1000 | +</details> |
| 1001 | + |
| 1002 | +<details> |
| 1003 | +<summary><b>Docker Container Usage</b> (click to expand)</summary> |
| 1004 | + |
| 1005 | +If you have PentAGI running in Docker, you can use ftester from within the container: |
| 1006 | + |
| 1007 | +```bash |
| 1008 | +# Run ftester inside the running PentAGI container |
| 1009 | +docker exec -it pentagi /opt/pentagi/bin/ftester [arguments] |
| 1010 | + |
| 1011 | +# Examples: |
| 1012 | +docker exec -it pentagi /opt/pentagi/bin/ftester -flow 123 describe |
| 1013 | +docker exec -it pentagi /opt/pentagi/bin/ftester -flow 123 terminal -command "ps aux" -message "List processes" |
| 1014 | +``` |
| 1015 | + |
| 1016 | +This is particularly useful for production deployments where you don't have a local development environment. |
| 1017 | + |
| 1018 | +</details> |
| 1019 | + |
| 1020 | +<details> |
| 1021 | +<summary><b>Integration with Observability Tools</b> (click to expand)</summary> |
| 1022 | + |
| 1023 | +All function calls made through ftester are logged to: |
| 1024 | + |
| 1025 | +1. **Langfuse**: Captures the entire AI agent interaction chain, including prompts, responses, and function calls |
| 1026 | +2. **OpenTelemetry**: Records metrics, traces, and logs for system performance analysis |
| 1027 | +3. **Terminal Output**: Provides immediate feedback on function execution |
| 1028 | + |
| 1029 | +To access detailed logs: |
| 1030 | + |
| 1031 | +- Check Langfuse UI for AI agent traces (typically at `http://localhost:4000`) |
| 1032 | +- Use Grafana dashboards for system metrics (typically at `http://localhost:3000`) |
| 1033 | +- Examine terminal output for immediate function results and errors |
| 1034 | + |
| 1035 | +</details> |
| 1036 | + |
| 1037 | +### Command-line Options |
| 1038 | + |
| 1039 | +The main utility accepts several options: |
| 1040 | + |
| 1041 | +- `-env <path>` - Path to environment file (optional, default: `.env`) |
| 1042 | +- `-provider <type>` - Provider type to use (default: `custom`, options: `openai`, `anthropic`, `custom`) |
| 1043 | +- `-flow <id>` - Flow ID for testing (0 means using mocks, default: `0`) |
| 1044 | +- `-task <id>` - Task ID for agent context (optional) |
| 1045 | +- `-subtask <id>` - Subtask ID for agent context (optional) |
| 1046 | + |
| 1047 | +Function-specific arguments are passed after the function name using `-name value` format. |
| 1048 | + |
811 | 1049 | ## 🏗️ Building |
812 | 1050 |
|
813 | 1051 | ### Building Docker Image |
|
0 commit comments