Skip to content

konradodwrot/ai_sandbox_containerized

Repository files navigation

Containerized AI Sandbox Development Environment

Containerized development environment designed to provide balance between convenience and safety.

Features

  • Network isolation - Primary workspace container is located in private network with no internet access, HTTPS(443) and SSH(22) traffic goes through Squid Proxy with domain allowlisting functinality that is configured outside of AI agent reach. Proxy is placed within two networks, private and public, enabling indirect internet access for private workspace node. Unsafe traffic is being ignored by proxy, for example HTTP.
  • Filesystem isolation - ./projects directory is mounted from the host at /root/workspace inside the container. Source host directory is configurable via WORKSPACE_MOUNT environment variable.
  • Dev Container - Open Container filesystem or choosen directory within container using VSCode Dev Containers extension. If you use integrated terminal network traffic is going to be proxied and filtered.
  • Statefulness - Your data that you work on within a container is going to be retained unless you run make cleanup using persistent named volume workspace.
  • Tested Functionality - This Sandbox functinality is tested, setup and functinality is validated in CI for apple silicon(darwin, arm64) and linux(amd64) architectures.
  • Portability - Supported Architectures: AArch64(aka ARM64), x86-64(aka x64, x86_64, AMD64, Intel 64), Supported Operating Systems: MacOS, Linux, Windows(WSL)

Quick Start

  1. Install requirements:

    • docker
    • docker compose
    • GNU make
  2. Set up environment variables make prepare-env — creates env/workspace.env from the template; fill in with your config values.

  3. Generate ssh keys, ssh config and git config with make prepare-ssh GIT_USERNAME='<git_username>' GIT_EMAIL='<git_email>', set up ssh key configs/.ssh/id_ed25519.pub with github

  4. Build and run workspace container and proxy: make up

  5. Enter Shell within workspace container: make shell

  6. Optional: using VSCode and Dev Containers extension you can open directory directly in a container.

Commands

Useful commands are defined as makefile targets, run make help to check them out and their descriptions.

  • make build
  • make up
  • make down
  • make clean
  • make shell
  • make logs-proxy
  • make setup

DNS Whitelisting

Requests are denied unless requested domain is present in allowed-sites.txt. Both SSH and HTTPS traffic is filtered.

Demo

  1. HTTP traffic is blocked no matter what, even if domain is whitelisted
root@3cec72c52b4a:~/workspace# curl -I http://google.com 
# HTTP/1.1 403 Forbidden

# proxy denied the request
# proxy-1  | 1776009627.004      0 172.21.0.2 TCP_DENIED/403 332 HEAD http://google.com/ - HIER_NONE/- text/html
  1. HTTPS traffic is denied unless domain is in allowed-sites.txt

ALLOW

# .google.com is present in allowed-sites.txt
curl -I https://google.com
# HTTP/1.1 200 Connection established

# proxy allowed the request
make proxy-logs
# 193 172.21.0.2 TCP_TUNNEL/200 6498 CONNECT google.com:443 - HIER_DIRECT/142.250.109.138 -

DENY

# .instagram.com is not present in allowed-sites.txt
curl -I https://www.instagram.com
# HTTP/1.1 403 Forbidden

# proxy blocked the request
make proxy-logs
# 0 172.20.0.2 TCP_DENIED/403 3406 CONNECT www.instagram.com:443 - HIER_NONE/- text/html
  1. SSH traffic is denied unless domain is in allowed sites.txt

ALLOW

# .github.qkg1.top is present in allowed-sites.txt
git clone https://github.com/a2aproject/A2A.git
# Cloning into 'A2A'...
# Resolving deltas: 100% (7758/7758), done.

# proxy allowed the request
make proxy-logs
# 1417 172.21.0.2 TCP_TUNNEL/200 29254299 CONNECT github.com:443 - HIER_DIRECT/140.82.121.4 -

DENY

# .gitlab.com is not present in allowed-sites.txt
git clone git@gitlab.com:cryptsetup/cryptsetup.git
# nc: Proxy error: "HTTP/1.1 403 Forbidden"
# fatal: Could not read from remote repository.

# proxy blocked the request
make proxy-logs
# 0 172.21.0.2 TCP_DENIED/403 3373 CONNECT gitlab.com:22 - HIER_NONE/- text/html
  1. Other protocols are denied

ICMP - squid can't handle ICMP so pings won't work.

Other protocols require additional configuration, and are ignored by squid. Which make sense, choose protocols and ports that are safe to use.

curl -v ftp://ftp.debian.org
# * Could not resolve host: ftp.debian.org
# * Closing connection 0

make proxy-logs
# no log

Identity Patterns

SSH Github Access

‼️ IMPORTANT - make sure to enable protected branches for your repositories, and require pull requests for integrating changes.

If SSH key is added to your primary GitHub account it operates with an ⚠️ owner identity.

Primary GitHub Account

  1. Add SSH keys to your primary account, both as signing key and as authentication key (add the same key 2 times)

Collaborator GitHub Account

This is safe approach, you can control what the collaborator account can do using granular github permission model.

  1. Create Collaborator GitHub Account (Completely new account)
  2. Invite Collaborator GitHub Account for collaboration to your projects (project > settings > collaborators) or to organization by leveraging GitHub organizations settings. (You will need to create organization and transfer projects)
  3. Add SSH keys to newly created GitHub account.

Footnote

Network Proxy

Approach of this sandbox workspace is limited in a sense that containerization technologies such as Docker and friends doesn't offer direct low level access to network configuration without configuring additional capabilties such as NET_ADMIN. Adding these capabilites would allow AI agents to reconfigure network settings, which is major security concern and is not viable approach.

Proxy environment variables are convenient approach to configure proxying, most tools respect it but some don't, it could create configuration overhead.

For complete effective proxy configuration transparent to tools one could use virtualization with true virtual machines using for example QEMU/kvm/lxd/parallels and configure network settings on an operating system level e.g. by iptables/nftables or similar.

This approach is still better (or at least more trusted) than handcrafted sandboxes that doesn't leverage any type containerization/virtualzation.