A simple Streamlit app that uses AI image editing models (via fal.ai) to remove burned-in text, annotations, measurement lines, and other non-anatomical markings from medical images — purely through prompting, no computer vision or masking required.
Medical image datasets often have labels, measurement lines, patient info, arrows, and other annotations burned directly into the image pixels. When training a classifier on such data, the model can learn to associate these markings with specific diagnoses (shortcut learning / overfitting) rather than the actual anatomy.
This tool preprocesses a dataset by sending each image through a generative AI model with a cleaning prompt, producing annotation-free versions ready for training.
| Upload & Settings | Before / After Results |
|---|---|
![]() |
![]() |
| Tier | Model | Price | Approach |
|---|---|---|---|
| Budget | FLUX.1 Dev | ~$0.025/image | Strength-based img2img — regenerates the full image guided by the prompt |
| Standard | FLUX.1 Kontext Pro | ~$0.04/image | Instruction-based editing — understands what to remove and what to keep |
| Premium | FLUX.2 Pro | ~$0.03+/MP | Latest FLUX editor, scales with output resolution |
| — | Nano Banana 2 (Google) | ~$0.08/image | Gemini 3.1 Flash Image — instruction-based, high fidelity |
Recommendation: Start with Standard (Kontext Pro) — it's instruction-based, so it actually understands the edit rather than regenerating the whole image. Test on a handful of images before running a full dataset.
1. Clone the repo
git clone https://github.com/your-username/ImageAICleaner.git
cd ImageAICleaner2. Create a virtual environment and install dependencies
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt3. Get a fal.ai API key
Sign up at fal.ai and grab a key from your dashboard.
4. Run
streamlit run app.pyThe app opens at http://localhost:8501. Enter your API key in the sidebar — it's never stored anywhere.
You can also set the key as an environment variable to skip typing it each time:
export FAL_KEY=your_key_here- Select a model tier in the sidebar
- Upload your images (PNG, JPG, TIFF, BMP, WEBP)
- Review the estimated cost
- Edit the prompt if needed (a good default is already provided)
- Click Clean Images — results appear one by one as they finish
- Download all cleaned images as a ZIP
- Results are stochastic — the same image may produce slightly different outputs each run
- For the Budget (FLUX Dev) model, lower the Strength slider to allow more aggressive cleaning; higher values preserve more of the original
- Generative inpainting may alter regions that overlap with anatomically relevant features — always sanity-check results before using them for training
- Images are temporarily uploaded to fal.ai's CDN for processing and are not stored permanently
- No patient data should be sent to external APIs without ensuring it has been de-identified first
The model list lives at the top of app.py in the MODELS dict. Each entry needs:
"Label — Name (~$price)": {
"endpoint": "fal-ai/model-id",
"description": "Shown in the sidebar.",
"prompt_style": "instructive", # or "descriptive"
"image_url_key": "image_urls", # omit if the model uses "image_url" (singular)
"params": {
"guidance_scale": 3.5, # any extra API params with their defaults
},
},
