MAM-AI is a smart search application developed for nurses and midwives in Zanzibar. This repository is an MVP for this, as submitted to the Gemma3n Kaggle challenge.
There are three main folders: rag, app, and finetune. rag is used for document preprocessing. app
contains the actual app. finetune is the Gemma3n finetune which we sadly did not manage to deploy
into the Android app (yet).
Under the GitHub releases tab there is an apk to install :) It may not work on an emulator (according to the Google AI Edge RAG documentation) since the underlying inference library needs real hardware.
You can also build it yourself with flutter build apk in the app/ directory.
This is a rough sketch of how you could reproduce what we created in this project.
Prepare RAG docs:
- Curate documents you want to include (or use the Google Drive link from the writeup)
- Run MMORE over these documents to extract their text
- Chunk the documents using the scripts in
rag/ - Copy the chunks to the
mamai_trim.txtin theassetsfolder of the Android app and uncomment thememorizeChunks()call - Run the app and wait for it to incorporate the chunks into the sqlite db
- Re-comment the
memorizeChunks - Use
adbto pull theembeddings.sqlitefor redistribution
How we developed the app:
- Flutter frontend with regular Flutter <-> Android FFI/bridging
- Built this out to meet our needs
- Adapted the Google AI Edge RAG for the Android backend which runs the LLM
Serving the remote files to the users:
- Start an nginx server with a self-signed cert and the files (see below) in
/var/www/html/
Finetuning (not included in app):
- Finetuning Dataset
- Finetuned Model
- Set Up a Python 3.10 Virtual Environment:
python3.10 -m venv .venv- Windows:
.venv\Scripts\activateLinux:source .venv/bin/activate
- Install core dependencies
pip install -r requirements.txt
- Run the Training Script
python train.py
Note: because of the requirements to accept the Gemma3n license before using it, we do not provide the model files in this repo. Hence, we host everything on a temporary VPS which will only remain up during judging. If you are uncomfortable with this, you can simply replace the link with your own endpoint.
We host copies of various models on a temporary VPS. The app fetches these the first time it launches. This is just done for simplicity's sake.
They are as follows:
Gecko_1024_quant.tflite: embedding model from litert-community/Gecko-110m-ensentencepiece.model: tokenizer from litert-community/Gecko-110m-engemma-3n-E4B-it-int4.task: Gemma3n E4Bembeddings.sqlite: pre-computed document embeddings