Some ammico components require tensorflow (e.g. Emotion detector), some pytorch (e.g. Summary detector). Sometimes there are compatibility problems between these two frameworks. To avoid these problems on your machines, you can prepare proper environment before installing the package (you need conda on your machine):
1. First, install tensorflow (https://www.tensorflow.org/install/pip)
-
create a new environment with python and activate it
conda create -n ammico_env python=3.13conda activate ammico_env -
install cudatoolkit from conda-forge
conda install -c conda-forge cudatoolkit=11.8.0 -
install nvidia-cudnn-cu11 from pip
python -m pip install nvidia-cudnn-cu11==8.6.0.163 -
add script that runs when conda environment
ammico_envis activated to put the right libraries on your LD_LIBRARY_PATHmkdir -p $CONDA_PREFIX/etc/conda/activate.d echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh echo 'export LD_LIBRARY_PATH=$CUDNN_PATH/lib:$CONDA_PREFIX/lib/:$LD_LIBRARY_PATH' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh -
deactivate and re-activate conda environment to call script above
conda deactivateconda activate ammico_env -
install tensorflow
python -m pip install tensorflow==2.15
-
install pytorch for same cuda version as above
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
python -m pip install ammico
It is done.
If you are using micromamba you can prepare environment with just one command:
micromamba create --no-channel-priority -c nvidia -c pytorch -c conda-forge -n ammico_env "python=3.10" pytorch torchvision torchaudio pytorch-cuda "tensorflow-gpu<=2.12.3" "numpy<=1.23.4"
To make pycocotools work on Windows OS you may need to install vs_BuildTools.exe from https://visualstudio.microsoft.com/visual-cpp-build-tools/ and choose following elements:
Visual Studio extension developmentMSVC v143 - VS 2022 C++ x64/x86 build toolsWindows 11 SDKfor Windows 11 (orWindows 10 SDKfor Windows 10)
Be careful, it requires around 7 GB of disk space.
You have to accept the privacy statement of ammico to run this type of analyis.
According to the google Vision API, the images that are uploaded and analysed are not stored and not shared with third parties:
We won't make the content that you send available to the public. We won't share the content with any third party. The content is only used by Google as necessary to provide the Vision API service. Vision API complies with the Cloud Data Processing Addendum.
For online (immediate response) operations (
BatchAnnotateImagesandBatchAnnotateFiles), the image data is processed in memory and not persisted to disk. For asynchronous offline batch operations (AsyncBatchAnnotateImagesandAsyncBatchAnnotateFiles), we must store that image for a short period of time in order to perform the analysis and return the results to you. The stored image is typically deleted right after the processing is done, with a failsafe Time to live (TTL) of a few hours. Google also temporarily logs some metadata about your Vision API requests (such as the time the request was received and the size of the request) to improve our service and combat abuse.
You have to accept the privacy statement of ammico to run this type of analyis.
According to google Translate, the data is not stored after processing and not made available to third parties:
We will not make the content of the text that you send available to the public. We will not share the content with any third party. The content of the text is only used by Google as necessary to provide the Cloud Translation API service. Cloud Translation API complies with the Cloud Data Processing Addendum.
When you send text to Cloud Translation API, text is held briefly in-memory in order to perform the translation and return the results to you.
Some features of ammico require internet access; a general answer to this question is not possible, some services require an internet connection, others can be used offline:
- Text extraction: To extract text from images, and translate the text, the data needs to be processed by google Cloud Vision and google Translate, which run in the cloud. Without internet access, text extraction and translation is not possible.
- Image summary and query: After initial loading and caching of the model, image summarization and VQA can work fully offline.
- Video summary and query: After initial loading and caching of the model, video summarization and VQA can work fully offline.
- Video summary and query with audio: After the WhisperX model (and optional language assets) is downloaded, audio transcription and combined video+audio summarization also work offline.
- Multimodal search: After initial loading and caching of the model, multimodal search can work fully offline.
- Color analysis: The
colormodule does not require an internet connection.
