Mohamed Kandil Kandil7

Mohamed Kandil

AI / NLP Engineer | Arabic LLMs · RAG Systems · Applied AI

🧠 About Me

class AINLPEngineer:
    def __init__(self):
        self.name       = "Mohamed Kandil"
        self.location   = "Kafr El-Sheikh, Egypt 🇪🇬"
        self.title      = "AI / NLP Engineer"
        self.focus      = ["Arabic LLMs", "RAG Systems", "Applied AI", "Arabic NLP"]

    def currently_building(self):
        return {
            "Athar"        : "Arabic RAG system over large-scale domain-specific corpora — 4,500+ HF downloads",
            "Baligh"       : "Arabic LLM assistant — curated knowledge grounding + SFT alignment",
            "EgyptianAgent": "Egyptian Arabic function-calling model (FunctionGemma fine-tune)",
        }

    def open_source(self):
        return {
            "datasets" : "4,500+ downloads · 11.5M+ indexed vectors published on Hugging Face",
            "models"   : "huggingface.co/Kandil7",
            "github"   : "github.com/Kandil7",
        }

🚀 Featured Projects

📚 Athar — Arabic RAG Knowledge System

Large-scale Arabic retrieval-augmented generation

🔍 Hybrid retrieval: dense embeddings + BM25 via Qdrant
📖 Domain-specific Arabic knowledge corpora (Shamela4)
✅ Citation-based, grounded question answering
📊 4,500+ HF downloads · 11.5M+ vectors indexed

Tech: Qdrant BM25 Arabic NLP FastAPI PyArrow

🤖 Baligh — Arabic LLM Assistant

Retrieval-grounded Arabic language model

🧩 SFT on curated Arabic knowledge datasets
🛡️ Controlled, hallucination-resistant responses
⚙️ QLoRA fine-tuning via Unsloth on Qwen2.5-1.5B
🌙 Aligned for structured Arabic knowledge tasks

Tech: Unsloth QLoRA Qwen2.5 Hugging Face TRL

📱 Egyptian Mobile Action Assistant

Dialect-specific function-calling LLM

🇪🇬 Fine-tuned FunctionGemma for Egyptian Arabic
📲 Mobile action understanding and function calling
🗣️ Dialect-specific instruction following
🏗️ Custom dataset for Egyptian voice commands

Tech: FunctionGemma QLoRA Egyptian Arabic NLP

🪪 Egyptian ID OCR System

Production-ready Arabic OCR pipeline

🎯 YOLO detection → OCR ensemble → JSON API
🔥 PaddleOCR + EasyOCR with confidence gating
✅ ~92% field-level accuracy on real samples
🏦 Ready for fintech / KYC applications

Tech: YOLO PaddleOCR EasyOCR OpenCV FastAPI

🌿 Plant Disease Detection

Deep learning computer vision classifier

🍃 Automated leaf image classification pipeline
🔬 CNN-based deep learning with preprocessing
📊 Multi-class plant disease identification

Tech: PyTorch OpenCV TensorFlow

🎙️ Long-Form Speech Transcription

Whisper-based video transcription pipeline

⏱️ Processes multi-hour audio/video content
📄 Structured text output with timestamps
🌐 Supports Arabic and multilingual audio

Tech: Whisper Python Audio Processing

🤗 Open-Source Contributions on Hugging Face

Artifact	Type	Highlights
Athar-Shamela4	📦 Dataset	4,500+ downloads · Large Arabic knowledge corpus
Athar-Embeddings	📦 Dataset	1,000+ downloads · 3.26M embedding rows
shamela-vectors	📦 Dataset	11.5M rows · Ready-to-index Arabic vectors
Athar-RAG-Hub	📦 Dataset	5,850 curated RAG QA pairs
Athar-Mini-Dataset-v2	📦 Dataset	80+ downloads · Evaluation-ready
egyptian-voice-commands	📦 Dataset	Egyptian Arabic commands dataset
Baligh-1.5B	🤖 Model	Arabic LLM · SFT on curated knowledge
functiongemma-270m-egyptian	🤖 Model	46+ downloads · Egyptian function calling

🛠️ Tech Stack

Languages & Frameworks

AI / ML

Databases & Vector Stores

DevOps & Tools

📊 GitHub Stats

🐍 Contribution Graph

🎯 Current Focus

Building:
  - Athar v2: GraphRAG + Agentic retrieval layer for Arabic corpora
  - Baligh v1: Improved SFT alignment + Arabic evaluation benchmarks
  - Arabic RAG evaluation suite (RAGAS-based for Arabic QA)

Learning:
  - Advanced RAG patterns (HippoRAG, EHRAG, atomic retrieval)
  - LLM post-training strategies (DPO, ORPO, RLHF)
  - Production ML deployment and MLOps at scale

Open Source:
  - Publishing Arabic NLP datasets on Hugging Face
  - Building open evaluation benchmarks for Arabic QA
  - Contributing to Arabic LLM tooling ecosystem

💬 Let's Connect!

💼 Open to collaborations, full-time roles, and freelance AI/NLP projects

Provide feedback

Saved searches

Use saved searches to filter your results more quickly