AI / ML Engineer Β· Data Scientist Β· LLM Evaluation & Safety Research
Building data-driven systems, evaluating large language models, and designing robust ML pipelines with a focus on ethics, safety, and real-world impact.
- π― Aspiring AI / Machine Learning Engineer with strong foundations in data analysis and systems thinking
- π§ Actively working on LLM evaluation, safety, bias, and benchmarking frameworks
- π Experienced with Python, Pandas, NumPy, and structured datasets (JSONL, CSV)
- π Interested in model behavior analysis, robustness, and ethical AI
- π Long-term goal: contribute to reliable and transparent AI systems
- Python, SQL, Java (academic & projects)
- Pandas, NumPy, Matplotlib
- Scikit-Learn
- NLP preprocessing & data cleaning
- Model evaluation & benchmarking pipelines
- Safety / Ethics / Bias datasets (JSONL)
- Rubric-based scoring systems
- Model comparison & regression tracking (conceptual + implementation)
- Git & GitHub
- Jupyter Notebook
- Linux
- MySQL (integration with Python & PHP)
- π§ͺ LLM benchmarking frameworks (safety, ethics, bias)
- π Dataset schema design for evaluation tasks
- π§ Understanding reasoning, adversarial prompting, and failure modes
- π Visualization & comparative analysis of model outputs
| Area | Description |
|---|---|
| LLM Evaluation Platform | Framework to benchmark LLMs on safety, ethics, and bias using structured JSONL datasets |
| Data Analysis Projects | Exploratory analysis, pivot tables, and visualizations using Pandas |
| NLP Data Cleaning | Annotation, preprocessing, and normalization of text data |
| Academic Projects | Java, assembly language fundamentals, and systems-level understanding |
π Only repositories where I am the original author are considered primary work.
- β Prefer clean schemas and extensible designs
- β Focus on evaluation, metrics, and behavior, not just accuracy
- β Avoid unnecessary complexity (e.g., databases unless required)
- β Build with future expansion in mind (reasoning, adversarial tests, compliance)
- Advanced NLP & LLM internals
- Reasoning and chain-of-thought evaluation
- Robust ML system design
- Research-level benchmarking methodologies
- GitHub: @MAqeel151214
- Open to collaboration on AI, ML, NLP, and evaluation research
"Build systems that can be trusted β not just systems that work."

