How much misinformation do recommendation algorithms actually surface? This project benchmarks 15 Top-N recommendation algorithms to measure their tendency to recommend misinformation-containing videos (NS, SERP-MS), while simultaneously evaluating standard recommendation quality (NDCG@10).
📄 Paper: Misinformation in Video Recommendations: An Exploration of Top-N Recommendation Algorithms — B. Hornig, M.S. Pera, J. Scholtes (ROMCIR workshop at ECIR '24)
Given a dataset of YouTube videos annotated for misinformation, this pipeline:
- Classifies videos as misinformation or not, using a trained classifier on video metadata and transcripts.
- Runs 15 recommendation algorithms (neighborhood-based, matrix factorization, neural, and hybrid approaches) using the Elliot recommendation framework.
- Evaluates each algorithm on both recommendation accuracy and misinformation prevalence in the generated recommendations.
- SVD++, nearest-neighbor methods, Deep MF, and Non-Negative MF consistently delivered strong recommendation quality while surfacing less misinformation.
- Field-aware FM, LogMF, and standard MF performed poorly on both fronts.
- The analysis covered over 15,000 videos, revealing that the choice of algorithm significantly affects how much misinformation users encounter.
├── classifier/ # Misinformation classifier
├── dataset/ # Data preparation and storage
├── elliot_configs/ # Elliot framework experiment configurations
├── elliot_external/ # Custom extensions for Elliot
├── classify_videos.py # Run the misinformation classifier
├── run_non_hybrid_experiments.py # Execute recommendation experiments
├── evaluate_recommendation_results.py # Analyze results
└── visualize_experiment_results.py # Generate figures
@inproceedings{hornig2024misinformation,
title={Misinformation in Video Recommendations: An Exploration of Top-N Recommendation Algorithms},
author={Hornig, Benedikt and Pera, Maria Soledad and Scholtes, Johannes},
booktitle={ROMCIR Workshop at ECIR '24},
year={2024}
}