This code is intended for calculating inter-annotator agreement when annotating (parallel learner) treebanks in the UD format.
The implementation of Krippendorff's alpha for dependency trees (syn_agreement.py) comes from Cora Haiber's adaptation (https://gitlab.ruhr-uni-bochum.de/comphist/lrec-coling_2024_hgb/-/tree/main/scripts/alpha_agreement) of Arne Skjaerholt's code (https://github.com/arnsholt/syn-agreement/). This version also includes changes that make this code importable as a module. See also LICENSES below.
To run the code you need one or more file with annotations per annotator, as well as helper .json files specifying:
batches.jsonneeds to contain a list of dictionaries where each dictionary constitutes an annotation batch (timestep), the key inside it specifies the annotation pair using (code)names separated by a hyphen (e.g. 'a1-a2') or 'everyone' if everyone annotated the given sentences and where the value of that key is a list of sentence IDs annotated in that batch.paths.jsonwhich is a list of paths to all of the.conllufiles that are to be included in the analysis.usernames.jsonwhich is a dictionary where the key is the (code)name of the annotator used inbatches.jsonand the value is the annotator name included in the name of the.conllufile(s). These should ideally be different.
Calculating agreement per sentence:
- python3 iaa.py --annotators 'usernames.json' --paths 'paths.json' --batches 'batches.json' --truncated_id --parallel --output 'ud_swell_per_sent_agr' --per_sent
Calculating dependency relation agreement per annotation batches:
- python3 iaa.py --annotators 'usernames.json' --paths 'paths.json' --batches 'batches.json' --truncated_id --ann_type 'deprel' --parallel --output 'deprel-batches.json'
Calculating lemma agreement relative to sentence length:
- python3 iaa.py --annotators 'usernames.json' --paths 'paths.json' --bins --truncated_id --ann_type 'lemma' --parallel --output 'lemma-bins.json'
The files syn-agreement.py and conll.py are (c) 2014 Arne Skjærholt and released under the GNU GPL version 2 or later: http://gnu.org/licenses/gpl.html, with the Python 3 version made possible by Cora Haiber and adaptations from Maria Irena Szawerna.
The code in alpha.py is (c) 2011-2014 Thomas Grill and released under the Creative Commons Attribution-ShareAlike licence: http://creativecommons.org/licenses/by-sa/3.0/
The code in iaa.py is (c) 2026 Maria Irena Szawerna and is released under GNU GPL v.3 or later: https://www.gnu.org/licenses/gpl-3.0.html