This issue came to light when using a Pepper importer which relied on SCorpusGraphImpl's default implementation of naming documents:
String namePart = null;
namePart = document.getName();
if (Strings.isNullOrEmpty(namePart)) {
namePart = "doc_" + getCorpora().size();
}
GraphFactory.createIdentifier(document, URI.createURI(corpus.getId() + "/" + namePart).toString());
Relying on this implementation produced ConcurrentModificationExceptions during runtime, which seems to have been fixed when explicitly setting document names, in this case, falling back on PepperImporterImpl's default implementation of naming documents in #importCorpusStructureRec(URI currURI, SCorpus parent):
SDocument sDocument = null;
if (docFile.isDirectory()) {
sDocument = getCorpusGraph().createDocument(parent, currURI.lastSegment());
} else {
// if uri is a file, cut off file ending
sDocument = getCorpusGraph().createDocument(parent,
currURI.lastSegment().replace("." + currURI.fileExtension(), ""));
}
Especially the line namePart = "doc_" + getCorpora().size(); doesn't quite sit right, as at any time a document named doc_n could be created while getCorpora().size() < n, and once getCorpora().size() == n we'd have two documents of the same name, which would be translated into the document ID, which in turn is used by Pepper to calculate execution paths. If the two documents sit in the same corpus, this is likely to trigger a ConcurrentModificationException on lists in either of the documents' lists of nodes, etc., as has been encountered above.
This issue came to light when using a Pepper importer which relied on
SCorpusGraphImpl's default implementation of naming documents:Relying on this implementation produced
ConcurrentModificationExceptions during runtime, which seems to have been fixed when explicitly setting document names, in this case, falling back onPepperImporterImpl's default implementation of naming documents in#importCorpusStructureRec(URI currURI, SCorpus parent):Especially the line
namePart = "doc_" + getCorpora().size();doesn't quite sit right, as at any time a document nameddoc_ncould be created whilegetCorpora().size() < n, and oncegetCorpora().size() == nwe'd have two documents of the same name, which would be translated into the document ID, which in turn is used by Pepper to calculate execution paths. If the two documents sit in the same corpus, this is likely to trigger aConcurrentModificationExceptionon lists in either of the documents' lists of nodes, etc., as has been encountered above.