Skip to content

MaastrichtU-CDS/nlp-datau

Repository files navigation

NLP-DATAU

NLP Data Utilities for processing NLP-data and analyzing results.

Notebooks

Pre-processing

ElasticSearch

  • XML documents to ElasticSearch link
  • ElasticSearch documents to Excel link
  • Prepare a dataset for doccano annotation using ElasticSearch link
  • Template parsing link

Processing

  • Context Extraction using spaCy & pyContextNLP link

Post-processing

Stats

  • Calculate NLP statistics over classification results in excel format link

Dependencies

REQUIRED:

RECOMMENDED:

Install

  1. Clone or download (button) this repository

    git clone https://github.com/putssander/nlp-datau.git

Run

  1. Navigate to the cloned or downloaded project using the terminal or cmd

  2. Create network (if it does not exists)

     docker network create nlp-datau-network    
    
  3. Start docker-compose

     docker-compose up
    
  4. Find the Jupyter link in the log file and copy the link in the browser.

     jupyter-nlp-datau             | [I 12:06:45.669 NotebookApp]  or http://127.0.0.1:8888/?token=0c01e853a34a4bb0db3a542ca15c3af036cab7a11fd64bb2
    
  5. Navigate to the desired notebook in the browser (directory notebooks)

  6. Data can be copied to the resources/data folder (needs to be created)

About

Data Utilities for PRE/POST-processing NLP-data and analysis of results

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors