(2023-09-12) 20230912-models.zip: Latest tagging, lemmatization and depparse_models to be used in a stanza 1.5.* pipeline.
(2023-09-11) elNER18-bert.pt: named entity recognition model, to be used with flair. Trained and evaluated (0.92 micro F-score) on the elNER18 dataset (Bartziokas et al, 2020). Using bert-base-greek-uncased-v1 (Koutsikakis et al. 2020) for pre-training.
Web application and API for the toolkit: http://nlp.ilsp.gr/nws/
Pre-trained embeddings: 2020.el.fasttext.skipgram.100.vec.gz
Processed versions, with boilerplate removed, of crawled corpora that have been acquired from web sites with open content: greek_corpus.tar.gz
Tagging and lemmatization models, to be used in a stanfordnlp pipeline: pos_lemma_models.tar.gz
Dependency parsing model, for use with stanfordnlp : depparse_models.tar.gz
Text classification module, for use with fasttext: news_classifier_model.tar.gz
License: CC BY-NC-SA 4.0