![]() |
ILSP NLP Web ServicesThis site hosts Natural Language Processing services developed by the NLP group of the Institute for Language and Speech Processing. These services are free to use for research purposes. |
![]() ![]() |
Category | Service name | Description |
---|---|---|
Getstarted | ||
ilsp_nlp (WSDL) | Uses ILSP NLP tools to process Greek texts. Input is either plain text or an XCES document with text segmented in paragraphs. The service detects paragraph sentence and token boundaries and generates POS and lemma annotations for each token. The output by default is an XCES document. Standoff alternatives include UIMA and GATE files. For more documentation and ways to cite this work please visit http://goo.gl/1voev1 . | |
ilsp_nlp_depparse_ud (WSDL) | This service uses ILSP NLP tools to syntactically analyze Greek texts and to generate representations compatible to the Universal Dependencies schema. The input is plain text. The service detects paragraph sentence and token boundaries and generates POS and lemma annotations for each token. Then it parses sentences into dependency trees using a dependency parser trained on the Greek Dependency Treebank. The output by default is a CoNLL 2007 document. Another output option is xml files that can be viewed and edited on the TrEd tree editor. A final althernative is UIMA Standoff files. For more documentation and ways to cite this work please visit http://gdt.ilsp.gr and http://www.aclweb.org/anthology/W17-0413.pdf . | |
Ilsp | ||
ilsp_chunker (WSDL) | Chunker for Greek texts. Input by default is an XCES document with with POS-tagged tokens. Output is a standoff document with chunk annotations. | |
ilsp_depparser (WSDL) | Dependency parser for Greek texts. Input is an XCES document with with POS-tagged and lemmatized tokens. Output is a standoff document with labelled dependencies between tokens. | |
ilsp_fbt (WSDL) | FBT part-of-speech tagger for Greek texts. Input is an XCES document with sentence and token boundaries recognised. Output is an XCES document with POS tags assigned to each token. | |
ilsp_lemmatizer (WSDL) | Lemmatizer for Greek texts. Input is an XCES document with with POS-tagged tokens. Output is an XCES document with lemmas assigned to each token. | |
ilsp_nerc (WSDL) | Named Entity Recognizer for Greek texts. Input by default is an XCES document with with POS-tagged tokens. Output is a standoff document with NE annotations. | |
ilsp_trans (WSDL) | Simple transliterator for Greek texts. Input is a UTF8 document with Greek text. Output is a UTF8 document with the transliteration result. | |
ilsp_wikipedia (WSDL) | Using the EN wikipedia category graph and pages as pivot this web service extracts multilingual domain-related term and URL lists from Wikipedia. | |
Test and older services | ||
ilsp_nlp_chunks (WSDL) | This service uses ILSP NLP tools to recognize chunks in Greek texts. The input is plain text. The pipeline behind the service detects paragraph sentence and token boundaries and generates POS and lemma annotations for each token. It then recognizes chunks based on a grammar compiled into FSTs. The output is a GATE XML document. For more documentation and ways to cite this work please read http://goo.gl/6ozYut . | |
ilsp_nlp_depparse (WSDL) | This service uses ILSP NLP tools to syntactically analyze Greek texts. The input is plain text. The service detects paragraph sentence and token boundaries and generates POS and lemma annotations for each token. Then it parses sentences into dependency trees using a dependency parser trained on the Greek Dependency Treebank. The output by default is a CoNLL 2007 document. Another output option is xml files that can be viewed and edited on the TrEd tree editor. A final althernative is UIMA Standoff files. For more documentation and ways to cite this work please visit http://gdt.ilsp.gr and http://goo.gl/gFwxw4 . | |
ilsp_nlp_depparse_ud_const (WSDL) | This UNDER DEVELOPMENT service uses ILSP NLP tools to syntactically analyze Greek texts and visualize them as both dependency and constituency trees. | |
ilsp_nlp_nes (WSDL) | This service uses ILSP NLP tools to recognize named entities in Greek texts. The input is plain text. The service detects paragraph sentence and token boundaries and generates POS and lemma annotations for each token. Then it recognizes NEs based on an approach described in http://goo.gl/R2HXa6. The output is a Gate XML document. | |
ilsp_sst (WSDL) | Sentence splitter and tokenizer for Greek texts. Input by default is plain text. Output by default is a CoNLL-U document with sentence and token boundaries recognized. | |
ilsp_wikipedia_seed_urls (WSDL) | Using the EN wikipedia category graph and pages as pivot this web service extracts multilingual domain-related term and URL lists from Wikipedia. | |
syllabification (WSDL) | Performs syllabification on lists of Greek words using rules created by the Greek LaTeX community. |