Segment Alignment » History » Version 3
« Previous -
Version 3/4
(diff) -
Next » -
Current version
Vassilis Papavassiliou, 2016-05-31 05:30 PM
Segment Alignment¶
It uses maligna aligner for identifying segment pairs from each detect document pair. It generates a TMX file for each cesAlign file (e.g. eng-12_ell-18_x.tmx for eng-12_ell-18_x.xml).
java -Dlog4j.configuration=file:/opt/ilsp-fc/log4j.xml \ -jar /opt/ilsp-fc/ilsp-fc-2.2.2-jar-with-dependencies.jar \ -align -lang "eng;lv" -oxslt -i (fullpath of dir with the generated cesAlign) \ -bs (fullpath and basename on which all files for easier content navigation will be generated) &>"/var/www/tests/eng-ita/log-align_www_esteri_it_eng-ita"
Options¶
-align : for segment alignment -i : crawlpath up to the auto-generated dir by the crawl module -lang : two or three letter ISO code(s) of target language(s), e.g. el (for a monolingual crawl for Greek content) or en;el (for a bilingual crawl) CesDoc files will be generated only for crawled web documents that are in the targeted language(s) -bs : Basename to be used in generating all files for easier content navigation -oxslt : Export crawl results with the help of an xslt file for better examination of results.