Project

General

Profile

Segment Alignment » History » Version 3

Vassilis Papavassiliou, 2016-05-31 05:30 PM

1 1 Prokopis Prokopidis
# Segment Alignment
2 1 Prokopis Prokopidis
3 1 Prokopis Prokopidis
4 1 Prokopis Prokopidis
It uses maligna aligner for identifying segment pairs from each detect document pair. It generates a TMX file for each cesAlign file (e.g. eng-12_ell-18_x.tmx for eng-12_ell-18_x.xml).
5 1 Prokopis Prokopidis
6 1 Prokopis Prokopidis
```
7 2 Prokopis Prokopidis
java -Dlog4j.configuration=file:/opt/ilsp-fc/log4j.xml \
8 2 Prokopis Prokopidis
-jar /opt/ilsp-fc/ilsp-fc-2.2.2-jar-with-dependencies.jar \
9 3 Vassilis Papavassiliou
-align -lang "eng;lv" -oxslt -i (fullpath of dir with the generated cesAlign) \
10 3 Vassilis Papavassiliou
-bs (fullpath and basename on which all files for easier content navigation will be generated)
11 1 Prokopis Prokopidis
&>"/var/www/tests/eng-ita/log-align_www_esteri_it_eng-ita"
12 1 Prokopis Prokopidis
```
13 1 Prokopis Prokopidis
14 3 Vassilis Papavassiliou
## Options
15 1 Prokopis Prokopidis
16 3 Vassilis Papavassiliou
```
17 3 Vassilis Papavassiliou
-align	: for segment alignment
18 1 Prokopis Prokopidis
19 3 Vassilis Papavassiliou
-i      : crawlpath up to the auto-generated dir by the crawl module
20 3 Vassilis Papavassiliou
21 3 Vassilis Papavassiliou
-lang   : two or three letter ISO code(s) of target language(s), 
22 3 Vassilis Papavassiliou
          e.g.  el (for a monolingual crawl for Greek content) or en;el (for a bilingual crawl)
23 3 Vassilis Papavassiliou
          CesDoc files will be generated only for crawled web documents that are in the targeted language(s)
24 3 Vassilis Papavassiliou
25 3 Vassilis Papavassiliou
-bs     : Basename to be used in generating all files for easier content navigation
26 3 Vassilis Papavassiliou
27 3 Vassilis Papavassiliou
-oxslt  : Export crawl results with the help of an xslt file for better examination of results.
28 3 Vassilis Papavassiliou
```