Project

General

Profile

Sentence Alignment Setup » History » Version 4

Prokopis Prokopidis, 2014-08-14 01:41 PM

1 1 Prokopis Prokopidis
h1. Sentence Alignment Setup (Linux only)
2 1 Prokopis Prokopidis
3 1 Prokopis Prokopidis
In order to get sentence alignments as the output of bilingual crawls, an external aligner is required. For the current version of ILSP-FC,  
4 1 Prokopis Prokopidis
* download the hunalign-1.1 source code from http://mokk.bme.hu/en/resources/hunalign/ 
5 1 Prokopis Prokopidis
* follow the instructions on the hunalign page for building hunalign
6 1 Prokopis Prokopidis
* put the hunalign directory containing the hunalign executable next to the runnable ilsp-fc jar. 
7 1 Prokopis Prokopidis
8 1 Prokopis Prokopidis
For example, if you run ilsp-fc from:
9 1 Prokopis Prokopidis
<pre>~/ilsp-fc/ilsp-fc-2.2-jar-with-dependencies.jar</pre>
10 1 Prokopis Prokopidis
11 1 Prokopis Prokopidis
you should have a hunalign dir 
12 1 Prokopis Prokopidis
13 1 Prokopis Prokopidis
<pre>~/ilsp-fc/hunalign-1.1/</pre>
14 1 Prokopis Prokopidis
15 1 Prokopis Prokopidis
with the suggested hunalign directory structure, including
16 1 Prokopis Prokopidis
17 1 Prokopis Prokopidis
<pre>~/ilsp-fc/hunalign-1.1/dict 
18 1 Prokopis Prokopidis
~/ilsp-fc/hunalign-1.1/linux/src/hunalign/hunalign</pre>
19 1 Prokopis Prokopidis
20 4 Prokopis Prokopidis
Now, you are ready to produce TMX files from bilingual crawled data using the <code>-align</code>, <code>-dict</code>, <code>-oft</code> and <code>-ofth</code> options described  in the [[GettingStarted|Getting Started]] part of the documentation.