Project

General

Profile

Sentence Alignment Setup » History » Version 3

Version 2 (Prokopis Prokopidis, 2014-08-13 02:49 PM) → Version 3/12 (Vassilis Papavassiliou, 2014-08-13 05:21 PM)

h1. Sentence Alignment Setup (Linux only)

In order to get sentence alignments as the output of bilingual crawls, an external aligner is required. For the current version of ILSP-FC,
* download the hunalign-1.1 source code from http://mokk.bme.hu/en/resources/hunalign/
* follow the instructions on the hunalign page for building hunalign
* put the hunalign directory containing the hunalign executable next to the runnable ilsp-fc jar.

For example, if you run ilsp-fc from:
<pre>~/ilsp-fc/ilsp-fc-2.2-jar-with-dependencies.jar</pre>

you should have a hunalign dir

<pre>~/ilsp-fc/hunalign-1.1/</pre>

with the suggested hunalign directory structure, including

<pre>~/ilsp-fc/hunalign-1.1/dict
~/ilsp-fc/hunalign-1.1/linux/src/hunalign/hunalign</pre>

Now, you are ready to produce TMX files from bilingual crawled data using the <code>-align</code> , <code>-dict</code> , <code>-oft</code> and <code>-ofth</code> options described in the [[GettingStarted|Getting Started]] part of the documentation.