Pages in topic:   [1 2] >
How to convert a TBX file into a Translation Memory
Thread poster: Adrián LY
Adrián LY
Adrián LY
Spain
Local time: 23:14
English to Spanish
Mar 29, 2021

Hello guys.

A while ago, I was looking for free translation memories for Trados and I came across a TBX file (IATE, a free file given by the European Union). I tried to convert it to .sdltb using glossary converter, but it has given me all kinds of errors (the file is too big, the format is incorrect, etc...). I know this TM is used by Wordfast Everywhere (it has about 700.000 terms), but I don't know how to make it work.

My questions are:

Is there a way to
... See more
Hello guys.

A while ago, I was looking for free translation memories for Trados and I came across a TBX file (IATE, a free file given by the European Union). I tried to convert it to .sdltb using glossary converter, but it has given me all kinds of errors (the file is too big, the format is incorrect, etc...). I know this TM is used by Wordfast Everywhere (it has about 700.000 terms), but I don't know how to make it work.

My questions are:

Is there a way to convert the TBX file without Glossary Converter?

Could I extract the TM from Wordfast Everywhere somehow?

Thank you!
Collapse


 
Andriy Yasharov
Andriy Yasharov  Identity Verified
Ukraine
Local time: 00:14
Member (2008)
English to Russian
+ ...
Goldpan TMX/TBX Editor Mar 29, 2021

Goldpan TMX/TBX Editor can help with your task.

https://logrusglobal.com/goldpan.html


Stepan Konev
 
Adrián LY
Adrián LY
Spain
Local time: 23:14
English to Spanish
TOPIC STARTER
Privacy concerns Mar 29, 2021

Andriy Yasharov wrote:

Goldpan TMX/TBX Editor can help with your task.

https://logrusglobal.com/goldpan.html


That program looks good, but it asks far too many personal details:

"Your LinkedIn of Facebook profile must have information about your professional standing".
"You must be a member of Localization Professional group on LinkedIn or Facebook"

Why do they need that data? It seems like an artificial way to gatekeep the app, if you ask me.

I am a freelance translator that is just starting and wants to get some experience with CAT tools.

[Editado a las 2021-03-29 11:15 GMT]


 
Milan Condak
Milan Condak  Identity Verified
Local time: 23:14
English to Czech
Xbench Mar 29, 2021

Adrián L. wrote:

My questions are:

Is there a way to convert the TBX file without Glossary Converter?



Hi Adrián,

I made a presentation in Czech. I hope you will understand my pictures.

www.condak.cz/nove/2021-02/27/cs/02.html

Here is machine translation CS > EN:

In 2019 I downloaded a large ZIP file that contained all languages; I extracted the language few options and the thematic area to the TBX file.

In Xbench, I did TBX conversion to TMX, by import and export method.

I imported TMX into the database of TMLookup DB.

From the database of the TMlookup, I exported the TXT file = finished glossary for OmegaT.
--
Glossaries for OmegaT are in UTF-8, glossaries for Wordfast are in Unicode 16 (LE or BE).

Milan

The user of Goldpan TBX Editor for creating TBX.

[Edited at 2021-03-29 14:37 GMT]


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 01:14
English to Russian
Goldpan is a good tool Mar 29, 2021

If you want privacy you should not ever use any social network or even Internet in general. My only regret about Goldpan is that it does not have an Undo feature. Unlike other converters that do just one thing: convert A to B, Goldpan allows you to take several actions with your files: edit, replace, copy, paste, find&replace and even use regular expressions, etc. It takes one wrong move to have to start the entire work from scratch again. (Other converters do not bear this risk not because they... See more
If you want privacy you should not ever use any social network or even Internet in general. My only regret about Goldpan is that it does not have an Undo feature. Unlike other converters that do just one thing: convert A to B, Goldpan allows you to take several actions with your files: edit, replace, copy, paste, find&replace and even use regular expressions, etc. It takes one wrong move to have to start the entire work from scratch again. (Other converters do not bear this risk not because they are smarter, but because they just do not offer such functionality.) In other respects, Goldpan is a very powerful tool and worthy of providing a link to your Facebook page. I never received any sort of spam from them. They just asked me once to give my feedback. That's it.

[Edited at 2021-03-29 15:36 GMT]
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:14
Member (2006)
English to Afrikaans
+ ...
@Adrián Mar 29, 2021

Adrián L. wrote:
I came across a TBX file (IATE, a free file given by the European Union). ... Is there a way to convert the TBX file [to TMX]?


I also have a version of that file (downloaded in 2018, 2 GB unzipped). I tried it in various tools just now. Locamotion's tbx2po gives an error message. Xbench 2.9 opens it without complaining but I can't figure out how to export anything from it (and besides, 2.9 may not preserve the Unicode characters correctly anyway).

Goldpan says that the file is too big and that I have to split it into smaller files using Batch Tools > Split. (GoldenDict gives a similar error message.) Well, when I try to do that in Goldpan, I get an error message that is apparently related to a Windows setting, saying that DTD processing is not allowed for some or other security reason. I solved this by opening the TBX file in a text editor (I used Akelpad) and just deleted the DOCTYPE line (line 2 of the file). Then Goldpan split the file without any complaint into 100 MB files, which even the Trados Glossary Converter appeared to accept. I'm not sure what effect deleting the DOCTYPE line would have, but I doubt if the effect would be great.


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 01:14
English to Russian
Heartsome Mar 29, 2021

It took 22 minutes for Heartsome to convert IATE_export_26022019.tbx to IATE_export_26022019.tmx. Plus 7 minutes to open.
However you can download tbx files as split by Paul Filkin of SDL from here.

[Edited at 2021-03-29 19:42 GMT]


 
Milan Condak
Milan Condak  Identity Verified
Local time: 23:14
English to Czech
A short presentation Mar 29, 2021

Samuel Murray wrote:
Xbench 2.9 opens it without complaining but I can't figure out how to export anything from it (and besides, 2.9 may not preserve the Unicode characters correctly anyway).


For Unicode you need licenced version 3.x. This is not Adrian's case.

The presentation:

Xbench: TBX to TMX

http://www.condak.cz/nove/2021-03/29/en/00.html

Export TXT or TMX. My output format is TMX.

HTH

Milan


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:14
Member (2006)
English to Afrikaans
+ ...
@Milan Mar 30, 2021

Milan Condak wrote:
Samuel Murray wrote:
Xbench 2.9 opens it without complaining but I can't figure out how to export anything from it.

http://www.condak.cz/nove/2021-03/29/en/00.html


Hmm, it may be that my Xbench 2.9 does not actually open the file appropriately.

When I create a project, I get this screen (which you also get):

01

and when I click OK, I get this message:

02

and when I choose YES, I get this dialog:

03

but there is nothing I can do on that dialog. I can tick or untick the box, but it doesn't change anything. So, based on this, it may be that none of the terms are actually imported by Xbench to begin with. This would explain why it keeps exporting a TMX file with zero segments in it.

I use drag and drop to add the TBX file to the project, but I see you use the "Add..." button. I tried using the "Add..." button. At first, the dialog has only three tabs:

04

but after clicking Next, it opens a fourth tab, as in your screenshots, but: no language codes:

05

My test TBX file is IATE_export_29082018.tbx (1.92 GB). It is here, zipped (112 MB):
https://wsi.li/dl/FnRz3k7omJvaujYg5/d7475b


[Edited at 2021-03-30 06:59 GMT]


 
Milan Condak
Milan Condak  Identity Verified
Local time: 23:14
English to Czech
Second short presentation with animation Mar 30, 2021

Samuel,

I do not see a TBX file you are opening.
--
http://www.condak.cz/nove/2021-03/30/en/00.html

TMX NL-CS 2019-10, see an animation.

Notes:

Give the name to project after importing a file(s).

Give the name to exported TMX before export.

Milan


 
Clarisa Moraña
Clarisa Moraña  Identity Verified
United States
Local time: 17:14
Member (2002)
English to Spanish
+ ...
convert them into a termbase Mar 30, 2021

Iate termbases are huge, and they mainly consist of terms, thus I would recommend you to convert into termbases. They are very useful in that way. I have exported those bases by specific fields, such as Mechanical engineering, Electronics, Wood, Banks, and so on. I attach them to my translation projects as termbases.

 
Adrián LY
Adrián LY
Spain
Local time: 23:14
English to Spanish
TOPIC STARTER
Solved Mar 30, 2021

Milan Condak wrote:

Adrián L. wrote:

My questions are:

Is there a way to convert the TBX file without Glossary Converter?



Hi Adrián,

I made a presentation in Czech. I hope you will understand my pictures.

www.condak.cz/nove/2021-02/27/cs/02.html

Here is machine translation CS > EN:

In 2019 I downloaded a large ZIP file that contained all languages; I extracted the language few options and the thematic area to the TBX file.

In Xbench, I did TBX conversion to TMX, by import and export method.

I imported TMX into the database of TMLookup DB.

From the database of the TMlookup, I exported the TXT file = finished glossary for OmegaT.
--
Glossaries for OmegaT are in UTF-8, glossaries for Wordfast are in Unicode 16 (LE or BE).

Milan

The user of Goldpan TBX Editor for creating TBX.

[Edited at 2021-03-29 14:37 GMT]


Thank you Milan. This was exactly what I was looking for. You seem pretty knowledgeable about this stuff.

Cheers!


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 01:14
English to Russian
The task has changed from 2GB to 505KB Mar 30, 2021

Samuel Murray wrote:
My test TBX file is IATE_export_29082018.tbx (1.92 GB).

You work with the entire glossary file with all languages included, while Milan Condak extracted just 2 languages. Obviously, if we talk now about 2 languages only, or extract by specific fields, any converter can do this task. It took less than a minute for Glossary Converter to convert 2 languages without any error. The same applies to Heartsome. However the original task was a bit different...

[Edited at 2021-03-30 18:33 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:14
Member (2006)
English to Afrikaans
+ ...
@Stepan, @Milan Mar 30, 2021

Stepan Konev wrote:
Samuel Murray wrote:
My test TBX file is IATE_export_29082018.tbx (1.92 GB).

You work with the entire glossary file with all languages included, while Milan Condak extracted just 2 languages.

It also occurred to me that perhaps Xbench can't handle TBX files with more than two languages in it. Or maybe it is just a simple thing that we need to change in the TBX header... who knows.

Milan Condak wrote:
Samuel, I do not see a TBX file you are opening.

My test TBX file is IATE_export_29082018.tbx (1.92 GB). It is here, zipped (112 MB):
https://wsi.li/dl/FnRz3k7omJvaujYg5/d7475b
https://we.tl/t-3hU9DyCOFH


[Edited at 2021-03-30 22:01 GMT]


 
Milan Condak
Milan Condak  Identity Verified
Local time: 23:14
English to Czech
Xbench is for language pair Mar 31, 2021

Samuel Murray wrote:

Stepan Konev wrote:
Samuel Murray wrote:
My test TBX file is IATE_export_29082018.tbx (1.92 GB).

You work with the entire glossary file with all languages included, while Milan Condak extracted just 2 languages.

It also occurred to me that perhaps Xbench can't handle TBX files with more than two languages in it. Or maybe it is just a simple thing that we need to change in the TBX header... who knows.

Milan Condak wrote:
Samuel, I do not see a TBX file you are opening.

My test TBX file is IATE_export_29082018.tbx (1.92 GB). It is here, zipped (112 MB):
https://wsi.li/dl/FnRz3k7omJvaujYg5/d7475b
https://we.tl/t-3hU9DyCOFH


[Edited at 2021-03-30 22:01 GMT]


First step is extract TBX language pair from downloaded ZIP file.
Xbench support many formats of bilingual files.
Export TXT from Xbench contains lot of unuseful data.
TXT from TMX contain clear "glossary" data.
If you need multilingual TMX you can align TMX from more TXT files.

In a year 2021 you have to ask for generating data on demand:

http://www.condak.cz/nove/2021-02/27/cs/03.html

Milan


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to convert a TBX file into a Translation Memory







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »