Mysterious segmentation rules
Thread poster: Heinrich Pesch
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 02:13
Member (2003)
Finnish to German
+ ...
Nov 9, 2006

One thing that always bothers me when working with Wordfast is the strange behaviour it shows when chosing the segmentation point.
I know I can somehow influence the rules by adding items to the abbrivation list and chosing other rules than the recommended options, but why can it not use common sence in the first place?

Just now I had a sentence with "e.g. a fixed ". Wf first segmented after the first full stop, then I press to enlarge the segment, but it stops at the next ful
... See more
One thing that always bothers me when working with Wordfast is the strange behaviour it shows when chosing the segmentation point.
I know I can somehow influence the rules by adding items to the abbrivation list and chosing other rules than the recommended options, but why can it not use common sence in the first place?

Just now I had a sentence with "e.g. a fixed ". Wf first segmented after the first full stop, then I press to enlarge the segment, but it stops at the next full stop, after the g., and only after that does it segment the whole sentence.
When I look at the segmentation tools, the option "sentence" is not recommended. Why not? What would happen?
I would like Wf to segment always, when after a full stop comes a space and after that an initial, where the next sentence starts. But very often Wf grabs two or three sentences into one segment. On the other hand it stoopidly segments at places like "312.15.67", which is rather annoying.

What segmentation rules have you tried out and what do you use?

Regards
Heinrich
Collapse


 
Valters Feists
Valters Feists  Identity Verified
Latvia
Local time: 02:13
English to Latvian
+ ...
experiment with finer settings... Nov 9, 2006

I too do use manual expanding of segments quite often. It's not that terrible if you know the keyboard shortcuts (alt-pgdown).
Are you also aware of manual page breaks versus paragraph-end characters, non-breaking spaces versus normal spaces?

I think you can fiddle with the Wf's setup->segs->end-of-segment punctuation + abbreviations. You can enter your own items in the abbreviations and the ESP boxes (have to do it carefully). This could depend on Wf's version though.
A
... See more
I too do use manual expanding of segments quite often. It's not that terrible if you know the keyboard shortcuts (alt-pgdown).
Are you also aware of manual page breaks versus paragraph-end characters, non-breaking spaces versus normal spaces?

I think you can fiddle with the Wf's setup->segs->end-of-segment punctuation + abbreviations. You can enter your own items in the abbreviations and the ESP boxes (have to do it carefully). This could depend on Wf's version though.
A while ago I was looking for a way of making the regular space character to act as a segment delimiter (so that one TU = one word) -- which unfortunately doesn't seem to be possible and I have to resort to an oblique replace-all-and-later-unreplace-all routine.
Apparently there are some things in Wf that you just can't be in control of. :-/

Regards,
Valters Feists
Technical Latvian translator
Collapse


 
Gerard de Noord
Gerard de Noord  Identity Verified
France
Local time: 01:13
Member (2003)
English to Dutch
+ ...
Don't make any special settings Nov 9, 2006

Hi Heinrich,

You shouldn't make any special settings, if you want your segments to be Trados compatible. Full sentences aren't.

When you encounter e.g., select those four characters and push Ctrl+ALt+T to add the abbreviation to the list of abbreviations. The text will be resegmented.

Regards,
Gerard


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 02:13
Member (2003)
Finnish to German
+ ...
TOPIC STARTER
Why Trados compatible? Nov 9, 2006

I rather would it be common sense compatible

If there is a full stop plus a space plus an initial I really would like the segmenting take place there and not two sentences later.
Can anybody explain why sentence segmentation is not recommended?
According to my experience Wf segmentation rules differ from Trados at least in lists, where the items are numbered German fashion.
1.)
2.)
etc.... See more
I rather would it be common sense compatible

If there is a full stop plus a space plus an initial I really would like the segmenting take place there and not two sentences later.
Can anybody explain why sentence segmentation is not recommended?
According to my experience Wf segmentation rules differ from Trados at least in lists, where the items are numbered German fashion.
1.)
2.)
etc.

Brackets are a problem too for Wf. Often I encounter situations, where a ")." is left to the next segment, and I cannot get Wf to segment after it, instead it jumps to and fro too far or too short.

Perhaps the reason for this is the fact that Wf is French, and the French have some strange punctuation rules, if I remember right?

Regards
Heinrich

[Bearbeitet am 2006-11-09 18:53]
Collapse


 
Philippe Etienne
Philippe Etienne  Identity Verified
Spain
Local time: 01:13
Member
English to French
I love it Nov 9, 2006

Heinrich Pesch wrote:

I rather would it be common sense compatible
...


I am afraid the meaning of common sense is somewhat lost nowadays...
Thanks for the laugh
Philippe


 
Valters Feists
Valters Feists  Identity Verified
Latvia
Local time: 02:13
English to Latvian
+ ...
Wf 3.35 - more or less the following settings... Nov 10, 2006

In setup/segs:

1) Add e.g. to the list of abbreviations.
Your list can be for example "Inc.,Corp.,Ltd.,e.g." (separate with commas)
2) You can leave . (full stop) in the ESP box,
3)...but make sure you uncheck "An ESP without a trailing space ends a segment",
4) uncheck also "An ESP + a space + a lowercase end a segment".

P.S.
In French punctuation, a space character comes before exclamation and question marks; it also separates quote mark
... See more
In setup/segs:

1) Add e.g. to the list of abbreviations.
Your list can be for example "Inc.,Corp.,Ltd.,e.g." (separate with commas)
2) You can leave . (full stop) in the ESP box,
3)...but make sure you uncheck "An ESP without a trailing space ends a segment",
4) uncheck also "An ESP + a space + a lowercase end a segment".

P.S.
In French punctuation, a space character comes before exclamation and question marks; it also separates quote marks from words, e.g., « merci ! » .

Regards,
Valters Feists
Technical Latvian translator
Collapse


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 02:13
Member (2003)
Finnish to German
+ ...
TOPIC STARTER
Thanks Valters Nov 12, 2006

I implemented the settings you suggested. So far I haven't noticed any changes. Wf continues to segment two sentences into one segment in certain places.
But when I try Trados 7,5, I notice Trados has no difficulties with the same text. So Wf segmentation rules are definitely not trados-compatible.
The same text was translated also on another machine with a different Wf and different settings, and the result is the same an in my case.

Sorry I cannot cite the text, as it
... See more
I implemented the settings you suggested. So far I haven't noticed any changes. Wf continues to segment two sentences into one segment in certain places.
But when I try Trados 7,5, I notice Trados has no difficulties with the same text. So Wf segmentation rules are definitely not trados-compatible.
The same text was translated also on another machine with a different Wf and different settings, and the result is the same an in my case.

Sorry I cannot cite the text, as it is confidential, but they are normal sentences of the ". T"-model.

Regards
Heinrich
Collapse


 
Valters Feists
Valters Feists  Identity Verified
Latvia
Local time: 02:13
English to Latvian
+ ...
could it be...? Nov 12, 2006

Could it be that the sentences are separated by a non-breaking space (nbs) character instead of simple space? Check it by switching the inverted "P" button to on; the nbs characters then will be shown as degree characters, and normal spaces as middle dots. I think Wf cannot be trained to handle nbs's... my option would be to replace-dereplace them while translating.

 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 02:13
Member (2003)
Finnish to German
+ ...
TOPIC STARTER
No, they are normal spaces Nov 12, 2006

This phenomenom is so common that I did never think about it before, almost every document has such stombling stones for no apparent reason, except that the sentences include numbers and abbreviations.
Cheers
Heinrich


 
Mick De Meyer
Mick De Meyer  Identity Verified
Belgium
Local time: 01:13
English to Dutch
+ ...
More weird segmentation, any help? Apr 17, 2011

Hi all,

Here is a specific problem I'm having with segmentation: the curly brackets.

I often need to translate this kind of sentence string:

Sentence number one.{end_li}{li}Sentence number two.{end_li}{end_para}{breakline}{para}Sentence number three.

... and so on. However, Wordfast mysteriously decides that this is a segment:

Sentence number one.{

How on earth can I simply te
... See more
Hi all,

Here is a specific problem I'm having with segmentation: the curly brackets.

I often need to translate this kind of sentence string:

Sentence number one.{end_li}{li}Sentence number two.{end_li}{end_para}{breakline}{para}Sentence number three.

... and so on. However, Wordfast mysteriously decides that this is a segment:

Sentence number one.{

How on earth can I simply teach it to segment before the curly bracket? How is it at all logical to end a segment with an open bracket?

If anyone knows how this is done, I would be very grateful!
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Mysterious segmentation rules







Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »