Re: [Po4a-devel]HTML translating

Wednesday, 10 November 2004

On Wed, Nov 10, 2004 at 05:36:29PM +0000, Yves Rutschle wrote:
...
 On Mon, Nov 08, 2004 at 02:22:42PM +0100, Martin Quinson wrote:
 > Ok. I wanted to reply this message the way it desserve (with a long
 > argumentation to base my point)

 Thank you for sharing your experience; I'm getting convinced
 now. 
Thansk for your patience. I'll have to be even shorter tonight...

...
 [splitting in HTML blocs]
 > > That's actually fairly easily achievable: the list of
 > > paragraph-marking tags is fairly small (<p>, <div>,
 > > <h1,2,3,4,...>) and XHTML makes it mandatory for text to be
 > > included in a block-level element of some sort.
 > 
 > You thus have to show some formating tags to the translators. We do so in
 > all other modules. I don't see any better idea.

 Ok. Well, I'm afraid that means I'm gonna have to ditch the
 current Html.pm and redo one from scratch (bar a couple of
 routines that may be recued). 
I see three solutions to implement a Html module:
  - pretend html is a xml dialect (xhtml is), and use Jordi's parser. 
    It should be about 20 lines long. See the Guide module for an example.
  - pretend html is a sgml dialect, and use the sgml module for that. It
    will work if all html pages begin with a prolog stating the dtd. It
    should be the case, isn't it ?
    Then you have to list all tags in the relevant lists around line 400 of
    Sgml.pm. Just add a "    } elsif ($prolog =~ /html/i) {" block, and
    do the same than for other DTDs.
  - recognize html is uniq. You have to implement a whole new module in that
    case. You may well want to check how we did it for the sgml and xml
    modules. The best may be to translate a file with both of them, or so.

...
 This is a <a
 href="blahblah.com/this/that/blah.html">link</a> to <img
src="blahblah.com/this/that/blah.png" alt="blah"
title="Blah">

 [doesn't] belongs to a PO.

 So I'd propose to collapse the inside of long inline tags,
 so as to simply state there is a tag (e.g. "you're in a
 link") without detailing what the tag contains. Thus, the
 example line would appear, in the PO, as:

 This is a <a>link</a> to <img>blah</img> 
I'm not fond of this because if the translator wants/have to reordonate the
links, you'll have trouble. Check the gettext info file, in the section
explaining what "%2$s" is good for. It's not impossible, but you have to
deal with it.

...
 [HTML::Parser vs Jordi's XML parser]
 > Moreover, I'd be pleased to cut a dependency. I hate unjustified
 > dependencies, but it may be personal.

 Me too, but I hate reimplementation of code (reinventing the
 wheel) more. 
Then, that's an argument of pretending that html is xml or sgml and not
reimplement any specif po4a module :)

Ok, I'm sorry, this mail really should be longer, but I'm out of time, man.

Mt.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Po4a-devel]HTML translating