|
/
Zope
/
WHI Hosted Mailing Lists
/
osis-user
/
Archive
/
2004
/
2004-05
/
RE: [osis-user] USFM \b --> TE 'Stanza Break' <--OSIS
[
OSIS version / Patrick Durusau ... ]
[
USFM \qs --> TE 'Interlude' <--OSIS / ... ]
RE: [osis-user] USFM \b --> TE 'Stanza Break' <--OSIS
Jim_Albright(at)wycliffe.org |
2004-05-26 15:19:33 |
[ FULL ]
|
enumerated attribute requested
<osis:lg (at)type='stanza'>
Jim Albright
704 843-0582
Wycliffe Bible Translators
|
RE: [osis-user] USFM \nd --> TE 'Name Of God' <--OSIS
Jim_Albright(at)wycliffe.org |
2004-05-27 15:23:41 |
[ FULL ]
|
enumerated attributes requested
<osis:devineName (at)type='yhwh'> for Lord in the OT when the underlying
text
is YHWH.
Jim Albright
704 843-0582
Wycliffe Bible Translators
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
Jim_Albright(at)wycliffe.org |
2004-05-27 18:34:29 |
[ FULL ]
|
enumerated attributes requested
ethnologueCode
<osis:language (at)type='ethnologue'> is good but I think we should be
more
precise and also allow
<osis:language (at)type='ethnologueCode'> as there are several languages
in
the world that are spelled exactly the same.
Example:
KELE: a language of Democratic Republic of Congo
SIL code:
KHY
ISO 639-2:
bnt
KELE: a language of Papua New Guinea
SIL code:
SBC
ISO 639-2:
map
Jim Albright
704 843-0582
Wycliffe Bible Translators
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
Chris Little <chrislit(at)crosswire.org> |
2004-05-27 19:24:58 |
[ FULL ]
|
The type attribute of the <language> element within <work> allows
an "SIL"
value, indicating that the contents of the element represent an Ethnologue
code. That seems to be what you're asking that we add.
Currently, encoding the actual name of a language is not supported.
About a year ago, I proposed that we add "English", "French", and "native"
to the enumerated set of language "code" types, but the proposal didn't
gain much support (even from me). Is that what you thought we were
allowing, and if so, do you think it would be a valuable addition?
--Chris
On Thu, 27 May 2004 Jim_Albright(at)wycliffe.org wrote:
[...]
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
Jim_Albright(at)wycliffe.org |
2004-05-27 22:20:41 |
[ FULL ]
|
Thank you for pointing that out. I read the spec incorrectly. Since the
ethnologue code is allowed as it stands then I would request the
the untyped <osis:language>
<osis:language (at)type='SIL'>will precisely identify the language.
<osis:language>Having the language name is also very useful... as there
are several variations allowed in the spelling of the language name ...
and some people have changed their name as time goes on. It seems that the
Auca that killed the missionaries in Ecuador prefer to be called the
Waorani now. Of course the SIL code still identifies them correctly but
for including in searches some people will be inclined to search on the
common name rather than a code. So the untyped <osis:language> will work
for me in this situation if that meets with approval.
Jim Albright
704 843-0582
Wycliffe Bible Translators
Chris Little <chrislit(at)crosswire.org>
05/27/2004 07:30 PM
Please respond to osis-user
To: osis-user(at)whi.wts.edu
cc:
Subject: RE: [osis-user] USFM ??? --> TE 'Ethnologue Code'
<--OSIS
The type attribute of the <language> element within <work> allows
an "SIL"
value, indicating that the contents of the element represent an Ethnologue
code. That seems to be what you're asking that we add.
Currently, encoding the actual name of a language is not supported.
About a year ago, I proposed that we add "English", "French", and "native"
to the enumerated set of language "code" types, but the proposal didn't
gain much support (even from me). Is that what you thought we were
allowing, and if so, do you think it would be a valuable addition?
--Chris
On Thu, 27 May 2004 Jim_Albright(at)wycliffe.org wrote:
[...]
[...]
[...]
[...]
|
RE: [osis-user] USFM \pc\sc --> TE 'Inscription Paragraph' <--OSIS
Jim_Albright(at)wycliffe.org |
2004-05-27 22:33:03 |
[ FULL ]
|
enumerated attribute requested
<osis:p (at)type='inscription'>
NIV REV 17 is NT example
Thank you Chris and Todd for pointing out solutions and where I needed to
read the spec better. Any word on what to do with
<osis:lg (at)type='stanza'>
<osis:lg (at)type='doxology'> Yes \qc is always doxology. I would prefer
meaning based markup over describing the look.
Jim Albright
704 843-0582
Wycliffe Bible Translators
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
"Todd Tillinghast" <todd(at)contentframeworks.com> |
2004-05-27 22:50:02 |
[ FULL ]
|
Jim,
I think we should stay away having the language name and only use the
x-SIL-[Ethnologue code] value as in <language>x-SIL-DUG</language>
for
the Duruma language.
The variations of the language name can be a lookup for what ever
software is people are using to find an OSIS document and can pull
directly from the Ethnologue source itself.
Todd
[...]
the[...]
there[...]
...[...]
the[...]
but[...]
the[...]
work[...]
Code'[...]
"SIL"[...]
Ethnologue[...]
"native"[...]
didn't[...]
more[...]
languages in[...]
|
RE: [osis-user] USFM \pc\sc --> TE 'Inscription Paragraph' <--OSIS
"Todd Tillinghast" <todd(at)contentframeworks.com> |
2004-05-27 22:57:20 |
[ FULL ]
|
Jim and Jeff,
<p type="inscription> would simply be <inscription> or if you want
it in
a new paragraph the I suppose <p><inscription>.. text
..</inscription></p>, so there is not need for an enumerated type
value.
I posted a proposal to add <lg type="stanza">.
Are you sure it is not <l type="doxology"> rather than <lg
type="doxology"> you want?
I am still not 100% clear exactly what a "doxology" is?
Also do you think you would find general consensus around defining \qc
as always being a "doxology"?
Jeff, can you express an opinion?
Todd
[...]
<--[...]
to[...]
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
Jim_Albright(at)wycliffe.org |
2004-05-28 08:56:16 |
[ FULL ]
|
I'm a bit confused in the use of x-SIL. The documentation says that 'SIL'
is the value of the 'type' attribute. So that would make it
<osis:language type='SIL'>DUG</osis:language>
Am I reading the documentation wrong?
Jim Albright
704 843-0582
Wycliffe Bible Translators
"Todd Tillinghast" <todd(at)contentframeworks.com>
05/27/2004 10:49 PM
Please respond to osis-user
To: <osis-user(at)whi.wts.edu>
cc:
Subject: RE: [osis-user] USFM ??? --> TE 'Ethnologue Code'
<--OSIS
Jim,
I think we should stay away having the language name and only use the
x-SIL-[Ethnologue code] value as in <language>x-SIL-DUG</language>
for
the Duruma language.
The variations of the language name can be a lookup for what ever
software is people are using to find an OSIS document and can pull
directly from the Ethnologue source itself.
Todd
[...]
the[...]
there[...]
...[...]
the[...]
but[...]
the[...]
work[...]
Code'[...]
"SIL"[...]
Ethnologue[...]
"native"[...]
didn't[...]
more[...]
languages in[...]
[...]
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
"Todd Tillinghast" <todd(at)contentframeworks.com> |
2004-05-28 09:07:09 |
[ FULL ]
|
Jim and Chris,
The use of the x-SIl-[Ethnologue code] for <language> was my
recollection. However, Chris and Partick took the lead on resolving
that issue.
When it comes to <osisText ... xml:lang="x-SIL-DUG"> is the correct
form. In this case this is not in the OSIS name space and is conforming
to the general XML community practice.
Chris can you clear up any misinformation I may have spread regarding
<language>?
Todd
[...]
'SIL'[...]
Code'[...]
that[...]
be[...]
|
RE: [osis-user] USFM \pc\sc --> TE 'Inscription Paragraph' <--OSIS
Jim_Albright(at)wycliffe.org |
2004-05-28 09:38:17 |
[ FULL ]
|
Here is results of looking at Paratext files for usage of \qc
This example is a lg of praise to God (doxology)
NIV84\67REVNIV84.PTX(177): \qc "Holy, holy, holy
NIV84\67REVNIV84.PTX(178): \qc is the Lord God Almighty,
NIV84\67REVNIV84.PTX(179): \qc who was, and is, and is to come."
and in other languages the same
This example is a lg of praise to God (doxology)
NVI-P\67REVNVI-P.PTX(164): \qc "Santo, santo, santo
NVI-P\67REVNVI-P.PTX(165): \qc é o Senhor, o Deus todo-poderoso,
NVI-P\67REVNVI-P.PTX(166): \qc que era, que é e que há de vir".
end of book in Psalms
NVI-S\19PSANVI-S.PTX(2508): \qc Amén y amén.
NVI-S\19PSANVI-S.PTX(4428): \qc Amén y amén.
NVI-S\19PSANVI-S.PTX(5748): \qc Amén y amén.
\qc\sc is inscription
NVI-S\26EZKNVI-S.PTX(2346): \qc \sc AQUÍ HABITA EL\sc* S\sc EÑOR\sc*.»
This example is a lg of praise to God (doxology)
NVI-S\67REVNVI-S.PTX(164): \qc «Santo, santo, santo
NVI-S\67REVNVI-S.PTX(165): \qc es el Señor Dios Todopoderoso,
NVI-S\67REVNVI-S.PTX(166): \qc el que era y que es y que ha de venir.»
*******In the NIV this is marked as \q3 ....
***************************************
*****Since the NVI-S is based on the NIV, I think they mismarked it
here*****
*******\qc and \q3 often look
similar*********************************************
*********The NIV only uses \q3 12 times in the whole
Bible.******************
NVI-S\67REVNVI-S.PTX(217): \qc por los siglos de los siglos!»
\qc\sc is inscription
NVI-S\67REVNVI-S.PTX(615): \qc \sc LA GRAN BABILONIA\sc*
NVI-S\67REVNVI-S.PTX(616): \qc \sc MADRE DE LAS PROSTITUTAS\sc*
NVI-S\67REVNVI-S.PTX(617): \qc \sc Y DE LAS ABOMINABLES IDOLATRÍAS\sc*
NVI-S\67REVNVI-S.PTX(618): \qc \sc DE LA TIERRA\sc*.
\qc\sc is inscription
NVI-S\67REVNVI-S.PTX(804): \qc R\sc EY DE REYES Y \sc*S\sc EÑOR DE
SEÑORES\sc*.
end of book in Psalms
RSV52\19PSARSV52.PTX(4026): \qc Amen and Amen!
RSV52\19PSARSV52.PTX(5251): \qc Amen and Amen.
TEV92\19PSATEV92.PTX(2461): \qc Amen! Amen!
TEV92\19PSATEV92.PTX(4294): \qc Amen! Amen!
TEV92\19PSATEV92.PTX(5554): \qc Amen! Amen!
TEV92\19PSATEV92.PTX(6708): \qc Praise the \nd Lord\nd*!
TEV94\19PSATEV94.PTX(2461): \qc Amen! Amen!
TEV94\19PSATEV94.PTX(4294): \qc Amen! Amen!
TEV94\19PSATEV94.PTX(5554): \qc Amen! Amen!
TEV94\19PSATEV94.PTX(6708): \qc Praise the \nd Lord\nd*!
Found 34 occurrence(s) in 12 file(s)
Jim Albright
704 843-0582
Wycliffe Bible Translators
"Todd Tillinghast" <todd(at)contentframeworks.com>
05/27/2004 10:57 PM
Please respond to osis-user
To: <osis-user(at)whi.wts.edu>, "'Jeff Klassen'"
<jklassen(at)ubs-icap.org>
cc:
Subject: RE: [osis-user] USFM \pc\sc --> TE 'Inscription
Paragraph' <--OSIS
Jim and Jeff,
<p type="inscription> would simply be <inscription> or if you want
it in
a new paragraph the I suppose <p><inscription>.. text
..</inscription></p>, so there is not need for an enumerated type
value.
I posted a proposal to add <lg type="stanza">.
Are you sure it is not <l type="doxology"> rather than <lg
type="doxology"> you want?
I am still not 100% clear exactly what a "doxology" is?
Also do you think you would find general consensus around defining \qc
as always being a "doxology"?
Jeff, can you express an opinion?
Todd
[...]
<--[...]
to[...]
[...]
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
Chris Little <chrislit(at)crosswire.org> |
2004-05-28 15:03:24 |
[ FULL ]
|
For Ethnologue code DUG, the IETF (RFC 3066, IIRC) form would be
x-sil-DUG. (See, e.g.
http://www.language-archives.org/REC/language.html)
So if you're
encoding <language type="IETF">, the content would be "x-sil-DUG". If
you're encoding <language type="SIL">, just "DUG" would be correct.
I would recommend that every document include a <language type="IETF">
element, since it is the only uniform manner for expressing all languages
that have language codes. I don't know if this was decided as a best
practice, but it really ought to be.
We also have those xml:lang attributes on elements. They take a code
in IETF form also.
Case (i.e. sil vs. SIL & dug vs. DUG) is important from the perspective of
XML, but the RFCs specifically state they are not case-dependent.
--Chris
On Fri, 28 May 2004, Todd Tillinghast wrote:
[...]
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
"Todd Tillinghast" <todd(at)contentframeworks.com> |
2004-05-28 15:11:41 |
[ FULL ]
|
Chris,
Can you give an example of the IETF form for two letter and three letter
ISO codes?
Todd
[...]
languages[...]
perspective of[...]
conforming[...]
regarding[...]
that[...]
|
RE: [osis-user] USFM ??? --> TE 'Ethnologue Code' <--OSIS
Chris Little <chrislit(at)crosswire.org> |
2004-05-28 17:39:52 |
[ FULL ]
|
Todd,
2 and 3 letter ISO codes would just be the code itself in IETF form.
Unfortunately, as we discussed on osis-core a while back, xml:lang can't
handle a 3 letter ISO code in the current version of XML Schema (I think).
I forget the exact details surrounding where the fault lies, but it's
something that is already fixed in the newer versions of Schema or XML or
whatever. Since you can't do xml:lang="ang" for Anglo-Saxon, my
recommendation would be to use xml:lang="x-iso-ang". This is just
my suggestion for a workaround, and bears no force at all. :) ("ang" is
still correct/valid in the <language> element, of course.)
I think I sent out a short recommendation of how to encode langauge codes
in OSIS to osis-core a while back. I'll see if I can track it down and
post it to osis-user and on the OSIS WG page
(http://www.bibletechnologieswg.org/).
--Chris
On Fri, 28 May 2004, Todd Tillinghast wrote:
[...]
|
|