|
/
Zope
/
WHI Hosted Mailing Lists
/
osis-user
/
Archive
/
2006
/
2006-03
/
Re: [sword-devel] Yet another KJV markup question.
[
User Manual Examples / DavidTroidl(at)aol.com ]
[
canonical attribute / DM Smith ... ]
Re: [sword-devel] Yet another KJV markup question.
DM Smith <dmsmith555(at)yahoo.com> |
2006-03-06 15:12:28 |
[ FULL ]
|
Chris,
Yeah, this was a bad example. In fact, most of the actual <w> nesting
situations in the KJV2003 are bad examples. I know just enough Greek to
fix most (I'm rusty as it has been 25+ years since I studied Greek:) and
maybe all.
The worst bug of this kind is in Luke 15.24.
Troy will probably have to confirm, but it appears that the src
attribute is being used to indicate the position of the Greek word in a
particular module. If so, it would be good to be able to construct an
actual osisRef for it. While cp and s are defined as grain operators it
would be good to also have wp (i.e. word position), too.
The <w> element lets one indicate that two or more words are being
translated as in
<w lemma="s:1 s:2">word</w>.
Likewise, it is possible to have more than on morph as in
<w morph="r:9 r:10">word</w>
But in combining these,
<w lemma="s:1 s:2" morph="r:9 r:10">word</w>
it is not possible to know for certain, but only by convention that s:1
maps to r:9 and s:2 maps to r:10.
Adding src to it makes the mapping even more difficult.
<w src="7,9" lemma="s:1 s:2" morph="r:9 r:10">word</w>
It appears that nesting is being used to indicate uniquely the origin.
<w src="7" lemma="s:2" morph="r:9"><w src="9" lemma="s:1"
morph="r:10>word</w></w>
Is there any value in such a nesting construct? (Personally, I don't
think so.)
DM
Chris Little wrote:[...][...]
|
Re: [sword-devel] Another KJV markup question
DM Smith <dmsmith555(at)yahoo.com> |
2006-03-06 15:52:28 |
[ FULL ]
|
Chris Little wrote:[...][...][...]
I think it is correct and also valid. In many languages, gender is
communicated by context, but in translating to English it may be
rendered explicitly. In the KJV these instances are indicated by italics.
Earlier today I saw "give *him* drink" where "give drink" was one word
in Greek and *him* was added for clarity.
So I suggest that for OSIS 2.5 that transChange be allowed to be a child
of <w>.[...]
Unbelievable as it may be, the Strong's entry is:
4155 pnigo pnee'-go strengthened from 4154; to wheeze, i.e. (causative,
by implication) to throttle or strangle (drown):--choke, take by the
throat. see GREEK for 4154
[...]
I don't think I like splitting the word into a before and an after. This
might communicate to a casual user of the document that 4155 can be
translated as either "took" or "by the throat".
[...]
I like having explicit markup as it is easier to transform later.
|
Re: [osis-user] Re: [sword-devel] Yet another KJV markup question.
Chris Little <chrislit(at)crosswire.org> |
2006-03-06 15:57:28 |
[ FULL ]
|
DM Smith wrote:[...]
Yeah, src on <w> in CrossWire's KJV refers to the nth word in the same
verse within the Byz module. I don't think this what what this attribute
was intended for, but I don't have a better suggestion of an attribute
on which to hang such a value. Technically, you'd want a token
attribute, which would require either extension or modification of OSIS.
We discussed a word grain, but decided against it. I recall the
ambiguity of "word" as one of the issues. More useful would probably be
a document that indicates osisIDs down to the word level (a word-level
tagged Byz, in other words--which we may already have).
[...]
No. OSIS is already being used in CrossWire's KJV in ways other than
what was intended. The lemma and morph attributes indicate the lemma and
morphology of their content, but the CrossWire KJV uses them to indicate
the lemma and morphology of words in a completely different document.
(We have an English language document with lemmatization and morphology
information for a Greek language document.)
If the English text does not distinguish two Greek language tokens, such
that they must be translated using a single English token (or vice
versa), then there is no reason to disambiguate which lemma is
associated with which morphological information in the translation text.
I think it is simply not relevant information, from the perspective of a
KJV-reader.
--Chris
|
Re: [osis-user] Re: [sword-devel] Yet another KJV markup question.
DM Smith <dmsmith555(at)yahoo.com> |
2006-03-06 16:27:28 |
[ FULL ]
|
Chris Little wrote:[...]
Can you recommend a better way? I would like to create the best OSIS for
the KJV and use xslt to make it acceptable to the SWORD api.
I was thinking that POS is a better place for robinson's part of speech
codes.
Or is this a "future" for OSIS?
|
Re: [osis-user] Re: [sword-devel] Another KJV markup question
Chris Little <chrislit(at)crosswire.org> |
2006-03-07 03:22:47 |
[ FULL ]
|
DM Smith wrote:[...][...]
>>> The following construct if fairly frequent:
>>> <w src="19" lemma="strong:G4155"
morph="robinson:V-IAI-3S">and took
>>> <transChange type="added">him</transChange> by the
throat</w>
>>>
>>> The problem is that <w> does not allow for
<transChange>.
>>> Should this be changed to <seg type="x-transChange"
>>> subType="added">him</seg>?
>>> (This is the form that is used within study notes.)[...][...]
Based on the morphology code, it's person and number that are specified
by morphology on the verb (which causes the rendering with "him" in the
English). It wasn't the "him" I had a problem with. Greek is just a
pro-drop language (one of the approximately 10 things I can name about
Greek :). What I had a problem with was a verb meaning "to take by the
throat", but it looks like (based on the Strong's definition you gave)
that's just a metaphorical extension to the meaning.
[...]
I'm inclined to agree, unless anyone can come up with a good reason not to.
[...][...]
The casual user should (and probably did) turn Strong's numbers off.
--Chris
|
Re: [osis-user] Re: [sword-devel] Yet another KJV markup question.
Chris Little <chrislit(at)crosswire.org> |
2006-03-07 03:32:27 |
[ FULL ]
|
DM Smith wrote:[...][...][...]
I can't. It's just isn't a normal thing to want to markup (morphology
from some other text) and I can't see it ever being a common enough
issue to warrant special handing in OSIS (like a new attribute).
[...]
Robinson's codes identify more than just part of speech. POS shouldn't
indicate case/number/person agreement or tense/aspect/mood/modality.
Those are the domain of the morph attribute. (NB: Again, I don't know
much about Greek, so if Greek doesn't have some portion of those
attributes, obviously Robinson's doesn't either, but I know it presents
a fairly complete description each word's morphology.)
--Chris
|
|