Skip to content

OSIS Portal

Sections
Personal tools
You are here: Home » Technical Committee » OSIS Schema & Best Practices Issues

OSIS Schema & Best Practices Issues

Document Actions
This document lays out the meeting agenda

Contents

1. Schema Bugs, Errors, Fixes

1.1. Dead Elements - Removal Suggested

1.1.1. <cell>

Has attributes of rows and columns (delete).

1.1.2. milestones

osisMilestoneSe, milestoneSe, osisMilestonePt, and milestonePt should be deleted.

2. Content Model Issues

2.1. <div> attributes

Add section, front, body, back, titlePage, introduction, index, preface, afterword, colophon, lexeme to type attribute on <div>.

2.2. Insert <divineName> in <catchWord>

The <divineName> element is meant to control imposition of styling. If not included in <catchWord>, must use ad hoc methods, leading to inconsistent encoding and more complex stylesheets.

2.3. lang/script/ews

The lang vs. xml:lang issue is already identified. I think we should also consider adding a script attribute at the same places where lang currently is. (Plenty of use cases exist Cyrillic vs. Latin for Serbian being the most recognizable.) I think I recall TEI having a similar facility for identifying script.

In terms of best practices for these attributes:

lang should be specified as RFC 3066 (currently the only mention of a language RFC in the schema is a reference to 1766, which this obsoletes, in the language element)

In addition, we should specify best practices for languages not covered by ISO 639. x-E-... was suggested previously as a best practice for identifying languages included in the Ethnologue, but common practice at SIL and according to LINGUIST List, seems to be to use x-SIL-...

Additionally, I would recommend we specify LINGUIST List's codes for languages absent from ISO 639 and Ethnologue, using something like x-LING-.... (Their codes are available here: http://saussure.linguistlist.org/cfdocs/new-website/LL-WorkingDirs/forms/langs/GetListOfAncientLgs.cfm http://saussure.linguistlist.org/cfdocs/new-website/LL-WorkingDirs/forms/langs/GetListOfConstructedLgs.cfm )

If we choose to add a script attribute, ISO 15924 would be the appropriate standard to follow, but it is not final. Their pattern for codes is either of [A-Z][a-z]{3} or [0-9]{3} (Codes can be found here: http://www.evertype.com/standards/iso15924/document/dis15924.pdf)

I still don't know why ews is necessary, but it should at least be confined to some set of standard values if such a thing exists.

2.4. <table> in <p> and <speech>

Should table be allowed in <p> and <speech>?

2.5. osisID as list, pointing at with osisRef with grain

Ex. osisID="Matt.1.2 Matt.1.3 Matt.1.4" would osisRef="Matt.1.2@ch14" be the SAME as osisRef="Matt.1.3@ch14"

Problem is that the grain reference has to have a certain staring point and that can only be the first osisID. Other issue is blind pointing, how do I know if the author has used osisID as a list?

2.6. osisRef as list?

Should osisRef be allowed to be a list, like osisID? Would allow <note> to be applied to discontinuous material, avoiding Todd's annotation extension.

2.7. xml:lang?

Should we replace lang with xml:lang?

3. Best Practices

3.1. Major Issues

3.1.1. Levels of Encoding

Category A: Bibles with just the scripture text and no notion of paragraphs and organized with a book/chapter/verse hierarchy.

Category B: Bibles with paragraphs but no sections, with the paragraphs held by chapter <div> elements.

Category C: Bibles with sections and paragraphs, where sections as <div type="x-section"> elements contain paragraphs.

3.1.2. Milestones: Start and Stop

The use of milestone start and end elements

3.1.3. Predominant hierarchy

What is the best hierarchy to use for biblical texts?

3.1.4. Quotes

Strategy for quotes (not likely to be the predominant hierarchy)

3.1.5. Text in Verses

Strategy to put ALL scripture text within a <verse> element.

3.1.6. Verse splits

<verse> element split in <lg>, <list>, and <table> and should the schema be changed

Chris: (Personally I think splidIDs are a bad thing in every circumstance where I've been forced to use them. They force text to be encoded in an extrememly unnatural manner.)

Allowing <l> inside of <verse> and allowing <l> to not require <lg> seems like it would solve the line-related part of the problem.

It seems that issue 3.2.25. Stanza was the reason <lg> was created, wasn't it?

Isn't <lg> just a special version of <p> for lines?

3.2. Lesser Issues

3.2.1. blockQuote vs. Speech

Guidelines on usage?

3.2.2. Book Titles

For <title> elements, use the type attributes "short" for the short title like "Matthew", "mainTitle" for the main title, and "subTitle" for any sub titles for the book. (The same could be applied to testaments and book groups.)

3.2.3. catchWord

Catchword (unbalanced quotes, <divineName>, etc..) Chris: Also consider inserting <hi> in <catchWord>. This issue comes up in the TEV.

3.2.4. Complex or discontinuous text

Marking AddEsther where chapters interrupt other chapters and alternant reference systems are present.

3.2.5. Continuing Paragraph

How to best encode a continuing paragraph after a block quote of line group.

3.2.6. Copyright pages

How to deal with the copyright page and the related <work> element

3.2.7. Cross-References in <title>

Should cross-references following a <title> be placed in a child <title> element?

Example:


       <div type="x-section" osisRef="Matt.3.1-Matt.3.12">

       <title type="section">The Preaching of John the Baptist

       <title type="cross-ref"><reference

       osisRef="Mark.1.1-Mark.1.8" type="parallelPassage">Mark

       1.1-8</reference> <reference osisRef="Luke.3.1-Luke.3.18"

       type="parallelPassage">Luke 3.1-18</reference> <reference

       osisRef="John.1.19-John.1.28" type="parallelPassage">John

       1.19-28</reference></title></title> 

      

3.2.8. Dictionary

How to encode a dictionary and other content at the end of a Bible

3.2.9. <div> following <osisText>

How to best organize top level structure for introduction sections, mini-dictionaries, glossaries, maps

3.2.10. Dublin Core

What should the DC elements in <work> look like for a document that is a portion of the entire work (ie. a single book, single chapter, set of books, range of verses, several sections from different books).

3.2.11. endings, multiple

Multiple endings (marking container elements and osisIDs)

3.2.12. Footnotes

Footnotes (<rdg> and superscripted numbers)

3.2.13. Identifier with element

The question I have is how do associate an identifier with an element. For example if I wanted to say that a paragraph or other block of text is about "anger".

The best I can come up with is a <note> element with an osisRef to desired text. (Similar to a cross reference)

This will work with scripture text with a well defined reference system, but for non-Biblical text that often does not have osisIDs this becomes an issue. (This would likely be resolved if XPath/XPointer like syntax were allowed in a reference or reference like element.) (Todd)

3.2.14. Identity of books, works

How to identify a book of Esther (Esther vs. Additions to Esther vs. Greek Esther). (Goes with 3.2.4. Complex or discontinuous text, somewhat.)

How to identify books of Ezra/Nehemiah/Esdras.

Depending on these... potentially, how to identify 1-2Kgs/1-2Chr vs. 1-4Kgdms.

How to identify books that occur multiply within a work (e.g. Esther in NRSVA & others; Psalms in Vulgate; Joshua, Judges, Daniel, & Tobit in Rahlfs')

3.2.15. Introduction

Should the introduction to a book be marked as a <> or a <list>?

3.2.16. Introduction content

The text found at the front of a bible, testament, book group, or book. Contain this type of content in a <div type="x-introduction"> element.

3.2.17. Lines within a line group

Use type="q", type="q2" (or similar type names) and a set of other standardized types to indicate the specific nature of the <l> element.

3.2.18. Major and minor divisions

Major and minor divisions in the text.

3.2.19. Matthew text example

Should Matt.1.2-Matt.1.6a be encoded as osisID="Matt.1.2 Matt.1.3 Matt.1.4 Matt.1.5 Matt.1.6 Matt.1.6a" or osisID="Matt.1.2 Matt.1.3 Matt.1.4 Matt.1.5 Matt.1.6"? The logic being that "a" is simply a TYPOGRAPHIC mechanism to indicate that there are two blocks of text with Matt.1.6 in them! I believe the latter form is CORRECT even though I have argued for the alternative in the past.

3.2.20. Milestone Pairs

Some guidelines on how to use milestone pairs for chapters, verses, and quotes and how to be consistent in the use of milestones vs elements.

3.2.21. Misc. but common structures

glossary, map, mini-dictionary, Thompson Chain Reference (Todd), see http://www.zondervan.com/media/pdfs/0310912229.pdf for example (I have a local copy for the meeting. pld)

3.2.22. Non-canonical text and speech

How to encode non-canonical text associated with the start of a speech. (<seg type="speechStart">She Speeks</seg>)

3.2.23. Notes

A "best practices" guideline for type attributes that indicate the type of note. Chris: Add cross-reference osisNotes type (unless there seems to be a better practice)

3.2.24. Parallel passages

Use a <div type="parallelPassage"> with strictly a set of <reference> elements and the related display text as children.

3.2.25. Poetry

How to encode lines of poetry: Line breaks, multiple translator specified line splitting alternatives. What is presentation and what is data?

3.2.26. Presentation Punctuation in References

The best way to encode a series of references with various presentation punctuation.

3.2.27. Reference encoding

In OSIS documents that are not Bibles, it is a common to see a quote of scripture text followed by the reference.

Does it make sense to add an optional osisRef attribute to <q> and <milestoneStart> to accommodate this frequent issue?

Example: <q osisRef="Matt.20.28">The Son of Man did not come to be served, but to serve . . . </q>

rather than <q>The Son of Man did not come to be served, but to serve . . .<reference osisRef="Matt.20.28">Matthew 20:28</reference></q> (Todd)

3.2.28. Reference to entire work

Introduction to CEV has references to entire works. How to encode? (Todd)

3.2.29. Special Information

How to preserve special information related to verse numbers that can not be represented with an osisID ("*1") at Bible.CEV.Gen.49.1. (Is this really a note?)

3.2.30. Split

Only split AND only use the attribute "splitID" for the following elements: <verse>, <div type="chapter">, <div type="x-section">, and <p>.

3.2.31. Stanza

How to best encode "stanza" in the Psalms.

3.2.32. Title Page

How to deal with the title page

3.2.33. Translator practices

How to encode the idea created by the translator that leads to a blank line being rendered? (This would be additional spacing than would normally exist between two paragraphs to emphasize a shift in thought.)

3.2.34. Verse value

Use the n attribute to indicate the verse value to present when more than one value is present in an osisID. (Eg. <verse n="1-2" osisID="Matt.1.1 Matt.1.2">)

3.2.35. Work related practices

Work related practices for different scenarios. (I am thinking primarily about the "thisWork" element.)

Created by klowery
Last modified 2003-05-22 05:55 PM
« November 2008 »
Su Mo Tu We Th Fr Sa
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30            
 
 

Powered by Plone

This site conforms to the following standards: