WHI Newsletter for February, 2003
A New Way of Communicating
With this first issue of the WHI Newsletter we inaugurate our new collaborative web portal. We now have a central place where those interested and involved in the Institute's mission can work and communicate together. We can now author content collaboratively, dynamically and interactively.
Each member of the Institute has his or her own area to create in. There are also “common” areas where content can be created by more than one individual. There are many types of content in our portal. The basic type is the document, which can be simple plain text or web page. There are containers called folders and are what you would expect: places that hold related content objects together in one place.
There are other content types: Forums, Wikis, Books, Issue Collectors, and much, much more. This is not the place to describe all that is possible. Just to say, this website is where the action is!
Project News
MORPH Release 4.0
The next major release of our flagship project — the Hebrew Bible morphology database — will occur around the first of March. Some of the major changes:
- All nominals are now parsed as to whether it is “construct” or “absolute.”
- Gender and number have been completely re-evaluated
- Paragogic he and nun have been exhaustively rechecked
- Assimilated definite articles, formerly parsed with the preceeding preposition, now have their own separate record as a morpheme
- The Aramaic “gentilic” category has been replaced with the “adjective” designation
- Many individual changes, including several to the Hebrew text itself
Web Portal
One of the major goals of 2002 was to begin the process of adapting network technology to the needs of the Institute. This has taken several paths: we have been evaluating “Content Management Frameworks” — software that makes the development of websites and web applications easier. As you can see, we’ve made considerable progress.
How did we do it? What did it cost? How steep is the learning curve? The answers to these questions are “easy, most of it works right out of the box”, “nothing (it’s Open Source)”, and “depends on your previous experience with websites, web application frameworks and programming”. The first time, it took about 3 days to get everything installed and customized. Now I can create a new portal in just a couple of hours.
Some of you may be telling yourself “I gotta get me one of these!” Here are some links for more information:
- Web portal package: Plone
- Content Management Framework: CMF
- Dynamic Web Content Generator: ZOPE (Z-Object Publishing Environment)
- The programming language that is the “engine” for Zope/CMF/Plone: python
Are these names strange-sounding and unfamiliar to you? That is not surprising. But the Institute is in good company in using these software packages: NATO, the Austrian government, CBS (New York), the AARP, The Genome Sciences Centre (Canada), the Governor of Texas, — to name just a few — use this software for their collaborative authoring needs!
“MORPH-ing” into a networked, searchable SQL database
If the web portal may be viewed as being the “top-down” approach to the Institute’s mission, then the “bottom-up” approach has been the task of migrating our morphology database to a state-of-the-art database and adding a search engine to it which is optimized for linguistic queries (emdros).
We have been working closely with Ulrik Petersen, the author of emdros to come up with a python version of the emdros libraries as well as consulting on the data model we are using for Biblical Hebrew. We are now working on the connecting a plone page and the emdros command interface using python.
Database Cross-check
The Institute’s senior programmer, Stephen Salisbury, has been working on comparing our Hebrew morphology with one of the other major Hebrew Bible databases in the world: that of the Werkgroep Informatica, directed by Prof. Eep Talstra at the Freije Universiteit, Amsterdam. This is a very complex task, since the underlying data and linguistic models are not the same as the one used by MORPH. Up to this point we have:
- compared their Hebrew text with ours (each had a couple of errors)
- checked the number of morphemes (i.e., how we divide the words up and how WI does it)
- compared gender, number and state of nominals
- compared lemmatization (yet to be completed)
Release 4.0 of MORPH benefits considerably from Stephen’s complex work.