The History of Ontobee Development
The Ontobee program has now gained a high popularity in the ontology community. This program was originated in 2008 from the development of a web browser for the Vaccine Ontology (VO) in He Group. The VO browser dynamically dereferences VO term URIs to HTML pages. In fall 2009, the technology was applied to generate a web browser for the Ontology for Biomedical Investigations (OBI). A new feature was later added: dynamically dereferences ontology term URIs to RDF. By the end of 2010, Ontobee became the default linked data server for most OBO Foundry library ontologies. Continuous improvements have been made over the past many years. Since its application in OBI, the program has received strong communty support from many ontology and tool developers and users.
Roughly, the history of the Ontobee development can be split into five stages as described below.
Table of Contents
- Stage 1: Develop a VO Browser for dynamically dereferencing VO terms to HTML [Time: early 2008 to July 2009]
- Stage 2: Develop an Ontobee prototype for OBI and for dynamically dereferencing ontology terms to RDF and HTML [Time: October 2009]
- Stage 3: Further develop Ontobee for OBO Foundry and other ontologies until the first Ontobee presentation [Time: November 2009 to July 2011]
- Stage 4: Further develop and maintain Ontobee until Allen left [Time: from July 2011 to March 2013]
- Stage 5: Further develop and maintain Ontobee after Allen left [Time: from March 2013 to now]
- Appendix: Background about OBI RDF dereferencing prototype
1. Stage 1: Develop a VO browser for dynamically dereferencing VO terms to HTML [Time period: early 2008 to July 2009]
Since early 2008, Oliver He has initiated and led the development of a community-based Vaccine Ontology (VO; http://www.violinet.org/vaccineontology). The VO research on VO development got a lot of support and collaboration from many ontology developers (See: VO development team). Dr. He received his NIH-NIAID R01 grant (R01AI081062) in September 2009, which provided further funding on the VO development and VO applications.
While we were developing VO, Allen Xiang, a bioinformatician in He Laboratory, tried to develop a VO web browser under Dr. He's mentorship. The first VOBrowser was launched in July, 2008. The original VOBrowser was based on the OWLDoc program, which was coded with Java. We chose to use OWLDoc because of very nice OWLDoc visualization features, for example, the ontology hierarchical visualization feature of OWLDoc is cool. One issue in our use of OWLDoc was that we had to generate one individual (static) HTML file for one ontology term. When the VO became bigger and bigger, we generated so many individual HTML files that we felt it would be not feasible to maintain and keep this approach. Therefore, we decided to find another way to dynamically generate HTML pages for individual ontology term URIs.
After discussion between Oliver and Allen, they decided to generate a local Virtuoso RDF triple store that stores the RDF triples of VO, retrieve related ontology information from the RDF triple store using SPARQL, and use the information to dynamically generate HTML files. Allen did the programming. He generated the first version of Hegroup Virtuoso RDF triple store. The SPARQL code querying VO ontology contents from the RDF triple store was embeded in PHP. PHP has been the default web programming used for all web tools developed in He lab. Besides SPARQL/RDF triple store and PHP, Allen also used other technologies including JSON and OWLAPI. The updated PHP/SPARQL-based VOBrowser program was eventually able to dynamically generate a HTML page for every VO ontology term URI. The VOBrowser was able to display the ontology term hierarchy and other information nicely and automatically. In this way, we did not need to generate thousands of HTML files and store them statically.
The development of some VO browser visualization styles was inspired by the OWLDoc styles. Specifically, we liked the nice OWLDoc hierarchical ontology display feature, and we programmed in our way to make a similar ontology hierarchy view in VOBrowser. But we did not use the frames shown in OWLDoc (or a typical Javadoc view). We have also tried different ways to improve the look-and-feel of our ontology visualization.
At this stage, our VO browser web URL was: http://www.violinet.org/vaccineontology/vobrowser/ (note: this URL has expired now). The Virtuoso RDF-based VO browser was successfully implemented in May 1, 2009. This date was also recorded in the VO News page.
The first demonstration of the VOBrowser was likely when Oliver presented the VO development in the 109th American Society for Microbiology (ASM) General Meeting on May 17-21, 2009 in Philadelphia, Pennsylvania. This is a VOBrowser screenshot generated for the ASM presentation. Both Allen and Oliver attended the first ICBO conference at Buffalo in July 24-26, 2009. The poster presentation was published in Nature preceding (http://precedings.nature.com/documents/3552/version/1), which contains another VOBrowser screenshot. As seen from these two screenshots, the early day VOBrowser has contained basic Ontobee visualization features and styles we see now.
2. Stage 2: Develop an Ontobee prototype for OBI and for dynamically dereferencing ontology terms to RDF and HTML
[Time: October 2009]
Since vaccine investigation is a type of biomedical investigation, Dr. Barry Smith recommended Oliver in 2008 to join the community-based Ontology for Biomedical Investigations (OBI) development. Oliver attended the Vancouver OBI Workshop on Feb 2-6, 2009. In the workshop, Oliver presented a talk with a title "VO and OBI". He discussed with the OBI developers there how to improve VO using OBI and possibly improve OBI by using vaccine investigation as a user case in OBI development. Since the workdshop, Oliver has become an active member of the OBI Consortium, and represented the vaccine community in the OBI development.
After the VO browser success, many OBI developers suggested us to develop one VOBrowser-like OBI browser. Oliver discussed with Allen and asked him to apply the VO Browser technology on OBI. Allen programmed and developed a prototype soon. That time we put the results into our ontology browser program URL to http://ontobrowser.hegroup.org/. In October 2, 2009, Allen wrote an email to announce the prototype. Melanie Courtot provided some quick comments to this prototype in the same day. Note that at the time, the OBI browser was only serving HTML pages (which is also what the email thread mentions).
As seen in the long email thread: On October 5, 2009, Melanie emailed to the OBI developers email list and announced the availability of an OBI browser prototype developed based on VO Browser. That time, Oliver would like to put the source code to OBI svn. However, based on the suggestions from the email thread, we eventually did not do so. In the email thread, Oliver mentioned on October 6,
"Our strategy is simple: we store an ontology in a Virtuoso RDF database, and then use PHP and SPARQL programming to access the rdf database and make sparql queries. Therefore, I am sorry to say that our tool (we name OntoBee) cannot support SOAP. Basically, OntoBee will provide an VOBrowser like ontology browsing feature: http://www.violinet.org/vaccineontology/vobrowser/ and will also include a SPARQL script web interface like the following: http://www.violinet.org/vaccineontology/sparql/index.php". Note that the name "OntoBee" had been given by Oliver by that time.
Now comes the story of ontology term URI deferencing to RDF using the trick of xslt stylesheet:
In an email dated October 2, 2009, Melanie suggested the addition of the RDF as source:
"one of the main point in the current OBI prototype (http://purl.obofoundry.org/obo/OBI_0000225) it to be able to serve RDF for each term, as described in http://docs.google.com/Doc?docid=0Acx6Blq96uycZHpwcm5td182Mmdrc3ZrOWs4&hl=en (and in http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/). Is that something you plan on adding?"
In another email thread starting Oct 3, 2009, James Malone at EBI also pointed out the same RDF issue on 10/7/2009:
"Looks like you've done a lot of work on this, some nice features. I have a couple of questions if I may. I'm not sure who the browser is aimed at - is it the human user or is it the computer agent, or both? If it's the former I think most of what you have is fine, the search is useful. If it's the latter I would refer back to what Melanie previously asked about making the pages fragments of rdf that are rendered by stylesheets. We did a similar thing for our microarray ontology e.g. We did a similar thing for our microarray ontology e.g. http://www.ebi.ac.uk/efo/EFO_0001526 - if you view the source you will see the page is actually RDF but is then rendedered to look reasonable and have clickable links. This would be really useful in a semantic web view of the world."
At 3:06 pm in the same day (10/7/2009), Oliver acknowledged to James that OntoBee did not support the RDF dereferencing feature. Olive then discussed with Allen. They examined quickly some related references they could find, including those provided by Melanie and the OBI prototype paper published in the conference OWLED 2008. The OWLED 2008 paper said,
"In order to present readable information in web browsers, we use an XSL stylesheet, which is executed by the browser to generate HTML".
Both Oliver and Allen were very familiar with the XSLT stylesheet technology and knew what the sentence meant. When Oliver was working on a PathInfo project at the Virginia Bioinformatics Institute (VBI), he wrote a long XSLT script program to transform XML documents based on the Pathogen Information Markup Language (PIML) to HTML files. Allen had also done a lot of XSLT programming for another project in He laboratory. Therefore, Allen was confident that he could implement the XSLT idea to dereference RDF and at the same time keep the HTML display. Based on the discussion, Oliver replied to James at 3:50 pm (44 minutes after his last email to James):
"I just talked to Allen. Allen said that he can use xslt style sheet to make it work: to have the RDF source output, and meanwhile to have our current HTML look."
In the period of Oct 7-22, with Oliver's active discussion and support, Allen spent some good time to implement the above idea. With the HTML visualization still available, the RDF addition feature was successfully added to OntoBee. Specifically, for each onology term URI, OntoBee provided well-structured HTML output for web browser visualization (as done in VOBrowser), and if you viewed the source code of the HTML page, the RDF/XML output could be seen. Based on the Principles of Linked Data, such a system of dereferencing both RDF and HTML would be ideal.
October 22, 2009, Oliver responded to the group, stating:
"In addition, as James/Melanie/Alan suggested, OntoBee generates an RDF file for the ontology term. The RDF file contains a line that directs the html display to the HTML version. Therefore, Allen was able to keep all the visualization code (HTML). The RDF file was relatively easy for Allen to do since he has already had SPARQL scripts (from the visualization part) to get the ontology term information. Allen has also taken Alan's instructions to include ontology description and other information."
Allen also deposited the source code of OntoBee to Sourceforge on 10/22/2009.
More information about the ontology term URI dereferencing to RDF can be found in the Appendix on this page.
3. Stage 3: Further develop Ontobee for OBO Foundry and other ontologies until the first Ontobee presentation
[Time: November 2009 to July 2011]
Allen and Oliver continued to develop OntoBee, with an aim to make OntoBee work better for VO and OBI, and also apply it for other ontologies. This aim was achieved. One key feature achieved during this stage was the usage of Ontobee as the default linked ontology data server for OBO Foundry library ontologies. Ontobee was first used to resolve VO PURLs, then OBI PURLs, then most other OBO library ontologies by the end of 2010. At this stage, we obtained a lot of help from the ontology community, especially Alan Ruttenberg and Chris Mungall.
Although ontology URIs could be dereferenced inside OntoBee, the ontology URIs might not be dereferenced into OntoBee when the ontology URIs were typed in a web browser separately (not inside OntoBee). For example, as an OBO Foundry library ontology, VO ontology term URIs use the OBO PURL format: http://purl.obolibrary.org/obo/VO_xxxxxxx. Initially, such VO PURLs were not directed to OntoBee by default. To redirect VO URIs to OntoBee by default, Oliver and Allen requested Chris Mungall and OBO Foundry to automatically forward VO OBO PURLs to OntoBee for default display. Eventually the OntoBee system worked out well as the default linked data server for VO. For example, the VO class 'DNA vaccine' (http://purl.obolibrary.org/obo/VO_0000032) would directly be linked to an OntoBee site for display, and the source code of the HTML page was provided in RDF/XML format (instead of HTML format).
On May 18, 2010, Oliver registered the web domain name: ontobee.org. After that, the Ontobee website became http://www.ontobee.org instead of http://ontobee.hegroup.org. Oliver decided to make the update and pay the domain registration fee because he predicted that the tool could be widely used as a community-based program, and a new domain name might boost its usage. After consultation with many, Oliver also changed the name "OntoBee" to "Ontobee".
The Hegroup RDF triple store initially stored only VO and OBI, and then a couple of other ontologies. To make OntoBee access and use more OBO library ontologies, Allen and Oliver started in 2010 to use the Neurocommons RDF triple store, which was developed by Alan's group. Since both RDF triple stores used the same Virtuoso system, OntoBee could use SPARQL to access both RDF triple stores with no compatibility issue. During this period, Alan provided a lot of support to Allen on how the Neurocommons RDF triple store could be accessed and queried through SPARQL. Initially, when SPARQL queries were complex, our Ontobee execuation was often slow. Alan introduced to Allen the Concise Bounded Description (CBD) technique. After this, the Ontobee query became much faster especially when complex queried were executed. Due to incontinuous maintenance and updates of the Neurocommons RDF triple store, Ontobee started to use only the Hegroup RDF triple store from the summer 2011. The Hegroup RDF triple store also started to include more ontologies.
As shown in this email thread, on Aug 26, 2010, Bjoern Peters emailed to OBI-devel email listserv and asked:
"As we are starting to use OBI IDs in our production system, we would like to link directly from an OBI ID to a corresponding website with definition, metadata, synonyms etc. I know Alan had previously prototyped such a page; is their an estimate when this would be available?"
Oliver replied the email:
"It's now available for VO IDs. For example, if you click on the VO class 'DNA vaccine': http://purl.obolibrary.org/obo/VO_0000032, it will directly link to: http://www.violinet.org/vaccineontology/vobrowser/rdf.php?iri=http://purl.obolibrary.org/obo/VO_0000032. In addition, if you look for the source of the web page, it's owl format and contains all information about this term. So we have managed to separate the web display and source OWL file output. This feature has partially been expanded to all other ontologies in OntoBee: http://www.ontobee.org/. It contains OBI", ...
"We don't have a direct link to the above page from a click on the URL yet: http://purl.obolibrary.org/obo/OBI_0000426. Some work needs to be done first from http://purl.obolibrary.org/obo/ to make this occur." ... "Chris Mungall has proposed to use OntoBee for all OBO foundry ontology term visualization."
"Oliver: That looks really nice, excellent!"
By the end of 2010, all OBI ontology term PURL IDs were resolved to Ontobee by default. The OBO PURL redirection protocol was later described in the OBO Foundry document "OBOPURLDomain". This document has been generated and maintained by the OBO Foundry Operation Committee, where Alan, Chris and Melanie are all owners, and Oliver is a contributor (see the people list of the committee).
The Ontobee technology can be used for different applications. One application would be ontology alignment, comparison, and analysis. Oliver was developing a Brucellosis Ontology (IDOBRU), as an extension of the Infectious Disease Ontolgy core (IDO-core). Other IDO-core extensions, including Malaria Ontology (IDOMAL) and Influenza Ontology (FLU), were also being developed. Under Oliver's suggestions and support, Allen developed the Ontobeep program targeted for aligning and comparing different ontologies, such as IDO-core and all its extensions. In IDO Workshop 2010 held on Dec 8-9, 2010, Allen presented the Ontobeep program and its usage in the alignment and comparion between IDO-core and its extensions (see the PPT file in the worshop side, or the PDF copy in Ontobee site).
In the 2nd International Conference on Biomedical Ontologies (ICBO) held on July 2011, Oliver and Allen presented the Ontobee as a short paper. Chris and Alan were two othe coauthors of the paper. The Ontobee paper was later published: http://ceur-ws.org/Vol-833/paper48.pdf.
4. Stage 4: Further develop and maintain Ontobee until Allen left [Time: from July 2011 to March 2013]
Beside OBO libary ontologies, many non-OBO ontologies were also included in Ontobee. For the non-OBO Foundry ontologies, a typical scenario was that a primary ontology developer or representative contacted Oliver directly, or they contacted another person who then contacted Oliver. Then Oliver and Allen discussed and uploaded the ontology to Ontobee.
Many researchers from the ontology community, esp. the OBO Foundry community, have provided a lot of suggestions and comments. For example, Alan, Melanie, and Chris contributed a lot in providing ideas and discussions with Allen and Oliver in terms of the conception of what role the browser plays and in the layout and contents of the page. Another example is that Jie suggested to include an Excel term list worksheet for each ontology. This suggestion was implemented by Allen. The ontology term lists could be found from the cover page of Ontobee, located at the last column of the Ontobee ontology table.
Allen left Oliver He's laboratory in March 2013 due to personal reasons. As an excellent software developer, Allen had been the primary developer of Ontobee. While Oliver also contributed to the Ontobee programming, his effort was primarily in the web interface, HTML coding, and documentation. Allen's leaving was a big hit to the Ontobee (and other He group software development).
5. Stage 5: Further develop and maintain Ontobee after Allen left [Time: from March 2013 to now]
Since Allen left in March 2013, Oliver took the whole responsibilty in maintaining and updating Ontobee. Since Ontobee has become very popular, many requests and questions came to Oliver. This project has thus sucked a lot of Oliver's time. Fortunately, starting from May 2013, Oliver found some technical help from several experienced software developers in the University of Michigan (UM). Particularly, Mr. Yue Liu helped to solve some bugs in Ontobee. Mr. Bin Zhao was initially recruited as a summer intern, by a collaborative MCubed project funded for Oliver and two other PIs in UM, to work on SPARQL-based Semantic Web programming on an informed consent ontology (ICO) project. After the summer ended, Bin was recruited by Oliver with Oliver's NIH R01 funding to work on the Ontobee updates and some other bioinformatics projects in the He group.
On 8/2/2013, Oliver generated an Ontobee-discuss Google Group. This discussion group provides a good platform for Ontobee users to discuss issues and propose solutions.
One significant improvement in Ontobee has been the documentation. In August 2013, Oliver prepared detailed tutorial documentation for the Ontobee tutorial page: http://www.ontobee.org/tutorial. Bin and Oliver also prepared a detailed tutorial page specific for Ontobee SPARQL: http://www.ontobee.org/tutorial/sparql. In the Ontobee SPARQL website: http://www.ontobee.org/sparql, Bin and Oliver also prepared four clear and easy-to-modify examples to show how to use Ontobee SPARQL to query different types of ontology information. In March 2014, Oliver wrote a tutorial on Ontobeep: http://www.ontobee.org/tutorial/ontobeep.
In the end of 2013, Bin and Oliver cleaned up the Ontobee source code, and then submitted the source code on Dec 23 to the new Ontobee github repository website: https://github.com/ontoden/ontobee. The source code license is Apache 2.0 open source license. The usage of such a license was approved by the University of Michigan TechTransfer office.
With Oliver's guidance and support, Bin developed a new Ontobee program now called "Ontobeest". Ontobeest is an Ontobee-based tool for extracting and displaying detailed statistics for each ontology listed in Ontobee. Since each ontology typically reuses terms from many other ontologies, the tool was able to count the statistical numbers of classes, annotation properties, object properties, datatype properties owned by an ontology or imported from other ontologies to this ontology. The first version of Ontobeest (initially called Ontostat) was released for Ontobee users on 9/30/2013. The web page of detailed statistics information for an ontology could be opened by clicking on "Detailed Statistics" on the ontology home page such as the ICO webpage. An Ontobee statistics web coverpage was also developed: http://www.ontobee.org/ontostat. This program was named by Oliver and Bin as "Ontobeest" on 3/25/2014. Oliver also wrote a tutorial on this: http://www.ontobee.org/tutorial/#ontobeest.
-- More work to be done. New history is being generated ...
6. Appendix: About OBI RDF dereferencing prototype
Before Allen and Oliver in He group implemented the ontology URI dereferencing to RDF while keep HTML visualization in Ontobee, a lot of related work had been done in the area. Here is a summary of some related information (Note: much information below was provided by Melanie. Thanks!):
- The ID policy was developed, which included creating URIs for dereferencing - see old doc at https://docs.google.com/Doc?docid=0Acx6Blq96uycZHpwcm5td18wZGtkMmdiZ3Y&hl=en. However those criteria are not new and are borrowed from sharednames as well. That doc was shared in an email dated November 3rd 2009.
- The implementation of the OBI ids creation prototype was discussed in March 2008.
- OBI had been trying to get an HTML rendering of terms distributed with the released versions - Alan wrote a script which generated obi-lsw-report.html. See http://sourceforge.net/p/obi/code/HEAD/tree/releases/2008-03-10/obi-lsw-report.html (dated March 2008)
- Alan Ruttenberg, Melanie, Bill Bug (maybe others) developed the initial dereferencing prototype in the context of OBI. The post at http://ontolog.cim3.net/forum/ontolog-forum/2008-10/msg00035.html, dated October 2008, gives technical details on the mechanism, and points to the original source of the XSL transformation idea in this context, i.e. bioguid mentioned by Jonathan Rees. A section of the paper "the OWL of Biomedical Investigations" that we presented at OWLED 2008 covers this topic, http://www.webont.org/owled/2008/papers/owled2008eu_submission_38.pdf - see section 2.9 OBI terms on the web. Bill and Melanie's contributions were mostly testing and discussion, the implementation was done by Alan on the neurocommons server.
- About the OBI prototype published in the conference OWLED 2008, here is an introduction by Alan on 8/6/2013:
"Incidentally, the prototype was not an OBI project, but rather one that sprung from work that Jonathan Rees and I did working for Science Commons. Through my work on OBI we used OBI as a use case and therefore brought OBI in as a working example. Melanie was most involved as the initial developer from OBI who was interested in the issue and helped guide development of the OBI proof of concept from the first protoptype I implemented using the Neurocommons wiki. See http://neurocommons.org/page/CommonsPurl. For the OBI prototype, where we couldn't use semi-static templates, the proof of concept was to generate static RDF html by code that used (I can't remember which atm) either the Pellet API, or the OWLAPI. We intended to deploy, at first, by generating these static pages as part of the OBI Build process."
- The RDF dereferencing feature was achieved dynamically in Ontobee in late October 2009. Melanie became aware of the VO browser in 2009, and recommended its use to OBI. As described above, at the time of 10/7/2009, the VO browser was only serving HTML pages. However, in the period of 10/7-10/22/2009, Allen and Oliver worked hard and added the RDF feature to OntoBee. Later the contents of such a feature connued to improve. There have been many insightful suggestions and comments from OBI developers and other interested developers and users.
Disclaimer: The information introduced in this page was provided primarily based on preserved electronic records. It was prepared to Oliver's best knowledge. If you find any description inaccurate or incomplete, please contact Oliver for possible corrections. Thank you!
Note: The page was generated by Oliver He on 8/18/2013; updated on 3/26-4/3/2014.