This page provides a broad overview of our software development activities. Phenoscape supports open development processes and collaboration. All source code we create is available from open source repositories such as Github and Sourceforge, and we work with existing open-source projects whenever possible.
One of the chief objectives of the Phenoscape project is to present a centralized repository to store evolutionary phenotype annotations entered by curators and also integrate relevant data imported from partner projects. The Knowledgebase consists of a back-end semantic data store, a web service application which provides a query API for the data, a web application user interface allowing exploration of the Knowledgebase, and accessory tools for loading data.
RDF-OWL version (in development)
To meet new requirements for scalability, reasoning expressivity, and support for linked data standards, we are developing a new Knowledgebase infrastructure based on RDF and OWL reasoning. Some in-progress components:
- phenoscape-owl-tools—build application for generating the RDF content for the Phenoscape KB, including utilities for converting datasets to OWL and running reasoning tasks
- phenoscape-kb-services—updated web services API for the Knowledgebase. Very early in development so far.
- owlet—a query expansion preprocessor for SPARQL. It parses embedded OWL class expressions and uses an OWL reasoner to replace them with FILTER statements containing the URIs of subclasses of the given class expression (or superclasses, equivalent classes, or instances). It is written in Scala but can be used in any Java application.
- Scowl—a Scala library allowing a declarative approach to composing OWL expressions and axioms using the OWL API.
The live Phenoscape Knowledgebase is available at http://kb.phenoscape.org/. It makes use of the following software projects:
- OBD - an ontology-driven datastore developed by the Berkeley Bioinformatics Open-source Projects group. It provides a generic schema for storing ontologies and semantic data expressed with those ontologies. It also includes an automated reasoner used to materialize inferred knowledge into the datastore. The master build script for loading the Phenoscape Knowledgebase is included within the OBD source code.
- Phenoscape OBD data services - a suite of web services on top of OBD to serve as a data access API and foundation for our user-oriented Phenoscape web application. These web services present a RESTful service interface using the Restlet Java API. The specifications of these services are detailed in Data Services.
- Phenoscape web UI - a web application providing user-friendly interfaces for browsing and querying data and ontologies within the Knowledgebase.
- Phenoscape data loader - a Java library providing translations of various data inputs into the OBD model, used within the OBD build script.
Data curation tools
Phenex is ontology annotation software tailored for phenotype annotation of evolutionary character matrices. It saves ontology annotations alongside traditional character matrix data using the new NeXML format standard for evolutionary data.
CharaParser is a natural-language processing tool which analyzes the text of character-state descriptions to produce a structured output used to generate proposals for ontological phenotype annotations. We are working to enhance the performance of CharaParser and also to integrate it with Phenex, so that data curators can take advantage of natural-language processing to accelerate their workflow.
Other software products
The VTOTool assembles the VTO by starting from an initial taxonomy (NCBI) and deleting and splicing in group specific taxonomies (e.g., TTO, ATO). It will also add taxonomy from PaleoDB and synonyms from other taxonomic source (e.g., Catalog of Life).
This tool (legacy from Phenoscape I) builds the Teleost Taxonomy Ontology (TTO) from a download of the Catalog of Fishes.