This projects defines simple architecture and its Java implementation that allow to create BioMoby Web services accessing BioCASE-aware databases (collections). It means that many existing BioMoby clients can be used almost instantaneously (there is still a need for some programming), giving access to the richness of many databases available via BioCASE wrappers.
The BioMoby/BioCASE Web services are partly generated (from the information stored about each BioMoby service in a BioMoby registry) and partly manually programmed using Java BeanShell scripting language. Future plans include shifting more items to the generated part.
The expected audience for this document are developers that wish to create their own BioMoby services on top of existing BioCASE databases. Therefore, some chapters of this document are rather technical. The major backbone of this project is jMoby, a Java development environment for BioMoby. Especially, it refers many times to:
An example of input name is: Vicia faba (which is is a species of bean (Fabaceae) native to north Africa and southwest Asia, and extensively cultivated elsewhere.)
The service has also a paging mechanism: a client can ask either all results in one call, or ask to get separate results (see secondary inputs for paging).
So the above is a Biomoby Web service we are going to implement. It will bring data from SINGER (The System-wide Information Network for Genetic Resources), a system providing information about the samples of crop, forage, and tree germplasm.
cvs -d :pserver:cvs@cvs.open-bio.org:/home/repository/moby login
cvs -d :pserver:cvs@cvs.open-bio.org:/home/repository/moby co -P moby-live/Java
cd moby-live/Java
./build.sh
./build-dev.sh dashboard
cvs -d :pserver:anonymous@cvs.cropforge.org:/cvsroot/gcpmoby login
cvs -d :pserver:anonymous@cvs.cropforge.org:/cvsroot/gcpmoby co bb_services
cd bb_services
ant all
cd moby-live/Java ./build-dev.sh dashboard Setting -> Panel selection -> Registration -> Service RegistrationOur sample service and its input/output data types are already registered:
cd moby-live/Java ./build-dev.sh dashboard Setting -> Panel selection -> MoSeS Generator Select service...: getCoordinatesOfTaxon Button: All-in-One: Do it all
The MoSeS also packs generated code into two jar files that you need to move to the bb_services/lib directory:
cp moby-live/Java/build/lib/biomoby-datatypes.jar bb_services/lib/ cp moby-live/Java/build/lib/biomoby-skeletons.jar bb_services/lib/You may notice that there are already similar jar files in bb_services/lib (called sample-...). These contain skeleton and data types for our sample service.
|
This is the most important step - and the only one where you will need
you programming skills. This chapter is the core of this documentation
(the steps above and below are just repetition what is needed for
any BioMoby service.
|
The all implementation activities are done in the bb_services directory:
cd bb_services
|
The picture on the left shows the overall architecture of what happens
when a BioMoby client sends a BioMoby request, and before she gets
back a BioMoby response:
The picture also indicates what you, as a service developer, have to accomplish: There are (usually) four files that you need to create - they are indicated by the cyan color. Sometimes, the XSLT Stylesheet may be omitted. The Service Implementation is a regular Java class that inherits from a generated skeleton. It can be as complex as you wish but usually it is very simple (in the future, it may be even automatically generated). It does the whole job by calling some general classes available in the bb_services package, and passing them service properties. The Service Properties are the core of your service implementation. They contain BeanShell scripts (a Java or Java-like code) that transform BioMoby input data into BioCASE request, and back from the BioCASE response to the BioMoby output data. The properties also know where to find other needed files. The Request Template is a BioCASE XML request with tokens. Tokens are strings (e.g. @REC_START@) that will be substituted by the real data before the request is sent to a BioCASE site. The XSLT Stylesheet transforms a full BioCASE response to a smaller one that may be easier to deal with when it is converted to BioMoby output data.
|
It is recommended to put files described below to the suggested directories. Then, the Ant builder can find them. If you wish to have them located differently, consider to edit the Ant's main configuration file bb_services/build.xml.
On several places, a paging mechanism is mentioned. BioCASE request has ability to ask only for a subset of available records. The BioMoby services may indicate the same in their secondary parameters. But these parameters are not standardized by the BioMoby API. Therefore, I recommend to follow the pattern given in our sample service: there are two optional secondary parameters, startPage and maxPages. The former indicates a sequential number of the first wanted page (zero means from the beginning), the latter a maximum number of wanted pages (a negative or an empty value means all records).
Additionally to this user-driven paging, a service implementation needs to deal with a fact that BioCASE does not return always all wanted records in one go, but it returns only a chunk of them (usually about 500), and it indicates how many records are still available. Fortunately, the bb_services general classes used by your service implementation deal with this repetitive calls automatically.
package org.generationcp.bbservices;
import org.generationcp.www.getCoordinatesOfTaxonSkel;
import org.biomoby.shared.MobyException;
import org.biomoby.shared.parser.MobyPackage;
import org.biomoby.shared.parser.MobyJob;
import org.generationcp.bbservices.support.BiocaseWorker;
import org.generationcp.bbservices.support.Paging;
public class GetCoordinatesOfTaxonImpl
extends getCoordinatesOfTaxonSkel {
public void processIt (MobyJob request,
MobyJob response,
MobyPackage outputContext)
throws MobyException {
Paging paging = new Paging (getParameter_startPage (request),
getParameter_maxPages (request));
BiocaseWorker worker = new BiocaseWorker (this, paging);
worker.call (request, response, outputContext);
}
}
The bold font indicates where you should put your own names,
the italic font indicates the places that may be changed
(e.g. if your input BioMoby data type has different parameters for
paging mechanism). Once you create your implementation class, you can compile it by calling Ant:
ant
If it compiles successfully you are done with this class. The Java API documentation can be created by calling:
ant docs
<?xml version='1.0' encoding='UTF-8'?>
<request xmlns='http://www.biocase.org/schemas/protocol/1.3'>
<header>
<type>search</type>
</header>
<search>
<requestFormat>http://www.ipgri.org/schemas/gcp_pass/1.03</requestFormat>
<responseFormat start='@REC_START@' limit='@REC_LIMIT@'>
http://www.ipgri.org/schemas/gcp_pass/1.03
</responseFormat>
<filter>
<and>
<equals path='/DataSets/DataSet/GermplasmSamples/GermplasmSample/Classification/Taxonomy/FullScientificName'>@TAXON_NAME@</equals>
<isNotNull path='/DataSets/DataSet/GermplasmSamples/GermplasmSample/Origin/CollectingEvent/SiteLocation/LatitudeDecimalValue' />
<isNotNull path='/DataSets/DataSet/GermplasmSamples/GermplasmSample/Origin/CollectingEvent/SiteLocation/LongitudeDecimalValue' />
</and>
</filter>
<count>false</count>
</search>
</request>
There are three tokens in this example: The @REC_START@ and
@REC_LIMIT@ should be always named like this (otherwise you
need to override some bb_services general classes). The
@TAXON_NAME@ is any name that must correspond with a service
property name. In this case, such property is named
request.beanshell.TAXON_NAME. (What value such property has
is described below.) The @TAXON_NAME@ will be replaced by user data (coming from a BioMoby input data type). A request may contain more such user-driven tokens (each of them has to have a corresponding service property).
The creation of a request template requires to know BioCASE's XML protocols (but in most cases you can just follow examples).
If you write such stylesheet, there are two requirements: The transformed response must still have an element named content with the following attributes containing status of paging:
<content recordDropped='0'
recordCount='10'
recordStart='0'
totalSearchHits='11'/>
Note that in the transformed response, the rest of the
content element can be empty (in the full BioCASE response,
the whole response is actually part of the content
element). And (optionally, but recommended) keep in the transformed response the whole section diagnostics as it is in the original response from BioCASE. It helps to indicate (and to report) errors that happen on the BioCASE level (when extracting data from databases). To keep the whole section, usually it is enough to put into your XSLT stylesheet something like this:
<xsl:copy-of select = "//bi:diagnostics" />
Second, having this mechanism in place, we may introduce, in the future, a direct transformation from a BioCASE response to a BioMoby output message. It is not possible now because the Moses generated classes do not accept a ready Biomoby output, already in XML, and also it is not that straightforward to combine more XML response chunks (remember the paging mechanism) together when using raw XML.
<xsl:stylesheet
xmlns:bi="http://www.biocase.org/schemas/protocol/1.3"
xmlns:gcp="http://www.ipgri.org/schemas/gcp_pass/1.03"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:element name="Coordinates">
<xsl:element name="content">
<xsl:apply-templates select="bi:response/bi:content/@*"/>
</xsl:element>
<xsl:copy-of select = "//bi:diagnostics" />
<xsl:element name="SiteLocations">
<xsl:apply-templates select="bi:response/bi:content/gcp:DataSets/gcp:DataSet/gcp:GermplasmSamples/gcp:GermplasmSample/gcp:Origin/gcp:CollectingEvent/gcp:SiteLocation"/>
</xsl:element>
</xsl:element>
</xsl:template>
<!-- various record counters -->
<xsl:template match="bi:content/@*">
<xsl:attribute name="{local-name(.)}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
<!-- coordinates of a site -->
<xsl:template match="gcp:SiteLocation">
<xsl:element name="SiteLocation">
<xsl:apply-templates select="gcp:LatitudeDecimalValue"/>
<xsl:apply-templates select="gcp:LongitudeDecimalValue"/>
<xsl:element name="SpatialDatum">WGS84</xsl:element>
</xsl:element>
</xsl:template>
<xsl:template match="gcp:LatitudeDecimalValue | gcp:LongitudeDecimalValue">
<xsl:element name="{local-name(.)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
In order to see how a response, transformed by this sheet, looks like, you can call a command-line program, included in the bb_services, and useful for testing your own stylesheets (put everything in one line; do not use backslashes - they are here only for better readability):
build/run/run-transformer \
data/BioCASE-response-example.xml \
src/etc/services/GetCoordinatesOfTaxonImpl.xsl
# Run-time properties for service: GetCoordinatesOfTaxonImpl
# ----------------------------------------------------------
biocase.endpoint = http://biocase.grinfo.net//pywrapper.cgi?dsa=SINGER
###response.timeout = 60000
###keep.response = true
###request.filename = GetCoordinatesOfTaxonImpl.request
request.beanshell.TAXON_NAME = service.get_FullScientificName(request).get_Name()
###response.xslt.filename = GetCoordinatesOfTaxonImpl.xsl
response.beanshell.elem.start.SiteLocation = handler.push (new Coordinates())
response.beanshell.elem.start.LatitudeDecimalValue = handler.push (new LatitudeDecimal())
response.beanshell.elem.start.LongitudeDecimalValue = handler.push (new LongitudeDecimal())
response.beanshell.elem.start.SpatialDatum = handler.push (new SpatialDatum())
response.beanshell.elem.end.LatitudeDecimalValue = \
number = new MobyFloat(); number.setValue (value); \
current.set_LatitudeDecimal (number); \
parent.set_LatitudeDecimal (current);
response.beanshell.elem.end.LongitudeDecimalValue = \
number = new MobyFloat(); number.setValue (value); \
current.set_LongitudeDecimal (number); \
parent.set_LongitudeDecimal (current);
response.beanshell.elem.end.SpatialDatum = \
current.set_SpatialDatum (value); \
parent.set_SpatialDatum (current);
response.beanshell.elem.end.SiteLocation = \
lat = current.getMoby_LatitudeDecimal().getMoby_LatitudeDecimal().getValue(); \
lon = current.getMoby_LongitudeDecimal().getMoby_LongitudeDecimal().getValue(); \
for (en = result.elements(); en.hasMoreElements(); ) { \
elem = en.nextElement(); \
if (lat.equals (elem.getMoby_LatitudeDecimal().getMoby_LatitudeDecimal().getValue()) && \
lon.equals (elem.getMoby_LongitudeDecimal().getMoby_LongitudeDecimal().getValue())) \
return; \
} \
result.addElement (current);
response.beanshell.end = \
realResult = new Coordinates [result.size()]; \
result.copyInto (realResult); \
service.set_CoordinatesSet (response, realResult); \
log.info ("Response contains " + realResult.length + " locations.");
|
Remember, this is the core of your implementation. Here you express your programming skills. The most important properties contain Java BeanShell scripts. You can use pure Java here, or you can simplify by using the type-looseness of the BeanShell. You do not need to compile it. |
There are two places where BeanShell scripts are needed: For converting BioMoby input data to a BioCASE request, and for converting BioCASE response back to BioMoby output data. The former is more straightforward because it just creates strings that are used to replace tokens in the BioCASE request template. For each such token there should be one property, named correspondingly:
| request | An object of type org.biomoby.shared.parser.MobyJob containing BioMoby input. |
|---|---|
| service | An object representing your service implementation class. |
For example:
request.beanshell.TAXON_NAME = service.get_FullScientificName(request).get_Name()The used methods get_FullScientificName and get_Name were generated from the article names used when a service was registered in a BioMoby registry (as you can check in our sample service).
The implementation stores each XML element on an object stack when an element is encountered, and removes it again when the same element ends. Let's see how it can be dealt with in your BeanShell code. The following variables are set automatically before the BeanShell code is evaluated - both at the beginning and at the end of each XML element:
| handler | An object of type org.generationcp.bbservices.support.ResultHandler representing the current SAX-based event parser. Usually, you need it only because of its push method. Its API is documented in the bb_services/docs/API/index.html. |
|---|---|
| service | An object representing your service implementation class. Usually, it is needed only at the very end in order to put results into BioMoby output data. |
| log | An object that can be used to write into a log file. Use its methods log.info, log.debug, etc. Its object type is org.apache.commons.logging.Log (more details about logging are here). |
| result | An object of type java.util.Vector suitable for storing data read from BioCASE responses. This object is kept for the whole time of all BioCASE chunk responses - but at the end you have to move data from here to a BioMoby output. |
| paging | An object of type org.generationcp.bbservices.support.Paging responsible for the paging mechanism. Usually, there is no need for it in your BeanShell code (and if it is then you need to read its API first). |
Your
responsibility is to put an appropriate object on the object
stack, using the handler.push method. This cannot be
done automatically because only you know what kind of object it should
be (the object type depends on the BioMoby output type for this
particular service). The code can use the following (additional) variables (set automatically before the code is evaluated):
| attrs | An object of type org.xml.sax.Attributes containing XML attributes of the currently encountered XML element. |
|---|
For example:
response.beanshell.elem.start.LatitudeDecimalValue = handler.push (new LatitudeDecimal())Note that the example uses simplified BeanShell syntax: in normal Java you would need to qualify (directly or in an import statement) which package the LatitudeDecimal comes from.
The usual task here is to take just finished element (it is in the variable current) and copy it either into its parent (which is still on the object stack and is available as the variable parent), or to copy it into result. Both alternatives are demonstrated in the examples below.
The code can use the following (additional) variables (set automatically before the code is evaluated):
| value | An object of type java.lang.String containing the data/text part of the just finished XML element. It may be empty if this element does not have any text. |
|---|---|
| current | An object representing the just finished XML element. Its type depends on what you pushed on the object stack at the element beginning (see above property response.beanshell.elem.start.<XML-element-name>). |
| parent | An object representing the parent XML element of the just finished XML element. Its type also depends on what you pushed on the object stack. |
Usually, the code in these properties are more complex because here you actually have data to play with. For example, this shows how to fill current object (an instance of LatitudeDecimal) with a value (an instance of MobyFloat), and how to incorporate it into its parent (an instance of Coordinates):
response.beanshell.elem.end.LatitudeDecimalValue = \ number = new MobyFloat(); number.setValue (value); \ current.set_LatitudeDecimal (number); \ parent.set_LatitudeDecimal (current);Another, more complex example shows how to put one set of coordinates into a result (remember, the result is a variable where we store all coordinates until the full end when we copy them to a BioMoby output). The complexity of code comes from the fact that we first check if the same coordinates is already there (and if it is, we ignore this location):
response.beanshell.elem.end.SiteLocation = \
lat = current.getMoby_LatitudeDecimal().getMoby_LatitudeDecimal().getValue(); \
lon = current.getMoby_LongitudeDecimal().getMoby_LongitudeDecimal().getValue(); \
for (en = result.elements(); en.hasMoreElements(); ) { \
elem = en.nextElement(); \
if (lat.equals (elem.getMoby_LatitudeDecimal() \
.getMoby_LatitudeDecimal().getValue()) && \
lon.equals (elem.getMoby_LongitudeDecimal() \
.getMoby_LongitudeDecimal().getValue())) \
return; \
} \
result.addElement (current);
The usual task here is to copy the whole result to an appropriate field in the BioMoby output. For that, the following (additional) variables are set automatically before the code is evaluated:
| response | An object of type org.biomoby.shared.parser.MobyJob containing a (so far) empty BioMoby response. |
|---|
For example:
response.beanshell.end = \
realResult = new Coordinates [result.size()]; \
result.copyInto (realResult); \
service.set_CoordinatesSet (response, realResult); \
log.info ("Response contains " + realResult.length + " locations.");
Surprisingly, the best time for testing a service is before its deployment. It helps to find and fix problems without hassle with Tomcat and Axis. (But come back to test again after deployment.)
There are many BioMoby clients that can be used to call our services. Let's show two of them. One is a command-line program that is the best for fast debugging, the other one is again the BioMoby Dashboard.
The command-line program has its own help:
build/run/run-service -helpand its details are described in the Moses documentation. But without bothering with explanation or details, here is what you need to type (all on one line, no backslashes):
build/run/run-service \ -service getCoordinatesOfTaxon \ -class org.generationcp.bbservices.GetCoordinatesOfTaxonImpl \ -showxml \ -xml data/test-MOBY-input-with-paging.xmlHave you got back anything like this?
Jobs (invocations):
(1) Query ID: sip_1_
Data elements:
(Collection) Article name: Coordinates
Data elements:
(1) (Simple) Article name:
Coordinates
LongitudeDecimal, Article name: LongitudeDecimal
MobyFloat, Article name: LongitudeDecimal
Value: 35.5
LatitudeDecimal, Article name: LatitudeDecimal
MobyFloat, Article name: LatitudeDecimal
Value: 30.4833
SpatialDatum, Article name: SpatialDatum
MobyString, Article name: SpatialDatum
Value: WGS84
(2) (Simple) Article name:
Coordinates
LongitudeDecimal, Article name: LongitudeDecimal
MobyFloat, Article name: LongitudeDecimal
Value: 35.9
LatitudeDecimal, Article name: LatitudeDecimal
MobyFloat, Article name: LatitudeDecimal
Value: 32.2833
SpatialDatum, Article name: SpatialDatum
MobyString, Article name: SpatialDatum
Value: WGS84
Check also the log file bbservices.log. It contains something
similar to this:
2006-04-24 01:31:09,475 0 [main] DEBUG BiocaseWorker - Loading service properties from GetCoordinatesOfTaxonImpl.properties
2006-04-24 01:31:09,578 103 [main] DEBUG BiocaseWorker - Calling http://biocase.grinfo.net//pywrapper.cgi?dsa=SINGER
2006-04-24 01:31:09,578 103 [main] DEBUG BiocaseWorker - Request (paging tokens substituted):
<?xml version='1.0' encoding='UTF-8'?>
<request xmlns='http://www.biocase.org/schemas/protocol/1.3'>
<header>
<type>search</type>
</header>
...
...
...
2006-04-24 01:31:09,578 103 [main] DEBUG Paging - BioCASE request: REC_START=3 REC_COUNT=5
2006-04-24 01:31:16,785 7310 [main] DEBUG BiocaseWorker - Response retrieved: 16616 bytes
2006-04-24 01:31:17,816 8341 [main] DEBUG Paging - BioCASE response: REC_START=3 REC_COUNT=5 TOTAL_HITS=9
2006-04-24 01:31:17,830 8355 [main] INFO ResultHandler - Response contains 2 locations.
The command-line program can also create input data from its command-line parameters (so you do not need an input XML file). Unfortunately, the secondary parameters (used for the paging mechanism) are not yet supported here. The following example will take a minute or two and bring back all 2000 locations:
build/run/run-service \ -service getCoordinatesOfTaxon \ -class org.generationcp.bbservices.GetCoordinatesOfTaxonImpl \ -showxml \ -name FullScientificName \ -obj TaxonScientificName "Name=Vicia faba"In all the cases above, the implementation class was called directly (the parameter -class). This is good for debugging. You may use the same program to call a real service, just use instead a parameter -e with a service endpoint. For example, for our testing service, type:
build/run/run-service \ -service getCoordinatesOfTaxon \ --e http://hq3.grinfo.net:8080/axis/services/getCoordinatesOfTaxon \ -showxml \ -xml data/test-MOBY-input-with-paging.xml
cd moby-live/Java ./build-dev.sh dashboard Setting -> Panel selection -> Simple Client Select service...: getCoordinatesOfTaxon Take an input from this XML file... Button: Call service
Or, enter data directly (including perhaps the paging parameters):
Before any deployment, make sure that:
The Dashboard knows what jMoby files are needed and what are the service and implementation class names. But Dashboard does not know anything about files coming from the bb_services project. So let's start there:
cd bb_services ant allThis again compiles everything, and - importantly for deployment - it puts all bb_services files into one directory: bb_services/build/lib. Remember this directory, it will be needed in Dashboard in a moment.
The collecting of all needed files can be also done by calling a specific Ant's task:
cd bb_services ant pre-deployThe core of deployment is done by the BioMoby Dashboard in the MoSeS Generator Panel:
cd moby-live/Java
./build-dev.sh dashboard
Setting -> Panel selection -> MoSeS Generator
Pattern for implementation class names: org.generationcp.bbservices.{Service}Impl
Select service...: getCoordinatesOfTaxon
Button: Deploy
Before you press the Deploy button check at least these three things:
Type the
name of the directory with the bb_services
files into the field Directory with user's jar
files. This is how Dashboard learns about your
implementation files.
There is still one missing thing: the logging that will be used by Tomcat. This should be taken care of by Dashboard but, unfortunately, it is not implemented (yet). The proper configuration file for the logging (the one that will create a log file bbservices.log in directory <tomcat-home>/logs) was already created in bb_services/build/classes/log4j.properties.for.tomcat but you have to copy it manually. You might have noticed that the Ant in bb_services suggested what to do. There was this report:
pre-deploy: [DO MANUALLY] cp build/classes/log4j.properties.for.tomcat <your-tomcat-home>/webapps/axis/WEB-INF/classes/log4j.properties
After deployment, restart your Tomcat (I recommend to use Tomcat Manager and to reload only Axis), and try the service, now using its real endpoint.
I am getting Error in creating XML parser
Exception: error, INTERNAL_PROCESSING_ERROR AN ERROR OCCURED DURING THE SERVICE EXECUTION: Error in creating XML parser Parser class: org.apache.xerces.parsers.SAXParser(SAX2 driver class org.apache.xerces.parsers.SAXParser not found)The problem is with your Tomcat installation. The XML parser jar files should be in your <your-tomcat>/common/endorsed directory. Move there files xercesImpl.jar and xml-apis.xml (e.g. from your bb_services/lib), and restart your Tomcat (the whole Tomcat, not only its Axis part).