Apache Taverna


Apache Taverna is an open source software tool for designing and executing workflows, initially created by the myGrid project under the name Taverna Workbench, now a project under the Apache incubator. Taverna allows users to integrate many different software components, including WSDL SOAP or REST Web services, such as those provided by the National Center for Biotechnology Information, the European Bioinformatics Institute, the DNA Databank of Japan, SoapLab, BioMOBY and EMBOSS. The set of available services is not finite and users can import new service descriptions into the Taverna Workbench.
Taverna Workbench provides a desktop authoring environment and enactment engine for scientific workflows. The Taverna workflow engine is also available separately, as a Java API, command line tool or as a server.
Taverna is used by users in many domains, such as bioinformatics, cheminformatics, medicine, astronomy, social science, music, and digital preservation.
Some of the services for the use in Taverna workflows can be discovered through the BioCatalogue - a public, centralised and curated registry of Life Science Web services. Taverna workflows can also be shared with other people through the myExperiment social web site for scientists. BioCatalogue and myExperiment are another two product from the myGrid consortium.
Taverna is used in over 350 organizations around the world, both academic and commercial. As of 2011, there have been over 80,000 downloads of Taverna across different versions.

Capabilities

Taverna workflows can invoke general SOAP/WSDL or REST Web services, and more specific SADI, BioMart, BioMoby and SoapLab Web services. It can also invoke R statistical services, local Java code, external tools on local and remote machines, do XPath and other text manipulation, import a spreadsheet and include sub-workflows.
Taverna Workbench includes the ability to monitor the running of a workflow and to examine the provenance of the data produced, exposing details of the workflow run as a W3C PROV-O RDF provenance graph, within a structured Research Object bundle ZIP file that includes inputs, outputs, intermediate values and the executed workflow definition; together this format is called TavernaProv.
Taverna includes the ability to search for services described in BioCatalogue to invoke from workflows. However, services do not need to be described within BioCatalogue to be included in workflows as they can be added from a WSDL Web Service description or entered as a REST URI pattern.
Taverna also includes the capability to search for workflows on myExperiment. The Taverna Workbench can download, modify and run workflows discovered on myExperiment, and also upload created workflows in order to share them with others using the social aspects of myExperiment.
Taverna workflows do not need to be executed within the Taverna Workbench. Workflows can also be run by:
Taverna allows pipelining and streaming of data. This means that services downstream in the workflow can start as soon as the first data item is received, without waiting for the whole data list to become available from upstream services and iterations. Taverna services execute in parallel when possible, as Taverna workflows are primarily data-driven rather than control-driven.

Open source community

Taverna has been an open-source project since 2003, with contributors from multiple academic and industry institutions. In October 2014 Taverna became an independent Apache incubator project, and changed its name to Apache Taverna . The project is developing Apache Taverna 3.x, which license changed from LGPL 2.1 to Apache License 2.0.