Español (spanish formal Internacional)English (United Kingdom)

e-Semantic Science

semanticescienceThe fast advances of technologies transform the way scientific research is performed. Data analysis and storage has moved from a paper-based, manual affair to an activity in which computers are vital. As a result, a vast amount of scientific data is being daily collected or produced by computational equipment. No single research organization has enough resources to collect everything; hence the data gathering and archiving processes are distributed and scattered at different places. Neither any single research group has the computational power to process all these data. Besides, collaboration among scientists from different institutions or disciplines is necessary in many occasions to apply a spectrum of methods and models to analyze and process this deluge of information, and the ability to access and reuse datasets, methods, models and results of existing scholarly publications generally ensures more effectiveness and better quality in the research that can be done.

The development of e-Science is a response to these emerging trends in scientific research. E-science was originally conceived as the application of computing to traditional science (mostly empirical, although in some cases theoretical as well) in order to empower scientists with their research in traditional activities such as modeling, simulation and prediction, among others. However, now e-Science can be considered to have gone further than that, and is even being considered as a third leg of the scientific method, together with the theoretical and empirical ones, by introducing a new environment in scientific research that has also led to new research methods that may potentially lead to better science.

Giving support to some of these new requirements arising from this new approach to Science requires in some cases the explicit definition of the meaning of data about these different domains. This is the role that explicit semantics and their associated technologies, models and methods can play, in the context of what it is known as Semantic e-Science. That is, while traditionally e-Science has mainly addressed issues of data and computation distribution, interoperation and high-performance in traditional and non-traditional scientific research tasks, the main focus of Semantic e-Science is on the application of explicit semantics over the e-Science infrastructure to drive more accurate information interpretation, more efficient scientific analyses, and better collaboration among scientists, among others.

Projects

Currently we have two EU projects in execution in this area: Wf4Ever, Red de Excelencia PlanetData and the Marie Curie Initial Training Network SCALUS. As well, we also have a nacional project, myBigData,  in execution.

Previous projects in this area include ADMIRE STREP and OntoGrid.

Main results

The work done in this research area has mainly focused on:

  1. Ontology-based integration of heterogeneous scientific and non-scientific data sources. Important steps towards this goal are the provision of SPARQL querying support over distributed SPARQL endpoints, with a testbed in the bioinformatics domain that makes use of Bio2RDF endpoints and some initial results in query planning over distributed data sources. Previous results, which are still being used in several semantic e-Science projects, are the S-OGSA architecture and its related technological infrastructure. Some of the most relevant publications in this area are:
    • Buil-Aranda, C., Arenas, M., Corcho, O., Polleres, A., "Federating queries in SPARQL 1.1: Syntax, semantics and evaluation", Web Semantics: Science, Services and Agents on the World Wide Web, Volume 18, Issue 1, January 2013, Pages 1-17, 10.1016/j.websem.2012.10.001
    • Corcho, O., Alper, P., Kotsiopoulos, I., Missier, P., Bechhofer, S., Goble, C. (2006) An overview of S-OGSA: A Reference Semantic Grid Architecture. Journal of Web Semantics, 4 (2). pp. 102-115. ISSN 1570-8268
  2. The definition of models to describe, in a standard way, scientific experiments by means of workflow-centric Research Objects, which comprise scientific workflows, the provenance of their executions, interconnections between workflows and related resources (e.g., datasets, publications, etc.), and social aspects related to such scientific experiments. This activity also includes the definition of best practices for the creation and management of Research Objects, along with strategies for dealing with workflow decay.
  3. The publication of a corpus of provenance traces, compliant with the W3C standard PROV-O, in order to have data available for different types of analyses (derivation of results, completeness, abstraction, error detection during the experiment, etc):
    • Khalid Belhajjame, Jun Zhao, Daniel Garijo, Aleix Garrido, Stian Soiland-Reyes, Pinar Alper and Oscar Corcho, A Workflow PROV-Corpus based on Taverna and Wings. To be presented in BigPROV13.
  4. Another area of work is related to understanding scientific workflows to improve workflow reuse, through the use of provenance. By manually analyzing templates and traces, we have identified a set of domain independent motifs in scientific workflows that could be used to simplify and abstract them. We are currently working towards the automatic recognition of these abstractions, in order to simplify the view of the workflow to other communities and make it easier to understand. Metadata and provenance are key to facilitate this, since they describe the history and main features of every resource in a workflow execution:

    Members

    This research area is led by Oscar Corcho, and the team is formed by the associate professor María Pérez Hernández, PhD students Carlos Buil, José Mora, Freddy Priyatna, Daniel Garijo Verdejo and Rafael González and by the MSc student Alex de León.

    This research area is led by Oscar Corcho, and the team is formed by the associate professor María Pérez Hernández, PhD Rafael González, PhD students José Mora, Freddy Priyatna, Daniel Garijo Verdejo, Idafen Santana Pérez and by the MSc student Olga Ximena Giraldo.

    Recommended Reading

    Some readings related with the e-Semantic Science:

    Job opportunities

    There are currently no job offers or studentships available in this research area. For offers in other areas of the group, please check in our job opportunities section.

    However, you may contact Oscar Corcho to check whether there are any potential open positions in the near future.

     

     News

    Created under Creative Commons License - 2010 OEG.