The INAF Cloud Science Platform explores a possible technological solution for large projects, to implement a new integrated approach to data access, manipulation and sharing. In particular, in case of worldwide distributed collaborations that needs to share distributed infrastructures. It is an hybrid cloud built of three main blocks: the CANFAR e-infrastructure, the EGI Federated Cloud and a cloud gateway site INAF-OATs.
INAF Cloud Science platform general architecture and main software components
CANFAR (Canadian Advanced Network For Astronomy Research) is a federated cloud deployed on Compute Canada resources and operated by National Research Council Canada. CANFAR is a unique example of an Astronomy and Astrophysics oriented infrastructure that joins together the Infrastructure as a Service (IaaS) cloud and the standards and services developed by the International Virtual Observatory Alliance (IVOA).
The EGI Federated Cloud is an European open cloud system that offers a scalable and flexible e-infrastructure to the European research communities. The Federation pools IaaS, PaaS and SaaS services from a heterogeneous set of cloud providers using a single authentication and authorization framework that allows the portability of workloads across multiple providers.
The INAF-OATs cloud site is a gateway between the two infrastructures: it is compliant with the EGI Federated Cloud and interoperable with the CANFAR e-infrastructure thanks to the IVOA standards implementation, so short-circuiting the interoperability. It offers an integrated set of astronomical services to Astronomy and Astrophysics community allowing data and applications to work in the same way regardless the infrastructure and to implement the ability of data, VMs and software to be easily moved and reused in the two cloud environments.
The interoperable Science Platform is possible thanks to the gateway site at INAF-OATs deployed exploiting and extending a set of software APIs, open source released by the Canadian Astronomy Data Center (CADC). This software is used to set up in Europe the set of IVOA standards based services interoperable with the CANFAR one.
Software services include storage, authentication, delegation, data access authorization. These services are provided as EGI Community Services and based on IVOA standards:
A VM image, configured to support a pool of ad hoc clients and analysis tools (ESO Scisoft package, TOPCAT), is available in the EGI AppDB (stored also in the project repository).
The Access Control (AC) guarantee a seamless access to both EGI and CANFAR infrastructures to users identified by their X.509 certificate while the services interoperability is achieved exploiting the IVOA credential delegation protocol.
For what regards the authorization, EGI Federated Cloud (EFC) implements a coarse grained approach based on Virtual Organisations while CANFAR, OATs-INAF gateway site and consequently our Community Cloud Platform, implements a fine grained approach based on the data ownership and on the user group memberships.
In EFC, Astronomers will be mapped in a single specific Virtual Organization and the granularity of local data access policies is driven by the local groups memberships stored in the local AC Service at the gateway site.
Each service has a persistence layer. The Access Control persists data on an LDAP database. We have chosen LDAP because widely used in data and computing centers to store users and groups information for authentication and authorization purpose. This allows already existing LDAP databases to be integrated with or directly used by the Access Control.
The VOSpace Service uses a relational database to save meta-data associated with stored files while we implement a vospace-backend service and a TransferGenerator on the gateway site in order to interface with a EGI specific storage solution (Openstack Swift).
The Credential Delegation Service uses a relational database to store the user's delegated credentials.
We implemented different solutions for the persistent layer: Sybase at CANFAR and MysqlDB at EGI.
We also made a number of software customizations to make the CADC provided APIs more flexible in supporting different persistent layers.
Specifically for the authentication and authorization a user and group management software, called GMS (Group Management Service) is used. A software interface is being finalised to propose a standardization of groups management operations to allow a wider interoperability considering the possibility to use different group management services like, for example, grouper. This common interface should allow the integration with other developments carried on inside the project, for example by CTA and SKA.