JADE - Jülich Aachen Data Exchange

The Jülich Aachen Data Exchange (JADE) tries to establish a flexible and scalable data management tool that addresses the specific needs and requirements of domain scientists to exchange data in a convenient and efficient way. The main objectives of JADE were gained in close interaction with the intended user and the following requirements for JADE were derived:

  • Data exchange between distributed partners through data replication
  • Archiving of stored data
  • Flexible right management and access control
  • Multi-level data access – from NFS mount to cloud-like access
  • Flexible extension of partners and sites

Therefore, the implementation of JADE addresses the following use cases:

Use Case Data Access: Any data in JADE can be accessed and shared using various methods, such as https, WebDAV, scp, or nfs. JADE further intends to provide a full featured cloud interface. Fine grained ACLs are used to authorize access to the data. Thus, JADE provides data access on various levels of abstraction: from file system mounting to web-based cloud access.

Use Case Data Replication: Sites can have synchronized data sets, e.g. in locally deployed data pools, where communities can use any available method for local or remote access. This offers a transparent data transport form data producers to data consumers and make this data accessible close to the place of usage and processing.

Use Case Data Providing: Generated data can be written to a local pool. These files get transferred to a huge and central data facility. This happens transparently to the user. When data is read, JADE always knows in which location a copy exists and links the read request to the best available copy or transfer the file back to the local pool in advance.

Use Case Data Archiving: A pool can be configured to migrate files down to tertiary storage systems. Store and restore from tertiary storage is transparent to the user. In JADE, JSC as central unit will provide tertiary storage and access to it.

In the middle of 2014, a first test bed based on dCache has been installed in the context of the SMHB project. dCache is a project to “provide a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods” [http://www.dcache.org/].

The Jülich Supercomputing Centre (JSC) serves as JADE’s central data center for storage and network resources. The embedding of resource at RWTH Aachen University has shown the feasibility of the approach and initiates the JADE’s inclusion into JARA.

The initial test bed of JADE contains 18TB storage at JSC and 30GB storage at RWTH. In its first operational phase, JADE will be installed in JSC with access to a multi TB storage connected to the JSC’s archiving systems and at RWTH on an 80TB file server. The 30GbitE connection between Aachen and Jülich will be used to transfer data within JADE.

Beside the support of the service infrastructure for a distributed file system, JADE further investigates other software solutions, such as iRods and investigates actively in the development of these software solutions. Furthermore, JADE is in contact with other data communities, such as LSDMA or EUDAT.

Team
Bastian Tweddell
Jülich Supercomputing Centre (JSC)
FZ Jülich

Benjamin Weyers
IT Center
RWTH Aachen

Rajalekschmi Deepu
Jülich Supercomputing Centre (JSC)
FZ Jülich

Alexander Peyser
Jülich Supercomputing Centre (JSC)
FZ Jülich