Virtual Data Collaboratory

A Regional Cyberinfrastructure for Collaborative Data Intensive Science

Introduction

The Virtual Data Collaboratory (VDC) is a federated data cyberinfrastructure that is designed to drive data-intensive, interdisciplinary and collaborative research, and enable data-driven science and engineering discoveries. VDC accomplishes this by providing seamless access to data and tools to researchers, educators, and entrepreneurs across a broad range of disciplines and scientific domains as well as institutional and geographic boundaries. In addition to enabling researchers to advance research frontiers across  multiple disciplines, VDC also focuses on (1) training the next generation of scientists with deep disciplinary expertise and a high degree of competence in leveraging data, cyberinfrastructure, and tools to address research problems and (2) helping data scientists and engineers develop and apply advanced federated data management and analysis tools for high impact scientific applications. To meet this mission, VDC extends beyond its collaborating institutions and leverages NSF investments to provide cyberinfrastructure typically not available to community colleges, state-associated colleges and universities, and regional liberal arts colleges and universities, and to stimulate intense user engagement and adoption by scientists across domains and institutions.

 

VDC represents state of the art data-intensive computing, storage, and networking solutions, integrated with an innovative data services layer. VDC is federated and coordinated across three geographically distributed Rutgers University campuses in New Jersey and multiple campuses in Pennsylvania and New York by a high-speed network, with the potential to incorporate academic/research institutions across the Mid-Atlantic and the nation. VDC builds on and integrates existing national/international and regional data repositories, including NSF-funded repositories, and leverages local/regional/national ACI investments. Central to the VDC vision are three infrastructural innovations, a regional science data science DMZ network that provides services to enable efficient and transparent access to data and computing capabilities, an expandable and scalable architecture for data-centric infrastructure federation, and a data services layer to support research workflows that utilize cutting-edge semantic web technologies, support interdisciplinary research, expand access, and increase the impact of data-science worldwide.

Overarching Goals

Provide seamless access to data and tools to researchers, educators, and entrepreneurs across a broad range of disciplines and scientific domains as well as institutional and geographic boundaries. tools to address research problems.

Enable data scientists and engineers develop and apply advanced federated data management and analysis tools for high impact scientific applications.

Train the next generation of scientists with deep disciplinary expertise and a high degree of competence in leveraging data, cyberinfrastructure, and tools to address research problems.

Driving Applications

Driving Applications - PDB

Protein Data Bank

Deciphering Sequence and Structural Correlates of Protein Nucleic Acid Interactions (H. Berman & V. Honavar)

Driving Applications - Smart City

Smart City

High-Volume City Data Sharing and Processing for Smart, Resilient, and Sustainable Cities (J. Gong, RU; Z. Zhu, CUNY; X. Liang, University of Pittsburgh; M. Balduccini, Drexel University)

Driving Applications - OOI

Ocean Observatories Inititative

Platforms and sensor systems measure physical, chemical, geological and biological properties and processes from the seafloor to the air-sea interface (I. Rodero, J. J. Villalobos, M. Parashar)

Proposed Architecture

VDC Architecture

Team

Rutgers University

Ivan Rodero

Principal Investigator

Manish Parashar

Former Principal Investigator

Grace Agnew

Co-PI Investigator | Data Services Lead

James von Oehsen

Co-Principal Investigator

J. J. Villalobos

Systems Lead

Forough Gahrahmani

Education Co-Lead

Thu Nguyen

Education Co-Lead

Helen Berman

PDB Use Case

PennState University

Vasant Honavar

Co-PI | Use Cases Lead

Jenni Evans

Co-Principal Investigator

Wayne Figurelle

Chuck Gilbert

Karen Estlund

KINBER

Wendy Huntoon

Network Lead

NJEdge

Edward Chapel

Virtual Data Collaboratory is supported by its members institutions and the United States National Science Foundation through the NSF award number 1640834. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
BACK TO TOP