NSF funding allows collaboration with Purdue to preserve and link research data

Calendar Icon Feb 14, 2017      Person Bust Icon By Keith McGuffey     RSS Feed  RSS Submit a Story

Dr. Chungwook Sim showing the front page of DataCenterHub, an online platform for sharing research data
Dr. Chungwook Sim showing the front page of DataCenterHub, an online platform for sharing research data

Chungwook Sim, Assistant Professor of Civil Engineering, is collaborating with computer scientists and librarians at Purdue University to build a cyber platform to preserve and share valuable research data.

 “When I completed my dissertation, it was bound into two large volumes,” said Sim. “I wondered ‘who is going to read this?’ There is a wealth of information about my research, but unless someone is willing to read my full dissertation, that work is lost to the broader research community.”

Typically, the results of research are summarized in journal articles. Sim believes this summarization, while helpful, can leave out data other researchers may find important.

“Data collected by researchers are typically kept on some type of physical media – which used to be micro-tapes, floppy disks, CDs, DVDs, to now external hard drives. Unless you have physical access to these media which also becomes obsolete, you do not have access to the raw data. The media can sit in a researcher’s office for years untouched, despite there being useful data on it.”

Sim is working to develop this cyber platform as a co-principal investigator of the project. The platform would allow a researcher to upload research data, reports, metadata, media files, and various key parameters of the research. The result of this research efforts is the DataCenterHub, which currently hosts over 30 terabytes of data on over 5,400 experiments around the world. DataCenterHub currently hosts data on a broad variety of subjects such as earthquake damage, liquefied sands, and honeybee mites.

Sim is working towards transferring research data kept in physical storage devices, often living in silos, to an online, collaborative experience. DataCenterHub presents research data as tables with sortable, searchable columns for ease of use. By performing keyword search, a user can sort and make comparisons between multiple datasets created by different users.

“Our platform is not just a repository for archiving information but a solution to help researchers organize, classify, collaborate and share data in an accessible and easy way.”

Dr. Sim is currently working with the Nebraska Department of Roads to transfer its bridge maintenance data to the hub to provide link between the National Bridge Inventory data for Nebraska and the State Bridge Inspection data in an easily accessible format.

Along with the collaborative data exchange, DataCenterHub also provides private datasets, access restrictions within datasets, and allows for the addition of copyrights to datasets. For more information on the platform, please visit the DataCenterHub website.

-          For more information on the NSF grant.

  • DataCenterHub currently hosts over 30 TB of data, encompassing over 5,400 experiments
    DataCenterHub currently hosts over 30 TB of data, encompassing over 5,400 experiments
  • An example of a dataset in DataCenterHub. Information is arranged in columns and rows for ease of accessiblity.
    An example of a dataset in DataCenterHub. Information is arranged in columns and rows for ease of accessiblity.
  • Dr. Sim surveying captive columns after the April 2016 earthquake in Ecuador. The results of this survey have been uploaded to the DataCenterHub platform.
    Dr. Sim surveying captive columns after the April 2016 earthquake in Ecuador. The results of this survey have been uploaded to the DataCenterHub platform.
  • Dr. Sim is working to transfer the Nebraska Department of Roads' bridge database to the DataCenterHub platform.
    Dr. Sim is working to transfer the Nebraska Department of Roads' bridge database to the DataCenterHub platform.



Submit a Story