Expanding University Research Support through Data Management, Storage and Preservation
Scholars Portal, the digital infrastructure arm of Ontario’s university libraries, supports Canadian researchers throughout the research lifecycle. As a service of the Ontario Council of University Libraries (OCUL) hosted by the University of Toronto (UTL), Scholars Portal achieves this through collaboration and partnerships across Ontario and Canada. Two important components of this infrastructure, Scholars Portal Dataverse and the Ontario Library Research Cloud (OLRC), empower researchers to manage, store, and preserve research data in accordance with best practices.
Work has been completed on two 18-month development projects involving Scholars Portal Dataverse and the OLRC, funded by CANARIE’s call for Research Data Management software tools:
- “Dataverse for the Canadian Research Community” enhanced the Dataverse platform to address the needs of Canadian researchers through improved log-in mechanisms and account creation, tools that enable the transfer of large files and the curation of uploaded data, and developing a back-end integration with cloud storage with the OLRC.
- “DuraCloud: Linking Data Repositories to Preservation Storage” adapted the well-known cloud storage management software DuraCloud to work with existing Canadian storage solutions including the OLRC. Together, these tools provide secure and affordable Canada-based cloud storage, together with a robust and user-friendly management interface in support of Canadian research institutions.
These projects have strengthened Scholars Portal's capacity to support Canadian research and demonstrate the value of cross-institutional, and provincial collaboration in this endeavour.
Dataverse for the Canadian Research Community
Scholars Portal has hosted an instance of Dataverse, a research data repository software developed by the Institute for Quantitative Social Sciences (IQSS) at Harvard University, since 2012. In that time Scholars Portal has developed new features that were shared back with the global Dataverse community. These efforts include the internationalization of the core Dataverse code, which Scholars Portal released in early 2019 in collaboration with the University of Montreal. This allows the tool to be offered in languages other than English, and enables the bilingual interface for SP Dataverse. Another Scholars Portal contribution is the development of the Data Explorer tool, which allows users to create visualizations of tabular data within the browser.
As part of the CANARIE-funded project, “Dataverse for the Canadian Research Community”, the Scholars Portal Dataverse team developed the platform to better support the data deposit and sharing needs of Canadian researchers. This work focused on data curation, authentication, scalability, and large-file support.
- The Data Curation Tool (DCT), launched in fall 2019, allows data owners and curators to create and edit metadata at the variable level for files uploaded through the tabular ingest process in Dataverse. Users of the DCT can view summary statistics and charts about their data. The DCT improves data curation workflows within Dataverse, improves the ability for data reuse, and supports the application of standards and best practices using the Data Documentation Initiative (DDI) metadata standard.
- Scholars Portal configured Dataverse to work with Shibboleth for institutional single sign-on through the Canadian Access Federation (CAF), an identity management service for Canadian research institutions run by CANARIE. This integration ensures secure and trustworthy exchange of identity information as well as provides a simpler log-in process for users with one less username and password to manage.
- Scholars Portal connected Dataverse to in-house cloud storage by hosting files in a test cluster of the OLRC. This optimizes system architecture for scalable use and leverages an existing, distributed Canadian data storage network.
- Scholars Portal developed proof-of-concept integration with Globus as a large-file transfer tool. Internal tests found that this integration can handle robust transfers up to 100 GB in size and up to 38,000 files. The Scholars Portal Dataverse team is continuing to collaborate and consult with Harvard's IQSS Dataverse team to bring this proof-of-concept development work into the core Dataverse code.
For more information about this project, please visit the wrap-up blog post.
Project Team Members:
PI: Kate Davis
Technical Lead: Amaz Taufique
Co-PI: Amber Leahey
Project Manager: Meghan Goodchild
For questions about Scholars Portal Dataverse, please contact dataverse [at] scholarsportal.info
DuraCloud: Linking Data Repositories to Preservation Storage
Scholars Portal has maintained the Ontario Library Research Cloud (OLRC) since 2015. The OLRC is a distributed storage network consisting of five nodes located in data centres at universities geographically dispersed across the province. Files uploaded to the OLRC are copied across three of these nodes, so if one of these copies becomes unreadable, a new copy is created by the system from the two remaining good copies. Storage nodes in the OLRC are connected through the private high-speed ORION research network.
DuraCloud is an open source platform from the DuraSpace foundation that manages cloud storage and preservation. This tool enables users to control where and how their digital content is preserved in the cloud, with a robust framework for moving data into and out of different cloud storage systems. It can also synchronize data across cloud providers or against local storage solutions. DuraCloud provides a single interface for seamlessly interacting with data stored and preserved in different storage services. It includes robust and user-friendly tools for managing users, projects, and files.
As part of the CANARIE-funded project, “DuraCloud: Linking Data Repositories to Preservation Storage”, staff at Scholars Portal, the University of Toronto Libraries, and the Council of Prairie and Pacific Research Libraries (COPPUL) have removed DuraCloud’s dependencies on Amazon Web Services. This means Canadian researchers can now take advantage of a robust cloud storage service run entirely on infrastructure owned by Canadian research institutions, including Scholars Portal and COPPUL. The project team replaced back-end processing on Amazon with open source solutions hosted at Scholars Portal, and replaced Amazon S3 storage with the OLRC. This allows Scholars Portal to leverage existing secure, robust, Canadian-hosted, cost-effective, and preservation-friendly storage services.
Integrating DuraCloud into the OLRC will broaden the potential applications for Scholars Portal’s cloud services. The user-friendly interface will make it easier for OCUL institutions to securely share data, automatically backup network drives, store and serve large files from digitisation projects, and integrate other cloud-based applications into the OLRC’s back-end.
To learn more about the OLRC, please visit https://cloud.scholarsportal.info
Project Team Members:
PI: Steve Marks
Project Manager: Amaz Taufique
Co-PIs: Kate Davis, Corey Davis
For questions about the OLRC, please contact cloud [at] scholarsportal.info
Supporting libraries supporting researchers
The funding provided by CANARIE has enabled Scholars Portal to expand Dataverse and OLRC services, strengthening their capacity to support data management, storage, and preservation within the Canadian context. This improved infrastructure will allow university libraries to enhance the robust services they offer to support the data management needs of their research communities. Scholars Portal looks forward to continuing its work with partners in Ontario and across Canada to contribute to a national research infrastructure.
About Scholars Portal
Scholars Portal is a service of the Ontario Council of University Libraries, hosted through the University of Toronto Libraries. The Scholars Portal technological infrastructure preserves and provides access to information resources collected and shared by Ontario’s 21 university libraries. Through Scholars Portal online services, Ontario’s university students, faculty and researchers have access to an extensive and varied collection of scholarly content and datasets. Scholars Portal continues to respond to the research needs of Ontario universities through the creation of innovative information services and by working to ensure access to and preservation of this wealth of information.