Deploying RSpace

The following information is only applicable to a Team or Enterprise Editions of RSpace.

Scope and intended audience

This document is for IT staff at an organisation that is considering purchasing, or has purchased, RSpace Team or Enterprise.

For Enterprise, this document aims to provide an overview of deployment options to help guide your decision on how to deploy RSpace.

Which Edition should I choose and where will RSpace be deployed?

This is a fundamental decision for you to make, and there are various options, all of which can be supported and have proven to be workable with existing customers.

For deployments of RSpace with less than 15 users who have selected RSpace Team Edition, only option 1 is available. However, this document highlights various information about the SaaS offering that might be useful to the IT staff of an organization that has opted for the Team Edition.

Option 1: Research Space operates RSpace as a SaaS offering installed on an AWS server that we manage for you.

In this scenario, RSpace is deployed on your own private AWS instance in the cloud. Installation, backup, updates and maintenance are all performed by Research Space. For Team edition, ResearchSpace will select a data center location for you from the following list:

us-east-1 virginia

us-west-1 california

eu-west-1 ireland

eu-west-2 london

ap-northeast-2 seoul

ap-southeast-2 sydney

Oranizations who purchase RSpace Enterprise edition may request any AWS regional data center necessary to meet your data storage and regulatory requirements. Hosting on AWS is good option if you:

  • Are unable or unwilling to dedicate staff time to installing and maintaining RSpace.
  • Want complete convenience and to get up and running quickly.
  • Anticipate expanding usage over time but do not have suitable resources of your own to accomodate this - AWS is essentially is unlimited in terms of data storage capability.
Option 2. (Requires RSpace Enterprise Edition). You install and operate RSpace within a suitable on-premises environment that you provide.

In this scenario, RSpace is installed on your institutional system - either physical machines, virtual hosts or your private cloud. You can learn more about how to install RSpace. This scenario is good if:

  • Your data-compliance guidelines require you to store research data on-premises.
  • You are able to dedicate some staff time to installing and maintaining RSpace.
  • You have staff with IT experience in managing web applications in a Linux environment.
  • You want absolute ownership and control over all aspects of the data life cycle, for example backup and recovery.
It is critically important to keep your RSpace server application up to date. You can see the current version of your server at the bottom left of the interface. It will be of the form 1.XX.X. You can see the current release version in the changelog. If your organization has opted to manage your own server, but you allow you server to fall more than 4 major releases or 1 year behind the current release, then additional surcharges may apply if ResearchSpace needs to assist you with a complex, multistep, update path through many versions.
Option 3. (Requires RSpace Enterprise Edition). Research Space remotely installs and manages RSpace within a suitable on-premises environment that you provide.

This is a hybrid solution, where you want RSpace on-premises but don't want to delegate IT staff time to maintaining the system. In this case, you set up the infrastructure and Research Space will install and maintain the software. Responsibility for backup /disaster recovery is agreed before installation. This scenario is good if:

  • Your data-compliance guidelines require you to store research data on-premises.
  • You have limited desire or ability to dedicate staff time to managing RSpace

Here, the process would typically be:

  1. A kick-off meeting to meet each other and establish a procedure and timeline for the installation, and to confirm infrastructure requirements.
  2. Customer sets up infrastructure (e.g. virtual servers) with vanilla Ubuntu / Debian OS and grants SSH access to ResearchSpace installation technician.
  3. ResearchSpace performs basic installation
  4. Single SignOn / integrations are set up as required.
  5. RSpace is made available to users.

RSpace connectivity

RSpace does not operate in isolation from your institutional data; in fact it shines when connecting and linking your research work together. In this section we review how the different deployment options described above affect these aspects of RSpace functionality.

Single Sign On

If you want your users to access and login to RSpace using Single Sign On, RSpace supports this for all deployment options, using the SAML2 protocol. Most Identity Providers (IdPs) such as Okta, Azure AD etc., support this protocol. For more details, see Setting up SingleSignOn authentication

Connecting to your existing data storage

RSpace can store and manage all sorts of data files, but there are occasions when your researchers will want to link to data files on an institutional file server rather than bringing the files into RSpace. This might be the case if

  • The data files are huge, e.g. large images or sequencing files.
  • Your data has to be stored on a particular file server for compliance reasons.

RSpace can talk to these servers using either Samba or SFTP protocols. It just requires read access to list files to link to.

This can be easier to set up for an on-prem installation; connecting from RSpace on AWS is entirely possible technically, but requires access from RSpace to your file-server. Please read Configuring Institutional File Systems for more details

RSpace integrations

RSpace has integrations with many popular applications including Dropbox, Google Drive, OneDrive, Microsoft Teams, Slack, Office 365, protocols.io, Github, Figshare and Dataverse- see Integrations for a full list. The setup required for each integration is variable. If you are running RSpace as a SaaS (option 1 above) , ResearchSpace will be able to set up these integrations for you. If you are running RSpace on-prem (or, more specifically, the RSpace URL is not a researchspace.com URL), then you will have to configure these integrations, as they often require proof of domain ownership to set up (e.g. Google Drive).

RSpace Enterprise customers can customize the interface by adding an organizational branding / company logo image to the top right corner of the interface (replacing the standard RSpace image), and / or by adding up to 2 other custom text links in the page footer (e.g., pointing to a web page you maintain with information about data privacy policies, legal disclaimers, or other important information about using RSpace at your specific organization).

Getting data out of RSpace

RSpace supports export to all standard formats - HTML, XML, PDF, Microsoft Word and JSON (via RSpace API). Users can export their data themselves, at any level of granularity from a single document to their entire body of work, at any time, and download the export to their own machines. Exports can be scheduled using the API - e.g. running a cron-job to invoke export once a week.

If as a server administrator you want to do low-level data export, this is easily accomplished using standard, free tools. ELN metadata can be exported from the MySQL database using `mysqldump` or `Percona XtraBackup`), and from its internal file store via tools such as `rsync`.

No data is stored in a binary format proprietary to RSpace.

Sensitive data

Standard on prem and hosted RSpace deployments are not appropriate for entry of sensitive data (e.g., patient information subject to HIPAA or similar regulatory rules).   It is certainly possible, however, to deploy RSpace so that entry of sensitive data is supported and compliant.  Often, this issue comes up where usage in a medical school is planned.   In these situations, a solution is to deploy RSpace within a validated compliant environment you already use or that you create with assistance from ResearchSpace. Because of the increased cost of data storage and processing in these environments, it may even make sense to deploy a second instance of RSpace specifically for researchers who handle sensitive data, and researve your standard RSpace deployment for use by the majority of users, who don't need the extra 'compliance wrap'.

In the USA, AWS GovCloud offers a compliant computing environment for organisations bound by federal data-handling regulations. RSpace has been installed successfully in this environment.

Migration after a pilot

Customers often run a pilot of RSpace on AWS, before deciding to purchase an ongoing license. In that case you can decide whether to continue using the cloud instance as a production instance, or switching to an on-premises deployment. If you chose to move to an on-premises deployment, it's possible to migrate data that researchers entered into the cloud instance to the on-premises instance of RSpace.

Data Backup

For on-premise deployments of RSpace, backup is solely the customer’s responsibility. We will consult with your IT personnel at the time of deployment. For backing up AWS-based RSpace instances, ResearchSpace uses scripts to automate the backup process that we are happy to share with customers on request.

When deployed as SaaS (software as a service) onto an Amazon Web Services (AWS) private instance that we manage for you, ResearchSpace and Amazon take care of backups for you.

Data is stored in a MySQL 5.7 or MariaDB10.3 database; files are stored unmodified on EBS volumes in a directory structure.

  • We make hourly file syncs to S3 using AWS CLI tool
  • Nightly and weekly snapshots of instances and data volumes are stored as machine images (AMIs). These are fast to make, and support RTOs in the order of minutes.
  • Logical database backups are made nightly, and stored on S3. Data Files, logs, configuration files and search indices are additionally synced to S3 hourly.

For in-depth description, please read  SaaS Backup document.

In addition, customers can use the export API endpoint to make additional, scheduled, bulk data exports to any destination you like to act as an additional redundant data backup.


How did we do?


Powered by HelpDocs (opens in a new tab)

Powered by HelpDocs (opens in a new tab)