This page is designed for data managers new to DataSHIELD. It assumes that your systems operator has succesfully completed the steps described here and therefore have either an Opal or Armadillo server running.
The key tasks for a data manager are:
Opal and Armadillo are two types of DataSHIELD server. Opal is built and maintained by Obiba, and is a data management server which supports DataSHIELD analysis and also contains many additional features. Armadillo is built and maintained by Molgenis, and was designed as a light-weight server specifically to implement DataSHIELD analysis. DataSHIELD was originally developped using Opal, and Opal is still the reference implementation.
Both solutions share core functionality, and can be administered either using a user interface or R. However, the steps involved will be different. Full documentation can be found here:
Opal documentation
Armadillo documentation
For both Armadillo and Opal, most tasks can be completed via the UI. However, for more complicated tasks or tasks that need to be repeated, it can be more efficient to do these using R. The Armadillo R package for data managers is molgenisArmadillo, whilst the opal R package is called opalr. To learn more about R and how to install it along with these packages, see this page.
For researchers to be able to use DataSHIELD, you first need to upload your data to your local server. This can be done either using the user interface (Opal, Armadillo) or via R (Opal, Armadillo). If the data needs to follow a specific format defined in dictionaries you can use dsUpload (R) which is a collection of tools used to upload data into DataSHIELD backends (Opal, Armadillo). Opal also comes with the obiba-opal Python library and command-line tools for data and user administration automation.
As a data owner, you might upload many tables or resources to your server. However, individual researchers may only need access to a subset of this data for their research. Rather than giving researchers access to all of their data, it may be the policy of your institution to give access only to the subset of data required. Subsets of data can be described as 'views', and can also be created either via the user interface (Opal, Armadillo) or R packages (Opal, Armadillo)
To ensure data security, users are normally only allowed access to the subset of data which is relevant to their project, and for a set period of time. Managing which users have permission to access which projects can be done in the UI, R or Python with Opal and in the UI with Armadillo.
As a data manager, you also control which DataSHIELD packages are installed on your server. To make this process easier, analysis profiles have been created. These are docker images which consist of bundles of R and DataSHIELD packages which are commonly used in research. Analysis profiles are handled using the UI in both Opal and Armadillo. Opal offers more flexibility with additional features such as derived profiles and access controls.