There is a lot of data out there. And there are a huge number of platforms and technologies that promise to overcome the conflicting goals of keeping that data "safe" while allowing it to be used by the wider world to do good things like cure diseases and improve lives. The safest way to keep data is to encrypt it on a hard drive, coat the hard drive in concrete and fly it to the moon, but it will be challenging to draw useful conclusions from it. At the opposite end, it would be extremely inappropriate to make everyone's health data available on Dropbox, but it might reveal some useful results. If you have come to this wiki, you are probably looking for more information about how to get started using DataSHIELD to address this challenge. You have heard that using DataSHIELD allows the "analysis to come to the data", but it can be hard to know how this works in practice and how you might get started. There is good documentation and support for the details of various pieces of software and processes that are needed to use DataSHIELD. The purpose of this wiki is to tell the parts of the story that aren't elsewhere. This includes the real world decisions that will need to be made and the options that are available, how to get started on using DataSHIELD for research, and where you can find more detailed information.
This type of advice and guidance is needed, because despite all the information that has already been produced, we still see that more help is needed. The DataSHIELD forum is a good place for seeking help. This is an example post where the user asks a short question, however there is no simple answer:
Another example is this email received by someone who had experience using DataSHIELD. It is asking how to get started with setting up DataSHIELD for a consortium:
“I am contacting you because we are thinking of using DataSHIELD for an XXXXXX funded project on XXXXXXXX and XXXXXXX. We are looking for someone with expertise in setting up the DataSHIELD with Opal who could give our IT team some advice. I spotted you as a contact for the XXXXXXXX platform and thought you might be able to direct me to someone with the technical expertise in setting things up. Also I would be interested in your experience with DataSHIELD and whether the effort involved in setting up is worthwhile for a smaller collaborative project as ours (we are a small consortium of about 5 case control studies). Any help or advice you can provide is highly welcome.”
The target of this wiki is to cover the technical issues, although there is inevitably some cross over into other areas. For example, some of the technical decisions influence the ethics and governance processes.
Using a wiki as the format for this information was deliberate. While it might be more prestigious to write a peer reviewed paper, a wiki is a better medium to a signpost to other best practice, which is evolving all the time. At the moment it is not always possible to say that one particularly way of doing things is best, but hopefully we are moving in the right direction. All we can do here is highlight the options available and the experiences we know about.