DataSHIELD server-side functions contain automated output checks performed in real-time during analysis, preventing analysis that could return directly disclosive information. Where possible these automated checks are mapped to current best practice for manual output checking Welpton, Richard (2019): SDC Handbook and the thresholds are configurable by data controllers in Opal to align with their governance needs and the spectrum of data sensitivity. In addition to making use of these output checks, each function has been written so that only low dimensional non-disclosive summary statistics leave data custodians and may also have additional privacy preserving methods hard coded into the functions themselves. From DataSHIELD v5 onwards there are several checks that can be deployed in server-side functions, listed below.
Information correct as of March 2022.
Number | Setting | Description |
---|---|---|
1 | nfilter.tab | The minimum non-zero cell count allowed in any cell if a contingency table is to be returned. This applies to one dimensional and two dimensional tables of counts tabulated across one or two factors and to tables of a mean of a quantitative variable tabulated across a factor. Default usually set to 3 but a value of 1 (no limit) may be necessary, particularly if low cell counts are highly probable such as when working with rare diseases. Five is also a justifiable choice to replicate the most common threshold rule imposed by data releasers worldwide; but it should be recognised that many census providers are moving to ten. |
2 | nfilter.subset | The minimum non-zero count of observational units (typically individuals) in a subset. Typically defaulted to 3. |
3 | nfilter.glm | The maximum number of parameters in a regression model as a proportion of the sample size in a study. If a study has 1000 observational units (typically individuals) being used in a particular analysis then if nfilter.glm is set to 0.33 (its default value) the maximum allowable number of parameters in a model fitted to those data will be 330. This disclosure filter protects against fitting overly saturated models which can be disclosive. |
4 | nfilter.string | The maximum length of a string argument if that argument is to be subject to testing of its length. Default value = 80. The aim of this nfilter is to make it difficult for hackers to find a way to embed malicious code in a valid string argument that is actively interpreted. |
5 | nfilter.stringShort | Same as above but set to 20 characters |
6 | nfilter.kNN | The minimum value allowed for k on the k-nearest neighbours method which is used mainly for some of the graphical functions. Default value = 3. |
7 | DEPRECATED The maximum number of the unique levels of a categorical variable that are allowed to be returned to the client. If nfilter.levels is set to 0.33 (its default value), and if a categorical variable (i.e. factor) has X distinct categories then if X is greater than the 33% of the variable's length then the categories (i.e. levels) are not returned to the client. This disclosure filter protects against the disclosure of all the unique values in a numerical variable when it is converted to a factor variable. | |
8 | nfilter.levels.density | The maximum proportion of unique levels of a categorical variable with respect to the number of that variables that is regarded as non-disclosive. For example, if the resulting contains 1000 levels, and were derived from 4000 rows what would be a proportion of 0.25 (25%) so would be regarded as being non-disclosive. Default value is 0.33. |
9 | nfilter.levels.max | The maximum number of unique levels of a categorical variable that is regarded as non-disclosive. Default value is 40. |
10 | nfilter.noise | The minimum level of noise that can be added to a server-side vector. The "noisy" vector can then be returned to the client. This value specifies the variance of the added noise. If nfilter.noise is set to 0.25 (its default value) then noise following a distribution (usually Gaussian) with zero mean and variance equal to the 25% of the true variance of the vector of interest is added to each individual value of that vector. |
11 | datashield.privacyControlLevel | Permit server administrators to run servers with a predefined subset of the standard methods available. If the value of this option is not the string "permissive", the following server side methods will be blocked form use: dataFrameSubsetDS1, levelsDS, BooleDS, cDS, cbindDS, dataFrameDS, dataFrameSortDS, dataFrameSubsetDS2, dmtC2SDS, rbindDS, recodeLevelsDS, recodeValuesDS, repDS, reShapeDS, seqDS, subsetByClassDS and subsetDS. Default value is "permissive". The option was introduced in version DataSHIELD 6.2. |
12 | DEPRACATED. This is the old filter that is used in DataSHIELD v4. This option has been deprecated, and has been replaced by the filters described above. |
Number | Function | Type | nfilter.tab | nfilter.subset | nfilter.glm | nfilter.string | nfilter.stringShort | nfilter.kNN | nfilter.levels.density | nfilter.levels.max | nfilter.noise | datashield.privacyControlLevel |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | BooleDS.R | Assign | ||||||||||
2 | asCharacterDS.R | Assign | ||||||||||
3 | asDataMatrixDS.R | Assign | ||||||||||
4 | asFactorDS1.R | Aggregate | Checked | Checked | ||||||||
5 | asFactorDS2 | Assign | ||||||||||
6 | asFactorSimpleDS.R | Assign | ||||||||||
7 | asIntegerDS.R | Assign | ||||||||||
8 | asListDS.R | Aggregate | ||||||||||
9 | asLogicalDS.R | Assign | ||||||||||
10 | asMatrixDS.R | Assign | ||||||||||
11 | asNumericDS.R | Assign | ||||||||||
12 | boxPlotGGDS | Aggregate | ||||||||||
13 | cbindDS.R | Assign | Checked | |||||||||
14 | cDS.R | Assign | Checked | Checked | ||||||||
15 | changeRefGroupDS.R | Assign | ||||||||||
16 | checkNegValueDS.R | Aggregate | ||||||||||
17 | classDS.R | Aggregate | ||||||||||
18 | colnamesDS.R | Aggregate | ||||||||||
19 | completeCasesDS.R | Assign | ||||||||||
20 | corDS.R | Aggregate | Checked | Checked | ||||||||
21 | corTestDS.R | Aggregate | ||||||||||
22 | covDS.R | Aggregate | Checked | Checked | ||||||||
23 | dataFrameDS.R | Assign | Checked | Checked | ||||||||
24 | dataFrameFillDS.R | Assign | ||||||||||
25 | dataFrameSortDS.R | Assign | Checked | Checked | Checked | Checked | ||||||
26 | dataFrameSubsetDS1.R | Aggregate | Checked | Checked | Checked | |||||||
27 | dataFrameSubsetDS2.R | Assign | Checked | Checked | Checked | |||||||
28 | densityGridDS.R | Aggregate | Checked | |||||||||
29 | dimDS.R | Aggregate | ||||||||||
30 | dmtC2SDS.R | Assign | Checked | |||||||||
31 | extract.R | Aggregate | ||||||||||
32 | glmDS1.R | Aggregate | Checked | Checked | ||||||||
33 | glmDS2.R | Aggregate | Checked | Checked | ||||||||
34 | glmerSLMA2.R | Aggregate | ||||||||||
35 | glmPredictDS.ag | Aggregate | Checked | |||||||||
36 | glmPredictDS.as | Assign | Checked | |||||||||
37 | glmSLMADS1.R | Assign | Checked | Checked | ||||||||
38 | glmSLMADS2.R | Aggregate | Checked | Checked | ||||||||
39 | glmSLMAD.assign.R | Assign | ||||||||||
40 | glmSummaryDS.ag.R | Aggregate | Checked | |||||||||
41 | glmSummaryDS.as.R | Assign | Checked | |||||||||
42 | heatmapPlotDS.R | Aggregate | Checked | Checked | ||||||||
43 | histogramDS1.R | Aggregate | Checked | Checked | Checked | Checked | Checked | |||||
44 | histogramDS2.R | Aggregate | Checked | Checked | Checked | Checked | ||||||
45 | kurtosisDS1.R | Aggregate | Checked | |||||||||
46 | kurtosisDS1.R | Aggregate | Checked | |||||||||
47 | isNaDS.R | Aggregate | ||||||||||
48 | isValidDS.R | Aggregate | Checked | |||||||||
49 | lengthDS.R | Aggregate | ||||||||||
50 | levelsDS.R | Aggregate | Checked | Checked | ||||||||
51 | lexisDS1.R | Aggregate | Checked | |||||||||
52 | lexisDS2.R | Assign | Checked | Checked | ||||||||
53 | lexisDS3.R | Assign | ||||||||||
54 | listDisclosureSettingsDS.R | Aggregate | Read | Read | Read | Read | Read | Read | Read | Read | Read | Read |
55 | listDS.R | Assign | ||||||||||
56 | lmerSLMADS.assign | Assign | Checked | Checked | ||||||||
57 | lmerSLMADS2.R | Aggregate | Checked | Checked | ||||||||
58 | lsDS.R | Aggregate | Checked | |||||||||
59 | matrixDS.R | Assign | Checked | Checked | ||||||||
60 | matrixDetDS1.R | Aggregate | Checked | |||||||||
61 | matrixDetDS2.R | Assign | Checked | |||||||||
62 | matrixDiagDS.R | Assign | Checked | Checked | ||||||||
63 | matrixDimnamesDS.R | Assign | Checked | Checked | ||||||||
64 | matrixInvertDS.R | Assign | Checked | |||||||||
65 | matrixMultDS.R | Assign | Checked | |||||||||
66 | matrixTransposeDS.R | Assign | Checked | |||||||||
67 | meanDS.R | Aggregate | Checked | |||||||||
68 | meanSdGpDS.R | Aggregate | Checked | |||||||||
69 | mergeDS.R | Assign | Checked | |||||||||
70 | messageDS.R | Aggregate | Checked | Checked | Checked | Checked | ||||||
71 | metadataDS.R | Aggregate | ||||||||||
72 | namesDS.R | Aggregate | Checked | |||||||||
73 | numNaDS.R | Aggregate | ||||||||||
74 | quantileMeanDS.R | Aggregate | ||||||||||
75 | rangeDS.R | Aggregate | ||||||||||
76 | rbindDS.R | Assign | Checked | |||||||||
77 | rBinomDS.R | Assign | ||||||||||
78 | recodeLevelsDS.R | Assign | Checked | |||||||||
79 | recodeValuesDS.R | Assign | Checked | Checked | Checked | |||||||
80 | repDS.R | Assign | Checked | |||||||||
81 | replaceNaDS.R | Assign | ||||||||||
82 | reShapeDS.R | Assign | Checked | |||||||||
83 | rmDS.R | Aggregate | ||||||||||
84 | rNormDS.R | Assign | ||||||||||
85 | rowColCalcDS.R | Assign | ||||||||||
86 | rPoisDS.R | Assign | ||||||||||
87 | rUnifDS.R | Assign | ||||||||||
88 | sampleDS.R | Assign | Checked | Checked | ||||||||
89 | scatterPlotDS.R | Aggregate | Checked | Checked | ||||||||
90 | scoreVectDS.R | Aggregate | ||||||||||
91 | seedDS.R | Assign | ||||||||||
92 | skewnessDS1.R | Aggregate | Checked | |||||||||
93 | skewnessDS2.R | Aggregate | Checked | |||||||||
94 | seqDS.R | Assign | Checked | Checked | Checked | |||||||
95 | subsetByClassDS.R | Assign | Checked | Checked | ||||||||
96 | subsetDS.R | Assign | Checked | Checked | ||||||||
97 | table1DDS.R | Aggregate | Checked | |||||||||
98 | table2DDS.R | Aggregate | Checked | |||||||||
99 | tableDS2.R | Aggregate | ||||||||||
100 | tableDS.assign.R | Assign | ||||||||||
101 | tableDS.R | Aggregate | Checked | |||||||||
102 | tapplyDS.assign.R | Assign | Checked | Checked | ||||||||
103 | tapplyDS.R | Aggregate | Checked | Checked | ||||||||
104 | testObjExistsDS.R | Aggregate | ||||||||||
105 | tTestFDS2 | Aggregate | ||||||||||
106 | unListDS.R | Assign | ||||||||||
107 | varDS.R | Aggregate | Checked |