Reminder: Fill any DataSHIELD functionality developments in the shared spreadsheet:
Demetris Avraam, Stuart Wheater, Annika Swenne, Florian Schwarz
Nadja Lendle; Sofia Siampani
- Stuart: Experimenting a pipeline for dsUpload
- Florian: Coding Guidelines Survey
- General Update
- Zoom links for Stats Theme meetings and drop in sessions:
- Demetris will see whether we can switch to the Zoom from University of Copenhagen
- option to save chat for links would be beneficial
- Stuart is assessing on how to change dsUpload to be more flexible using a pipeline for config files doing specific tasks
- Florian: Coding Guidelines Survey to be sent around in stats theme E-Mail list & DS Forum asking for participation
- ProPass Consortium: Xavier working on functions
- Next meeting: 3rd April
Florian Schwarz, Nadja Lendle, Annika Swenne, Stuart Wheater, Roy Gusinow, Sofia Siampani
Demetris Avraam
- Organisational
- What functions are generally absent in DataSHIELD but have not been asked specifically for?
- General update.
- Florian Schwarz taking over responsibility of co-Lead for Statistical Theme from Manuel Huth
- Writing rights on the DS Community wiki: someone missing who can't edit + another introduction session necessary? (will be asked again next time)
- Demetris Disclosure Control Paper: deadline for comments next week, pay attention to affiliations and funding information
- Topics of Interest for more in-depth discussion either in this or in another dedicated meeting?
- Harmonisation in different consortia (within / outside of Opal/DataSHIELD)
- Functions that were not specifically asked for but are missing in DataSHIELD?
- Random-Forest, other ML techniques
- Image Processing
- so far noone who could work on that
- Performance of dsBase could be improved by several seconds per function call if checks (e.g. class / exists) are not executed; however, there could be some strange error messages
- next meeting: 06.03.2025
Manuel Huth, Demetris Avraam, Katerina Lymperidou, Roy Gusinow, Stuart Wheater, Florian Schwarz, Nadja Lendle, Timm Intemann, Annika Swenne, Andreas Mändle, David Sarrat González, Sofia Siampani
- New members of the Theme; create an email list
- Plans for DataSHIELD developments in 2025
- Discuss items from the agenda of the previous meeting
Save-the-date: DataSHIELD Conference from 23rd to 26th September in Lausanne, Switzerland
Future Structural Changes:
- taking disclosure control out of dsBase package in the future and putting those functions into a standalone package (dsDisclosure); thereby reducing necessity to have all dependencies of dsBase installed by non-dsBase packages (instead the leaner dsDisclosure)
- adding provenance tags (~ attributes) to objects created on the server-side by DS functions
- this helps to prevent unwanted execution of functions on intermediary objects that need to be created on the server-side
- from dsBase 7.0.0: deprecated functions will be deleted
- Manuel, Roy, Florian, Demetris, Stuart, Annika, Nadja working on more function implementations & CRAN submissions this year
- DS Performance Meeting with Tim Cadman Thursday, 16th January (anyone interested should contact either Demetris or Tim)
Andreas Mändle, Becca Wilson, Paula Irving, Stuart Wheater
Demetris Avraam
- Additional tags for the headers of DataSHIELD functions: disclosure filters, authors, contributors, maintainers.
- Auditing process independently to CRAN submission.
- Internal Data Provenance
- Roadmap to the 7.0 version of dsBase:
- tidy up the package
- improve performance
- new developments
- increase testing coverage
- submission to the CRAN
Meeting Notes:
- Oxygen headers/hexi stickers to be discussed at a further meeting – when people attend – Becca will dig out the links
- Stuart – audit process – to do a bare bones framework over xmas
- Items 1 and 2 in agenda are linked and need to be discussed when more people at further meeting – possible take to other themes for discussions i.e. tech core team
- Andreas thinks DS is slow in response time – Stuart to look into this – metadata operations
- Leave slides and internal data providence to next meeting
Stuart Wheater, Timm Intemann, Florian Schwarz, Nadja Lendle, Annika Swenne, Roy Gusinow, Andreas Mändle, Sofia Siampani, Tim Cadman, Manuel Huth, Ahmet Akkoç
Demetris Avraam
- From December 2024 we will change the zoom link of our meetings to:
Topic 1:
- CRAN release of server-side package before client-side package because
CRAN Dependencies should not consist of any package to be installed from GitHub
- CRAN procedure very fast (automatic testing environment ~ 20min; E-Mail contact after 2 days)
- CRAN does not allow to have error, warnings or notes occurring during testing
Topic 2+3:
- dsBase(Client) release candidate 6.3.1 ready for release next week
- code coverage small in many community packages
Other Topics:
- Outreach: Community Map (Website + Raw Data for re-creation) of studies + people in DataSHIELD
- Outreach: Hex Stickers for DataSHIELD Packages
- Designer necessary (potentially Teresa Albers from NFDI4Health Consortium acc. to Timm I., interest also from Roy)
- important to come up with a feature shared across DS packages
- also indicates higher level of professionalism of DS Community
- Ahmet: there is an R (?) package for creation of such stickers
- Action: Florian will inform Outreach on topic
- DSI: Tim C. implemented new global option with Yannick so that DS server-side error is displayed nicely without having to call datashield.error() function
- dsSurvival Package (Timm I.): who is working on the package and what is the current status?
- Roche, Demetris (forked from Soumya into datashield repository)
- DSLite (Annika): Is it possible to run the servers in parallel instead of sequentially?
- unlear?
- Stuart will check this?
- Unassigned Action: we could need documentation on the difference in behaviour of DSLite in comparison to real-world setting
Manuel Huth, Florian Schwarz, Demetris Avraam, Annika Swenne,
Timm Intemann, Andreas Mändle, Roy Gusinow, Stuart Wheater
- Presentation: Andreas Mändle presented his work "A User-Friendly Interactive Dashboard for DataSHIELD".
- Updates from everyone for ongoing developments
- DataSHIELD versioning: What parts of the infrastructure need a version number (the entire system, the packages, an image, a profile)?
- Internal DataSHIELD functions: What is the best approach to develop/use internal functions?
- General updates
Manuel Huth, Roy Gusinow, Demetris Avraam, Stuart Wheater
- General updates
Ines Amine, Timm Intemann, Annika Swenne, Roy Gusinow, Stuart Wheater, Demetris Avraam, Andrei Morgan, Angela Pinot de Moira
- Presentation: Annika Swenne presented her work on “Federated Generalized Additive Models for Location, Scale and Shape in DataSHIELD”
- Stuart: dsDataShaper, pipelines for data shaping
- Andrei: Use wiki for agenda and meeting minutes
- General updates
Manuel Huth, Timm Intemann, Andreas Mändle, Florian Schwarz, Annika Swenne, Roy Gusinow, Angela Pinot de Moira, Andrei Morgan, Stuart Wheater, Paula Irving
Demetris Avraam
- Presentation: Manuel Huth presented his work on “Challenges with developing non-linear mixed-effects models in DataSHIELD”
- Update on cox development for ProPass (did not happen)
- General updates
Demetris Avraam, Becca Wilson, Florian Schwarz, Stuart Wheater
Annika Swenne, Andreas Mändle, Timm Intemann, Andrei Morgan, Manuel Huth
- Continue the discussion for the audit process
- Public roadmap - what mechanism: wiki, github, embedded spreadsheet?
- Cox Regression
- Conference speakers
- General updates
- Manuel Huth and Camille Bachot to present in the next two meetings
- Does Stuarts testing incl version R
- release note re version R
Demetris Avraam, Tom Bishop, Manuel Huth, Florian Schwarz, Timm Intemann, Becca Wilson, Paula Irving, Angela Pinot de Moira, Roy Gusinow, Andreas Mändle, Sofia Siampani, Miron, Aikaterini Lymperidou, Stuart Wheater
Annika Swenne, Andrei Morgan
- Presentation: Florian Schwarz presented “Creating and Audit Process for DataSHIELD Packages” (20mins + 10 mins discussion)
- Takeaways from the conference - Audit discussion
- Becca: draft of UK guidelines for output checking
- Becca: differential privacy package
- Demetris: DataSHIELD wiki and Tech Group meetings
- General updates
- DataSHIELD Community calendar
- Manuel Huth to present in the first meeting of 2024 and Camille Bachot to present dsPrivacy in the second meeting
- To continue the discussion for the audit process in the next theme meeting
Demetris Avraam, Stuart Wheater, Manuel Huth, Tim Cadman, Roy Gusinow, Florian Schwarz, Annika Swenne, Andreas Maendle, Andrei Morgan
Becca Wilson, Timm Intemann
- Presentation: Tim Cadman presented his work on “Developing tidyverse functions in DataSHIELD” (20mins + 10 mins discussion)
- Developing 1-stage Cox ph functions in DataSHIELD
- General updates
- Tim will look for a group of ds.tidyverse -> WikiEntry
- Demetris, Roy & Manu will work on ds.coxReg (name debatable)
- Florian: We should summarize the takeaways from the conference -> next meeting’s Agenda -> collect ideas
- Andrei: Post minutes of the meetings to the wiki
Demetris Avraam, Manuel Huth, Annika Swenne, Stephan Ringshandl
Carolina, Roy
- General updates
- Parallelization of ds.base
- Demetris + Manuel: Discussion on what to send to Andrei for the conference
- work on parallelization of ds.base (dplyr, no for loops, compute linear model directly, checks can be turned off, prints can be turned off)
Stuart Wheater, Timm Intemann, Stephan Ringshandl, Florian Schwarz
Demetris Avraam, Annika Swenne, Andrei Morgan
- General updates
- Homomorphic encryption capabilities via HomomorpheR -> Shared private key among data owners
- Stuart: started having a weekly Drop-In-Session / Meeting for the Operational Theme (Wednesday)
- Florian: work related to the cluster analysis function (specifically related to dsBase 6.3 version; changes in the permission mode) and microbiome functions
- Timm (on behalf of Annika): first GAMLSS model implementation in DataSHIELD
Demetris Avraam, Manuel Huth, Annika Swenne, Andreas Mändle, Stephan Ringshandl, Beeca Wilson, Carolina Alvarez Garavito, Stuart Wheater
Andrei Morgan, Roy Gusinow, Florian Schwarz
- Discussion: Plan for dsBase 7.0 release
- General updates
- skip meeting in August
- dsBase release: Improve performance of dsBase functions (maybe set option to exclude checks; make more use of dplyr; directly add columns to data frames or replace columns directly, make use of sapply/sapply instead of for loops)
- Include checks in scatter plots that were described by Stephan
- Stephan: Multiple servers are executed/called after each other -> can this be done in parallel? Answer: Can be done in Client library in assign functions)
Demetris Avraam, Becca Wilson, Carolina, Stephan Ringshandl, Manuel Huth, Timm Intemann, Annika Swenne, Florian Schwarz, Stuart Wheater, Hank
- Presentation: Stephan Ringshandl presented his work on “Brute Seed disclosure method in DataSHIELD” (20mins + 10 mins discussion)
- Discussion: Roadmap for dsBase 6.4 (or 7.0)
- (if possible: discussion HP Swarm Learning)
- General updates
Demetris, Christian, Andrei, Stuart, Becca, Manuel, Roy, Carolina, Hank
- Presentation: Manuel Huth presented his work on “Privacy-preserving Difference-in-Differences with multiple time periods” (20mins + 10 mins discussion)
- Discussion: Separate functions (more packages + papers? -> marketing) or include “all” functions in dsBase?
- Discussion: Roadmap for dsBase 6.4 (or 7.0)
- General updates
Demetris, Roy, Stuart, Florian, Hank, Manuel
Andrei, Becca, Juan, Tom, Tim
- Presentation: Demetris Avraam presented his development of the ds.mice function for multiple imputations in DataSHIELD.
- General updates
- Miscellaneous (Continuous Integration, Being up-to-date with standard R packages [dplyr -> issue in ds.make with matrices, tidyverse, etc.], summary and plot functions for ds.glm and other model classes [standards] -> summary(ds.glm.object yields just a list summary?); ds.glm prints vcv automatically [why?]; standard naming convention?)
- continuation of ds.mice package functionalities (Demetris)
- global version for imputation would be desirable (time consuming)
- updates: Manu -> ds.did package; Florian -> conferences marketing with DataSHIELD; Stuart -> ds.base release, docker images for analysis with multiple packages; Hank -> new version of ds.MTL out with automatic capture of covariance during model Training, request: remove automatic printing of functions; Roy ->
- presentations: ~ 20 mins + ~ 10 discussion
- Manu: should we stick to CRAN as much as possible when creating model objects? (considering disclosure, of course)
- Next presentation by Manu or Roy
- Split up dsBase package in multiple packages (dedicated meeting for this) -> good from a technical and a marketing point of view
Demetris Avraam, Manuel Huth, Dick Postma, Augusto Anguita, Christian Hilbrands, Florian Schwarz, Becca Wilson, Stuart Wheater
Tom Bishop, Tim Cadman, Roy Gusinow
- Introductions
- How to organize the Theme meetings
- Statistical developments plan
- Working groups
- One meeting per month with a short presentation and discussions, and then discuss any issues from developers
- We can suggest a general Drop-in session for the whole Core Tech Group
- We can have drop-in sessions for users (we can discuss this also with Juan)
- To have an image on all available type of analysis, from all tested community packages
- Spreadsheet to be shared between members of the theme. At later stage we can create a live road map shared publicly with everyone, where the users can see what we plan and when in future releases
- Group on continuous integration and automated package testing