Skip to content

ds.summary simplification #257

@tombisho

Description

@tombisho

ds.summary was originally written to deal with simple R object types like dataframes and vectors. It tests the object type on the server side, and depending on the result, calls a serverside method that generates a privacy protected summary and returns it to the client.

There are 2 things I would like to raise about this approach:

  1. As we add more exotic object types, the list of if{} then{} statements grows ever longer
  2. If the object type has not been added to ds.summary and a corresponding server side function defined to summarise it, a summary is not available

I have been looking how the native summary() function works. It is written as a generic function, which just provides a placeholder. When new objects are defined, they can provide their own method for summarising the data. For example, an object of class blob could have a method called summary.blob(). When summary() is called with a blob object, the code in summary.blob() is executed. There is also a method summary.default() which looks at the underlying structure of an object - most objects are matrices, lists etc. with some extra attributes - and then summarises it. For example, if the blob object was just a matrix with some extra attributes, and we did not define a method for summarising it, when summary() is called, it falls back to summary.default(). This sees that the blob object is really just a matrix and summarises each column.

I wonder if a similar approach might be useful in DataSHIELD, to save having to explicitly define summary functions for every object type.

This would mean having a generic summaryDS() function which is analogous to the summary() function, but with privacy protection. Objects created with DataSHIELD on the serverside could also define their own version of summaryDS(), for example summaryDS.glm() for GLMs.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions