This presentation was written by Louise Wheeler, Sharyn Wise and me for the
Asia Pacific Research Integrity 2018 meeting in
Taiwan, Feb 2018 it was scripted and delivered by Louise, who is the UTS Manager,
Research Integrity and Research Program and Sharyn works in the eResearch team
with me. This is good introduction to the work we've been doing on the UTS
provisioner project from Louise's Research
Integrity (RI) perspective. There's
not much technical detail in this talk about the open source
ReDBox platform on which our data
management system, Stash, is built. I'll post more soon about that.
Thanks also to Chris Evenhuis for some slide design.
I'm posting this both on the eResearch blog and
at my site.
Today I’ll be presenting the UTS approach to Research Data Management (RDM), and
highlighting how the development of a practical and comprehensive solution can
enable researchers to meet their RI obligations
I’ll run through a brief history that lead to our approach, and then I will
present an overview of the solution we are implementing.
Formal RI training may occur early in a researchers career, but otherwise
becomes assumed knowledge. Awareness of responsible research practices is not
often actively supported or promoted throughout an academic's career, at least
not consistently at the institutional level. At UTS we have pockets of
opportunities for promoting research integrity principles, and one such pocket
is the eResearch team – who have regular face-time with researchers when
supporting them with tools and practices. We are using these opportunities to
build a comprehensive framework that will enable researchers to meet RI
principles, through improved systems and practices.
Australia’s current code has been in place since 2007. It is a relatively
prescriptive document that guides institutions and researchers as to their
responsibilities in meeting integrity principles, including management of
Despite these guidelines, at that time there was no defined or coordinated
approach to research data management at UTS. Any management relied on the best
endeavours of individual researchers to protect their data. They continued to
store their data on file-servers; in drawers, in unlabelled hard drives, in
their garages etc.
It was not until 2013 that national infrastructure funding allowed Australian
Universities to develop Research Data catalogues. For the first time researchers
were encouraged to think of their data as an asset to be managed, as opposed to
simply kept. At the same time, national requirements to include RDM plans in
proposals to our major funding bodies, gave universities a bigger driver for
At UTS we began to govern research data at the Policy level, which required
researchers to plan their data management, At the systems level, because of the
new focus on RDM requirements, we had the necessary senior-level investment to
develop our research data catalogue (which we call Stash).
Take up of Stash over the next two years was very slow. With minimal resources,
the strategy for promoting Stash focused on educating graduate research students
and thereby indirectly, their supervisors. It was difficult to persuade
researchers that sharing their data was a good idea, and that storage should be
on university owned and controlled systems.
And, since the ANDS funding rules didn’t allow for development of data
repositories, our catalogues weren’t integrated with our storage solutions. It
was clear at that point that researchers needed an integrated data management
In 2016 efforts around RI and RDM began to align. We rolled out a new Research
Integrity Framework while the eResearch team produced a [Strategy and
roadmap](https://eresearch.uts.edu.au/eresearch/strategy/) that explicitly
linked data management to Integrity requirements and during 2017, our research
policy environment was redrafted in the same vein.
Collectively, these efforts enabled us to put the pieces in place for an
end-to-end data management solution and the [Provisioner project](https://eresearch.uts.edu.au/2018/04/05/provisioner_1.htm) was born.
This is Provisioner. It looks complex, but I will walk you through it using a
couple of examples that demonstrate how it supports both RDM and research
A researcher (top centre) accesses the data management catalogue “Stash”, which
has three parts: (1) Create a data management plan (as required by our policy
environment), to describe how their data will be collected, analysed, stored and
accessed. (2) Access research data catalogue listing where archived data sets can
And from the plan, they (will by the end of 2018) have access to (3) the
innovative provisioning tool which allows researchers to provision workspaces
such as file-shares, electronic notebooks, and repositories for programming
They can request and link to data storage, and at the end of the project
archived and publish their datasets.
The Provisioner will also have a range of reporting tools and automated ‘bots’
than generate useful reports for the Research Integrity team – identifying
‘orphaned’ workspaces, or auto-creating data management plans for researcher or
Information is pulled in from our existing databases to populate the RDMPs,
again saving researchers time.
We had a recent example in a science lab where some code from one project was
modified for use in a new project. Most researchers are not aware that code is
considered research data that must be retained, particularly in the sciences
where they are not trained in IT. So no copies of the code were retained, and
that data is now lost. In the Provisioner model, [Gitlab] solves this problem
without the researcher having to stop and think about their research integrity
obligations by saving each version the code and creating a revision history.
By introducing Provisioner, we are not implying that researchers can remain
blissfully unaware of their responsibilities, but we are supporting them by
providing a positive and streamlined experience, with the added comfort of
understanding that it comprehensively addresses responsible research practices.
We will continue to raise awareness by linking integrity training to our
practical training around data mgmt, and other element of the project
We’ve found that bringing researchers into contact with our eResearch experts
is an effective way of supporting and building an integrity culture. Our
researchers tend to prefer practice-based over principles-based education: so we
are aiming to build the principles into our workshops/clinics (e.g. Research
ethics practices, Research data management practices, Publications workshops
that focus on open access and reproducibility
Another example of how Provisioner addresses RCR is the incorporation of the
[LabArchives]/ELN (Electronic Lab Notebook)(top left). As this model is being implemented uni-wide, it
gives us an institutional advantage of oversight. eNotebooks developed at
the lab or faculty level leads to inconsistent practices and means that data is
out of the universities control.
Our policy dictates that rights in data are owned by the institution, which means we should
have some control over how it is stored, accessed and reported, and also have
the ability to interrogate the system, should any disputes arise. The
LabArchives tool ensures that data can not be manipulated or lost, can help to
clarify IP and ownership issues, and also enables us to manage data access when
researchers move institutions.
And finally, for researchers doing real-live imaging experiments, they could
generate terabytes of data in one session, so Provisioner can provide access to
large, online repositories such as
[Omero](https://www.openmicroscopy.org/omero/). This gives researchers comfort
that all their experimental data can be kept, not just selected files.
Provisioner is still being developed throughout 2018. What we’ve learned so far is that:
* Simply mandating new RDM requirements doesn’t work. Researchers are
incentivised by integrated and discipline-relevant tools that save them time
and give them reason to uphold good data management practices for more than just
* Successful implementation requires stakeholder engagement and championship at
senior levels - we are fortunate that our Associate Dean, Research (ADR) in
Health has invested heavily in both our RI and RDM responsibilities and has
implemented a cultural change in that Faculty, meaning that every project now
has an RDMP. This is reflected in uptake of RDMPs across the institution in
2017, supported by the extensive cross-promotion of between Ethics, RI/RCR
and RDM representatives at UTS.
Early feedback has been positive with researchers keen to take up the components
we have rolled out because they can see immediate value in adopting them, and we
can already begin to see the effectiveness of the model by the levels of uptake
of our [LabArchives] and [GitLab] since their
implementation in 2017.
However, there is still a way to go. Data archiving and preservation to cater
for longer retention periods such as those demanded by clinical trials are still
ahead on the eResearch roadmap. Data management practice is improving but is by
no means universal. We accept that culture change concerning RDM is a long term
proposition; but it can also be a quick win for RCR by offering an opportunity
to promote research integrity in practical ways. Meanwhile, we take a risk
management approach, focusing on sensitive and personal data and grants where
the funding bodies require RCR and research data management.