By Mike Lynch and Peter Sefton Mike Lynch and Peter Sefton attended the 2019 eResearch Australasia conference in Brisbane from 22-24 October 2019, where we presented a few things - and a pre-conference summit on the 21st held by the Australian Research Data Commons, where Mike presented our report from our small discovery project on scalable repository technology. UTS paid for the trip. What we presented - our work on Simple Scalable Research Data Repositories We've posted fleshed-out versions of our conference papers as usual.
This presentation was given by Peter Sefton & Michael Lynch at the eResearch Australasia 2019 Conference in Brisbane, on the 24th of October 2019. Welcome - we’re going to share this presentation. Peter/Petie will talk through the two major standards we’re building on, and Mike will talk about the software stack we ended up with. ' title='The project in a nutshell A static, file-based research data repository platform using open standards and off-the-shelf web technology OCFL – versioned file storage RO-Crate – dataset / object metadata Solr – index and discovery nginx – baked in access control
By Peter Sefton This presentation was given by Peter Sefton at the eResearch Australasia 2019 Conference in Brisbane, on the 24th of October 2019. ' title='Meet RO-Crate ' border='1’ width='85%'/> This presentation is part of a series of talks delivered here at eResearch Australasia - so it won’t go back over all of the detail already covered - see the introduction of datacrate in 2017 and and the 2018 update. The standard formerly known as DataCrate has been subsumed into a new standard called Research Object Crate - RO-Crate for short.
ARDC funding - Data and Services Discovery projects - Institutional Role in a Data Commons There are a lot of specialized repository applications, from small (Omeka) to large (Hydra, Fedora), all designed as special-purpose homes for datasets and metadata which provide APIs for getting things in and out. Experience has shown that these solutions don’t scale. Eventually, an institution will have to store a dataset that’s too big either to get in or out, or to store, and will have to look at a workaround like putting the data on disk and pointing to it from a record in the repository.
Overview In order to know whether we are providing what researchers need and thereby to improve our services to researches, we rolled out a short anonymous survey* back in May 2018 that has been open to all UTS researchers, research assistants and HDR students. We have received 41 responses so far. The two bar charts below show the distribution of their research positions and the distribution of their research fields respectively.
This Genomics Workshop was held at Sydney University on 3rd June 2019 Pascal Tampubolon and Mike Lake attended the event. Present were the NSW Chief Scientist and leaders of various Genomics institutes in Australia. The University of Sydney are developing a 1-5 year infrastructure roadmap to enable excellence in genomics-based research and a strategy to extend this across other institutions in Australia. This workshop showcased some outstanding genomics research being done Australia.
' title='DataCrate: a progress report on packaging research data for distribution via your repository Peter Sefton University of Technology Sydney ' border='1’ width='85%'/> This is a talk that I delivered at Open Repositories 2019 in Hamburg Germany, reporting on developments in the DataCrate specification for research data description and packaging. The big news is that DataCrate is now part of a broader international effort known as RO-Crate. I spent several hours at the conference working with co-conspirators Stian Soiland-Reyes and Eoghan Ó Carragáin on the first draft of the new spec which we hope to unveil at eResearch Australasia 2019.
Implementation of a Research Data Repository using the Oxford Common File Layout standard at the University of Technology Sydney
This is a presentation by Michael Lynch and Peter Sefton, delivered by Peter Sefton at Open Repositories 2019 in Hamburg. My travel was funded by the University of Technology Sydney. ' title='Implementation of a Research Data Repository using the Oxford Common File Layout standard at the University of Technology Sydney Michael Lynch, Peter Sefton University of Technology Sydney, Australia ' border='1’ width='85%'/> This presentation will discuss an implementation of the Oxford Common File Layout (OCFL) in an institutional research data repository at the University of Technology.
[Edited: 2019-07-01, 2019-07-2 fixed a few typos] This year Open Repositories was in Hamburg, Germany. I was funded by the University of Technology Sydney to attend. I gave two presentations, one on our work on scalable research data repositories and other on research data packaging and ran a workshop, more on which is below. This year was an intense, focussed conference for me. Last year I had a few take-aways from presentations; this year was one of those conferences where the value was in the conversations and in getting down to work.
Grant awarded; ARDC Discovery Activities; FAIR Simple Scalable Static Research Data Repository Demonstrator
The eResearch team at UTS in collaboration with colleagues at QCIF and AARNet applied for and received funding under the Australian Research Data Commons “Institutional role in a data commons” grant scheme. We applied for $50,000 but were only awarded $49,999 :(. The proposal is reproduced below. We promised to share what we're doing - so there will be more coming about what this Oxford Common File Layout (OCFL)´ thing is, why it matters and what we're testing / demoing.
This Hacky Hour blog post covers the 14th February to the 21st March. These 6 sessions of Hacky Hour saw a good turnout, with both HDR students and researchers attending. Repeat visitors were common in the first few weeks – this was due to the solid assistance given by our eResearch team that encouraged attendees to return week after week, until they had solutions to their queries. Queries that were addressed included: HPC installation and use Software expertise GROMACs SSH connection assistance Redcap Visualisation STASH Programming Machine learning Management of data supercomputer Figures in R Database assistance GPU queries Excel data processing Geonomic / statistical analysis Omero storage General queries.
Last week I attended the Dr Trevor Pearcey Centenary Celebration. This was an afternoon of talks at Sydney University to celebrate the achievements of Dr Trevor Pearcey, a British-born Australian scientist, who created CSIRAC in 1949. This was Australia's first digital computer and the 4th or 5th stored program computer in the world. CSIRAC is the oldest surviving first-generation electronic computer in the world. I saw CSIRAC a few years ago when in Melbourne.
I have released as open source the small web application that we use to show the nodes, queues and jobs running on our HPC cluster. This is used by the cluster administrators to quickly see the status of the nodes and how busy they are, to check how full each queue is and to see what jobs are running and queued. It’s also useful for users to see this information as well.
Upcoming Training Courses in 2019 Have been Updated UTS eResearch and Intersect offer a wide range of specialised courses for researchers, from beginner through to advanced levels in High-Performance Computing (HPC), Programming with R/Python/Matlab, Excel, data management, data cleaning and visualisation, databases and SQL, and more. Delivered by Intersect's team of experts, training courses provide practical and research-relevant hands-on exercises. Upcoming training courses are updated regularly. The latest update has been done on 22 Jan 2019.
This is a presentation by Mike Lynch, Peter Sefton and Sharyn Wise, delivered at eResearch Australasia 2018 by Mike Lynch. Mr Michael Lynch Dr Peter Sefton Ms Sharyn Wise ' title='A Framework for Integrated Research Data Management With services for planning, provisioning research storage and applications and describing and packaging research data Mr Michael LynchDr Peter SeftonMs Sharyn Wise ' border='1’ width='85%'/> A Framework for Integrated Research Data Management ' title='Provisioner Integrate research data management into research applications Allow researchers to self-provision research apps Apply lessons learned from earlier generations (Data Capture) Give researchers something, get out of the way We didn’t want to build a monolith Small parts, loosely joined, data-centric
eResearch Australasia This year Peter Sefton and Michael Lynch from eResearch, and our colleague Liz Stokes from the library attended eResearch Australasia. The conference is the main professional networking / learning opportunity for eResearch staff. Promoting UTS work and building community Mike Lynch presented UTS's work on the Provisioner - the presentation is here Peter Sefton gave a presentation launching the “release candidate” version of the DataCrate specification - the presentation is online.
Launching DataCrate v1.0 a general purpose data packaging format for research data distribution and web-display
This is a presentation by Peter Sefton, Michael Lynch, Liz Stokes and Gerard Devine, delivered at eResearch Australasia 2018 by Peter Sefton. ' title='Launching DataCrate v1.0: a general purpose data packaging format for research data distribution and web-display ' border='1’ width='85%'/> In this presentation we will launch version 1.0 of the DataCrate standard. The presentation will cover: The motivation for this work, and prior art - why we needed to bring together the standards we did in the way that we did.
The 2018 Linux Conference Australia was held the University of Technology Sydney from 22-26 January 2018. I attended courtesy of eResearch. Linux totally dominates supercomputers. As of November 2017, all 500 of the world's fastest supercomputers were running Linux. This is because most of the world's scientific software for generating or crunching research data is written to run on Linux systems because of its openness, ability to be customised, speed and robustness.
This presentation was written and delivered by Sharyn Wise (with couple of slides from Peter Sefton) for the Australasia Preserves meeting at the NSW State Library. This was one of a number of short talks from various organisations and was the only one to focus on research data. UTS has not started our Digital Preservation journey yet - we're still packing our things and looking at the map, or ‘prepping’.
A couple of weeks ago, I attended Research Bazaar (ResBaz) Sydney 2018 as an instructor. This is a wrap-up report on this successful and enjoyable event. Overview Research Bazaar was initiated in Melbourne, and now it is a worldwide festival promoting the digital literacy emerging at the centre of modern research, where researchers can learn from training courses, share knowledge and skills, network with peer researchers and have fun. In Australia, ResBaz has been held in Melbourne, Sydney, Brisbane, Perth and Hobart.
I (Peter Sefton) recently attended OR2018, the Open Repositories conference from June 4-7, 2018 in Bozeman Montana. This post is being posted on the UTS eResearch site and on my site. My trip was funded by the University of Technology Sydney (UTS). Mission Gavin Kennedey from QCIF was also in attendance, and we were on something of a mission - to promote and get feedback on the recent work we've been doing on the ReDBox research data management platform.
This is a presentation that Gavin Kennedy and I gave at Open Repositories 2018 in Bozeman Montana. I am posting this on the UTS eResearch website and on my own site. Gavin Kennedy, QCIF Peter Sefton, University of Technology Sydney ' title='ReDBox 2.0 / Provisioner Gavin Kennedy, QCIF Peter Sefton, University of Technology Sydney ' border='1’ width='85%'/> Notes - Slide 1 The first part of this presentation was narrated by Gavin.
End-to-End Research Data Management for the Responsible Conduct of Research at the University of Technology Sydney
This presentation was written by Louise Wheeler, Sharyn Wise and me for the Asia Pacific Research Integrity 2018 meeting in Taiwan, Feb 2018 it was scripted and delivered by Louise, who is the UTS Manager, Research Integrity and Research Program and Sharyn works in the eResearch team with me. This is good introduction to the work we've been doing on the UTS provisioner project from Louise's Research Integrity (RI) perspective. There's not much technical detail in this talk about the open source ReDBox platform on which our data management system, Stash, is built.
This post is an introduction to the Provisioner, an open framework for research data management which we're developing in collaboration with the Queensland Cyber Infrastructure Foundation, QCIF and the Australian National Data Service, ANDS. Provisioner grew out of a project funded by the UTS IT Capital Managemement Program, which is, confusingly, also called Provisioner. In this post, I'll use “Provisioner” to refer to the framework and software, not the project as a whole.
Upcoming Training Courses in 2018 Have been Updated UTS eResearch and Intersect offer a wide range of specialised courses for researchers, from beginner through to advanced levels in High-Performance Computing (HPC), Programming with R/Python/Matlab, Excel, data management, data cleaning and visualisation, databases and SQL, and more. Delivered by Intersect's team of experts, training courses provide practical and research-relevant hands-on exercises. Upcoming training courses are updated regularly. The latest update has been done on 28 Feb 2018.
By Peter Sefton A version of this post is also available at my website. This is a presentation I gave at eResearch Australasia 2017-10-18 about the new Draft (v0.1) Data Crate Specification for data packaging I've just completed, with lots of help from others (credits at the end). BACKGROUND In 2013 Peter Sefton and Peter Bugeia presented at eResearch Australasia on a format for packaging research data(1), using standards based metadata, with one innovative feature – instead of including metadata in a machine readable format only, each data package came with an HTML file that contained both human and machine readable metadata, via RDFa, which allows semantic assertions to be embedded in a web page.
Upcoming Training Courses Have been Updated UTS eResearch and Intersect offer a wide range of specialised courses for researchers, from beginner through to advanced levels in High-Performance Computing (HPC), Programming with R/Python/Matlab, Excel, data management, data cleaning and visualisation, databases and SQL, and more. Delivered by Intersect's team of experts, training courses provide practical and research-relevant hands-on exercises. Upcoming training courses are updated regularly. Please keep an eye on our training page.
In collaboration with Western Sydney University (formerly UWS), the eResearch team here at UTS has been working on some enhancements to the Omeka repository tool. We presented a general paper about this at Open Repositories 2015, but in this (very late!) post we'd like to talk a bit about the software. There are a couple of changes we made that we think might be good additions to the Omeka v2 core.
Attention HPC users! Are you ready to play in the big league? We’ve noticed some sophisticated research compute happening on the UTS HPCCs, so this week Hacky Hour has a special guest for you. Meet Raijin, the National Supercomputing facility (and star of ABC TV’s thriller “The Code”)! Ok, so Raijin isn’t coming to us, but our guest, Dr Joachim Mai, can help bring you to Raijin. If you are getting frustrated with capacity issues or queues it might be time to migrate to the big time.
Hacky Hour 1.0 Thank you to everyone who came to our first ever Hacky Hour at UTS. It was great to see so many people come along to get help with their problems, lend a hand to others or just catch up with other researchers or people on the eResearch team. Here are just a few of the things people talked about: Peter Sefton talked to people from the Microbial Imaging Facility about managing microscope image data using an application such as Omero.
What is Hacky Hour? Inspired by the ResBaz people at The University of Melbourne, the UTS eResearch team are launching Hacky Hour at UTS. Hacky Hour has been successfully replicated at other universities such as The University of Auckland and The University of British Columbia, so we’re excited to be bringing it to UTS! Hacky Hour is a weekly meetup where researchers can congregate to work on their research problems related to code, data, or digital tools in a social environment.
Date: 25 August 2015 Time: 2.30pm – 3.30pm Session Format: Presentation Venue: CB11.06.408 Machine learning is being used to solve interesting problems in many fields: autonomous navigation, fraud detection, animal conservation, health care, energy forecasting. But many machine learning algorithms are complicated to implement, so how can you start using these algorithms on your applications? In this session, Matt will show how the MATLAB® environment makes it easy to apply machine learning algorithms to your data.
Presentation: Ozmeka: extending the Omeka repository to make linked-data research data collections for (any and) all research disciplines
This is a presentation delivered by Peter Sefton at the Open Repositories conference in June 2015 in Indianapolis. See Dr Sefton's blog for a trip report. ![ Ozmeka: extending the Omeka repository to make linked-data research data collections for (any and) all research disciplines Peter Sefton, University of Technology, Sydney, firstname.lastname@example.org Sharyn Wise, University of Technology of Sydney, Sharyn.Wise@uts.edu.au Peter Bugeia, Intersect Australia Ltd, Sydney, email@example.com Katrina Trewin, University of Western Sydney, k.