deRSE23 - Conference for Research Software Engineering in Germany, Paderborn 2023

Unfortunately, I missed the RSE conference last year year in Newcastle because I was moving jobs and country. So, it was really nice to be able to attend this year's German RSE conference in Paderborn.

Day 1

The keynote on Engineering Software Ergonomics was given by Reinhard Keil. I have been working on software development for a long time but was never really exposed to the idea of usability engineering. What follows is my brain dump of the keynote, snippets that I thought were interesting and insightful: Graphical user interfaces are not engaged in a dialog with the user. Messages like are you sure are not helpful. Software should provide precise information so that the user can make an informed decision. The well-known adage one image is worth a thousand words is not always true. Language is an abstraction So, for example, the word vehicle can mean many things. Inscriptions are external memories that can persist through time and be transported from one place to another. Mathematics, like text is two-dimensional. The symbols are laid out in a particular pattern on paper to help with the solution of the mathematical problem. The values, however, are tied to physical reality: 3 apples and 4 pineapples are 7 fruit, whereas 3 parts of one chemical and 4 parts of another may lead to an explosion. Software, like text and mathematics is a spatial thing. Maybe that is why writing software is in many ways unlike a precise engineering science. At the end Reinhard Keil also mentioned the currently very topical chatGPT. He does not think it will replace normal web search as it is good for dictating text but not iteratively honing in on a web search. I now have added looking into perception and usability engineering to my ever growing todo list.

Following the keynote, I joined the session on Workflows. Really enjoyed the talk on pyWATTS. pyWATTS is a non-sequential workflow engine that works with sklearn. The next talk was on ZnTrack which sits on top of DVC. DVC uses git to track data sets, particularly for machine learning. ZnTrack uses DVC to track parameters used in a pipeline. It is a bit like geosmeta only that it uses a git repository rather than a mongo database. It does support flagging up when parts of the pipeline need to be rerun because dependencies have changed.

The next session I attended was on Integration and Modularity. Interesting talks on how various groups are trying to encourage interoperability of codes that could be plugged together. The last talk was on developing a python frontend to control equipment in a physics lab. The presenter was really enthusiastic. It would be really nice if physical equipment came with a nice software interface. Prodding bits coming over a serial line is just a pain. The talk led to an interesting conversation afterwards when we discussed labview which is the go-to package to deal with physical instruments. The conclusion was that it is really easy to get started with but eventually too limiting. I guess that's exactly why a good, complete API is needed.

Next, I attended the Metadata session, AKA the Stephan Duskat show. Stephan talked about the citation file format which can be used to add metadata to a software project. The aim should be to cite software directly rather than papers to establish software as a first class academic product. Stephan's next talk was on the HERMES Project which aims to develop automated workflows for publishing scientific software with rich metadata. The last presentation was on the difficulty of extracting metadata from a large historic data set with inconsistent naming schemes.

The final session of talks was on machine learning and AI. Two presentation on using machine learning algorithms to analyse video data sets. It was interesting to see the application of ML.

Before the conference dinner was the poster session. I had a lovely chat about providing RSE services as a self-employed RSE. I checked out some more posters on workflows. I also had a totally nostalgic trip down memory lane chatting about deal.II with new support for temporal discretisation and Nedelec elements.

Also good chats about programming languages, supporting applications for long periods of time and what it is like working in a place that is still without central IT after having been hacked a little while ago.

Day 2

Day 2 started with the keynote from one of the sponsors. Tim Cutts from AWS presented the Amazon Cloud Development Kit. The CDK allows you to describe infrastructure as code using among other languages python. It spits out cloudformation Amazon's YAML/JSON schema for describing infrastructure in their cloud. It can also produce terraform and kubernetes.

Christian Kniep gave an interesting presentation on MetaHub Registry for HPC containers. Traditional containers only distinguish between CPU architecture, eg x86-64 and arm. HPC containers need to be more hardware and software aware, ie what flavour of MPI is used, which CPU optimisations were switched on or which accelerator was used. MetaHub uses docker labels to store the additional metadata.

My next session was on Heterogeneous Computing Architectures. The first talk introduced the python package ACME - Asynchronous Computing Made ESI. The package provides a python context manager than can be used to schedule tasks from an embarrassingly parallel problem. It uses dask and SLURM to distribute the work. Steffen Christgau from the ZIB talked about using modern programming languages to deal with heterogeneous architectures. Essentially use OpenMP or SYCL or a suitable library such as PETSc or pytorch depending on the application. In another presentation the nvidia MPS, multi-process service, was mentioned. Another thing to look into.

Next, I went on a guided tour of the Heinz Nixdorf Museum. The museum is absolutely fascinating. It starts off with a history of writing and number systems and how we do calculations. Babbage, Lovelace and Leibniz featured. Then there are historic computers such as the Z11 which is the last electromechanical computer built by Zuse. It also had the computer from the GEMINI-2 space mission built by IBM in the 60ies on display. Also interesting to see were the silicon crystals providing the wafers for CPUs. The museum is extremely good and highly recommended to visit.

The final session was on Musicology. The most amazing talk was on an art/science collaborative project where an orchestra performed a musical piece, (Un)Answered Question to an audience. The audience was equipped with sensors. The data collected during the performance was used to generate a new score which the orchestra then performed shortly afterwards. The scientists had 5 minutes to process the data, so not quite real time, but close enough. Unfortunately, no recording exists since it was meant to be unique which also helps with data protection issues.

Day 3

Workshops were run on the last day of the conference. I attended the software carpentry workshop. It was really cool to see the new system for generating course materials. The final workshop I attended was on Teaching RSEs. We had super interesting conversations and I am looking forward to seeing the resulting paper.