Technology

On this page a Technological overview of the CRC806-Database legacy system and the Repository Life Cycle Model (Willmes 2016) is given, followed by an introduction to the PWA technology.

The Research Data management infrastructure CRC806-Database

The CRC806-Database was first launched in early 2010 as a simple PHP/HTML/JS website, without much interaction. The data upload was not facilitated via the web interface, but by manually copying the files into folders on the Andrew File System (AFS) backend of the University of Cologne, that were mounted by the web server serving the website (Willmes et al 2012, Willmes et al 2014). As described in (Willmes et al 2014), in 2013 the system already consisted of a data catalogue and a WebGIS interface. This system was developed using an early version of AngularJS and was very lightweight on the server side. As backend for the data catalogue CKAN was already used, and as the SDI backend for the WebGIS a MapServer instance on a PostGIS backend was installed (Willmes 2017).

Since then, the CRC806-Database application evolved into a complex system, integrating multiple middleware components, like CKAN, GeoNode and Typo3, see fig.1.

Fig. 1: System Architecture of the CRC806-Database research data management infrastructure until 2017. Left diagram, CRC806-Database as of 2017, source: (Willmes et al 2016). Right diagram, CRC806-Database as of 2013, source: (Willmes et al 2014).

It is obvious, that the CRC806-Database infrastructure is relatively complex, meaning it integrates several components, that each has to be maintained.There are many vulnerabilities due to updates that may break the current interplay of the components.But the main disadvantage is, that only highly trained people that know the system well can maintain it, and it would not be feasible to give this job to any third party systems administrators, because the effort and thus the cost would be too high.

Repository Life Cycle

One major result of the research data management infrastructure implementation and its development over several years was the Repository Lifecycle Model (see fig.2 by (Willmes 2016), that describes how a research data repository evolves through the different phases of a CRC INF project.

Fig. 2: The repository Lifecycle Model for a fixed term funded research project. Source: (Willmes 2016).

Pre-project phase

In the pre-project phase, gathering of ideas, research about related work and infrastructures and project proposal writing are the main tasks. This phase is about designing the initial layout and aims of the infrastructure, the results are afterwards formulated in the project proposal (Willmes 2016).

Project phase

In the first phase of the project, the system is implemented, as developed in the project proposal. During this initial development, or at the latest, after a first version of the repository is finished, the system will be tested and used, during this process, almost certainly errors and shortcomings will be detected, that needs to be fixed. This will start a feedback cycle, similar to the prototyping approach (Willmes 2016}, until most errors are identified and fixed. Consequently, innovation, respectively requests and ideas for additional functionality will emerge. To develop these additional features will restart the initial development cycle. At the latest, when it seems that the repository is feature complete, it should be taken care of implementing and improving interoperability, ideally according to well established standards. Finally, the security of the system should be hardened. This includes, testing the whole system for vulnerabilities and fixing them (Willmes 2016).

Post-project phase

Because the project will not be maintained by the project funded staff, after the project has ended, strategies for keeping the system online and working need to be developed and implemented.This can include many different measures, that are very dependent on the given infrastructure implementation and the technology used.What always holds is, that the system needs to be reduced in complexity, this means to get rid of most dynamic functionality.The here presented system, for example will be transformed into a static website, without server side scripting and no Relational database management system (RDBMS), no user login, etc., just plain HTML.This solves about 99\% of all possible vulnerabilities of a web based system (Willmes 2016).

Progressive Web Apps

Progressive Web Apps describe a new class of web applications that utilize a set of emerging technologies and patterns. They aim to bring a native-like user experience to the browser by offering a quickly loading and continuously reactive UI with deeper integration into the underlying operating system compared to usual web applications. The term has been coined by Russel and Berriman which identified qualities like Responsive, Connectivity independent, App-like-interactions, Fresh, Safe, Discoverable, Re-engageable, Installable and Linkable (Russel 2015).
The following quote summarizes well what PWAs achieve and add to the state of the art of web development:

Progressive web apps are a new breed of web apps that combine the benefits of a native app with the low friction of the web [...] they start off as simple websites, but as the user engages with them, they progressively acquire new powers. They transform from a website into something much more like a traditional, native app (Ater 2017, p.2 ).

Technologies

Service workers (Russel 2017) are one of the more significant additions.A service worker is an additional JavaScript file that can be registered during the application life cycle.Once in control, it has access to the cache and push API, as well as being able to intercept fetch-events.This allows service workers to respond to a network request with a cached resource instead of relying on a network connection.A successfully installed service worker may also offer a re-engagement UI in the form of push notifications, those can also be issued when the web application is not opened in a browser - as long as the service worker is installed.However, in order for a PWA to be able to install a service worker, it needs to be served through a secure connection, using HTTPS.
Defining a web app manifest enables PWAs to integrate into the underlying operating system similar to native apps. The manifest controls which icon and title may be used within a launcher and whether the PWA should launch in a browser tab as usual or in its own exclusive window that may hide specific, or all browser UI elements.

Those additions progressively enhance a web application in modern browsers, while basic site functionality is still established in older browsers even without the aforementioned features.

Concepts

PRPL is a pattern for structuring and serving Progressive Web Apps, with an emphasis on the performance of app delivery and launch (Osmani et al 2018). It stands for push, render, pre-cache and lazy-load.
The initial push should only contain assets that are critical for an initial route (see fig. 3). This can be achieved with code-splitting and/or route-based chunking (Osmani et al2018), often introduced by modern JavaScript module bundlers.

Fig. 3: Initial loading of a progressive web app. The Service Worker pre-caches critical resources for future requests. Source: (Yener 2018).

With the critical assets loaded, the initial route will be rendered. At this point, the app shell model can be utilized to achieve start-up times that are closer to native applications (Osmani et al 2018). The app shell model is applied by making a clear distinction between the dynamic content of a web app and its surrounding structure (e.g. navigational menus, routing in general (Osmani et al 2018). The app shell is ideally cached by a service worker to truly reach almost-instant load times on consecutive visits (see fig. 4).

Fig. 4: Once pre-cached, the app shell will be available almost instantly without requiring network requests. Source: (Yener 2018).

Once the assets for the initial route are pushed and rendered, any assets that are needed for routes that are likely to be navigated next, may be pre-loaded on the current route. One way to do this is with the < link >-tag when the rel="prefetch" attribute is set. Assets declared this way will only be downloaded when critical resources for the current route have finished loading (Osmani 2017).
Less important routes may utilized lazy-loading instead of only load assets when needed.

References

Ater (2017): Building Progressive Web Apps: Bringing the Power of Native to the Browser. O'Reilly, UK Ltd.

Osmani (2015): The App Shell Model. Link

Russel, A. (2015): Progressive Web Apps: Escaping Tabs Without Losing Our Soul. Link.

Kürner, D. (2012): Implementation des Metadatenmanagement von Geodaten der SFB806-Datenbank. Diplomarbeit. Geographisches Institut der Universität zu Köln.

Willmes, C., Kürner, D. and Bareth, G. (2014): Building Research Data Management Infrastructure using Open Source Software. Transactions in GIS. doi: 10.1111/tgis.12060 [Link]

Willmes, C., (2016): CRC806-Database: A semantic e-Science infrastructure for an interdisciplinary research centre. PhD Thesis, University of Cologne. url: http://kups.ub.uni-koeln.de/7381/

Willmes, C., Yener, Y., Gilgenberg, A., Bareth, G. (2016): CRC806-Database: Integrating Typo3 with GeoNode and CKAN. Geographisches Institut der Universität zu Köln, Kölner Geographische Arbeiten, Vol. 96, DOI: 10.5880/TR32DB.KGA96.17

Willmes, C., Becker, D., Verheul, J., Yener, Y., Zickel, M., Bolten, A., Bubenzer, O., Bareth, G. (2017): PaleoMaps: SDI for open paleoenvironmental GIS data. IJSDIR, Vol. 12, 39-61, DOI: 10.2902/1725-0463.2017.12.art3 , URL: http://ijsdir.jrc.ec.europa.eu/index.php/ijsdir/article/view/431

Yener, Y. (2018): Progressive Web Apps - Costs, Benefits and Tradeoffs. Masters Thesis. Informatik, TH Köln.

Currently offline, some contents may be unavailable