19.09.2019Behind the platform

Chris Gwilliams is the Center’s back-end developer, the geeky soul behind the citizenscience.ch platform. We convinced him to sit down and explain in layman’s terms our tech services and tools … here is what we got!

Author: Fanny Gutsche-Jones

Let’s face it, you have read enough about “good news” and “projects”, what you really want to read is a good, old-fashioned technical blog post. I am going to assume that is not true, but I will try and introduce how we create projects and what our infrastructure is at the Citizen Science Center, in layman’s terms.

Our Values

Firstly, we have some values at the Center that help us make technical decisions and shape the tools we use. Some of the most important are:

Work in the open. All of our choices for tools and software are restricted to open source software, and all of our work can be found at: https://github.com/citizensciencecenter
Respect the user. We take privacy seriously, following Swiss laws and adhering to GDPR.

What We Offer (Technically)

As a center, we are here to help researchers and citizens design, create and run Citizen Science projects, and then download and analyse the associated data. For all volunteers contributing, we want to create engaging, reliable and easy-to-use projects that allow them to enjoy the experience while seeing the benefits of taking part.

Our offers and tools:

File storage
We store files (documents, media) in a service called Minio that allows us to quickly retrieve them and make sure people with the required rights can access them
Data storage
We store all data in a Postgres database, Postgres is a very old but incredibly feature rich and stable database with some excellent tools available for it
Project Websites
Thanks to our in house UX-designer, we develop accessible and state-of-the-art web applications for each project with unique ‘look-and-feel’.
Application Programming Interface (API)
This is a set of standard URLs that allows people to get and update information in our database. We use the API to retrieve tasks and save users submissions.

Where Is Everything Stored?

You may already have heard of Amazon Web Services, or Google Cloud Platform. These are servers (and other services) provided by the big players that allow people to offload all of the technical management to them. However, they are expensive ….

Luckily, the University of Zurich comes to the rescue with ScienceCloud and provides us a number of servers to use that would otherwise cost us hundreds of Francs a month.

Few Basic Concepts

Before the deep dive, you need to understand concepts such as Users and Projects in relation to our database. Do not worry, no technical knowledge is required here!

Basically, a User has an email address, a username and a password. Users can create Projects that have a name and a description, among other things. A Project could contain more than one thing for Citizen Scientists to do, we call those Activities. For example, a Project to identify snakes could have an Activity where a User uploads pictures of snakes that they have taken and another Activity where those pictures are classified.

Inside each Activity is a number of Tasks, there can be just one if you wanted, or 1 million. Up to you. When Users take part in your project, they make a Submission. Users own their own Submissions, but all submission in a Project can also be seen by the Project Owner (ie. the User that made the Project).

Simple enough, right?

How Does It Work?

Let’s go through a typical example: you, a citizen scientist, connect to a project site and want to contribute, let’s say to wiesel-gesucht.citizenscience.ch.

You know you love weasels, so you register for an account (unless you are already a member). Each project in the platform is linked, so you only need one account to take part in all of the projects we host. In some projects, we also have the option to allow users without accounts to take part. Generally, we tend to ask very little when you sign up (just your email!) so you get to keep your private data to yourself.

However sometime, depending on the project, we may ask for more information. If you want to take part in a project that, for example, wants to find all people in Switzerland that can speak more than 4 languages, then you will need to provide the languages you speak but those data will only be available within that project.

You are ready to take part now!

When you get to the project, a task is loaded by the API and shown to you. This may include, for instance, the question that is to be answered, and any related media and a form for you to answer. Once you fill your answers in and click the Next button, a submission is created and sent to the API. This happens for every user that is taking part in that project.
That is pretty much it!

How Does It Run? (This may get technical, but bear with me!)

Well, we have some tools that make things easy for us to handle large numbers of projects and large amounts of data (some of our projects have more than 300,000 tasks for users to complete). This may get technical, but bear with me:

OpenAPI

APIs are difficult beasts, they can grow in size and complexity and, more importantly, they can be compromised to allow people to get data that they should not have access to.
Normally, all APIs are written in code somewhere and quite hard for people to read or follow. Like this example:

$app->group(‘/api’, function () use ($app) {
$dataForApi = [‘yo’, 777];
// api route “test” which just gives back some demo data
$app->get(‘/test’, function ($request, $response, $args) use ($dataForApi) {
return $response->withJson([
‘demoText’ => $dataForApi[0], // “yo”
‘demoNumbers’ => $dataForApi[1] // “777”
]);
});
});

For programmers this may make some sense, but it is far from easy. It also means that your API documentation needs to be separate from your code; this basically means having to write a document for every file of code you write. Sounds fun? Nope.

OpenAPI is a specification that means your API is in a standard format that separates it from your code. Because this is a standard, a lot of tools have been made for anything that follows the standard. This means things like testing, documentation and even some server backends exist without needing to write a line of code! You can see this in action at api.citizenscience.ch/explorer. This page is automatically generated for us and allows us to test our API, but it also allows others to see what kinds of data we make available.

VueJS

VueJS is an excellent tool made by the community of Web Developers. Did you every change the style of your MySpace page or dabble in some HTML or CSS? When a site gets big and has a lot of functionality, we may not want to write thousands of lines of HTML. VueJS allows us to make Web Components (like a Task Form or a Contact Form or an Image Gallery) and reuse them in all of our sites. Not only does this mean we have the same look and feel but we can do the work a lot quicker!

VueJS also has a huge number of packages available that makes our life easier, things that help us:

make sites available in multiple languages
connect to our API quickly and easily
optimise the site for search engines

The Nitty Gritty

OK, this part is for those that have some technical knowledge and want to know how we run things really. Feel free to read on if you are new to this and contact me if some parts are not clear!

All of our work is stored in repositories on Github, so we use that as a collaboration tool and we use Travis CI to handle all of the tests that we have written for each repository.

Inside the Universitæt Zurich cluster, we have a number of different servers. Let me break the more important ones down:

3 servers operating a Kubernetes Cluster
1 server running the production API
1 server running the production Database
1 server handling deployments into Kubernetes

Kubernetes

Our Kubernetes is based on K3S and self-managed. We have 1 manager node and 2 worker nodes, with 2 TB of persistent volumes and 3 namespaces for: prod, staging and test. Packages are managed by Helm and we have a certificate manager configured using Let’s Encrypt. This means that a deployment with an ingress specified will automatically have the SSL certificate provisioned. Cool, right? Right?!?

AutoDeploy

Managing deployments is never fun and when your job is technical architect, full stack dev and devops, you want to optimise this… This is why we created autodeploy.

This tool receives webhooks from Travis CI whenever a test suite passes and deploys that repo to the right namespace (based on the branch name. i.e. master->production) with certificates and docker images all configured for you.

Unsure if it worked? No worries, auto deploy can also send messages out using webhooks also, so we can see it in our RocketChat.

What’s Next?

A lot. Really a lot. First, we are working on handling multiple database connections and pooling them together for better performance.

Project Builder

We are also in the process of creating a Project Builder site that will allow citizens to:

create their own projects
discover and participate in other projects
import data from Twitter, Flickr and other services

Mobile App

Using the newly released Flutter framework, we have a mobile app for both iOS and Android in the pipeline also. If you want to be a beta tester, let us know!

What Do You Need?

Do you have questions about the platform? Ideas for potential Citizen Science projects? This platform is for you, so we are always happy to be in touch. Email us at: info@citizenscience.ch

Back to news overview

Quicklinks

Main navigation