Skip to content
Snippets Groups Projects

dockerized CLARIN DSpace

This is meant to provide an easy way to install CLARIN DSpace by providing a Docker setup that automatizes the installation as far as possible. We are running this within Kubernetes, but try to keep the Docker Compose setup also in working condition. All files that need to be customized or added are also part of this repository or in the sister projects https://github.com/commul/clarin-dspace and https://github.com/commul/lindat-common (both forked from the corresponding UFAL repositories).

How to use it

You need a Linux server (we tested on Ubuntu 16.04 and CentOS 7.3) with a recent install of Docker (we tested with 17.05.0-ce) or a Kubernetes cluster (we are using version 1.8). Clone this git repository onto the server. Then you need to look through all the files in commul-customization and adapt the configuration to your server (domain name, user names, etc).

None of the files contain passwords, instead passwords and other confidential information are provided to the containers using environment variables. In Kubernetes you can use secrets for this.

Kubernetes

Prerequisites

Setup

All sensitive information is stored in Kubernetes secrets. If you want to use the Kubernetes yaml files as is, make sure that you create those secrets with the same secret and key names.

Workflow

Building docker images

Kubernetes is pulling images from a registry, so you need to first build the docker files locally and push them to a registry. We are using the registry feature of GitLab. Make sure to set up your local docker so that it can upload images into the registry.

We try to keep the version numbers of all images in sync and use the script release.sh for this. You will need to edit this script and exchange the registry URL with your own. The script takes the new version number as its argument and builds all docker images and pushes them into our registry.

cd dockerfiles
./release 1.2.3

By default this is pushing images into the staging branch of the registry. If you want to build a production image, you need to provide the extra argument production.

cd dockerfiles
./release 1.2.3 production

deploying to Kubernetes

After pushing all images to your container registry you can ask Kubernetes to pull and deploy them. If necessary, you might need to create a personal token for logging into the GitLab Registry and store it as a secret that you then reference in the Kubernetes yaml files. You need to edit the yaml files to suit your setup, especially you need to edit:

  • The image registry URL
  • The ceph setup
  • The name of your Kubernetes namespace

These edits you only have to do once, but for each new deploy you have to edit the version numbers of the images in the deployment yaml files. This can be done with a simple sed:

sed -i 's/1.2-RC1/1.2/' ../kubernetes/*deploy*

There is a handy script that calls all necessary kubectl commands one after another called start-kube.sh, for bringing it all down again use stop-kube.sh instead. By default this doesn't touch the persistent volume claim. For a full redeploy that forgets all history you have to call the kubectl command manually:

kubectl delete -f pgdata-persistentvolumeclaim.yaml

ceph

secrets

d

Exemplary workflow using Docker Compose (this might be outdated)

get Dockerfiles

git clone https://gitlab.inf.unibz.it/commul/docker/clarin-dspace/
cd clarin-dspace

get my versions of dist files

cp password_mod.sh.dist password_mod.sh
cp commul-customization/init-dspace-dbs.sh.dist commul-customization/init-dspace-dbs.sh
cp commul-customization/local.properties.dist commul-customization/local.properties

change passwords

vi password_mod.sh
chmod +x password_mod.sh
./password_mod.sh

make sure the certificate and key are there

cp -r /tmp/certs ./commul-customization/

build the images

docker-compose up -d --build

enter the DSpace container

docker exec -ti clarindspace_dspace_1 bash

deploy DSpace

make new_deploy

copy over modified aai_config.js

cp /tmp/aai_config.js /opt/lindat-dspace/installation/webapps/xmlui/themes/UFAL/lib/js/

create dspace admin as tomcat8, so that the log files have the right owner

su  -s /bin/sh tomcat8
/opt/lindat-dspace/installation/bin/dspace create-administrator

start the dspace webapp

cd /opt/repository/sources/dspace/utilities/project_helpers/scripts
/etc/init.d/tomcat8 start