Open Data Denmark's Open Data Portal: Multilingual Support & Security
Datopian
Brief summary of the project.
We collaborated with Open Data DK, a Danish open data initiative, to modernize and upgrade their existing CKAN classic portal. We focused on enhancing usability, security, and multilingual support while integrating their blog content and streamlining their infrastructure.
Open Data DK was operating on an aging CKAN classic portal that lacked a decoupled front end, making it less user-friendly. While they were satisfied with CKAN's data management, the need for an integrated CMS for blog and news updates and reduced infrastructure overhead was clear.
Open Data DK required a solution that would securely store data as per ISO/IEC 27001 and GDPR compliance standards. They also needed Civil Personal Registration (CPR) validation to protect sensitive names and addresses, seamless integration with WordPress blogs, and WCAG 2.1 compliance for accessibility.
We upgraded Open Data DK's portal to exceed initial criteria, integrating it with Google Cloud for ISO/IEC 27001 compliance and adding a user-configurable GDPR-friendly popup to manage data collection preferences. A custom CKAN extension ensures Civil Personal Registration data integrity. WordPress posts and WCAG 2.1 standards enrich user experience. Decoupling admin and public UIs enhances security. We automated metadata translations through an IBM’s Watson-driven extension, supporting Danish, English, and French. We modernized the front end by leveraging our frontend-v2. Hosting was migrated to Datopian, and we offer continuous deployment and support. The result is a secure, accessible, and powerful platform that keeps users updated while exploring various open data topics.
Context
Open Data DK is a joint effort between a number of Danish municipalities and other public authorities. Their primary goal is to provide an easy and accessible open data platform for public data that does not belong on other open data platforms, as well as to collaborate on central themes e.g. standardization and requests for data. They collaborate with the Danish Agency for Digitalisation and KL to create better access to data from the Danish public sector.
Their data covers many topics. For example, you can find data about technology, the environment, health, schools, tourism, traffic, housing, politics, and weather (this is a tiny sample of the variety of data on opendata.dk).
The situation
When Open Data DK approached Datopian (Viderum at the time), they already had an older CKAN classic portal (without a decoupled front end). While happy overall with CKAN’s data management capabilities, their portal was beginning to show its age.
Along with the need to update CKAN, they wanted CMS integration to seamlessly provide their blog and news posts directly on the portal.
Finally, they wanted to reduce their infrastructure overhead by migrating to a single hosting service.
The criteria
- Securely store data per ISO/IEC 27001
- GDPR Compliance
- Names and addresses under protection in the Civil Personal Registration must be stored together with the relevant data
- Integration with WordPress blogs
- WCAG 2.1 (accessibility) compliance
The solution
Along with the original criteria, additional requirements popped up along the way as the client continued to pursue the best experience for both their internal and public users—for example, automated metadata translation, improving the public portal UI, and more.
Secure Data Storage
The first criterion was a non-issue. The data lives on Google Cloud, which is ISO/IEC 27001 compliant by default.
GDPR Compliance
For GDPR compliance, users must select their preferred level of data collection of cookies before using the portal. It’s managed in a popup:
It includes additional details to explain each level:
The user can access and change their selection at any time by clicking on the icon in the bottom left corner:
CPR Validation
To handle requirement 3 (names and addresses under protection in the Civil Personal Registration must be stored together with the relevant data), a custom CKAN extension (ckanext-cprvalidation) was developed to scan resources for CPR numbers, validate them, and set their datasets to private if they’re exposing this personal information. This is done automatically in the background.
WordPress Integration
Thanks to the built-in support for WordPress in frontend-v2 (the decoupled front end that we developed), most of the work required here was simply theming and styling the blog posts within the portal. This provides a seamless experience for users while exploring the data and related news or updates.
For example, the latest blog posts are displayed at the bottom of the home page:
Clicking on “View all news” will take a user to the complete list of posts, with previews:
Additionally, the magnifying glass icon in the top right corner will take users to a dedicated search page for content:
WCAG Accessibility Compliance
An important aspect throughout the iterations of the portal is accessibility. The Web Content Accessibility Guidelines (WCAG) provide a comprehensive framework for ensuring digital inclusivity. Following these guidelines ensures that the portal is compatible with assistive technologies—everything from alternative text for screen readers to proper color brightness ratios.
As the portal continues to improve and evolve, accessibility compliance follows along. It’s not a one-off task but an ongoing process.
Additional Changes and Improvements
Decoupled front end
Initially, the public users and administrators used the same portal, domain, and UI, which is the default in CKAN. For many organizations, that’s perfectly fine, but having a decoupled front end is becoming more common, which is what was done in this case (using Datopian’s frontend-v2).
Decoupling means administrators use the original CKAN portal to manage data, while public users access the data on a separate portal. This keeps the public portal clean, minimal, and focused on the best user experience for searching and viewing data and blog posts.
On the technical side, decoupling provides a far more manageable and less constrained experience while designing modern UIs because the front end retrieves the data from the admin portal via APIs. This means designs, layouts, styles, and themes are endlessly customizable, as the CKAN backend remains unchanged. By comparison, components in a default CKAN instance—though still highly customizable—are tightly interconnected, limiting how far customization can deviate from the existing design.
PortalJS 🌀 is a javascript framework for building rich data portal front ends fast using a modern front end approach (JavaScript, React, SSR). Next step in the evolution of this codebase. See more: 👉 PortalJS Github
Preventing Anonymous Access to the Admin Portal
Along the journey, the decision was made to prevent public users from accessing the CKAN portal for increased security and to prevent additional anonymous traffic.
Luckily, Datopian had already developed an extension to handle this, ckanext-noanonaccess. The extension redirects every URL to the login page if a user isn't logged in. This makes the CKAN portal truly for administrators and organization members only, while all public users use the decoupled public portal.
Automated Translation of Dataset Metadata
Using IBM Language Translator, a new CKAN extension was developed to provide automated language translation of dataset metadata fields—title and description—from Danish to English and French. The new extension, ckanext-translate, handles the translating, while ckanext-fluent handles the multilingual support within the dataset metadata schema.
Since this process takes place behind the scenes, there’s no additional burden placed on the data maintainers. The translations are automatically added to the dataset metadata object (see notes_translated
and title_translated
):
With these translations in place, when a public user visits opendata.dk, they will see data displayed in their browser’s default language. For example, in English:
And then in the original language, Danish:
This provides a seamless experience for Danish, English, and French-speaking users without requiring manual intervention.
2023 Front end Redesign
The latest work on the portal was a sort of redesign—mostly the home page, but there were also changes to the search pages. This included resizing text (e.g., the main site tagline), elements (e.g., the search bar), etc. The client was deeply involved in the process, which showcases their drive to continuously improve the user experience. They provided a detailed design in Figma and worked closely with Datopian throughout each iteration by providing quick and clear feedback.
The most notable changes were what a user sees when first landing on the site. Here’s the original (note the large tagline text and the small search bar on the right):
Now, the tagline is consolidated into a single line (and includes dynamically changing text—the last word cycles through “innovation”, “testing”, “research”, and “innovation”), and the search bar takes center stage:
Next, as you can see in the image above, the groups and organizations are easily accessible using the tabs/buttons (defaults to “Emner”, which are groups):
You can switch between them by clicking the tab/button (“Dataudstillere” are organizations):
At the bottom of the home page, the blog posts are presented in a cleaner way, and the banner with previews below them has been removed (the banner was a bit redundant and offered no additional benefit). Here’s the original:
And here’s the updated version:
To make searching easier, the dataset and content (blog, news, etc.) search pages now have a tab/button to switch between search types. Here’s the original dataset search page:
Now, with the navigation buttons, users can quickly switch between them:
Lastly, the search bar functionality on the home page has dramatically improved for mobile users, now providing live-updated results as users type (every character typed updates the displayed results, refining the search on the fly):
The outcome
The project successfully addressed Open Data DK's core requirements while also incorporating value-added features that dramatically improved both the administrator and user experience. Our collaboration has achieved more than technical excellence; it has also delivered compelling business value. The platform has become a valuable resource for open data, enhancing accessibility for a varied audience that includes government officials, researchers, and the general public. The decoupled CKAN admin portal we implemented allows for efficient behind-the-scenes data management, enhancing operational efficiency. Furthermore, by transitioning the hosting to Datopian's robust infrastructure, we've not only ensured high-level security but also substantially reduced infrastructure overhead. To cap it off, the addition of automated language translation features has expanded the platform's global reach, fulfilling Open Data DK's mission of broad data dissemination. Overall, the project has not just been a technical success; it's a strategic win that amplifies Open Data DK's influence, efficiency, and global reach.
What’s next?
Datopian continues to support the platform, but no significant plans are currently in the pipeline. However, the most recent work of refreshing the front end UI with quality-of-life improvements indicates that Open Data DK is always looking for the next step to take in its journey to provide the best experience for the users of Open Data DK.
Update (2024): As part of our continued collaboration with Open Data Denmark, Datopian recently completed a migration project, moving their platform from Google Cloud to Hetzner for improved cost-efficiency, performance, and GDPR compliance.
Check out our case study: Cost-Efficient and GDPR-Compliant: Datopian Migrates Open Data Denmark from Google Cloud to Hetzner