Making Public Health Data Accessible: CKAN-based Data Access Platform
Datopian
Brief summary of the project.
Vital Strategies, in collaboration with Datopian, developed VIA Data - a data access platform based on CKAN. VIA Data makes public health data more accessible by converting complex datasets into interactive, visual, and easy-to-understand formats. The platform helps public health practitioners, researchers, and the general public use data effectively.
Public health and epidemiology data, although extensively collected, often remain underutilized. Many countries gather vast amounts of data that end up languishing on servers or in Excel spreadsheets, rarely seeing the light of day. This data, which could be invaluable for researchers, journalists, and the general public, remains inaccessible due to the lack of user-friendly tools to share it effectively. VIA Data was conceived to address this gap.
Vital Strategies needed a tool that can make public health data more accessible, interactive, and understandable. Such a tool would enable public health practitioners, researchers, and the general public to leverage data effectively for informed decision-making and improved health outcomes.
VIA Data, standing for Visual, Interactive, and Accessible Data, addresses this need by providing a public, open-source CKAN extension that simplifies data access and visualization. It features a data access portal and interactive report platform, allowing users to easily find, analyze, and share public health data. By transforming raw data into compelling stories and interactive visualizations, VIA Data enhances data usability and impact, ultimately contributing to better public health systems worldwide. By customizing the default CKAN UI, we created visually-rich dashboards and easy-to-use Query Tools with multiple forms of visualizations. Popular libraries like Plotly, Leaflet, and DataTables were used to create interactive and customizable data visualizations. With a clean and organized interface, the portal makes sharing public health data visually easy and enjoyable for both admins and public users.
Context
Vital Strategies is an international nonprofit that assists governments by designing easily scalable solutions to address the many challenges found in—and around—public health. These public health systems can cover a wide range of areas related to health. From food safety regulations, to handling the data that leads to creating those regulations and policies. Their main mission is to work in partnership to reimagine evidence-based, locally driven policies and practices to advance public health.
Since its founding in 2016, Vital Strategies has rapidly expanded its reach and impact, demonstrating significant progress in various initiatives. One such initiative is the Data for Health Initiative, now in its eighth or ninth year. Launched in collaboration with Bloomberg Philanthropies and the Australian Department of Foreign Trade, the initiative aims to optimize the use of health data, which often resides unused on servers or in spreadsheets.
Headquartered in New York, they work with governements all over the world, with regional offices in New York (United States), Addis Ababa, (Ethiopia), Paris (France), Jinan (China), São Paulo (Brazil), New Delhi (India) and Singapore.
The situation
Vital Strategies aimed to make their extensive public health data more accessible and comprehensible. Public health data often remains underutilized on servers or in spreadsheets. That’s a common problem with large quantities of data - it is often lacking the necessary accessibility for effective use. The challenge was to convert this data into interactive formats that could be easily shared and understood by a general audience.
The objective was clear: to build a user-friendly portal equipped with tools that facilitate seamless access and visualization of public health data. By improving data accessibility, the project aimed to enable the public, researchers, journalists, and government officials to focus on their core work and goals, minimizing the time spent on searching for and visualizing data.
The criteria
The criteria for this project were clear, minimal, and focused. The solution needed to be:
- Open-source
- User-friendly
- Capable of handling complex datasets
- Ensuring data privacy and security
- Easily deployable within various IT environments
Also, there were two main requirements:
- compelling, visually-rich dashboards presenting data-driven insights on broad topics
- easy-to-navigate Query Tools to quickly answer questions with data, using multiple forms of visualizations
The solution
To address Vital Strategies' needs, we developed VIA Data - a platform that allows public health organizations to publish Visual, Interactive, and Accessible data reports from a wide range of health topics.
VIA = Visual, Interactive, and Accessible
VIA Data is tailored for use by public health professionals within Ministries of Health, Institutes of Public Health, and provincial Departments of Health. This extension eliminates the need for advanced technical knowledge or expensive software licenses, making it an ideal solution for public health data dissemination.
Simplified Data Sharing
By leveraging open-source technology, VIA Data makes it easy to publish existing data in a visually appealing manner. It transforms static tables into interactive visualizations, allowing users to explore data dynamically without requiring logins or multiple credentials.
Key Functions of VIA Data
VIA Data serves two primary functions: a data access portal and an interactive report platform.
1. Data Access Portal:
- Ease of Access: The portal simplifies finding specific facts from large datasets. It presents key data points through visualizations and simple text interpretations, allowing users to download data easily.
- Interactive Visualizations: Instead of static tables, the portal offers interactive visualizations. Users can select different indicators, compare data across various demographics, and gain insights quickly.
This portal ensures that complex data is easily navigable and understandable.
2. Interactive Reports:
- Storytelling with Data: The interactive report function allows for more detailed explanations and storytelling with data. They transform raw data into compelling stories that both inform and engage users, making data analysis more meaningful and impactful. Users can create comprehensive reports that guide the audience through the data, highlighting significant trends and insights.
Key characteristics of VIA Data tools
- Public tools: Data must be ready to be shared publicly.
- Interactive: Best suited for access to many data through fields that can act as filters.
What VIA Data is not
- Operational dashboarding tool: Not suited for sensitive data or constantly changing.
- Infographic designing tool: Visual elements cannot be highly customized to show a specific data point.
Main components of the platform
- Manage data (CKAN): A tool to upload and store data that can then be used to author reports for the public.
- Design interactive data reports: Offers an authoring environment to create interactive user-friendly reports that include health data, collections of data visualizations, and supporting elements (text and images).
Here’s how we did it:
Home page
To improve navigation and improve accessibility, customizations were implemented to override the default CKAN UI. The home page is now minimal and focused for both admins and public users, only displaying the groups found on the portal:
Group page
CKAN groups are used to organize Query Tools. Navigating to a group will result in the list of relevant Tools:
Visualization dashboard
Finally, once a Tool is selected, its visualization dashboard is shown:
To handle the creation of visualizations and their dashboards, a few different (JS) libraries were used. In very early versions, D3 was used for rendering bar, line, and pie charts, but Plotly proved to fit the demands of the project much better. For maps, Leaflet was used, and DataTables handles tables.
Key Features & Functionalities of VIA Data
- Customizable topics: Users can customize icons, names, and create folders or groups to organize data.
- Customizable reports and multiple reports per topic: Options to create detailed reports with text, images, and tables. Each topic can include multiple reports, such as one showing alcohol consumption broken down by different social demographic characteristics.
- Interactive features: Users can select different indicators, and the data updates dynamically. They can download the data in various formats, share the link, or embed the view elsewhere.
- Map functionality: Users can upload geographic boundaries and link them to datasets to create color-coded maps, such as mortality data across Brazilian provinces. Integration with GeoJSON enhances mapping capabilities.
For internal users, creating reports involves:
- Common CKAN data repository: Uploading data, adding resources, and metadata.
- Creating reports: Enter titles, descriptions, icons, organize data, select datasets, numeric variables, and public filters.
- Five types of visualizations: Charts, maps, text boxes, images, and tables. Charts, being the most used, offer extensive customization options. There are a wide range of chart types. Admins can choose from bar, horizontal bar, stacked bar, stacked horizontal bar, line, area, scatter, spline, donut, and pie. As previously mentioned, maps and tables are also available. For each of these visualizations, multiple customization options were implemented in the UI.
On top of all of that, the project continues to evolve and improve, always striving to make the experience better for data consumers.
The outcome
The current iteration of this project makes it easy to share public health data visually. It offers a comprehensive list of options to clearly highlight the most important aspects of data. With clean and organized visualization dashboards, the experience as a public user is just as pleasurable as that of an admin.
What’s next?
For now, there’s no end in sight. Organizations continue to discover and choose this project. With a growing user base, there is no shortage of new features and improvements. The experience is consistently decreasing the time spent on achieving goals for both the admins creating the visualizations and the public users consuming them.