Enhancing Data Operations: Datopian’s DaaS Solution for a Fortune 500 Logistics Leader
Datopian
Fortune 500 logistics company
Brief summary of the project.
To streamline global operations, a Fortune 500 logistics company partnered with Datopian for a comprehensive postal code solution. By sourcing and standardizing data from hundreds of countries, Datopian enabled seamless integration with the company's systems, optimizing route planning and enhancing logistics accuracy.
The client struggled to manage a vast array of open datasets due to challenges in resource allocation, data quality, customization, and licensing complexities. Building an in-house data team was costly and posed operational risks.
They required a scalable, cost-effective solution for sourcing, processing, and customizing open data, along with ongoing support to ensure reliability and minimize operational disruptions.
Datopian provided a tailored DaaS service, including automated ETL pipelines, bespoke data customization, and dedicated support. By managing compliance and delivering high-quality datasets seamlessly integrated into their workflows, we enabled the client to focus on their core operations with confidence.
Context
Our client, a leading Fortune 500 company, sought reliable and comprehensive datasets to support their global operations. They required reference data such as country codes, time zones, and other geopolitical information, alongside custom data processing capabilities. Despite the availability of many open datasets, the client needed a robust, streamlined process to access, manage, and customize this data efficiently.
Datopian enabled the client’s data team to build data products and fuel the rest of the enterprise with essential information in a standardized and high-quality fashion. This support empowered the client to make informed decisions and maintain operational efficiency across their global operations.
The Challenge
The client faced a complex yet relatable challenge: wrangling a massive array of openly available datasets while keeping costs, quality, and sanity in check. Here’s what they were up against:
- Resource Allocation: Building an in-house data engineering dream team to extract, process, and update datasets might sound great on paper, but the numbers told a different story. A team of 2-3 engineers and a manager would rack up costs of 250,000 annually. Not to mention the inherent risks such as potential delays, technical issues, and ongoing personnel management.
- Data Quality and Reliability: Open data is fantastic - until it isn’t. The client needed up-to-date, reliable data they could trust. Plus, they required custom, business-specific data on top of the publicly available ones, all while having a reliable point of contact to sort out any pesky data errors or inconsistencies.
- Customization: Although much of the data was openly available on platforms like Datahub.io, they required customized schemas and integrations tailored to their business needs.
- Compliance and Licensing: Who wants to spend hours untangling licensing agreements or processing open data when there are better things to do? The client needed a hands-off solution that would handle all the red tape, freeing them to focus on their core business. They did not want to manage licensing issues or invest time in sourcing and processing openly available data.
Check out our related case study: Delivering a Worldwide Postal Code Dataset for Global Logistics Fortune 500 Logistics Company
The Solution
Datopian stepped in to provide a tailored a Data-as-a-Service (DaaS) tailored solution that combined technical expertise with a deep understanding of the client’s unique needs. Our approach ensured the client could focus on their core business while we handled the complexities of data management. Here’s how we delivered:
- Comprehensive Data Delivery: Over 30 data tables were delivered in CSV format, encompassing a wide range of geopolitical datasets, such as airport codes, country codes, time zones, and more. Datopian’s solution provided regular updates to keep the data current and relevant. To align with the client’s internal requirements, the data was delivered via FTP protocol, ensuring seamless integration with their existing systems.
- ETL System: We built an Extract, Transform, and Load (ETL) system that automates the extraction of data from various open sources. This system normalizes, cleans, and transforms the data, consolidating it into a format aligned with the client’s specifications. We implemented the frictionless metadata specification to ensure consistency and interoperability of datasets.
- Data Curation and Customization: While most of the data was sourced from open platforms like Datahub.io, Datopian curated and customized these datasets to meet the client’s specific needs. This included modifying existing schemas and creating additional bespoke datasets.
- Support and Reliability: A dedicated support team was established to address any data-related issues, ensuring the client has a reliable point of contact. This level of support is crucial for enterprise clients who rely on accurate data for their daily operations.
Value Delivered
Partnering with Datopian transformed the client’s data operations, delivering measurable benefits that extended beyond cost savings. Here’s how we empowered their business:
- Significant Cost Savings: By outsourcing their data engineering needs to Datopian, the client avoided the significant costs of building and maintaining an in-house team. The estimated annual cost savings ranged from 250,000, without factoring in the potential risks of managing an internal operation.
- Reliable and Up-to-Date Data: Datopian’s automated ETL system ensured that the client received up-to-date and accurate datasets. The regular updates and cleaning processes provided the client with confidence in their data's integrity.
- Bespoke Data Tailored to Business Needs: Our team tailored the data delivery to fit the client's specific needs, modifying schemas and integrating bespoke datasets. This customization enabled the client to use the data directly in their workflows, enhancing operational efficiency.
- Reduced Complexity and Risk: With Datopian handling data sourcing, processing, and licensing considerations, the client could focus on their core business operations. This reduced the complexity and risk associated with managing open data.
- Enterprise-Grade Support and Responsiveness: Having a dedicated team for data support meant the client could quickly resolve any issues, minimizing disruptions to their operations.
Agile Collaboration and Communication
To ensure continuous alignment with the client’s evolving needs, Datopian adopted an agile delivery model for this service. This approach fostered a close, collaborative relationship between our teams, enabling effective communication and rapid response to changes.
- Regular Standups: We conducted regular meetings with the client’s team, similar to standups, to discuss ongoing progress, identify potential or current issues, and plan for upcoming tasks. These meetings facilitated transparency, kept both teams aligned, and ensured smooth data delivery.
- Team Structure: Our agile team consisted of a diverse set of roles, each contributing to the seamless execution of the data service:
- Project Manager: Oversaw the project's progress, managed timelines, and served as the main point of contact for the client, ensuring that their requirements were met promptly.
- Senior Data Engineer: Led the development of the ETL processes, ensuring the data was extracted, cleaned, transformed, and delivered efficiently.
- Data Analyst: Worked closely with the data engineer to verify data quality, structure, and relevance, making sure it aligned with the client’s business requirements.
- Business Analyst: Engaged with the client to gather and refine requirements, translating them into actionable tasks for the technical team.
- Support Specialists: Provided continuous support to address any data-related issues, ensuring that the client received timely assistance whenever needed.
- Adaptability: This agile, cross-functional team structure allowed us to rapidly adapt to changing client requirements, including modifying data schemas or integrating new datasets as needed. By maintaining regular communication and collaboration, we ensured that the client always had up-to-date, high-quality data tailored to their needs.
This collaborative and adaptive approach was crucial in delivering value, allowing the client to rely on Datopian not just as a data provider but as a strategic partner.
Sample Datasets Provided
Country Codes Dataset: comprehensive country codes: ISO 3166, ITU, ISO 4217 currency codes and many more.
Alpha_2_Country_code | Country_Name_English | Country_Name_English_CLDR | Country_Name_English_Readable | Alpha_3_Country_code | Numeric_Country_Code | Continental_Code |
---|---|---|---|---|---|---|
AD | Andorra | Andorra | Andorra | AND | 20 | EU |
AE | United Arab Emirates | United Arab Emirates | United Arab Emirates | ARE | 784 | AS |
AF | Afghanistan | Afghanistan | Afghanistan | AFG | 4 | AS |
AG | Antigua and Barbuda | Antigua & Barbuda | Antigua & Barbuda | ATG | 28 | NA |
AI | Anguilla | Anguilla | Anguilla | AIA | 660 | NA |
AL | Albania | Albania | Albania | ALB | 8 | EU |
AM | Armenia | Armenia | Armenia | ARM | 51 | AS |
AO | Angola | Angola | Angola | AGO | 24 | AF |
AQ | Antarctica | Antarctica | Antarctica | ATA | 10 | AN |
AR | Argentina | Argentina | Argentina | ARG | 32 | SA |
AS | American Samoa | American Samoa | American Samoa | ASM | 16 | OC |
AT | Austria | Austria | Austria | AUT | 40 | EU |
AU | Australia | Australia | Australia | AUS | 36 | OC |
AW | Aruba | Aruba | Aruba | ABW | 533 | NA |
AX | Åland Islands | Åland Islands | Åland Islands | ALA | 248 | EU |
AZ | Azerbaijan | Azerbaijan | Azerbaijan | AZE | 31 | AS |
Holidays Dataset: List of holidays per country with their names, types and dates.
Country Code | Holiday Name | Type of Holiday | Date |
---|---|---|---|
AD | New Year's Day | public | 2024-01-01 |
AD | Epiphany | public | 2024-01-06 |
AD | Shrove Tuesday | public | 2024-02-13 |
AD | Constitution Day | public | 2024-03-14 |
AD | Maundy Thursday | bank | 2024-03-28 |
AD | Good Friday | public | 2024-03-29 |
AD | Easter Sunday | public | 2024-03-31 |
AD | Easter Monday | public | 2024-04-01 |
AD | Labour Day | public | 2024-05-01 |
AD | Pentecost | public | 2024-05-19 |
AD | Whit Monday | public | 2024-05-20 |
AD | Assumption | public | 2024-08-15 |
AD | Our Lady of Meritxell | public | 2024-09-08 |
AD | All Saints' Day | public | 2024-11-01 |
AD | Immaculate Conception | public | 2024-12-08 |
AD | Christmas Eve | bank | 2024-12-24 |
AD | Christmas Day | public | 2024-12-25 |
AD | Boxing Day | public | 2024-12-26 |
AE | New Year's Day | public | 2024-01-01 |
AE | Laylat al-Mi'raj | public | 2024-02-08 |
AE | First day of Ramadan | public | 2024-03-11 |
AE | End of Ramadan (Eid al-Fitr) | public | 2024-04-10 |
AE | Feast of the Sacrifice (Eid al-Adha) | public | 2024-06-16 |
AE | Islamic New Year | public | 2024-07-07 |
AE | Birthday of Muhammad (Mawlid) | public | 2024-09-15 |
AE | National Day | public | 2024-12-02 |
Time Zones Dataset: Lists standardized time zone information, including UTC offsets, time zone names, and daylight saving time status.
countryCode | countryName | zoneName | gmtOffset |
---|---|---|---|
CI | Ivory Coast | Africa/Abidjan | 0 |
GH | Ghana | Africa/Accra | 0 |
ET | Ethiopia | Africa/Addis_Ababa | 10800 |
DZ | Algeria | Africa/Algiers | 3600 |
ER | Eritrea | Africa/Asmara | 10800 |
ML | Mali | Africa/Bamako | 0 |
CF | Central African Republic | Africa/Bangui | 3600 |
GM | Gambia | Africa/Banjul | 0 |
GW | Guinea-Bissau | Africa/Bissau | 0 |
Metadata Examples
Below is a snippet of the frictionless metadata specification used for one of the datasets to showcase how the data is documented and organized:
name: country-codes
title: Comprehensive country codes: ISO 3166, ITU, ISO 4217 currency codes and many more
format: csv
datapackage_version: 1.0.0
last_modified: 2024-09-25
licenses:
- name: ODC-PDDL-1.0
path: http://opendatacommons.org/licenses/pddl/
title: Open Data Commons Public Domain Dedication and License v1.0
sources:
- name: United Nations Protocol and Liaison Service
title: United Nations Protocol and Liaison Service
path: https://www.un.org/dgacm/sites/www.un.org.dgacm/files/Documents_Protocol/unterm-efsrca.xlsx
- name: Unicode Common Locale Data Repository (CLDR) Project
title: Unicode Common Locale Data Repository (CLDR) Project
path: https://github.com/unicode-org/cldr-json/blob/d38478855dd8342749f0494332cc8acc2895d20d/cldr-json/cldr-localenames-full/main/ms/territories.json
- name: United Nations Department of Economic and Social Affairs Statistics Division
title: United Nations Department of Economic and Social Affairs Statistics Division
path: https://unstats.un.org/unsd/methodology/m49/overview/
- name: SIX Interbank Clearing Ltd (on behalf of ISO)
title: SIX Interbank Clearing Ltd (on behalf of ISO)
path: https://www.six-group.com/dam/download/financial-information/data-center/iso-currrency/lists/list-one.xml
- name: Statoids
title: Statoids
path: http://www.statoids.com/wab.html
- name: Geonames
title: Geonames
path: http://download.geonames.org/export/dump/countryInfo.txt
- name: US Securities and Exchange Commission
title: US Securities and Exchange Commission
path: https://www.sec.gov/submit-filings/filer-support-resources/edgar-state-country-codes
resources:
- name: country-codes
format: csv
path: data/country-codes.csv
schema:
fields:
- name: FIFA
title: FIFA code
description: Codes assigned by the Fédération Internationale de Football Association
type: string
- name: Dial
title: telephone dialing code
description: Country code from ITU-T recommendation E.164, sometimes followed by area code
type: string
- name: ISO3166-1-Alpha-3
title: ISO3166-1-Alpha-3
description: Alpha-3 codes from ISO 3166-1 (synonymous with World Bank Codes)
type: string
constraints:
unique: true
minLength: 3
maxLength: 3
- name: MARC
title: MARC code
description: MAchine-Readable Cataloging codes from the Library of Congress
type: string
- name: is_independent
title: independent country
description: Country status, based on the CIA World Factbook
type: string
- name: ISO3166-1-numeric
title: ISO3166-1-numeric
description: Numeric codes from ISO 3166-1
type: string
- name: GAUL
title: GAUL code
description: Global Administrative Unit Layers from the Food and Agriculture Organization
type: string
- name: FIPS
title: FIPS code
description: Codes from the U.S. standard FIPS PUB 10-4
type: string
- name: WMO
title: WMO code
description: Country abbreviations by the World Meteorological Organization
type: string
constraints:
maxLength: 2
- name: ISO3166-1-Alpha-2
title: ISO3166-1-Alpha-2
description: Alpha-2 codes from ISO 3166-1
type: string
constraints:
unique: true
minLength: 2
maxLength: 2
- name: ITU
title: ITU code
description: Codes assigned by the International Telecommunications Union
type: string
- name: IOC
title: IOC code
description: Codes assigned by the International Olympics Committee
type: string
constraints:
maxLength: 3
- name: DS
title: distinguishing signs of vehicles
description: Distinguishing signs of vehicles in international traffic
type: string
- name: UNTERM Spanish Formal
title: UNTERM Spanish Formal
description: Country's formal Spanish name from UN Protocol and Liaison Service
type: string
- name: Global Code
title: global code
description: Country classification from United Nations Statistics Division
type: string
- name: Intermediate Region Code
title: intermediate region code
description: Country classification from United Nations Statistics Division
type: string
- name: official_name_fr
title: official name French
description: Country or Area official French short name from UN Statistics Division
type: string
- name: UNTERM French Short
title: UNTERM French Short
description: Country's short French name from UN Protocol and Liaison Service
type: string
- name: ISO4217-currency_name
title: ISO4217-currency_name
description: ISO 4217 currency name
type: string
- name: UNTERM Russian Formal
title: UNTERM Russian Formal
description: Country's formal Russian name from UN Protocol and Liaison Service
type: string
- name: UNTERM English Short
title: UNTERM English Short
description: Country's short English name from UN Protocol and Liaison Service
type: string
- name: ISO4217-currency_alphabetic_code
title: ISO4217-currency_alphabetic_code
description: ISO 4217 currency alphabetic code
type: string
- name: Small Island Developing States (SIDS)
title: small island developing state (SIDS)
description: Country classification from United Nations Statistics Division
type: string
- name: UNTERM Spanish Short
title: UNTERM Spanish Short
description: Country's short Spanish name from UN Protocol and Liaison Service
type: string
- name: ISO4217-currency_numeric_code
title: ISO4217-currency_numeric_code
description: ISO 4217 currency numeric code
type: string
- name: UNTERM Chinese Formal
title: UNTERM Chinese Formal
description: Country's formal Chinese name from UN Protocol and Liaison Service
type: string
- name: UNTERM French Formal
title: UNTERM French Formal
description: Country's formal French name from UN Protocol and Liaison Service
type: string
- name: UNTERM Russian Short
title: UNTERM Russian Short
description: Country's short Russian name from UN Protocol and Liaison Service
type: string
- name: M49
title: M49
description: UN Statistics M49 numeric codes (nearly synonymous with ISO 3166-1 numeric codes, which are based on UN M49. ISO 3166-1 does not include Channel Islands or Sark, for example)
type: number
constraints:
unique: true
- name: Sub-region Code
title: sub-region code
description: Country classification from United Nations Statistics Division
type: string
- name: Region Code
title: region code
description: Country classification from United Nations Statistics Division
type: string
- name: official_name_ar
title: official name Arabic
description: Country or Area official Arabic short name from UN Statistics Division
type: string
- name: ISO4217-currency_minor_unit
title: ISO4217-currency_minor_unit
description: ISO 4217 currency number of minor units
type: string
- name: UNTERM Arabic Formal
title: UNTERM Arabic Formal
description: Country's formal Arabic name from UN Protocol and Liaison Service
type: string
- name: UNTERM Chinese Short
title: UNTERM Chinese Short
description: Country's short Chinese name from UN Protocol and Liaison Service
type: string
- name: Land Locked Developing Countries (LLDC)
title: land locked developing country (LLDC)
description: Country classification from United Nations Statistics Division
type: string
- name: Intermediate Region Name
title: intermediate region name
description: Country classification from United Nations Statistics Division
type: string
- name: official_name_es
title: official name Spanish
description: Country or Area official Spanish short name from UN Statistics Division
type: string
- name: UNTERM English Formal
title: UNTERM English Formal
description: Country's formal English name from UN Protocol and Liaison Service
type: string
- name: official_name_cn
title: official name Chinese
description: Country or Area official Chinese short name from UN Statistics Division
type: string
- name: official_name_en
title: official name English
description: Country or Area official English short name from UN Statistics Division
type: string
- name: ISO4217-currency_country_name
title: ISO4217-currency_country_name
description: ISO 4217 country name
type: string
- name: Least Developed Countries (LDC)
title: least developed country (LDC)
description: Country classification from United Nations Statistics Division
type: string
- name: Region Name
title: region name
description: Country classification from United Nations Statistics Division
type: string
- name: UNTERM Arabic Short
title: UNTERM Arabic Short
description: Country's short Arabic name from UN Protocol and Liaison Service
type: string
- name: Sub-region Name
title: sub-region name
description: Country classification from United Nations Statistics Division
type: string
- name: official_name_ru
title: official name Russian
description: Country or Area official Russian short name from UN Statistics Division
type: string
- name: Global Name
title: global name
description: Country classification from United Nations Statistics Division
type: string
- name: Capital
title: capital city
description: Capital city from Geonames
type: string
- name: Continent
title: continent
description: Continent from Geonames
type: string
constraints:
minLength: 2
maxLength: 2
- name: TLD
title: TLD
description: Top level domain from Geonames
type: string
- name: Languages
title: languages
description: Languages from Geonames
type: string
- name: Geoname ID
title: Geoname ID
description: Geoname ID
type: number
constraints:
unique: true
- name: CLDR display name
title: CLDR display name
description: Country's customary English short name (CLDR)
type: string
- name: EDGAR
title: EDGAR code
description: EDGAR country code from SEC
type: string
constraints:
maxLength: 2
Conclusion
The partnership with Datopian has empowered the client to leverage open data efficiently while avoiding the pitfalls and expenses of managing an in-house data engineering team. Our professional services, combined with our data expertise, provided the client with the reliability and customization they needed, making Datopian a valuable partner in their data journey.
If you are an organization looking to streamline your data operations, Datopian offers the expertise and support to deliver high-quality, tailored data solutions.
Not finding the data you need? We can get it for you! Check out our premium data service at Datahub.io. You can also reach out to us to discuss how we can help you achieve your goals.
Don't forget to check out our postal codes collection page at datahub.io/collections/postal-codes-datasets