Data Fabric and Address Verification Interface

Katie Le
IBM Data Science in Practice
5 min readNov 28, 2022

--

As organizations steer their business strategies to become data-driven decision-making organizations, data and analytics are more crucial than ever before. Insights from data gathered across business units improve business outcomes, but having heterogeneous data from disparate applications and storages makes it difficult for organizations to paint a big picture. How can organizations get a holistic view of data when it’s distributed across data silos? Implementing a data fabric architecture is the answer.

What is a data fabric?

Data fabric is defined by IBM as “an architecture that facilitates the end-to-end integration of various data pipelines and cloud environments through the use of intelligent and automated systems.” The concept was first introduced back in 2016 but has gained more attention in the past few years as the amount of data has grown. Data and analytics leaders now realize how challenging it is to manage all types of data in different environments and on different cloud platforms. Strength in quantity of data does not necessarily improve the business outcomes, but high quality data does. With data fabric architecture, data can be integrated and enriched, governed and protected, and accessed across the organization. Data fabric is what a modern data architecture should be.

Source: Gartner — Understand the role of Data Fabric

Implementing a data fabric architecture

At IBM, we provide 5 entry points to help organizations implement data fabric architecture.

  1. Customer 360: create a comprehensive view of client
  2. Multicloud data integration: integrate data across any hybrid and multicloud landscapes
  3. Data governance and privacy: automate to manage data trust, protection and compliance
  4. MLOps and trustworthy AI: enable an end-to-end AI workflow infused with data governance and privacy
  5. Data observability: simultaneously manage the quality and reliability of data continuously

One of the key elements that builds a data fabric architecture is to weave integrated data from many different sources, transform and enrich data, and deliver it to downstream data consumers. IBM Multicloud Data Integration helps organizations connect data from disparate sources, build data pipelines, remediate data issues, enrich data, and deliver integrated data to multicloud platforms where it can easily accessed by data consumers or built into a data product.

Ensuring high-quality data

A crucial aspect of downstream consumption is data quality. Studies have shown that 80% of time is spent on data preparation and cleansing, leaving only 20% of time for data analytics. Thus, the earlier in the process that data is cleansed and curated, the more time data consumers can reduce in data preparation and cleansing. This leaves more time for data analysis.

Let’s use address data as an example. With high quality address data, businesses can:

  • develop a better quality of customer data
  • minimize the number of failed deliveries that could drive up the operational cost (leveraging validation/correction, CASS certification)
  • increase operational efficiency based on their proximity to customers (leveraging geocoding/reverse geocoding)
  • improve product deliverability to a global customer base (leveraging transliterate/validation), and much more.

IBM’s Next Generation DataStage is an ETL tool to build data pipelines and automate the effort in data cleansing, integration and preparation. As a part of data pipeline, Address Verification Interface (AVI) can remediate bad address data. Powered by Loqate, the world’s most trusted location intelligence service, and integrated into the Next Generation DataStage, AVI transforms this bad data to deliver best breed of data.

Address Verification Interface (AVI) capabilities include:

  • Parsing: take an address and parse it into individual parts
  • Validation/correction/suggestions: validate and correct an address, suggest addresses if it’s incomplete
  • Transliterate: take different character type to translate to a different language
  • Geocoding and reverse geocoding: take an address and turn it into longitude/latitude pair, spatial data
  • CASS certification: this is only available to customers who operate in the US, ensure their addresses are deliverable (only available to clients who operate in the US)

Use Case 1: State and Local Government

With multiple services spanning benefits, public housing, infrastructure planning and more, government agencies experience significant issues with siloed data, database inaccuracies and fraudulent claims. These matters make it difficult to capture and manage citizen information accurately.

The AVI solution offers government agencies rich capabilities to create and monitor data quality and supports the capture, verification and maintenance of customer location data, while helping government gains maximum value from their information assets.

User Case 2: Healthcare

Excellent healthcare service relies on a verified and complete patient database. For healthcare providers, collecting and maintaining accurate patient location data can be challenging due to the volume of patients, the varying methods of data collection and management, and the quality of their legacy data.

The AVI solution offers healthcare organizations rich capabilities. First, they can standardize, improve and verify their customer address records. This enables them to correctly identify their customer base. Second, they can easily calculate patients’ location data and provide the closest proximity care center. This has the added benefit of expanding a referrer’s visibility.

Use Case 3: Finance

The demand for superior customer experiences is growing in the increasingly digitized banking industry. There are calls for efficiency, transparency and the delivery of high-quality online services. This means integrating tools that help seamlessly collect data, develop insights and map out the customer journey. However, your insights are only as good as your data.

The AVI solution offers finance organizations rich capabilities to detect frauds, remove duplicate customer records, help building Customer 360 data and qualify addresses used for targeted marketing.

Data fabric has become a top technology trend in 2022 and, according to Gartner, will “quadruple efficiency in data utilization while cutting human-driven data management tasks in half” by 2024. Layered with data integration and data quality, verified address data, a data fabric architecture can help drive organizations to get a holistic view of data and outperform their competitors in today’s data-driven market. With Address Verification Interface and IBM Multicloud Data Integration, organizations can jumpstart their data fabric journey with quality data that is integrated and enriched, governed and protected, and easily accessible to their organization.

--

--