AWS Transformation Improves Health of Regional Blues Plan

This regional Blues health plan wanted to modernize its data platform and implement next-generation cloud technologies to amplify value-based care and drive growth.

NTT DATA Services transformed the health plan’s legacy data systems with a HIPAA-compliant AWS-based data lake accelerated by NTT DATA’s Nucleus data and intelligence fabric. Nucleus accelerators included data ingestion pipelines that added further value to the data lake—the solution defined and implemented data lake zones to which data could be directed automatically.

doctor and patient reviewing medical documents

Business Needs

The United States has more than 900 health insurance companies that offer medical coverage, so standing out in a competitive marketplace is no easy task. Yet, this regional health plan is taking a cutting-edge approach with a goal to improve care, lower costs and grow membership through data-driven insights that increase patient and provider satisfaction while positioning the organization for success in today’s value-based care model.

To achieve its goal, the insurer needed to modernize its legacy data systems. The organization tried working with several vendors but was unable to land on a winning modernization formula until the NTT DATA team presented its plan. NTT DATA provided the insurer with a breadth of services including an existing framework that meant the insurer did not have to build new functionality. The health plan agreed that these capabilities, coupled with NTT DATA expertise in healthcare and Amazon Web Services (AWS) technologies, would allow it to quickly build a modern, HIPAA compliant data platform in the cloud.


  • Delivers 33% more efficiency using Amazon Web Services
  • Creates automated data lake platform in less than four months
  • Expands business value of disparate data sources
  • Reduces costs significantly by decoupling storage from compute
  • Creates data ingest pipelines with automated validations, end-to-end deployment and governed curated data assets
  • Enables presentation layer using reusable data engineering framework


Modernizing the data platform

Already using AWS in other parts of the business, the insurer knew that it wanted to move to an AWS-based data lake with the latest technologies. Specifically, it possessed a plethora of structured claims data housed in a relational structure that hindered its ability to use unstructured data from a variety of sources, including the member portal and member encounters. To accomplish the organization’s goal to use the unstructured data coming into the environment, absorbing, and correlating it to find areas to improve patient satisfaction and support its value-based care programs, the two teams began work building the new data lake environment.

Nucleus accelerator

The Nucleus Intelligent Enterprise Platform uses NTT DATA and third-party IP to accelerate deployments; for this project, the team relied on Nucleus’s data and intelligence fabric to help accelerate the data lake build. Nucleus combined with NTT DATA consultant expertise in AWS technologies meant the team was able to get the data lake framework established within four months.

Nucleus accelerators included data ingestion pipelines that added further value to the AWS data lake. Data ingestion pipelines were created using Apache Spark; two ingestion modalities were created, using Scala and Python with AWS Simple Storage Service (Amazon S3) buckets as the destination for the data. The solution defined and implemented data lake zones to which data could automatically be directed.

Zones included a raw or landing area layer for data persistence and retention of raw data assets; a cleansed or curated layer, where transformed data is available in consumable data sets; and a production or secure layer where business logic and row level security are added. From these layers, data ends up in a presentation layer where line-of-business customers can gain access.

For data processing, the solution relies on Amazon EMR, a big data platform that using Apache Spark can process the insurer’s large amounts of data. With Amazon EMR, the insurer can run its workloads on Amazon EC2, or Amazon Elastic Kubernetes Service (EKS) clusters, which it plans to expand to as it matures its data lake program.

The implementation is deployed using AWS CloudFormation templates to help ensure configuration consistency and includes integrations with GitLab and GitHub.

Security and compliance built-in

Amazon EMR not only automates tasks like provisioning capacity and tuning clusters, but it is also inherently HIPAA compliant. Beyond EMR’s HIPAA compliance, the team built-in security by installing certificates to ensure communications had correctly defined security protocols, including data at rest and data in motion. This included data within Amazon Virtual Private Cloud (Amazon VPCs) where security protocols were proactively built in. NTT DATA consultants encrypted protected health information using Kerberos-based authentication.

Additional security measures included Amazon S3 encryption, Atlas authentication, and more. In addition, NTT DATA consultants established monitoring with AWS CloudTrail and provided the client with the ability to detect, respond, investigate and remediate incidents using logs from Amazon S3 and Amazon VPC. Amazon CloudWatch provides the team with visibility into system performance and health.

With a solid data lake platform in place, the NTT DATA team handed the reins to the client for data population. As the customer migrates data to its new AWS data lake, NTT DATA experts remain involved, ensuring the client team has the knowledge needed to effectively migrate data to the new system and maximize the new system capabilities.


As a data-centric organization, this health plan now has a platform at its fingertips to begin driving data insights to further its goals of providing more personalized care and fostering collaboration among providers. By bringing together healthcare claims data from multiple processing systems, and unstructured sources like patient touchpoints, this regional health plan now has the power and flexibility to build personalized experiences that will boost loyalty and entice new members as well.

The modernized data lake grows the company’s efficiency over 33% from its expensive, on-premises legacy platform. By separating the data and compute layers and moving the firm’s data to Amazon S3, the insurer was able to drastically decrease its storage costs. Moreover, by replacing several expensive tools with open-source solutions, the company was able to save on execution costs as well.

Now the health plan’s internal business customers have latency-free access to critical data sets that will allow them to bring new data-proven ideas to market faster.

Next steps

NTT DATA continues to work with the client, ensuring its platform, related applications, tools and development mature as desired. Planned steps in the process include expanding past EC2 to use managed Kubernetes containers and EKS as well expanding the use of Lambda functions. Last, the insurer plans to adopt AI and ML, creating advanced data models for even deeper patient insights.

About the Company

Regional Blue Cross Blue Shield health plan modernizes its data lake to enable aggressive growth.




United States

More Case Studies