Bridging knowledge silos: cross-bounded context querying with Vanguard’s Operational Learn-only Information Retailer (ORDS) utilizing Amazon Redshift


Are you modernizing your legacy batch processing techniques? At Vanguard, we confronted important challenges with our legacy mainframe system that restricted our skill to ship trendy, personalised buyer experiences. Our centralized database structure created efficiency bottlenecks and made it troublesome to scale providers independently for our hundreds of thousands of private and institutional buyers.

On this put up, we present you the way we modernized our knowledge structure utilizing Amazon Redshift as our Operational Learn-only Information Retailer (ORDS). You’ll learn the way we transitioned to a cloud-native, domain-driven structure whereas preserving vital batch processing capabilities. We present you the way this resolution enabled us to create logically remoted knowledge domains whereas sustaining cross-domain analytics capabilities—all whereas adhering to the ideas of bounded contexts and distributed knowledge possession.

Background and challenges

As monetary wants proceed to evolve, Vanguard is dedicated to delivering adaptable, top-notch experiences that foster long-lasting buyer relationships. This dedication spans from enhancing the non-public investor journey to bringing personalised cellular dashboards and connecting institutional shoppers with superior recommendation choices.

To raise buyer expertise and drive digital transformation, Vanguard has embraced domain-driven design ideas. This method focuses on creating autonomous groups, fostering sooner innovation, and constructing knowledge mesh structure. Central to this transformation is the Private Investor crew’s mainframe modernization effort, transitioning from a legacy system to a cloud-based, distributed knowledge structure organized round bounded contexts – distinct enterprise domains that handle their very own knowledge. As a part of this shift, every microservice now manages its personal native knowledge retailer utilizing Amazon Aurora PostgreSQL-Suitable Version or Amazon DynamoDB. This method allows domain-level knowledge possession and operational autonomy.

Vanguard’s current mainframe system, constructed on a centralized Db2 database, allows cross-domain knowledge entry and integration but in addition introduces a number of architectural challenges. Although batch processes can be part of knowledge throughout a number of bounded contexts utilizing SQL joins and database operations to combine data from varied sources, this tight coupling creates important dangers and operational points.

Challenges with the centralized database method embrace:

  • Useful resource Rivalry: Processes from one area can negatively influence different domains attributable to shared compute assets, resulting in efficiency degradation throughout the system.
  • Lack of Area Isolation: Adjustments in a single bounded context can have unintended ripple results throughout different domains, rising the danger of system-wide failures.
  • Scalability Constraints: The centralized structure creates bottlenecks as load will increase, making it troublesome to scale particular person parts independently.
  • Excessive Coupling: Tight integration between domains makes it difficult to switch or improve particular person parts with out affecting the complete system.
  • Restricted Fault Tolerance: Points in a single area can cascade throughout the complete system attributable to shared infrastructure and knowledge dependencies.

To handle these architectural challenges, we selected to make use of Amazon Redshift as our Operational Learn-only Information Retailer (ORDS). The Amazon Redshift structure has compute and storage separation, which allows us to create multi-cluster architectures with a separate endpoint for every area with unbiased scaling of compute and storage assets. Our resolution leverages the information sharing capabilities of Amazon Redshift to create logically remoted knowledge domains whereas sustaining the power to carry out cross-domain analytics when wanted.

Key advantages of the Amazon Redshift resolution embrace:

  1. Useful resource Isolation: Every area could be assigned devoted Amazon Redshift compute assets, ensuring one area’s workload doesn’t influence others.
  2. Impartial Scaling: Domains can scale their compute assets independently primarily based on their particular wants.
  3. Managed Information Sharing: Amazon Redshift’s knowledge sharing characteristic allows safe and managed cross-domain knowledge entry with out tight coupling, sustaining clear area boundaries.

Let’s discover the totally different options we evaluated earlier than choosing ORDS with Amazon Redshift as our optimum method.

Options explored

We carried out ORDS as our optimum resolution after conducting a complete analysis of accessible choices. This part outlines our decision-making course of and examines the alternate options we thought of throughout our evaluation.

Operational Learn-only Information Retailer (ORDS):

In our analysis, we discovered that utilizing Amazon Redshift for ORDS supplies a strong resolution for dealing with knowledge throughout totally different enterprise areas. It excels at managing massive volumes of information from a number of sources, offering quick entry to replicated knowledge for batch processes that require cross-bounded context knowledge, and mixing data utilizing acquainted SQL queries. The answer significantly shines in dealing with high-volume reads from our knowledge sources.

Benefits:

  • Works nicely in a relational database
  • Excels at real-time entry to knowledge from a number of enterprise areas
  • Improves efficiency of batch jobs coping with massive knowledge volumes
  • Shops knowledge in acquainted desk format, accessible by way of SQL
  • Enforces clear knowledge possession, with every enterprise space liable for its knowledge
  • Presents scalable structure that reduces the danger of single level of failure

Disadvantages:

  • Requires extra knowledge validation throughout loading processes to take care of knowledge uniqueness
  • Wants cautious administration of main key constraints since Amazon Redshift optimizes for analytical efficiency
  • Might require extra monitoring and controls in comparison with conventional RDBMS techniques

Listed here are the opposite options we evaluated:

Bulk APIs:

We discovered that Bulk APIs supplies an method for dealing with massive volumes of information.

Benefits:

  • Close to actual time entry to bulk knowledge by means of a single request
  • Autonomous groups have management over entry patterns
  • Environment friendly batch processing of huge datasets with multi-record retrieval

Disadvantages:

  • Every product crew must create their very own bulk API
  • For those who want knowledge from totally different areas, you could mix it your self
  • The crew offering the API should make sure that it could possibly deal with massive quantities of requests
  • You may want to make use of a number of APIs to get all the information you need
  • For those who’re getting knowledge in chunks (pagination), you may miss some data if it modifications between requests

Whereas Bulk APIs provide highly effective capabilities, we discovered they require substantial crew coordination and cautious implementation to be efficient.

Information Lake:

Our analysis confirmed that knowledge lakes can successfully mix data from totally different elements of our enterprise. They excel at processing massive quantities of information without delay, offering search capabilities by means of unified knowledge codecs, and managing massive volumes of numerous and complicated knowledge.

Benefits:

  • Handles huge knowledge volumes effectively
  • Helps a number of knowledge codecs and buildings
  • Permits complicated analytics and knowledge science workloads
  • Supplies cost-effective storage options
  • Accommodates each structured and unstructured knowledge

Disadvantages:

  • Might not present real-time, high-speed knowledge entry
  • Requires extra effort with complicated knowledge buildings, particularly these with many interconnected elements
  • Wants particular methods to prepare knowledge in a easy, flat construction
  • Calls for important knowledge governance and administration
  • Requires specialised expertise for efficient implementation

Whereas knowledge lakes excel at big-picture evaluation of huge datasets, they weren’t optimum for our real-time knowledge wants and complicated knowledge relationships.

S3 Export/Alternate: 

In our evaluation, we discovered that S3 Export/Alternate supplies a way for sharing knowledge between totally different enterprise areas utilizing file storage. This method successfully handles massive volumes of information and permits easy filtering of knowledge utilizing knowledge frames.

Benefits:

  • Supplies easy, cost-effective knowledge storage
  • Helps high-volume knowledge transfers
  • Permits easy knowledge filtering capabilities
  • Presents versatile entry management
  • Facilitates cross-region knowledge sharing

Disadvantages:

  • Not appropriate for real-time knowledge wants
  • Requires further processing to transform knowledge into usable desk format
  • Calls for important knowledge preparation effort
  • Lacks instant knowledge consistency
  • Wants extra instruments for knowledge transformation

Whereas S3 Export/Alternate works nicely for sharing massive datasets between groups, it didn’t meet our necessities for fast, real-time entry or instantly usable knowledge codecs.

The next desk supplies a high-level comparability of the totally different knowledge integration options we thought of for our modernization efforts. It outlines the place every resolution is most applicable to make use of and when it may not be your best option:

Answer Bulk APIs Information Lake ORDS S3 Export/Alternate
When to make use of Actual-time operational knowledge is required

Fetching particular knowledge subsets

Processing massive quantities of information without delay

Many bounded context

Close to real-time entry throughout a number of bounded contexts

Giant quantity batch processing

Few bounded contextsHandling massive volumes of information

Level-in-time export is adequate

When to not use Many bounded contexts concerned Actual-time knowledge entry wanted

Structured, transactional knowledge processing

Inside a single bounded context Actual-time knowledge wants

Many bounded contexts

Desk 1: Information Integration Options Comparability

Primarily based on our comparability, we discovered ORDS to be the optimum resolution for our wants, significantly when our batch processes require entry to knowledge from a number of bounded contexts in real-time. Our implementation effectively handles massive volumes of information, considerably enhancing the efficiency of our batch jobs. We selected ORDS as a result of it shops knowledge in a well-recognized desk format, accessible by way of SQL, making it easy and environment friendly for our groups to make use of.

The structure additionally aligns with our domain-driven design ideas by imposing clear knowledge possession, the place every bounded context maintains duty for its personal knowledge administration. This method supplies us with each scalability and reliability, lowering the danger of a single level of failure.

Amazon Redshift: Powering Vanguard’s ORDS Answer

Amazon Redshift serves because the spine of our ORDS implementation, providing a number of essential options that help our modernization objectives:

Information Sharing

Our resolution leveraged the strong knowledge sharing capabilities of Amazon Redshift, accessible on each Server-based Redshift RA3 cases and Redshift Serverless choices. This performance offered us with prompt, safe, and dwell knowledge entry with out copies, sustaining transactional consistency throughout the environment. The pliability of similar account, cross-account, and cross-Area knowledge sharing has been significantly precious for our distributed structure.

Excessive Efficiency

We’ve achieved important efficiency enhancements by means of Amazon Redshift’s environment friendly question processing and knowledge retrieval capabilities. The system successfully handles our complicated knowledge wants whereas sustaining strong efficiency throughout varied workloads and knowledge volumes.

Multi-Availability Zone Assist

Our implementation benefited from Amazon Redshift’s Multi-AZ help, which maintains excessive availability and reliability for our vital operations. This characteristic minimizes downtime with out requiring in depth setup and considerably reduces our threat of information loss.

Acquainted Interface

The relational setting of Amazon Redshift, comparable conventional databases like Amazon RDS and IBM Db2, has enabled a easy transition for our groups. This familiarity has accelerated adoption and improved productiveness, as our groups can leverage their current SQL experience. By centralizing knowledge from a number of enterprise areas in ORDS utilizing Amazon Redshift, we preserve constant, environment friendly, and safe knowledge entry throughout our product groups. This setup is especially precious for our batch processing that requires knowledge from varied elements of the enterprise, providing us a mix of efficiency, reliability, and ease of use.

Operational Learn-only Information Retailer (ORDS) utilizing Amazon Redshift

Right here’s how our ORDS structure implements Amazon Redshift knowledge sharing to unravel these challenges:

ORDS Architecture Diagram

Determine 1: Vanguard’s ORDS Structure utilizing Amazon Redshift Information Sharing

Amazon Redshift Ingestion Sample:

We utilized Amazon Redshift’s zero-ETL performance to combine knowledge and allow real-time analytics instantly on operational knowledge, which helped cut back complexity and upkeep overhead. To enhance this functionality and to meet our complete compliance necessities that necessitate full transaction replication, we carried out extra knowledge ingestion pipelines.

Our knowledge ingestion technique for Amazon Redshift employs totally different AWS providers relying on the supply. For Amazon Aurora PostgreSQL databases, we use AWS Database Migration Service (AWS DMS) to instantly replicate knowledge into Amazon Redshift. For knowledge from Amazon DynamoDB, we leverage Amazon Kinesis to stream the information into Amazon Redshift, the place it lands in materialized views. These views are then additional processed to generate tables for end-users.

This method permits us to effectively ingest knowledge from our operational knowledge shops whereas assembly each analytical wants and compliance necessities.

Amazon Redshift Information Sharing:

We used the Amazon Redshift’s knowledge sharing characteristic to successfully decouple our knowledge producers from customers, permitting every group to function inside their very own boundaries whereas sustaining a unified and simplified ruled mechanism for knowledge sharing.

Our implementation adopted a transparent course of: as soon as knowledge is ingested and accessible in Amazon Redshift desk format, we created views for customers to entry the information. We then established knowledge shares and granted entry to those views to client Amazon Redshift knowledge warehouses for batch processing. In the environment with a number of bounded contexts, we’ve established a collaborative mannequin the place customers work with varied producer groups to entry knowledge from totally different knowledge shares, every created per bounded context.

This entry remained strictly read-only—when customers have to replace or write new knowledge that falls exterior their bounded context, they need to use APIs or different designated mechanisms for such operations. This method has confirmed efficient for our group, selling clear knowledge possession and governance whereas enabling versatile knowledge entry throughout organizational boundaries. It simplified our knowledge administration and made certain every crew can function independently whereas nonetheless sharing knowledge successfully.

Instance: VG couple of cross bounded context

Disclaimer: That is offered for reference functions solely and doesn’t characterize an actual instance.

Let’s take a look at a sensible instance: our brokerage account assertion technology course of. This cross-bounded context batch course of requires integrating knowledge from a number of sources, accessing tons of of tables and processing massive volumes of information month-to-month. The problem was to create an environment friendly, cost-effective resolution that minimizes knowledge replication whereas sustaining knowledge accessibility.ORDS proved excellent for this use case, because it supplies knowledge from a number of bounded contexts with out replication, affords close to real-time entry, and allows easy knowledge aggregation utilizing SQL-like queries in Amazon Redshift.

The next diagram exhibits how we carried out this resolution:

ORDS example

Determine 2: Cross-Bounded Context Instance for Brokerage Account Assertion Era

We’d like the next bounded contexts to generate brokerage statements for hundreds of thousands of our shoppers.

  1. Account:
    • Particulars: Consists of details about the shopper’s brokerage accounts, comparable to account numbers, sorts, and statuses.
    • Holdings and Positions: Supplies present holdings and positions throughout the account, detailing the securities owned, their portions, and present market values.
    • Steadiness Info: Incorporates the steadiness data of the account, together with money balances, margin balances, and whole account worth.
  2. Consumer Profile:
    • Private Info: Details about the shopper, comparable to their title, date of beginning, and social safety quantity.
    • Contact Info: Consists of the shopper’s e-mail handle, bodily handle, and telephone numbers.
  3. Transaction Historical past:
    • Transaction Data: A complete document of transactions related to the account, together with buys, gross sales, transfers, and dividends.
    • Transaction Particulars: Every transaction document consists of particulars comparable to transaction date, sort, amount, worth, and related charges.
    • Historic Information: Historic knowledge of transactions over time, offering a whole view of the account’s exercise.

By means of this structure, we effectively generate correct and complete brokerage account statements by consolidating knowledge from these bounded contexts, assembly each our shoppers’ wants and regulatory necessities.

Enterprise Final result

Our journey with the Operational Learn-only Information Retailer (ORDS) and Amazon Redshift has enhanced our shopper expertise (CX) by means of improved knowledge administration and accessibility. By transitioning from our mainframe system to a cloud-based, domain-driven structure, we have now empowered our autonomous groups and established a resilient batch structure.

This shift facilitates environment friendly cross-domain knowledge entry, maintains high-quality knowledge consistency, and supplies scalability. Our ORDS implementation, supported by Amazon Redshift, affords near-real-time entry to massive knowledge volumes, guaranteeing excessive efficiency, reliability, and cost-effectiveness. This modernization effort aligns with our mission to ship distinctive, personalised shopper experiences and maintain long-lasting shopper relationships.

Name to Motion

If you’re dealing with comparable challenges together with your batch processing techniques, we encourage you to discover how an Operational Learn-only Information Retailer (ORDS) can rework your knowledge structure. Begin by assessing your present system’s limitations and figuring out alternatives for enchancment by means of domain-driven design and cloud-based options. Contemplate how this method will help you handle massive volumes of information from a number of sources, present quick entry to replicated knowledge for batch processes, and help high-volume reads from varied knowledge sources.

Take the subsequent step by conducting a proof of idea (POC) to judge ORDS effectiveness in reaching environment friendly cross-domain knowledge entry, enhancing the efficiency of batch jobs, and sustaining clear knowledge possession inside your online business domains. By implementing this resolution, you may improve your knowledge administration capabilities, cut back operational dangers, and drive innovation inside your group. Embrace this chance to raise your knowledge structure and ship distinctive buyer experiences.

Conclusion 

Our transition to a cloud-native, domain-driven structure with ORDS utilizing Amazon Redshift has efficiently reworked our batch processing capabilities in AWS cloud. This modernization effort has considerably enhanced the efficiency, reliability, and scalability of our batch operations whereas sustaining seamless knowledge entry and integration throughout totally different enterprise domains.

The strategic adoption of ORDS has harnessed the potential of cross-domain knowledge entry in a distributed setting, offering us with a strong resolution for real-time knowledge entry and environment friendly batch processing. This transformation has empowered us to raised meet the calls for of the digital age, delivering superior buyer experiences and reinforcing our dedication to innovation within the monetary providers business.


Concerning the authors

Malav Shah

Malav Shah

Malav is a Area Architect in Vanguard’s Private Investor Expertise division, with over a decade of expertise in cloud-native options. He focuses on architecting and designing scalable techniques, and contributes hands-on by means of improvement and proof-of-concept work. Malav holds a number of AWS certifications, together with AWS Licensed Options Architect and AWS Licensed AI Practitioner.

Timothy Dickens

Timothy Dickens

Timothy is a Senior Architect at Vanguard, specializing in superior knowledge streaming designs, AI, real-time knowledge entry, and analytics. With experience in AWS providers like Redshift, DynamoDB, and Aurora Postgres, Timothy excels in creating strong distributed architectures that drive innovation and effectivity. Keen about leveraging cutting-edge applied sciences, Timothy is devoted to delivering reliable, actionable knowledge that empowers assured, well timed decision-making.

Priyadharshini Selvaraj

Priyadharshini Selvaraj

Priyadharshini is a knowledge architect with AWS Skilled Providers, bringing over a decade of experience in serving to clients navigate their knowledge journeys. She makes a speciality of knowledge migration and modernization initiatives, specializing in knowledge lakes, knowledge warehouses, and distributed processing utilizing Apache Spark. As an knowledgeable in Generative AI and agentic architectures, Priyadharshini allows clients to harness cutting-edge AI applied sciences for enterprise transformation. Past her technical pursuits, she practices yoga, performs piano and enjoys passion baking, bringing steadiness to her skilled life.

Naresh Rajaram

Naresh Rajaram

Naresh is a seasoned Options Architect with over 20 years of expertise, with main focus in cloud computing and synthetic intelligence. Specializing in enterprise-scale AI implementations and cloud structure, he’s serving to clients develop and deploy superior AI options, with specific concentrate on autonomous AI techniques and agent-based architectures. His experience spans designing cutting-edge AI infrastructures utilizing Amazon Bedrock, Amazon Bedrock AgentCore, and cloud-native AI providers, whereas pioneering work in Agentic AI functions and autonomous techniques.

© 2025 The Vanguard Group, Inc. All rights reserved.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles