Traveler data ingestion platform modernization for the world’s leading health and security services company
About the Company
The world’s leading health and security services firm with nearly two-thirds Fortune Global 500 companies as its clients. The company provides multi-cultural health, security, and logistics services and support from over 1,000 locations in 85 countries.
On its journey to develop competitive products and services, the company wanted to design and build a modern data ingestion capability and revamp its legacy ETL, data storage, and data consumption platforms. They needed a partner who could help them design and develop next-generation data ingestion platforms with a decision engine and serve as the enterprise platform for the entire company.
- One of the goals was to design and develop a decision engine by consolidating and streamlining content ingest tools, processes, and operations while ensuring data integrity and quality.
- The company’s customer data was provided using multiple ingress protocols, and it had to be ingested into a cloud-native enterprise data lake.
- The company was looking to design and integrate high-performance APIs for ingestion and consumption for analytics and visualization tools, third party integrations, and web and mobile applications.
- The enterprise clients’ employee data provided using multiple ingress protocols has to be ingested into a cloud-native enterprise data lake. The existing core applications including traveler management applications, custom user provider for the single sign-on system, API based platform being exposed for consumption.
- Provide secure data upload mechanisms for end clients using a single sign-on feature.
- Provide a multi-layered safe encrypted mechanism for clients to upload data.
- Implement a configurable rules-based engine to the data ingestion pattern.
- Process various file formats, record structures, data types, and sizes to be processed.
- Data to be available for existing application consumption with minimal effort.
- Serve as Enterprise platform and gold source for the employee database for the entire company.
- Scalable to increasing clients and new initiatives planned by the organization due to pandemic.
- 24x7x365 availability of data with privacy and security compliance.
What we did
AWS Lambda Data Migration
Data Governance Model
Data Catalog, Data Integration, and Ingestion
Enterprise Data Lake
Cloud Managed Services
Our client was poised to embrace the public cloud for the first time with the help of CompuGain experts. AWS was the recommended data ingestion platform for flexibility, reliability, and scalability.
The solution architecture CompuGain team provided had the following distinct subsystems leveraging a breadth of AWS services:
- Ingress mechanism – Secure API, SFTP
- Data Pipeline – Serverless Data Ingestion pipeline
- Data storage – Elastic search, Cloud-Native Data Lake, and Application database consumption
AWS Lambda was used to extensively build several critical subsystems of the Enterprise Data Ingestion platform. The fully managed serverless technologies provided out-of-box capabilities for the non-functional requirements and allowed the engineering teams to focus on the business problem at hand.
- Our team provided services across SDLC, including architectural, design, development, test, and deployment activities.
- Designed and set up multi-region (US NorthEast Region & AWS France region) and multi-availability zone infrastructure to ensure data security, availability, and adherence to GDPR compliance.
- Designed and delivered AWS lambda based custom authorizer system for AWS Managed SFTP service to provide a secure file transfer capability. The custom authorizer allowed to integrate with a single sign-on customer database present in OKTA.
- Our team delivered AWS API Gateway backed by AWS lambda to build the REST API’s needed for the solution.
- Delivered AWS API Gateway backed by AWS lambda to build the REST API’s needed for the solution. Secure and scalable REST API’s were provided to perform various activities, including customer onboarding and customer verification.
- AWS Lambda + AWS Glue Serverless was used to build the Data ingestion into the AWS RDS database, and RDS proxy was evaluated.
- AWS Lambda was also used to add an extra application-level security with a strong public-private key mechanism, in-place decryption system within the data pipeline mechanism.
- At the heart of the data ingestion platform lies a complex state machine based on business rules that are changing continuously. CompuGain implemented a complex, asynchronous, and configurable Multi-step Business Rules Engine carefully orchestrated using multiple AWS Lambdas.
- Provided a scalable and reliable application database to be consumed by existing applications through API platforms by migrating the legacy SQL server.
Result and Outcomes
With the new architecture, the client was able to embrace AWS cloud, establish the multi-region cloud landing zones, develop and productize a solution for employees and customers in less than 4 months.
improvement in the quality of
client employee data ingested for onboarding 4500 enterprise clients
compliance for the architecture and design
adherence to data privacy, security, and compliance requirements
- Cloud-Native digital transformation foundation set up for the Enterprise using cutting edge managed serverless technologies.
- Quick introduction of new rules based on business demands through Modular Business Rules Engine.
- Available across the world according to geographical compliance needs like GDPR.
- 24*7*365 availability of data with automatic backup.
- Enabled multiple parallel data ingestion across clients and ingress protocols with high availability
- Custom database for onboarding and authenticating clients through OKTA to serve for all the applications
- Consolidated and streamlined content ingest tools, processes, and operations
- Multi-layered encryption mechanism enabled security clients
- Single sign-on enabled SFTP mechanism integrated with existing customer OKTA database
- Ability to design and integrate high-performance APIs for ingestion and consumption
- Improved geospatial support through POSTGRES 11 and POSTGIS