"Yakhont-BI" - an intelligent software ETL platform to process data and build structured data storages. Increased volumes of information received daily from external sources, need for application of common encoding and classification tools require application of special and specialized solutions - server ETL customized platforms. Databases and databanks, non-structured text arrays are the landscape where modern application information systems are to function. Uninterrupted implementation of information integration and fusion projects is aimed not only at organization of end-to-end business processes but also at ensuring common formats of accumulation, encoding and classification of all types of accumulated data. Such encoding and classification when filling corporate data warehouses are key factors that determine efficiency of use of a company's resources and propriety of new costs in case of utilization of multiple information systems and databases.
With its "Yakhont-BI" complex NORSI-TRANS offers a new, innovative approach to development of integration ETL platforms. "Yakhont-BI" applications allow using it as the basis of the following solutions for intelligent ETL data processing:
- Personal instrumental solutions and turnkey implementation;
- Corporate Intranet cloud;
- On-Demand SaaS Internet platform;
- Intelligent OSS platform of a communication operator.
Main design features:
- Special server cross-platform ETL engine using all capabilities of modern multicore processors and multiprocessor servers;
- Possibility of parallel processing of one data array simultaneously on several servers;
- Flexible capabilities for organization of structure of connected computing capacities (coordinators, control units) for distribution of processing tasks;
- User connects to computing resources through Web 2.0 thin client enabling:
- Interactive generation of diagrams of data collection, cleaning, normalization and loading;
- Formation of the structure of computing resources, dynamic attachment of additional capacities;
- Scheduling of fulfillment of generated ETL tasks on connected computing capacities.
Examples of the appearance of Web 2.0 OWS when creating the data processing scheme are shown in the below figures:
Summarized structural diagram of "Yakhont-BI" intelligent application platform
Main components of "Yakhont-BI" software:
- Units executing ETL data processing (special C++ engine for parallel data processing with complete use of capabilities of multicore processors, and control subsystem);
- Coordinators recording processing units (distribute data cleaning tasks sent to the coordinator between the connected units on which respective engines are started), interface for communication with the coordinator - HTTP REST;
- Control units - accumulate operator-generated schemes of ETL data cleaning and store information about coordinators, users etc. connected to them, ensure operation of Web 2.0 user OWS;
- Web 2.0 user OWS used for generation of schemes for data cleaning, normalization etc.; an operator sends the generated and saved processing schemes for execution to selected coordinators specifying on which units connected to each coordinator processing should be executed, compiles a schedule and defines the ETL task for completion according to this schedule etc., controls execution of started ETL processing tasks, monitors all computing means and software.
In case of creation of 'private' Intranet cloud ETL environments "Yakhont-BI" allows:
- Ensure joint use of resources in the data processing center;
- Use less physical equipment due to 100% utilization of the existing equipment;
- By orders of magnitude increase data processing efficiency and reduce time of ETL operations, unload information systems each of which is forced to independently execute data cleaning and loading (usually, by means of slow DBMS operations);
- Reduce costs for purchasing licenses on other ETL tools for various information systems, having centralized all respective tasks by means of "Yakhont-BI".
In case of creation of On-Demand SaaS Internet platforms "Yakhont-BI" allows:
- Completely exclude costs for procurement of server equipment and expensive licenses on box ETL products due to purchasing the service of connection to "Yakhont-BI" Internet SaaS Platform;
- Possibility for organizations to switch to purchasing the ETL data processing services to be paid only as required per SaaS model;
- Possibility to scale capacities engaged in processing through "Yakhont-BI" capabilities of operation on the basis of virtual machines leased from PaaS Service Providers; this allows a customer to pay only real needs for customer's data cleaning, normalizing and processing, and, as a result, to achieve significant savings throughout the entire period of operation of customer's own databases and information systems.
In case of creation of intelligent OSS platforms "Yakhont-BI" allows:
- Solve (due to mass parallelism of processed data on server computing capacities) tasks of primary decoding (the complex has a number of decoders of raw CDR files), cleaning and normalizing of CDR data in real-time mode;
- Process received data pursuant to the set transformation schemes, transfer information to various information systems and databases of the operator (incl. their entry into prepaid, billing systems, and FMS systems and Revenue Assurance Systems);
- Ensure centralized control and monitoring of all flows of arriving information and its transfer to operator's information system, detection of erroneous operation of communication network switching equipment.
Personal instrumental solutions and turnkey implementation
"Yakhont-BI" complex is also offered as a standard box product in the form of installation kits (Linux/Windows OS) that can be deployed and put into operation by customer's own forces and configured in accordance with the required performance of ETL operations. Generally, the complex provides the following capabilities for ETL processing and normalizing of heterogeneous information:
- All "Yakhont-BI" application components are totally cross-platform and ensure maximum utilization (loading) of computing capacities;
- Tens of processing, cleaning, and normalizing operations;
- Special data cleaning operations;
- Connectivity to network file storages, database servers (data sources, storages and users);
- Caching of reference tables used during data processing in RAM, which allows performing respective processing "on-the-fly" without additional database addressing;
- Linear performance scaling through addition of computing capacities engaged in data processing;
- Data processing rate advantage by several orders of magnitude (hundreds of times) as compared with conventional ETL tools due to applied technologies of in-memory processing;
- Due to applied approaches of interaction with coordinator, control unit (HTTP REST API) - capabilities of creation of specialized external complexes that can be connected to the ETL data processing environment of "Yakhont-BI".