Digital Intelligent Software Solutions


LeapHD Big Data Platform

LeapHD helps enterprises quickly establish a unified data lake or data center, supporting the integration of internal and external data, centralized storage of massive data, parallel processing of large-scale computing, unified management of computing resources, and efficient data analysis and mining. On the platform of big data, users can build corresponding analysis and mining applications.

 

Main functions

 

Cluster Management

Cluster Management (Manager) is an automated operation and maintenance tool for Lenovo's big data platform. It meets the requirements of guided automatic installation of big data platform under different scenarios and realizes effective monitoring and visual management of host resources and host services. The supported components include HDFS, Yam.MapReduce, HBase, Hive, Spark, Storm, Zookeeper, etc. to support the intelligent operation and maintenance of the entire big data platform.

 

Data development and task scheduling

Data Development and Task Scheduling (TaskScheduler) is an efficient graphical big data workflow configuration and execution management platform that supports visual big data computing task construction capabilities. By encapsulating the complexity of the underlying technology and providing visual operations on various computing modules such as SQL scripts, MR, Spark, Scala, Shell scripts, MySQL, Oracle, data import and export, developers are more focused on computing itself than on the underlying technical details.

 

Data computing storage

Based on the Hadoop open source ecosystem, the big data platform introduces a variety of core functions and components to highly integrate and optimize the performance of complex open source technologies. Based on the distributed storage system, a unified resource scheduling management system is established to efficiently support large-scale batch processing, interactive query computing, stream computing and other computing engines.

 

Data Directory

Data Catalog (Data Catalog) is a data management tool for big data platforms. It manages metadata owned by enterprises, supports business view and physical view data management, and can view basic metadata information, data location, data blood relationship, data impact analysis, and manage data life cycle.

 

Data Integration

Data Integration (Data Hub) is a data transmission tool for big data platforms. Using DataHub, data of different channels, different platforms and different formats can be summarized into Hive.Hbase or Hdfs. Datahub include functions such as graphical ETL construction, migration task management, migration running instance and other modules.

 

SQL Query Analyzer

SQL Query Analyzer (SQL Editor) is an online query system based on big data platform. With the help of SQL Editor system, users do not need to master complex big data development technology, as long as they are familiar with SQL syntax, they can quickly query massive data similar to relational database, and obtain intuitive query results in a visual way.

 

Data Quality

Data quality (Data Quality) is a data quality management tool for big data platforms that enables the rapid identification, repair, and monitoring of data quality issues in business applications within an enterprise. Supports unified maintenance of enterprise data standards and quality rule libraries, simple and easy-to-operate data audit configuration, complete graphical quality analysis, problem data preview and download, and flexible alarm mechanism.

 

System Management

System Admin (System Admin) adopts the idea of multi-tenant, opens data capacity on demand and controllably, provides multi-tenant library table resource management, permission allocation, and provides storage, computing resource allocation, usage monitoring and billing services based on projects.