VedAlgo Big Data Konnect Toolkit is a software suite that integrates processing in Apache Hadoop with operations in Operational Data Store (ODS). Our toolkit has been designed to work seamlessly with Hadoop and a variety of ODS. The toolkit allows for fast, seamless bi-directional data transfer between ODS and the Hadoop Operational Data Lake (HODL). They are easy to use with existing skill sets, greatly simplifying development of Big Data solutions. Konnect delivers high speed connectivity, performance, and security for Big Data applications.
Big Data allows the modern enterprise to process large amounts of structured and unstructured data. Insights gleaned from these solutions allow them to disrupt competitors, get actionable intel on individual customers, and exploit new analyses. Hadoop Operational Data Lakes which are created by merging unstructured and structured data are then used for exploratory analysis of raw data. VedAlgo Konnect provides bi-directional data flow and allows for the integration of results with ODS for real-time queries, advanced analytics, and complex data management.
Konnect SQL Connector for HDFS allows you to query data in the HODL (Hadoop Operational Data Lake) using SQL. The data is accessed via external tables. Data is loaded by selecting data from the HODL and ODS and inserting it into an external Hadoop RDBMS table. VedAlgo Konnect offers out of the box flexibility in terms of dealing with data types and Hive partitioning.
Konnect Loader is a high performance load tool for fast movement of data between ODS and HODL. When pumping data out of HODL, Konnect Loader for Hadoop sorts, partitions, and converts data into ODS types on Hadoop before loading into ODS. Transforming the data into ODS types reduces database CPU usage during the load, minimizing impact on database applications. This feature makes this connector useful for continuous and frequent loads. Konnect Loader for Hadoop intelligently distributes data across Hadoop nodes while loading data in parallel. This minimizes the effects of data skew, a common concern in parallel applications. Konnect Loader for Hadoop can load a wide variety of input formats: text files, Hive tables, log files (parse and load), NoSQL, and more. Through Hive it can also load from input formats (e.g.: Parquet, JSON files) and input sources (e.g.: HBase) accessible to Hive.
In addition, Konnect Loader for Hadoop can read proprietary data formats through custom input format implementations provided by the user.
VedAlgo R-LYTICS for Hadoop runs R code in Hadoop for scalable analytics. The connector hides complexities of Hadoop-based computing from the R user, executing R code in parallel in Hadoop from stand-alone desktop applications developed in any IDE the R user chooses.
R-LYTICS for Hadoop enables faster insights with the rich collection of scalable, high performance, parallel implementations of common statistical and predictive techniques in Hadoop, without requiring data movement to any other platform. It supports rapid development with R-style debugging capabilities of parallel R code on user desktops, simulating parallelism under the covers. The connector enables analysts to combine datafrom several environments – client desktop, HDFS, Hive, HODL and in-memory R data structures – all in the context of a single analytic task execution, greatly simplifying data assembly and preparation.
VedAlgo Konnect Enterprise has a comprehensive set of knowledge modules for Hadoop. The modules allow complex bidirectional data transformation operations executed through a familiar graphical interface, greatly simplifying running jobs in Hadoop. Data movement from one source to another can be graphically defined (e.g.: from ODS to HODL, HODL to Hive, etc.). Knowledge modules for Konnect Loader for Hadoop and Konnect SQL Connector provide high speed data movement. Konnect Enterprise eliminates the need to write complex code for Hadoop applications.