Real time insights cannot be measured using existing BI architecture. With increase in adoption of IoT, computations and visualizations needs to done on streaming data (e.g. Having time interval of 3 sec ,5 sec etc) in order to derive the value from it.
To analyse the streaming data, we need to first publish the data into messaging systems. Then all the data cleaning and data processing steps are done using Spark Engine. We can also implement real time scoring of predictive model using SparkML library.
All the processed data is then stored in elastic which is then visualized by Kibana or Grafana.
Kafka : Kafka is a distributed publish-subscribe messaging system used for building real-time data pipelines and streaming apps. Kafka
Elasticsearch : Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyse big volumes of data quickly and in near real time. Elasticsearch
Kibana : Kibana makes it easy to understand large volumes of data. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real time.” Kibana
Apache Spark : :”Apache Spark is a fast and general-purpose cluster computing system. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming.” Apache Spark
|USA: DEFTeam Solutions, Inc.
One Stamford Plaza, 263 Tresser Blvd. 9th Floor, Stamford, CT 06901
|Tel. : +203-653-2293
|Finland: DEFTeam Solutions Oy.
Myyrmäenraitti 2, 01600 Vantaa, Finland.
|Tel. : +358-9-4245 4499|
|Dubai: P.O.Box #337365, Jumeirah Lake Towers, Dubai, U.A.E.|
|Tel. : +971-50-2052270|