|
MACHINE LEARNING SOLUTION FOR IOT BIG DATA
|
|
|
K. Dineva;T. Atanasova
|
|
|
||
|
|
|
|
1314-2704
|
|
|
||
|
English
|
|
|
20
|
|
|
2.1
|
|
|
|
|
|
||
|
Nowadays it is critical to have the ability to quickly and reliably fetch huge amounts of
heterogeneous data and apply Machine Learning (ML) models against it for better decision making. Successful processing of streams with data is crucial for real-time operations like extracting, filtering, transforming, aggregating with other data sources, persisting data to data warehouses, publishing to a different messaging topics or pipelines. With Machine Learning gaining high in popularity serious concerns are appearing around the performance of the Machine Learning models in production and there is a reason for that. It is essential to choose wisely the right technologies used for creating robust data pipelines, deploying accurate Machine Learning models and monitoring the performance in production environments. In this paper, an approach is proposed for building a distributed platform using a messaging system which is capable of extracting, processing, and analyzing information from streaming data in real-time. Kafka streaming concepts for ingesting data are discussed along with ways to operationalize the data pipelines. Using Spark Structured Streaming for enriching Kafka events with a Machine Learning algorithm is shown. With streaming data continuing to arrive, the Spark engine will react to the data changes and will incrementally and continuously process the data. Important conceptual reasons are discussed that are explaining the factors which have a huge impact on the accuracy and the performance of the deployed Machine Learning models in a production environment. The overall improved result can be used later to produce the proper conclusions and better predictions. |
|
|
conference
|
|
|
||
|
||
|
20th International Multidisciplinary Scientific GeoConference SGEM 2020
|
|
|
20th International Multidisciplinary Scientific GeoConference SGEM 2020, 18 - 24 August, 2020
|
|
|
Proceedings Paper
|
|
|
STEF92 Technology
|
|
|
International Multidisciplinary Scientific GeoConference-SGEM
|
|
|
SWS Scholarly Society; Acad Sci Czech Republ; Latvian Acad Sci; Polish Acad Sci; Russian Acad Sci; Serbian Acad Sci & Arts; Natl Acad Sci Ukraine; Natl Acad Sci Armenia; Sci Council Japan; European Acad Sci, Arts & Letters; Acad Fine Arts Zagreb Croatia; C
|
|
|
207-214
|
|
|
18 - 24 August, 2020
|
|
|
website
|
|
|
cdrom
|
|
|
6988
|
|
|
Big Data; Machine Learning Model Performance; IoT
|
|