UCSY's Research Repository

Developing Scalable and Lightweight Data Stream Ingestion Framework for Stream Processing

Show simple item record

dc.contributor.author Hlaing, Nwe Ni
dc.contributor.author Nyunt, Thi Thi Soe
dc.date.accessioned 2022-06-21T05:49:37Z
dc.date.available 2022-06-21T05:49:37Z
dc.date.issued 2021-02-25
dc.identifier.uri https://onlineresource.ucsy.edu.mm/handle/123456789/2650
dc.description.abstract According to the development of technology, enormous amount of data are being generated as a continuous basis from Social media, IOT devices, and web etc. This lead to big data era. Many researchers are paying attention on massive amount of data stream processing coming with a rapid rate to gain valuable information in real-time or to make immediate decision .Data Ingestion Stage is an important part in data stream processing system .It is responsible for the data collection from different locations and then deliver this data for processing. The most important requirement of data ingestion is to provide low latency, high throughput, and scalability with many data producers and consumers. It can influence on entire stream processing performance. In big data stream computing, speed at which data being created and explosive growth of data lead to new challenges. One challenge is to accurately ingest different stream data into a processing platform or data storage platform.Current existing data stream ingestion systems use a combination of Apache NiFi and Kafka. Apache Nifi is used for collection and preprocessing of structured and unstructured data feeds. Kafka is used for message distribution. However, processor such as MergeRecord in Nifi can be memory , I/O CPU intensive.As a result,when processing massive data streams creation with high speed can lead to a lot of memory effort , input/output bottleneck or central processing unit (CPU) bottleneck.It leads to impact on the performance of stream processing layer and it is not appropriate for time sensitive applications. In this paper, we propose to use a combination of StreamSets Data Collector and Kafka to collect and transform from various sources of structured and unstructured feeds. en_US
dc.language.iso en_US en_US
dc.publisher ICCA en_US
dc.subject StreamSets Data Collector, Stream Ingestion, Apache Kafka ,Big Data en_US
dc.title Developing Scalable and Lightweight Data Stream Ingestion Framework for Stream Processing en_US
dc.type Presentation en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository



Browse

My Account

Statistics