Near-real time statistics gathered from a continuous and voluminous data mutation stream

Lavooij, K. (2010) Near-real time statistics gathered from a continuous and voluminous data mutation stream.

[img]
Preview
PDF
328kB
Abstract:The amount of digital data is growing fast [1]. Providing that information as a service is not enough, with the amount of information available [2]. To support the users in finding information, supporting systems have been developed to extract specific information from a large amount of stored data. Finding or extracting interesting information is as least as important as providing the original data. The “collective intelligence” of a large number of users can be used to order the information. The ordered information is of much greater value when compared to the unordered information, because it provides the user with an overview of interesting and less interesting information. Current database systems are not able to provide ranked information by analyzing a massive amount of user feedback (e.g. clicks) within a short period of time. Therefore, the systems update the answers periodically. In this thesis, a Stream Processing Engine [3, 4, 5, 6] (SPE) is being adapted. The modified SPE accepts a stream of mutations to a virtual data storage as opposed a stream of tuples. The newly created system exploits the properties of statistical functions in order to efficiently aggregate live statistics over a large stream of mutations. The newly created system is able to provide answers to a small set of continuous queries. The answers to the queries will be continuously maintained, instead of recalculated. Therefore, the system is able to provide the answers to the continuous queries instantly and with low latency for a large number of users.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:http://purl.utwente.nl/essays/59410
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page