Data Analysis as a Service

Gurvinder Singh
TrackTrack 3 --Lecture Hall IX on the First Floor
DescriptionIn recent years, the amount of research data generated has increased exponentially. So there has been an increasing demand of getting this data to work by storing and processing it in a horizontally scalable way. At UNINETT, we have large amount of data coming from different sources e.g. Netflow, System logs. To process such an amount of data, we require systems which can store data at high speed as well as offers capability to analyze incoming data. The system should be able to support real time analytics as well as  batch analytics on historical data.
Considering such requirements, we are testing systems which can store and process data in a distributed and scalable manner. Currently we are looking into system such as Hadoop, Spark, Hbase and Graph processing system to store and process Netflow data. This talk will describe our experiences from such systems and using them to analyze Netflow data. The experiences will include advantages and drawbacks of different systems, their performance etc.

Presentation documents

All talks