Massive Processing and Mining of Big Data on DaaS-based Cloud

Massive Processing and Mining of Big Data on DaaS-based Cloud


Fatos Xhafa


Big Data has become a big player in the current knowledge-based systems in various forms of Business Intelligence and Analytics. The premise is that by processing and mining large data sets, the actors of the system (managers, leaders, analysts, etc.) can make more informed decisions and predictions, often in (almost) real-time. However, this comes to the price of processing and mining huge amounts of data under time and cost restrictions.

In this talk we will address some emerging issues in massive processing and mining of Big Data. Such issues comprise, on the one hand, the need for distributed data mining (distributed machine learning, in a broader scope). Indeed, most existing machine learning algorithms are designed under the assumption that data can fit into RAM and can be fast accessed, while disk-level machine learning is slow. On the other hand, with ever increasing size of data sets, distributed machine learning algorithms face the burden of data pre-load and data communication, which could well be larger than the proper data processing time.

We will discuss how such issues can be approached under the Data-as-a-Service (DaaS)-based Cloud model, where data-locality and data-aware scheduling (namely, computing services are scheduled on the same or nearby computing resources) are essential to design new algorithms for both massive processing and mining of Big Data.


Fatos Xhafa received his PhD in Computer Science in 1998 from the Department of Computer Science of the Technical University of Catalonia (UPC), Barcelona, Spain. Currently, he holds a permanent position of Professor Titular at UPC, BarcelonaTech. He was a Visiting Professor at Birkbeck College, University of London (UK) during academic year 2009-2010 and Research Associate at Drexel University, Philadelphia (USA) during academic term 2004/2005. Dr. Xhafa has widely published in peer reviewed international journals, conferences/workshops, book chapters and edited books and proceedings in the field ( He is awarded teaching and research merits by Spanish Ministry of Science and Education, by IEEE conference and best paper awards. Dr. Xhafa has an extensive editorial and reviewing service. He is editor in Chief of International Journal of Grid and Utility Computing and International Journal of Space-based and Situated Computing from Inderscience and member of EB of several International Journals and Guest Editors of Special Issues. He is Editor in Chief of the Elsevier Book Series “Intelligent Data-Centric Systems” and of Springer Lecture Notes in Data Engineering and Communication Technologies. He is actively participating in the organisation of several international conferences and workshops. He is a member of IEEE Communications Society, IEEE Systems, Man & Cybernetics Society and Emerging Technical Subcomm. of Internet of Things. His research interests include parallel and distributed algorithms, massive data processing and collective intelligence, optimisation, networking, P2P and Cloud computing, machine learning and data mining, among others.