The paper aims at proposing a solution for designing and developing a seamless automation and integration of machine learning capabilities for Big Data with the following requirements: 1) the ability to seamlessly handle and scale very large amount of unstructured and structured data from diversified and heterogeneous sources; 2) the ability to systematically determine the steps and procedures needed for
analyzing Big Data datasets based on data characteristics, domain expert inputs, and data pre-processing component; 3) the ability to automatically select the most appropriate libraries and tools to compute and accelerate the machine learning computations; and 4) the ability to perform Big Data analytics with high learning performance, but with minimal human intervention and supervision. The whole focus is to provide
a seamless automated and integrated solution which can be effectively used to analyze Big Data with highfrequency
and high-dimensional features from different types of data characteristics and different application problem domains, with high accuracy, robustness, and scalability. This paper highlights the research methodologies and research activities that we propose to be conducted by the Big Data researchers and practitioners in order to develop and support seamless automation and integration of machine learning capabilities for Big Data analytics.