Efficient Mapreduce Matrix Multiplication With Optimized Mapper Set
Efficient Mapreduce Matrix Multiplication With Optimized Mapper Set
net/publication/321586026
CITATIONS READS
3 789
4 authors, including:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Azzam Sleit on 17 December 2017.
Methaq Kadhum, Mais Haj Qasem ✉ , Azzam Sleit, and Ahamd Sharieh
( )
1 Introduction
As shown in Fig. 1, the Hadoop framework is responsible for distributing the input
into the involved mappers. These mappers implement the map task, collect the results
for sorting during the shuffle process, and feed and collect the output of the reducers.
The MapReduce paradigm has been used to decompose enormous tasks, such as
data-mining algorithms. Specific MapReduce paradigms include: MapReduce with
expectation maximization for text filtering [11], MapReduce with K-means for remote-
sensing image clustering [12], and MapReduce with decision tree for classification [22].
Additionally, MapReduce has been used for job scheduling [23] and real-time
systems [11].
Matrix multiplication that uses MapReduce has been proposed [9]. The earlier
decomposition process of matrix multiplication involved two MapReduce tasks.
However, problems arose with these decomposition processes: the processing overhead
and file I/O overhead were obvious. Hence, it was necessary to re-decompose the matrix
multiplication process in the MapReduce paradigm to enhance and decrease the
computing overhead.
This paper proposes a technique for matrix multiplication. The technique uses a
single MapReduce task with an optimized mapper set. The optimal number of mappers
that formed the utilized mapper set is selected to balance the processing overhead results
of a small mapper set and the I/O overhead results of a large mapper set. These two
processes consume time and resources.
The rest of the paper is organized as follows: Sect. 2 reviews work that is closely
related to the implemented MapReduce matrix multiplication task. Section 3 presents
the proposed work for matrix multiplication and highlights the relation between
proposed and previous techniques. Section 4 presents the experimental results. Finally,
the conclusion is given in Sect. 5.
2 Related Work
faster than the element-to-element scheme. Moreover, the best scheme had medium
input sizes and involved a medium number of mappers. Thus, their results suggested the
need to balance input size with mapper number.
In addition to the blocking scheme, reducing the number of MapReduce jobs from
two to one also reduced the overall computational cost for matrix multiplication. There‐
fore, inputs in the MapReduce Job should be as blocks. Each block should contain
elements from both matrices to be multiplied. To reduce computational cost and memory
consumption, Deng and Wu [4, 21] modified the way Hadoop read I/O files. In the
HAMA project [21], a pre-processing stage was implemented for the same purpose.
3 Proposed Work
The goal of the proposed work is to enhance the efficiency of the matrix multiplication
in MapReduce framework. This is implemented by balancing between the processing
overhead results from using a small mapper set and the I/O overhead results from using
large mapper set, which both leads to consume time and resources, based on our previous
arguments in Sect. 2.
In the proposed technique, matrix multiplication is implemented as an element-to-
block scheme, as illustrated in Fig. 5. In the first schema; first array is decomposed into
individual elements, whereas the second array is decomposed into sub-row-based
blocks, while the second schema; first array is decomposed into sub-row-based blocks,
and the second array is decomposed into sub-column-based blocks. The number of
mappers is determined by the size of the block that is generated for the second array and
selected on the basis of the capability of the underlying mapper. Subsequently, a smaller
block size increases the number of blocks, thus requiring more mappers and vice versa.
This work uses a single MapReduce job. The map task, as listed in Table 2, is
responsible for the multiplication operations, whereas the reduce task is responsible for
the sum operations. The pre-processing step reads an element from the first array and a
block from the second array, and then merges them into one file. Note that in matrix
multiplication, the whole row in the first array has to be multiplied with the whole column
in the second array to calculate the results of an element in the output. Thus, the results
of each mapper in the proposed schemes are aggregated with other multiplication results
in the reduce task.
Compared with existing schemes, the proposed work utilizes one MapReduce job
instead of two. The number of multiplications handled by a mapper is dependent on the
capability of the mapper, which is determined by block size. Previous work has inves‐
tigated element-by-element (from the first and second arrays), element-by-column, and
row-by-column multiplications. Varying the number of elements in rows and columns
in different inputs revealed that the best result involves medium size input because the
processing overhead at each mapper is ignored. Hence, to match the capabilities of that
mapper, we proposed to vary the number of elements given to the mapper.
Unlike previous techniques, this work proposes to multiply an element by a block
of elements. The block varies from a single element into a complete row. If the block
size is equal to one, then the proposed work will be identical to an element-to-element
scheme. However, if the block size is equal to the dimension of the input array, then the
proposed work will be identical to the element-to-row/column scheme. Subsequently,
the previous work is considered as a special case of our general proposed work.
Table 3 compares the proposed and existing schemes.
The results of matrix multiplication using Hadoop for inputs with various size is
presented. Sparse matrices of size n*n are randomly generated with numbers from 1–10.
The experiments are conducted for various block size varied in the range [1−n]. In this
work, we run a simple matrix multiplication process with size 100*100 on the platform
with various block size varied in the range [1,10,15,20,25,30] in-order to determine the
optimal length to be given to the mapper before running the actual job. The pre-experi‐
ments are shown in Table 4.
process in the shuffle is reduced. As the matrix size growth, the stability of the proposed
scheme in better compared to the existing schemes, which, seems to be almost linear.
The results for space consumption for the proposed and existing schemes are reported
as given in Table 6. As noted, proposed and existing schemes almost identical but the
proposed work takes slightly more spaces compared to others. Therefore, if the user
cares about time our proposed schema is the best choice, but if he cares about the memory
capacity he can choose from another algorithm.
Our algorithm is written in java and the experimental results are calculated for our Proposed
Schemes and Existing Schemes on HP® core™ i7-5500U CPU @ 2.40 GHz /8 GB RAM.
5 Conclusion
A block-based matrix multiplication schemes were proposed in this paper. The proposed
schemes balance between the processing overhead results from using a small mapper set
and the I/O overhead results from using large mapper set, which both leads to consume time
and resources. This balancing is optimizing by determining the optimal block size and
number of involved mappers. The results show that the proposed schemes reduce both time
and memory utilization.
6 Future Work
Our proposed schema is implemented on sparse algorithm, our future work will be on dense
algorithm, in other hand, we can optimized reduce set.
Efficient MapReduce Matrix Multiplication 195
References
1. Cannon, L.E.: A Cellular Computer to Implement the Kalman Filter Algorithm. No. 603-
Tl-0769. Montana State Univ Bozeman Engineering Research Labs (1969)
2. Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In:
Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing, pp. 1–6.
ACM (1987)
3. Catalyurek, U.V., Aykanat, C.: Hypergraph-partitioning-based decomposition for parallel
sparse-matrix vector multiplication. IEEE Trans. Parallel Distrib. Syst. 10(7), 673–693 (1999)
4. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: OSDI,
p. 10. USENIX (2004)
5. Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM 53(1),
72–77 (2010)
6. Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters. Commun.
ACM 51(1), 107–113 (2008)
7. Dekel, E., Nassimi, D., Sahni, S.: Parallel matrix and graph algorithms. SIAM J. Comput.
10(4), 657–675 (1981)
8. Deng, S., Wenhua, W.: Efficient matrix multiplication in hadoop. Int. J. Comput. Sci. Appl.
13(1), 93–104 (2016)
9. Fox, G.C., Otto, S.W., Hey, A.J.G.: Matrix algorithms on a hypercube I: Matrix multiplication.
Parallel Comput. 4(1), 17–31 (1987)
10. Lin, J., Dyer, C.: Data-intensive text processing with MapReduce. Synth. Lect. Hum. Lang.
Technol. 3(1), 1–177 (2010)
11. Liu, X., Iftikhar, N., Xie, X.: Survey of real-time processing systems for big data. In:
Proceedings of the 18th International Database Engineering & Applications Symposium.
ACM (2014)
12. Lv, Z., Hu, Y., Zhong, H., Wu, J., Li, B., Zhao, H.: Parallel K-means clustering of remote
sensing images based on MapReduce. In: Wang, F.L., Gong, Z., Luo, X., Lei, J. (eds.) WISM
2010. LNCS, vol. 6318, pp. 162–170. Springer, Heidelberg (2010). doi:
10.1007/978-3-642-16515-3_21
13. Mahafzah, B.A., Sleit, A., Hamad, N.A., Ahmad, E.F., Abu-Kabeer, T.M.: The OTIS hyper
hexa-cell optoelectronic architecture. Computing 94(5), 411–432 (2012)
14. Norstad, J.: A mapreduce algorithm for matrix multiplication (2009). http://www.norstad.org/
matrix-multiply/index.html. Accessed 19 Feb 2013
15. Thabet, K., Al-Ghuribi, S.: Matrix multiplication algorithms. Int. J. Comput. Sci. Netw. Secur.
(IJCSNS) 12(2), 74 (2012)
16. Seo, S., Yoon, E.J., Kim, J., Jin, S., Kim, J.S., Maeng, S.: Hama: An efficient matrix
computation with the mapreduce framework. In: 2010 IEEE Second International Conference
on Cloud Computing Technology and Science (CloudCom), pp. 721–726. IEEE, November
2010
17. Sleit, A., Al-Akhras, M., Juma, I., Alian, M.: Applying ordinal association rules for cleansing
data with missing values. J. Am. Sci. 5(3), 52–62 (2009)
18. Sleit, A., Dalhoum, A.L.A., Al-Dhamari, I., Awwad, A.: Efficient enhancement on cellular
automata for data mining. In: Proceedings of the 13th WSEAS International Conference on
Systems, pp. 616–620. World Scientific and Engineering Academy and Society (WSEAS),
July 2009
19. Sleit, A., AlMobaideen, W., Baarah, A.H., Abusitta, A.H.: An efficient pattern matching
algorithm. J. Appl. Sci. 7(18), 269–2695 (2007)
196 M. Kadhum et al.
20. Sleit, A., Saadeh, H., Al-Dhamari, I., Tareef, A.: An enhanced sub image matching algorithm
for binary images. In: American Conference on Applied Mathematics, pp. 565–569, January
2010
21. Sun, Z., Li, T., Rishe, N.: Large-scale matrix factorization using mapreduce. In: 2010 IEEE
International Conference on Data Mining Workshops. IEEE (2010)
22. Wu, G., et al.: MReC4.5: C4.5 ensemble classification with MapReduce. In: 2009 Fourth
ChinaGrid Annual Conference. IEEE (2009)
23. Zaharia, M., et al.: Job scheduling for multi-user mapreduce clusters. EECS Department,
University of California, Berkeley, Technical Report UCB/EECS-2009-55 (2009)
24. Zheng, J., Zhu, R., Shen, Y.: Sparse matrix multiplication algorithm based on MapReduce. J.
Zhongkai Univ. Agric. Eng. 26(3), 1–6 (2013)