Tencent-CloudLogService
Tencent-CloudLogService
Performances
Muzhi Yu Zhaoxiang Lin Jinan Sun
Peking University Tencent Cloud Computing (Beijing) Peking University
Beijing, China Co., Ltd. Beijing, China
[email protected] Beijing, China [email protected]
[email protected]
Shikun Zhang
Peking University
Beijing, China
[email protected]
Name Value
No. of documents ∼12 b
No. of shards 6
average ES segment size ∼5 GB
No. of documents per ES segment ∼24 m
average No. of hits per query ∼40 m
Head Query
Service Time CPU / query rMB / query
No Optimizations 604124.0 200.5 452.7
(a) Head Queries O0 50318.2 7.3 37.3
Multiplier 12.0 27.6 12.1
Acc. Multiplier 12.0 27.6 12.1
O0 + O1 17224.8 5.5 12.5
Multiplier 2.9 1.3 3.0
Acc. Multiplier 35.1 36.5 36.2
O0 + O1 + O2 + O3 15904.2 5.2 12.1
Multiplier 1.1 1.1 1.0
Acc. Multiplier 38.0 38.9 37.3
Tail Query
Service Time CPU / query rMB / query
(b) Tail Queries No Optimizations 585014.0 196.0 438.4
O0 193487.0 831.7 144.3
Multiplier 3.0 0.2 3.0
Acc. Multiplier 3.0 0.2 3.0
O0 + O1 194551.0 821.8 82.2
Multiplier 1.0 1.0 1.8
Acc. Multiplier 3.0 0.2 5.3
O0 + O1 + O2 + O3 23931.0 34.4 17.1
Multiplier 8.1 23.9 4.8
Acc. Multiplier 24.4 5.7 25.6
Histogram Query
(c) Histogram Queries Service Time CPU / query rMB / query
No Optimizations 584511.0 116.4 438.0
Figure 10: Performances for three types of queries with dif-
O0 179252.0 66.6 134.0
ferent optimization options
Multiplier 3.3 1.7 3.3
Acc. Multiplier 3.3 1.7 3.3
O0 + O1 183304.0 69.2 137.7
On top of that, the turning on the secondary index (O1) further Multiplier 1.0 1.0 1.0
increases the head query performances by 3x, but has little effect Acc. Multiplier 3.2 1.7 3.2
on the performances of other types of queries.
Furthermore, the Reverse Binary Search Optimization technique O0 + O1 + O2 + O3 76893.0 39.8 57.0
(O2) increases the tail query performances by 3.5x, while the His- Multiplier 2.4 1.7 2.4
togram Optimization technique (O3) increases the histogram query Acc. Multiplier 7.6 2.9 7.7
performances by 1.6x.
The results are shown in Figure 10, distinguishing the perfor-
mances under different user counts, as well as in Table 2. Tencent Premium Cloud Storage, SATA HDD drives, and NVMe
SSD drives are the most representative ones.
5.1.3 RQ3. How does the choice of the storage option affect the All the above analyses (RQ1 and RQ2) are based on the experi-
query performance, before and after the optimization? Tencent Cloud ments using Tencent Premium Cloud Storage as the storage option.
provides a series of customizable storage options, among which However, experimental results with other storage options are also
8
Table 3: The specifications of different storage solutions at Table 4: Comparison of performance improvements among
Tencent Cloud. IOPS is tested with 4 KiB IO, and throughput different storage solutions. For each storage solution, three
is tested with 256 KiB IO. rows list the native performances, the performances after
optimizations, and the multipliers for performance improve-
Disk Type IOPS Throughput ments, respectively. The results are tested under 200 concur-
rent users for Premium Cloud Storage and NVMe SSD, and
Premium Cloud Storage 6,000 150 MB/s under 150 concurrent users for SATA HDD because of the its
NVMe SSD 650,000 2.8 GB/s limited performance.
SATA HDD 200 190 MB/s
Head Query
Service Time CPU / query rMB / query
Premium Cloud
important, because they not only show the comparison of effective- Storage 604124.0 200.5 452.7
ness of the optimization techniques, but also serve as a guidance 15904.2 5.2 12.1
for choosing the storage option. 38.0 38.9 37.3
Tencent Cloud Premium Cloud Storage is a hybrid storage option. NVMe SSD 84986.6 405.6 459.4
It adopts the Cache mechanism to provide a high-performance SSD- 2704.1 9.0 9.6
like storage, and employs a three-copy distributed mechanism to 31.4 45.3 47.6
ensure data reliability. SATA HDD 1426810.0 215.7 423.9
SATA HDD is the most economical option suitable for scenarios 108863.0 8.6 14.0
that involve sequential reading and writing of large files, but its 13.1 25.1 30.2
random access performance is relatively low. Tail Query
NVMe SSD has the highest performance. But its low cost perfor- Service Time CPU / query rMB / query
mance ratio restricts its strength in the log service scenarios.
Premium Cloud
Table ?? shows the comparison of the specifications of the three
Storage 585014.0 196.0 438.4
storage options.
23931.0 34.4 17.1
The experimental results with different storage options are shown
24.4 5.7 25.6
in Table 4. We can draw the following conclusions. First, the NVMe
NVMe SSD 77402.1 370.8 449.6
SSD option consistently outperform other storage options, while
13134.5 61.1 17.3
the Tencent Premium Cloud Storage option is less than an order
5.9 6.1 26.0
of magnitude behind. Second, compared with the NVMe SSD, the
SATA HDD 1448450.0 211.7 433.2
Tencent Premium Cloud Storage consistently enjoys more benefits
183195.0 35.7 17.7
from the query optimization techniques.
7.9 5.9 24.5
5.1.4 RQ4. Will the increase of timestamp precision level impact the Histogram Query
query performances? It is also the goal of Cloud Log Service to sup- Service Time CPU / query rMB / query
port storing and querying higher-precision timestamps. Therefore, Premium Cloud
it is important to check how does the increase of the timestamp pre- Storage 584511.0 116.4 438.0
cision level impact the query performance. To this end, we change 76893.0 39.8 57.0
the timestamp from second to millisecond, and analyze the query 7.6 2.9 7.7
performance. The data also comes from the experiments using NVMe SSD 53759.4 237.7 425.5
Tencent Premium Cloud Storage. 17333.5 77.4 48.9
Interestingly, as is shown in Figure 11, increasing the timestamp 3.1 3.1 8.7
precision has almost no impact on the query performance, thanks SATA HDD 1326030.0 130.9 411.9
to the search engine design in TencentCLS. 465770.0 42.4 58.1
The reason is that although the precision increases, the frequency 2.8 3.1 7.1
of the log writes stays the same. Although theoretically some oper-
ations such as locating the endpoints will get slower, after applying
the secondary index optimization, the difference in costs is sig-
nificantly reduced. Also, those precision-sensitive operations do 5.1.5 RQ5. What is the bottleneck of our system? We have also
not take up a large proportion of the total service time. Therefore, investigated the bottlenecks of our system, by analyzing the CPU
generally speaking, the performance is virtually unaffected by the usage and the disk IO during the above experiments.
time precision. As is shown in Table 4, the IO performances the main bottleneck
In fact, the online version of TencentCLS is running with microsecond- for Premium-Cloud-Storage-based solutions and SATA-HDD-based
level time precision thanks to the search engine design, while many solutions, while the CPU performances becomes the bottleneck for
vendors are providing second-level time precision log services. NVMe-SSD-based solutions.
9
Table 5: Results of the online experiment.
Head Query
# Log 109 1010
Original (ms) 12882 16904
Ours (ms) 399 780
Boost Multiplier 32x 21x
Tail Query
# Log 109 1010
Original (ms) 10577 17483
Ours (ms) 391 1299
(a) Head query performance Boost Multiplier 27x 13x
Histogram Query
# Log 109 1010 5 ∗ 1010 1011
Original (ms) 16623 >42764 TIMEOUT TIMEOUT
Ours (ms) 1144 4253 10300 17920
Boost Multiplier 15x >10x N/A N/A
6 CONCLUSION
In this paper, we introduce the motivation of TencentCLS, and
propose the architecture of TencentCLS. Then we elaborate on
the design and optimizations of the search engine in TencentCLS,
a system that supports low-latency queries with massive high-
cardinality data. Finally, we evaluate and analyze the performance
of our search engine, both with open benchmarks and with online
data in TencentCLS.
ACKNOWLEDGMENTS
We would like to thank anonymous reviewers for their valuable
(c) Histogram query performance comments and helpful suggestions. We thank the Tencent Cloud
staff for providing cloud resources and technical support. We also
Figure 11: Performances with second-level timestamp preci- thank ElasticSearch Team for the support.
sion and millisecond-level timestamp precision, evaluated
using the total service time (in milliseconds). REFERENCES
[1] 2022. Amazon CloudWatch - Application and Infrastructure Monitoring.
https://aws.amazon.com/cloudwatch/.
[2] 2022. Apache Lucene. https://lucene.apache.org/. Accessed: 2010-09-30.
[3] 2022. Apache Solr. https://solr.apache.org/.
[4] 2022. Azure Monitor | Microsoft Azure. https://azure.microsoft.com/en-
us/services/monitor/.
[5] 2022. Cloud Logging | Google Cloud. https://cloud.google.com/logging.
5.2 Online Test [6] 2022. Elastic. https://www.elastic.co/.
[7] 2022. MG4J: High-Performance Text Indexing for Java™.
In addition to the offline experiments with open benchmarks, we https://mg4j.di.unimi.it/.
have also tested the system with real world data. [8] 2022. Sphinx: Open Source Search Engine. http://sphinxsearch.com/.
[9] 2022. Splunk. https://www.splunk.com.
The experiments involve two clusters, one equipped with Elas- [10] 2022. The Xapian Project. https://xapian.org/.
ticSearch (version 7.10.1), and the other equipped with the search [11] Stefan Aulbach, Torsten Grust, Dean Jacobs, Alfons Kemper, and Jan Rittinger.
engine of TencentCLS. Each cluster consists of 3 master nodes as [n.d.]. Multi-Tenant Databases for Software as a Service: Schema-Mapping
Techniques. ([n. d.]), 12.
well as 40 data nodes. We select a single large log topic as input, [12] Andrzej Białecki, Robert Muir, and Grant Ingersoll. 2012. Apache Lucene 4. 24
and its data is written to those clusters at the same time. pages.
10
[13] Matteo Catena, Craig Macdonald, and Iadh Ounis. 2014. On Inverted Index Berlin Heidelberg, Berlin, Heidelberg, 124–138. https://doi.org/10.1007/978-3-
Compression for Search Engine Efficiency. In Advances in Information Retrieval 540-72951-8_11
(Lecture Notes in Computer Science), Maarten de Rijke, Tom Kenter, Arjen P. de [18] Giulio Ermanno Pibiri and Rossano Venturini. 2021. Techniques for Inverted
Vries, ChengXiang Zhai, Franciska de Jong, Kira Radinsky, and Katja Hofmann Index Compression. Comput. Surveys 53, 6 (Nov. 2021), 1–36. https://doi.org/10.
(Eds.). Springer International Publishing, Cham, 359–371. https://doi.org/10. 1145/3415148 arXiv:1908.10598
1007/978-3-319-06028-6_30 [19] Octavian Procopiuc, Pankaj K. Agarwal, Lars Arge, and Jeffrey Scott Vitter.
[14] D. Cutting and J. Pedersen. 1990. Optimization for Dynamic Inverted Index Main- 2003. Bkd-Tree: A Dynamic Scalable Kd-Tree. In Advances in Spatial and
tenance. In Proceedings of the 13th Annual International ACM SIGIR Conference Temporal Databases, Gerhard Goos, Juris Hartmanis, Jan van Leeuwen, Thanasis
on Research and Development in Information Retrieval - SIGIR ’90. ACM Press, Hadzilacos, Yannis Manolopoulos, John Roddick, and Yannis Theodoridis (Eds.).
Brussels, Belgium, 405–411. https://doi.org/10.1145/96749.98245 Vol. 2750. Springer Berlin Heidelberg, Berlin, Heidelberg, 46–65. https://doi.org/
[15] Marcus Fontoura, Ronny Lempel, Runping Qi, and Jason Zien. 2005. Inverted 10.1007/978-3-540-45072-6_4
Index Support for Numeric Search. [20] Hao Yan, Shuai Ding, and Torsten Suel. 2009. Inverted Index Compression and
[16] Xiaoming Gao, Vaibhav Nachankar, and Judy Qiu. 2011. Experimenting Lucene Query Processing with Optimized Document Ordering. In Proceedings of the 18th
Index on HBase in an HPC Environment. In Proceedings of the First Annual International Conference on World Wide Web - WWW ’09. ACM Press, Madrid,
Workshop on High Performance Computing Meets Databases - HPCDB ’11. ACM Spain, 401. https://doi.org/10.1145/1526709.1526764
Press, Seattle, Washington, USA, 25. https://doi.org/10.1145/2125636.2125646 [21] Peilin Yang, Hui Fang, and Jimmy Lin. 2017. Anserini: Enabling the Use of Lucene
[17] Maurice Herlihy, Yossi Lev, Victor Luchangco, and Nir Shavit. 2007. A Simple for Information Retrieval Research. In Proceedings of the 40th International ACM
Optimistic Skiplist Algorithm. In Structural Information and Communication SIGIR Conference on Research and Development in Information Retrieval. ACM,
Complexity, Giuseppe Prencipe and Shmuel Zaks (Eds.). Vol. 4474. Springer Shinjuku Tokyo Japan, 1253–1256. https://doi.org/10.1145/3077136.3080721
11