Skip to content

Commit 284f96a

Browse files
committed
Refine the documentation and references further according to the recommendations.
1 parent 7f05262 commit 284f96a

File tree

7 files changed

+24
-15
lines changed

7 files changed

+24
-15
lines changed

Appendix.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -274,7 +274,7 @@ This schema is used by five different transactions, each creating varied access
274274
4. **Order and Order-Line:** Inserts with time-delayed updates, causing rows to become stale and infrequently read.
275275
5. **History:** Insert-only.
276276
277-
The diverse access patterns of this small schema with a limited number of transactions contribute to TPC-C's ongoing significance as a major database benchmark. In this book, BenchmarkSQL [65] is primarily employed to evaluate TPC-C performance in MySQL.
277+
The diverse access patterns of this small schema with a limited number of transactions contribute to TPC-C's ongoing significance as a major database benchmark. In this book, BenchmarkSQL [68] is primarily employed to evaluate TPC-C performance in MySQL.
278278
279279
## How MySQL Processes SQL?
280280
@@ -652,6 +652,6 @@ For MySQL clusters, the patch introduces further optimizations for **Group Repli
652652
653653
## About the Author
654654
655-
In earlier years, Bin Wang worked at an internet company focused on developing high-performance computing and high-concurrency systems. He also contributed to open-source projects like TCPCopy [60] and MySQL Proxy [61], gaining valuable experience in problem-solving, particularly in logical thinking.
655+
In earlier years, Bin Wang worked at an internet company focused on developing high-performance computing and high-concurrency systems. He also contributed to open-source projects like TCPCopy [65] and MySQL Proxy [66], gaining valuable experience in problem-solving, particularly in logical thinking.
656656
657-
After leaving the internet company, he concentrated on MySQL-related development, successfully contributing to projects such as Group Replication, secondary replay, InnoDB storage engines, and query optimization [64]. He has accumulated extensive experience in problem-solving within the MySQL domain.
657+
After leaving the internet company, he concentrated on MySQL-related development, successfully contributing to projects such as Group Replication, secondary replay, InnoDB storage engines, and query optimization [67]. He has accumulated extensive experience in problem-solving within the MySQL domain.

Chapter4_1.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The figure below illustrates the comparison results of TPC-C tests across differ
3636

3737
Figure 4-3. Performance Comparison in SMP vs. NUMA.
3838

39-
In the scenario where NUMA node 0 is bound, the throughput versus concurrency curve is notably smooth. Even under high concurrency, there is only a slight decline in throughput, indicating low thread context switching costs. However, throughput consistently remains below 400,000 tpmC due to significant limitations in memory bandwidth, characteristic of traditional SMP architecture.
39+
In the scenario where NUMA node 0 is bound, the throughput versus concurrency curve is notably smooth. Even under high concurrency, there is only a slight decline in throughput, indicating low thread context switching costs [60]. However, throughput consistently remains below 400,000 tpmC due to significant limitations in memory bandwidth, characteristic of traditional SMP architecture.
4040

4141
In contrast, when utilizing all NUMA nodes, the throughput curve is relatively worse. This is attributed to reduced memory efficiency and increased context switching costs when accessing across NUMA nodes, resulting in less stable throughput. Nevertheless, scalability is greatly improved, with peak throughput increasing by 123% compared to using a single NUMA node.
4242

@@ -48,8 +48,8 @@ Many pieces of code are not suitable for NUMA environments. For example, frequen
4848

4949
To achieve optimal performance on NUMA systems [4], the following strategies are crucial:
5050

51-
1. Maximize the proportion of memory accesses routed to local nodes.
52-
2. Balance traffic across nodes and interconnect links.
51+
1. Maximize the proportion of memory accesses routed to local nodes.
52+
2. Balance traffic across nodes and interconnect links.
5353

5454
An unbalanced distribution of memory requests can significantly increase memory access latency on overloaded controllers, sometimes reaching up to 1000 cycles compared to approximately 200 cycles on non-overloaded controllers.
5555

Chapter4_12.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ The most efficient testing method is to utilize real online traffic for evaluati
2626

2727
In Oracle, Database Replay enables testing a system with real production workloads, helping identify potential problems before implementing changes on the production system. Any workload period can be captured with little overhead and used to drive a test system, maintaining the concurrency and load characteristics of the real workload. Maintaining these characteristics is crucial, as current testing solutions often lack synchronization based on data dependencies. Without proper synchronization, the workload does not perform as required, leading to poor coverage and inadequate load, leaving many problems undetected. Database Replay's data-based synchronization makes testing realistic and helps discover potential problems [34].
2828

29-
In MySQL, a common strategy involves taking a MySQL secondary instance offline for testing, configuring the necessary cluster, and replicating online MySQL requests to this new testing primary. The closer the testing primary resembles the production environment, the more accurate the test results. There are various methods to replicate online MySQL requests. This book recommends the open-source tool TCPCopy [60]. By using TCPCopy, many online problems have been successfully effectively resolved, laying a solid foundation for MySQL proxy enhancements [61]. For testing a MySQL cluster, replicating online requests to the testing system using TCPCopy allows us to evaluate whether the modifications achieve the expected outcomes, such as performance improvements, and robustness.
29+
In MySQL, a common strategy involves taking a MySQL secondary instance offline for testing, configuring the necessary cluster, and replicating online MySQL requests to this new testing primary. The closer the testing primary resembles the production environment, the more accurate the test results. There are various methods to replicate online MySQL requests. This book recommends the open-source tool TCPCopy [65]. By using TCPCopy, many online problems have been successfully effectively resolved, laying a solid foundation for MySQL proxy enhancements [66]. For testing a MySQL cluster, replicating online requests to the testing system using TCPCopy allows us to evaluate whether the modifications achieve the expected outcomes, such as performance improvements, and robustness.
3030

3131
### 4.12.5 Is Testing About Discovering Problems or Verifying Them?
3232

Chapter4_2.md

+2
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,8 @@ class hash_table_t {
196196
...
197197
```
198198
199+
Note that the page-based buffer pool has low caching efficiency, and the page translation table is a scalability bottleneck [64].
200+
199201
The certification database of Group Replication uses the *std::unordered_map* hash table to handle a large volume of certification information.
200202
201203
```c++

Chapter8.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ From the figure, it is evident that after applying the patch, the rate of throug
145145

146146
Addressing this problem directly presents considerable challenges, particularly for MySQL developers unfamiliar with query execution plans. Using logical reasoning and a systematic approach to identify and address code differences before and after the problem arose is a more elegant problem-solving method, though it is complex.
147147

148-
It is noteworthy that no regression testing problems were encountered after applying the patch, demonstrating high stability and providing a solid foundation for future performance improvements. Currently, MySQL 8.0.40 still hasn't solved this problem, suggesting potential shortcomings in MySQL's testing system. Given the complexity of MySQL databases, users should exercise caution when upgrading and consider using tools like TCPCopy [60] to avoid potential regression testing problems.
148+
It is noteworthy that no regression testing problems were encountered after applying the patch, demonstrating high stability and providing a solid foundation for future performance improvements. Currently, MySQL 8.0.40 still hasn't solved this problem, suggesting potential shortcomings in MySQL's testing system. Given the complexity of MySQL databases, users should exercise caution when upgrading and consider using tools like TCPCopy [65] to avoid potential regression testing problems.
149149

150150
### 8.1.2 Improving Binlog Group Commit Scalability
151151

@@ -633,7 +633,7 @@ Based on extensive testing, after solving most of MySQL's scalability problems,
633633

634634
Centralized databases struggle to fully utilize hundreds of CPU cores due to limitations in their transaction systems. To address this, transaction throttling mechanisms are becoming increasingly important.
635635

636-
MySQL has introduced a "Max Transaction Limit" feature in its thread pool to mitigate performance degradation [31]. This feature limits the number of concurrent transactions, improving throughput by reducing data locks and deadlocks on heavily loaded systems. This approach can inspire similar mechanisms that increase throughput in high-concurrency scenarios without relying solely on traditional thread pools.
636+
MySQL has introduced a "Max Transaction Limit" feature in its thread pool to mitigate performance degradation [31]. This feature limits the number of concurrent transactions, improving throughput by reducing data locks and deadlocks on heavily loaded systems [64]. This approach can inspire similar mechanisms that increase throughput in high-concurrency scenarios without relying solely on traditional thread pools.
637637

638638
For MySQL, the specific process figure for transaction throttling is as follows:
639639

Preface.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Part 5 is the concluding summary. Chapter 12 outlines future directions for MySQ
4949

5050
## References and Further Reading
5151

52-
This book focuses on analyzing and solving MySQL problems, so a certain level of computer science background is recommended. To support understanding and maintain continuity, key terminology is included in the "Glossary" section of the appendix. For those lacking a foundation in MySQL, please refer to the related content in the appendix or consult dedicated MySQL books.
52+
This book focuses on analyzing and solving MySQL problems, so a certain level of computer science background is recommended. To support understanding and maintain continuity, key terminology is included in the "Glossary" section of the appendix. For those lacking a foundation in MySQL, please refer to the related content in the appendix or consult dedicated MySQL books [69].
5353

5454
## Special Terminology Explanation
5555

References.md

+12-5
Original file line numberDiff line numberDiff line change
@@ -118,16 +118,23 @@
118118

119119
[59] J. M. Hellerstein, M. Stonebraker, and J. R. Hamilton. Architecture of a database system. Foundations and Trends in Databases. 1(2) pp. 141--259, 2007.
120120

121-
[60] https://github.com/session-replay-tools/tcpcopy.
122-
123-
[61] <https://github.com/session-replay-tools/cetus>.
121+
[60] Chuanpeng Li, Chen Ding, and Kai Shen. Quantifying the cost of context switch. In Proceedings of the 2007 workshop on Experimental computer science, ExpCS ’07, New York, NY, USA, 2007. ACM.
122+
[61] Xiangpeng Hao, Xinjing Zhou, Xiangyao Yu, and Michael Stonebraker. 2024. Towards Buffer Management with Tiered Main Memory. Proc. ACM Manag. Data 2, 1 (Feb. 2024), Article 31. SIGMOD.
124123

125124
[62] Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and detecting real-world performance bugs. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '12, Beijing, China - June 11 - 16, 2012. 77–88.
126125

127126
[63] Jelena Antic, Georgios Chatzopoulos, Rachid Guerraoui, and Vasileios Trigonakis. 2016. Locking made easy. In Proceedings of the International Middleware Conference (Middleware). 1--14.
128127

129-
[64] https://github.com/enhancedformysql/enhancedformysql.
128+
[64] D. Dice and A. Kogan, "Avoiding scalability collapse by restricting concurrency" in Euro-Par 2019: Parallel Processing, Cham:Springer International Publishing, pp. 363-376, 2019.
129+
130+
[65] https://github.com/session-replay-tools/tcpcopy.
131+
132+
[66] https://github.com/session-replay-tools/cetus.
133+
134+
[67] https://github.com/enhancedformysql/enhancedformysql.
135+
136+
[68] https://github.com/enhancedformysql/benchmarksql.
130137

131-
[65] https://github.com/enhancedformysql/benchmarksql.
138+
[69] https://github.com/enhancedformysql/tech-explorer-hub.
132139

133140
[Next](Appendix.md)

0 commit comments

Comments
 (0)