-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Read optimization #1697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read optimization #1697
Conversation
* ensuring performance schema is enabled when testing some performance schema results * Added logic to check if the default collation is overridden by the server character_set_collations * ensure using IANA timezone in test, since tzinfo depending on system won't have deprecated tz like "US/Central" and "US/Pacific"
Since configuration options doesn't change at runtime, after connection is established, use dedicated function, in order to avoid multiple test test compress, checking ReadTimeout configuration option
WalkthroughThe changes introduce a refactor of the Changes
Sequence Diagram(s)sequenceDiagram
participant Test/Benchmark
participant mysqlConn
participant Buffer/CompIO
participant NetConn
Test/Benchmark->>mysqlConn: Initialize (set readFunc/readNextFunc)
mysqlConn->>Buffer/CompIO: Assign readNextFunc (compression aware)
mysqlConn->>NetConn: Assign readFunc (with/without timeout)
Test/Benchmark->>mysqlConn: Call readPacket()
mysqlConn->>Buffer/CompIO: Call readNextFunc(readFunc)
Buffer/CompIO->>NetConn: Call readFunc
Buffer/CompIO-->>mysqlConn: Return data
mysqlConn-->>Test/Benchmark: Return packet data
Poem
Tip ⚡💬 Agentic Chat (Pro Plan, General Availability)
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
⏰ Context from checks skipped due to timeout of 90000ms (20)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
benchmark_test.go (1)
116-155
: Well-designed benchmark for measuring read performance improvementsThis benchmark effectively demonstrates the performance optimization from the PR by selecting 10,000 rows from a MariaDB sequence table, which triggers many calls to the optimized
readPacket
function. The benchmark is well-structured with proper error handling and resource cleanup.Consider starting the timer after statement preparation to isolate just the row-reading performance:
- b.StartTimer() stmt, err := db.Prepare("SELECT * FROM seq_1_to_10000") if err != nil { b.Fatalf("Failed to prepare statement: %v", err) } defer stmt.Close() + b.StartTimer() for n := 0; n < b.N; n++ {This would provide more precise measurement focusing exclusively on the row reading performance, which is where your optimization has the most impact.
packets.go (1)
29-112
: Excellent read optimization in packet handlingThe changes to
readPacket
effectively eliminate repeated conditional checks on compression and timeout settings by using function pointers that are initialized once during connection setup. This is a significant performance optimization for a hot path in the codebase.The approach replaces direct buffer reads with calls to the appropriate read functions via
readNextFunc
andreadFunc
function fields, which are set only once when the connection is established.Consider adding a comment before the
readPacket
function that briefly explains the optimization approach, which would help future maintainers understand the design pattern:// Read packet to buffer 'data' // Uses readNextFunc and readFunc function fields for optimized reading // that avoids repeated checks of unchanging configuration options func (mc *mysqlConn) readPacket() ([]byte, error) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
AUTHORS
(1 hunks)benchmark_test.go
(1 hunks)compress_test.go
(1 hunks)connection.go
(2 hunks)connection_test.go
(1 hunks)connector.go
(3 hunks)driver_test.go
(4 hunks)packets.go
(3 hunks)packets_test.go
(4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
connection_test.go (1)
dsn.go (1)
Config
(37-84)
🔇 Additional comments (22)
AUTHORS (1)
40-40
: Author added for contributionThe addition of "Diego Dupin" to the AUTHORS file properly acknowledges the contributor responsible for the read optimization work per the project's guidelines.
compress_test.go (1)
43-44
: Explicit assignment of read functions for compression supportThese additions are required to support the refactored read mechanism. The test now explicitly sets the read function pointers that were previously handled implicitly, ensuring proper packet reading during compression tests.
connection.go (2)
42-43
: Added function fields to abstract reading operationsThe addition of
readFunc
andreadNextFunc
function fields to themysqlConn
struct is the core of this optimization. These fields allow the packet reading logic to be established once at connection time rather than checking configuration options on every read operation.This design improves performance by avoiding repeated checks of unchanging configuration values like compression settings and read timeout during the frequently called packet reading operations.
242-242
: Fixed typo in commentCorrected "its" to "it's" in the comment for better grammar.
connection_test.go (4)
21-22
: Refactored test setup to use predefined buffer and connectionExtracted buffer and connection creation into variables for reuse and improved clarity in the test setup.
24-25
: Using extracted variables in mysqlConn initializationThe test now uses the predefined buffer and connection objects, making the code more consistent and maintainable.
30-31
: Initialized read function fields for testThese additions explicitly set the read function fields to match the new connection.go implementation. This ensures the test works correctly with the refactored read mechanism and validates the correct behavior of the interpolation functionality with the new design.
47-52
:❌ Incorrect review comment
Verify other tests are consistent with the new design
The other test functions in this file (
TestInterpolateParamsJSONRawMessage
and subsequent tests) don't initialize the new read function fields. While these tests may be working without errors (since they don't actually perform network reads), it would be more consistent to update them similarly to the first test.
🏁 Script executed:
#!/bin/bash # Check if other tests in this file are actually reading packets rg -A 5 "readPacket|read.*Next" connection_test.goLength of output: 211
No action needed: other tests already initialize the new read functions
The search confirms that subsequent tests inconnection_test.go
already set bothreadNextFunc: buf.readNext
andreadFunc: nc.Read
. No consistency updates are required.Likely an incorrect or invalid review comment.
packets.go (1)
367-368
: Correctly handling TLS connection upgradeThis change ensures that after upgrading to TLS, subsequent reads use the TLS connection's Read method directly, maintaining the optimization pattern established in this PR.
connector.go (2)
134-148
: Efficient read function initializationThis code efficiently initializes the read function pointers based on connection configuration. By setting up the appropriate read functions once during connection initialization, the code eliminates the need to check these conditions on every read operation, which is the core of this optimization.
The approach is particularly effective for the ReadTimeout configuration, where it creates a closure that sets the deadline before each read, but only creates this closure once.
190-191
: Proper handling of compressed connectionsThis change correctly updates the read function pointer when compression is enabled, maintaining the optimization pattern established in this PR. This ensures that the compressed I/O's readNext method is used directly without conditional checks on each packet read.
packets_test.go (5)
97-113
: Updated test helpers for new read function fieldsThis change correctly initializes the new read function fields in the test helper, which is necessary to make tests work with the updated code. It also improves efficiency by reusing the same buffer instance for both the buf field and the readNextFunc function.
115-138
: Updated test to properly initialize read functionsThis test update correctly initializes the read function fields required by the new packet reading implementation. The test still effectively verifies the same behavior while accommodating the structural changes from the optimization.
172-280
: Properly adapted packet split testThis test update correctly initializes the read function fields required by the new packet reading implementation. The core logic of the test remains unchanged, continuing to verify that large packets are correctly split and reassembled.
282-326
: Updated packet fail testsThis test update correctly initializes the read function fields required by the new packet reading implementation. The test scenarios for error handling remain unchanged, ensuring that error handling continues to work properly with the optimized code.
328-364
: Updated regression testThis test update correctly initializes the read function fields required by the new packet reading implementation. The regression test still effectively verifies the same behavior while accommodating the structural changes from the optimization.
driver_test.go (6)
1633-1658
: Good addition for MariaDB compatibility!This code properly handles the case where MariaDB might override the default collation for a character set via the
character_set_collations
system variable. It parses the charset=collation pairs and uses them to determine the expected collation value, making the test more robust.
1665-1672
: Proper error handling for the collation checkThis conditional statement nicely complements the MariaDB compatibility code by verifying that the collation is either the original expected value or the forced (overridden) value determined by the MariaDB configuration.
1721-1721
: Good update to standard IANA timezone namesSwitching from
US/Central
andUS/Pacific
toAmerica/New_York
andAsia/Hong_Kong
uses more standard IANA timezone identifiers, which is recommended practice.
1729-1730
: Timezone reference update consistent with the list changeThis correctly updates the reference time location to use America/New_York instead of US/Central, consistent with the timezone list change above.
1749-1749
: Updated error message matches new timezoneThe error message correctly references the new timezone name for consistency.
3577-3585
: Robust check for performance schema availabilityThis new check prevents the test from failing on MySQL instances where the performance schema is disabled. Good practice to check for prerequisites before running tests that depend on specific configurations.
please remove unrelated changes. |
nice catch about performance overhead. But I don't think "repeatedly checks configuration options" cause it. And I don't like solution too. |
added the PR #1703 as requested that correspond only to the benchmark part of this PR |
You'll see the only related change in this PR that is used in benchmark are those checks (most of the code is just adapting the test to this changes) I've specifically use a big resultset with small value. When reading rows, most of the code use is ReadPacket. This show the cost of those repeted checks. |
See #1705. |
readPacket Optimization Summary
Problem
The readPacket function is called very frequently during database operations, but it repeatedly checks configuration options that don't change after a connection is established (such as compression and ReadTimeout settings).
Solution
Create a dedicated function to handle packet reading after connection establishment, eliminating redundant checks of unchanging configuration options.
Performance Impact
This is not just a micro-optimization, as readPacket is called extremely frequently during database operations.
Benchmark Results
Before Optimization:
After Optimization:
Checklist
Summary by CodeRabbit
New Features
Refactor
Style