-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Update TimeKeyEncoder to Preserve Lexicographical Order #21
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe changes involve updates to the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (6)
keys.go (2)
71-75
: Consider enhancing error message for debugging.The implementation correctly decodes the binary format and ensures UTC timezone consistency. However, the error message could be more helpful for debugging.
Consider enhancing the panic message:
- panic("invalid time key") + panic(fmt.Errorf("invalid time key: expected 8 bytes, got %d bytes %s", len(b), HumanizeBytes(b)))
63-75
: Document breaking changes and provide migration guidance.This change in time key encoding is a breaking change that affects data persistence. To ensure a smooth upgrade path:
- Document the breaking change in CHANGELOG.md
- Provide migration scripts or guidance for users to re-encode existing time-based keys
- Consider adding version bytes to the key encoding to support future changes
Would you like assistance in drafting the migration documentation or implementing a migration helper function?
keyset_test.go (4)
108-108
: Simplify time comparisons usingtime.Equal
When comparing
time.Time
values, you can use the built-inEqual
method instead of stripping monotonic clock readings and comparing UnixNano timestamps. This simplifies the code and improves readability.Apply this refactor to simplify time comparisons:
-expectedTime := times[i].Round(0) -actualTime := k.Round(0) -require.Equal(t, expectedTime.UnixNano(), actualTime.UnixNano()) +require.True(t, times[i].Equal(k))Repeat this change in all test functions where time comparisons are performed.
Also applies to: 144-144, 154-154, 192-192, 213-213, 251-251
220-227
: Extend tests with sub-second time differencesThe current tests use time differences in whole seconds. To thoroughly verify the lexicographical order preservation, consider adding times with sub-second differences. This ensures the encoding handles times with finer granularity.
Extend the
times
slice with sub-second time values:times := []time.Time{ now.Add(1 * time.Second), now.Add(2 * time.Second), now.Add(3 * time.Second), now.Add(4 * time.Second), now.Add(5 * time.Second), + now.Add(5*time.Second + 500*time.Millisecond), + now.Add(5*time.Second + 750*time.Millisecond), }
255-275
: Test insertion of times with identical wall times but different monotonic clocksCurrently, the test inserts the exact same
time.Time
value multiple times. Consider testing with times that have the same wall clock time but different monotonic clock readings to ensure that they are treated as equal keys in theKeySet
.Modify the test to create times with identical wall times:
now := time.Date(2021, time.January, 1, 0, 0, 0, 0, time.UTC) +nowWithDifferentMonotonic := now.Add(0 * time.Nanosecond) + // Insert the same time multiple times (should only be stored once in a set) keyset.Insert(ctx, now) +keyset.Insert(ctx, nowWithDifferentMonotonic) keyset.Insert(ctx, now) keyset.Insert(ctx, now)Verify that only one key exists in the set:
require.Equal(t, 1, len(keys)) +require.True(t, keys[0].Equal(now)) +require.True(t, keys[0].Equal(nowWithDifferentMonotonic))
149-155
: Ensure encoding and decoding handles time zones correctlyIn
TestTimeKeyEncoder_EncodeDecode
, consider adding checks to verify that the encoding and decoding process correctly handlestime.Time
values with different locations (time zones).Extend the test to include times with different locations:
now := time.Date(2021, time.January, 1, 0, 0, 0, 0, time.UTC) +loc, _ := time.LoadLocation("America/New_York") +nowNY := now.In(loc) encoded := TimeKeyEncoder.Encode(now) _, decoded := TimeKeyEncoder.Decode(encoded) +encodedNY := TimeKeyEncoder.Encode(nowNY) +_, decodedNY := TimeKeyEncoder.Decode(encodedNY) // Compare UnixNano timestamps require.Equal(t, now.UnixNano(), decoded.UnixNano()) +require.Equal(t, nowNY.UnixNano(), decodedNY.UnixNano())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- go.mod (1 hunks)
- keys.go (2 hunks)
- keyset_test.go (2 hunks)
🔇 Additional comments (3)
keys.go (1)
4-4
: LGTM: Required import for binary encoding.The addition of "encoding/binary" import is necessary for the new BigEndian encoding implementation.
go.mod (1)
13-13
: LGTM! Verify gogoproto usage.The addition of gogoproto as a direct dependency is appropriate given its role in the new time encoding implementation.
Let's verify the new dependency usage:
✅ Verification successful
Let me check the usage of gogoproto in value_encoder.go to understand how it's being utilized.
Let me check the specific context where cosmosproto.Message is being used to understand its role in the implementation.
Gogoproto dependency is correctly used and implemented
The direct dependency on gogoproto is justified and properly implemented:
- It's used in
value_encoder.go
for protobuf message handling throughcosmosproto.Message
interface- The implementation is type-safe using Go generics in
ProtoValueEncoder
function- The usage aligns with the PR's objectives for efficient serialization
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that gogoproto is being used in the codebase # Expected: Find imports and usages of gogoproto, particularly in time-related encoding # Search for gogoproto imports echo "Searching for gogoproto imports..." rg "github.com/cosmos/gogoproto" # Search for potential proto definitions using gogoproto echo "Searching for proto definitions using gogoproto..." fd -e .proto -x rg -l "gogoproto"Length of output: 652
Script:
#!/bin/bash # Check the context of gogoproto usage in value_encoder.go echo "Checking value_encoder.go for gogoproto usage..." rg -B 3 -A 3 "github.com/cosmos/gogoproto/proto" value_encoder.go # Look for actual usage of the cosmosproto package echo "Checking for cosmosproto usage..." rg "cosmosproto\." value_encoder.goLength of output: 516
Script:
#!/bin/bash # Get the full context of cosmosproto.Message usage echo "Checking context of cosmosproto.Message usage..." rg -B 5 -A 5 "cosmosproto\.Message" value_encoder.go # Also check if there are any proto files that might be related to this implementation echo "Looking for relevant proto files..." fd -e protoLength of output: 870
keyset_test.go (1)
59-275
: Overall excellent test coverageThe added tests comprehensively cover the functionality of
KeySet
withtime.Time
keys, including insertion, deletion, iteration in both ascending and descending orders, range queries, and handling of duplicate keys.
func (timeKey) Encode(t time.Time) []byte { | ||
b := make([]byte, 8) | ||
binary.BigEndian.PutUint64(b, uint64(t.UnixNano())) | ||
return b | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Time range limitation needs documentation and testing
The implementation's use of UnixNano() introduces time range constraints (years 1678 to 2262), but this limitation is neither documented nor tested. Consider:
- Add documentation in
keys.go
explaining the valid time range - Add test cases with edge timestamps to verify behavior near the bounds
- Consider using a different encoding scheme if wider time range support is required
🔗 Analysis chain
Implementation ensures lexicographical ordering but has time range limitations.
The new implementation correctly uses binary encoding of Unix nanoseconds to ensure lexicographical ordering matches chronological ordering. However, there are some important considerations:
- This is a breaking change as existing encoded keys will not be decodable.
- The use of
UnixNano()
limits the valid time range to years 1678 to 2262 due to the int64 constraints.
Let's verify the time range limitation:
The search results show that all tests use timestamps from 2021, with relatively small time offsets (hours/seconds). There's no evidence of testing the time range limitations. Let's search for any documentation or comments about time range constraints.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for any time values or tests that might be affected by the year 2262 limitation
rg -i "time\.(date|parse|unix)" -A 5
Length of output: 3537
Script:
#!/bin/bash
# Search for any documentation or comments about time constraints
rg -i "(unix|time).*(limit|constraint|range|bound)"
Length of output: 117
sort.Slice(times, func(i, j int) bool { | ||
return times[i].Before(times[j]) | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor duplicate sorting logic into helper functions
The sorting logic for time slices is repeated across multiple test functions. Refactoring this code into helper functions will reduce duplication and enhance maintainability.
Apply this refactor by adding helper functions:
// Helper function to sort times in ascending order
func sortTimesAsc(times []time.Time) []time.Time {
sortedTimes := make([]time.Time, len(times))
copy(sortedTimes, times)
sort.Slice(sortedTimes, func(i, j int) bool {
return sortedTimes[i].Before(sortedTimes[j])
})
return sortedTimes
}
// Helper function to sort times in descending order
func sortTimesDesc(times []time.Time) []time.Time {
sortedTimes := make([]time.Time, len(times))
copy(sortedTimes, times)
sort.Slice(sortedTimes, func(i, j int) bool {
return sortedTimes[i].After(sortedTimes[j])
})
return sortedTimes
}
Then, replace the sorting code in your test functions with these helper functions:
-// Sort times in ascending order
-sortedTimesAsc := make([]time.Time, len(times))
-copy(sortedTimesAsc, times)
-sort.Slice(sortedTimesAsc, func(i, j int) bool {
- return sortedTimesAsc[i].Before(sortedTimesAsc[j])
-})
+sortedTimesAsc := sortTimesAsc(times)
-// Sort times in descending order
-sortedTimesDesc := make([]time.Time, len(times))
-copy(sortedTimesDesc, times)
-sort.Slice(sortedTimesDesc, func(i, j int) bool {
- return sortedTimesDesc[i].After(sortedTimesDesc[j])
-})
+sortedTimesDesc := sortTimesDesc(times)
Also applies to: 130-133, 176-181, 196-201
Update TimeKeyEncoder to Preserve Lexicographical Order
Summary
This pull request updates the
TimeKeyEncoder
in thecollections
module to ensure thattime.Time
keys are encoded in a lexicographically sortable manner that aligns with chronological order. This change fixes issues with iterators that rely on time-based keys, particularly when retrieving data in descending order.Background
Previously, the
TimeKeyEncoder
usedsdk.FormatTimeBytes(t)
andsdk.ParseTimeBytes(b)
for encoding and decodingtime.Time
values. This approach encodes times into RFC3339 string formats, which do not guarantee that the lexicographical order of the encoded bytes matches the chronological order of the times. As a result, iterators operating over these keys could return data in an incorrect order.This issue became evident when attempting to retrieve the latest snapshots in modules like the Oracle module, where iterators did not return the most recent data as expected.
4. Enhanced Testing
time.Time
values work as expected and that iterators return data in the correct order.Implications
Data Migration
time.Time
keys are encoded, which means existing data stored with the previous encoding will not be accessible using the new encoder.TimeKeyEncoder
. Without migration, data retrieval for time-based keys will fail or return incorrect results.Backward Compatibility
TimeKeyEncoder
. Users must be aware of this when upgrading and plan for data migration.Summary by CodeRabbit
New Features
Tests
KeySet
withtime.Time
keys, ensuring proper insertion, deletion, iteration, and encoding/decoding.