[BugFix] Fix check overflow for widening conversion (backport #51280) #51428

ZiheLiu · 2024-09-26T03:20:58Z

CP from #51280.

Why I'm doing:

Introduced by PR #49707. It fixes an issue in the overflow check when casting a double to an integral type. And it abstracts CastExpr::NumberCheck into check_number_overflow, allowing the same issue to be resolved for both Json and Avro formats.

However, CastExpr::NumberCheck only handles narrowing conversions, not widening conversions, because it is invoked only during narrowing conversions.

#define UNARY_FN_CAST_VALID(FROM_TYPE, TO_TYPE, UNARY_IMPL)                                                            \
    template <bool AllowThrowException>                                                                                \
    struct CastFn<FROM_TYPE, TO_TYPE, AllowThrowException> {                                                           \
        static ColumnPtr cast_fn(ColumnPtr& column) {                                                                  \
            if constexpr (std::numeric_limits<RunTimeCppType<TO_TYPE>>::max() <                                        \
                          std::numeric_limits<RunTimeCppType<FROM_TYPE>>::max()) {                                     \
                if constexpr (!AllowThrowException) {                                                                  \
                    return VectorizedInputCheckUnaryFunction<UNARY_IMPL, NumberCheck>::template evaluate<FROM_TYPE,    \
                                                                                                         TO_TYPE>(     \
                            column);                                                                                   \
                } else {                                                                                               \
                    return VectorizedInputCheckUnaryFunction<                                                          \
                            UNARY_IMPL, NumberCheckWithThrowException>::template evaluate<FROM_TYPE, TO_TYPE>(column); \
                }                                                                                                      \
            }                                                                                                          \
            return VectorizedStrictUnaryFunction<UNARY_IMPL>::template evaluate<FROM_TYPE, TO_TYPE>(column);           \
        }                                                                                                              \
    };

When performing a widening conversion, the expression value > (**FromType**)std::numeric_limits<ToType>::max() may returns true. This occurs because converting the maximum value of a larger type (ToType) into a smaller type (FromType) results in an overflow and a negative value.

Issues in Json Import:

int64 to double: This can result in the issue described above. Casting a double to int64 might lead to undefined behavior.
- Release mode: Luckily, converting the maximum double value to int64 will give the maximum int64 value, so no issue arises.
- Debug mode: In debug mode, converting the maximum double value to int64 could result in the minimum int64 value, causing problems.

Issues in Avro Import:

int64 to double: The same issue as in the Json import.
int32 to int64: This also suffers from the overflow issue during widening conversions.

Special Case: Values in the Range $[2^{63}, 2^{64})$

When the value falls within the range $$[2^{63}, 2^{64})$$, Json interprets the FromType as uint64_t. In such cases, using check_signed_number_overflow remains correct:

If the ToType is any signed integer (int8_t to int64_t), check_signed_number_overflow will return true, indicating overflow.
If the ToType is float, double, or int128_t, check_signed_number_overflow will return false since the widening conversion path is followed.

What I'm doing:

Fixes https://github.com/StarRocks/StarRocksTest/issues/8603

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

This is an automatic backport of pull request #51280 done by [Mergify](https://mergify.com). ## Why I'm doing:

Introduced by PR #49707. It fixes an issue in the overflow check when casting a double to an integral type. And it abstracts CastExpr::NumberCheck into check_number_overflow, allowing the same issue to be resolved for both Json and Avro formats.

However, CastExpr::NumberCheck only handles narrowing conversions, not widening conversions, because it is invoked only during narrowing conversions.

#define UNARY_FN_CAST_VALID(FROM_TYPE, TO_TYPE, UNARY_IMPL)                                                            \
    template <bool AllowThrowException>                                                                                \
    struct CastFn<FROM_TYPE, TO_TYPE, AllowThrowException> {                                                           \
        static ColumnPtr cast_fn(ColumnPtr& column) {                                                                  \
            if constexpr (std::numeric_limits<RunTimeCppType<TO_TYPE>>::max() <                                        \
                          std::numeric_limits<RunTimeCppType<FROM_TYPE>>::max()) {                                     \
                if constexpr (!AllowThrowException) {                                                                  \
                    return VectorizedInputCheckUnaryFunction<UNARY_IMPL, NumberCheck>::template evaluate<FROM_TYPE,    \
                                                                                                         TO_TYPE>(     \
                            column);                                                                                   \
                } else {                                                                                               \
                    return VectorizedInputCheckUnaryFunction<                                                          \
                            UNARY_IMPL, NumberCheckWithThrowException>::template evaluate<FROM_TYPE, TO_TYPE>(column); \
                }                                                                                                      \
            }                                                                                                          \
            return VectorizedStrictUnaryFunction<UNARY_IMPL>::template evaluate<FROM_TYPE, TO_TYPE>(column);           \
        }                                                                                                              \
    };

When performing a widening conversion, the expression value > (**FromType**)std::numeric_limits<ToType>::max() may returns true. This occurs because converting the maximum value of a larger type (ToType) into a smaller type (FromType) results in an overflow and a negative value.

Issues in Json Import:

int64 to double: This can result in the issue described above. Casting a double to int64 might lead to undefined behavior.
- Release mode: Luckily, converting the maximum double value to int64 will give the maximum int64 value, so no issue arises.
- Debug mode: In debug mode, converting the maximum double value to int64 could result in the minimum int64 value, causing problems.

Issues in Avro Import:

int64 to double: The same issue as in the Json import.
int32 to int64: This also suffers from the overflow issue during widening conversions.

Special Case: Values in the Range $[2^{63}, 2^{64})$

When the value falls within the range $$[2^{63}, 2^{64})$$, Json interprets the FromType as uint64_t. In such cases, using check_signed_number_overflow remains correct:

If the ToType is any signed integer (int8_t to int64_t), check_signed_number_overflow will return true, indicating overflow.
If the ToType is float, double, or int128_t, check_signed_number_overflow will return false since the widening conversion path is followed.

What I'm doing:

Fixes https://github.com/StarRocks/StarRocksTest/issues/8603

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Signed-off-by: zihe.liu <ziheliu1024@gmail.com>

fix

e433364

Signed-off-by: zihe.liu <ziheliu1024@gmail.com>

mergify bot assigned ZiheLiu Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Fix check overflow for widening conversion (backport #51280) #51428

[BugFix] Fix check overflow for widening conversion (backport #51280) #51428

[BugFix] Fix check overflow for widening conversion (backport #51280) #51428

Are you sure you want to change the base?

[BugFix] Fix check overflow for widening conversion (backport #51280) #51428

Conversation

Why I'm doing:

Issues in Json Import:

Issues in Avro Import:

Special Case: Values in the Range $[2^{63}, 2^{64})$

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

Issues in Json Import:

Issues in Avro Import:

Special Case: Values in the Range $[2^{63}, 2^{64})$

What I'm doing:

What type of PR is this:

Checklist: