Remove number validation on deserialization for DynamoDB resources#4699
Remove number validation on deserialization for DynamoDB resources#4699jonathan343 wants to merge 3 commits intodevelopfrom
Conversation
|
Thanks. Please note that the explanation that this situation can only arise if the item is written by a different library is not accurate. In the issue I created I had the following example, where the write was also done in boto3, and worked - only the read failed: table.update_item(Key={'p': p}, UpdateExpression='SET a = :val',
ExpressionAttributeValues={':val': Decimal("1e100"}) |
Right, thanks! I updated the description to mention this case. |
|
I think we still want to remove the I think the remaining |
|
Thanks for the review @rowanseymour, and apologies for the delayed response. Since the DynamoDB resource interface is feature frozen, I think it is better for the resource layer to defer numeric validation to DynamoDB rather than continue enforcing boto3-local decimal rules that can drift from the service. This issue is an example of that drift already causing valid DynamoDB numbers to fail in boto3. I've updated the PR to remove client-side validation for serialization as well. The change still keeps local Python-side checks for I'll start asking for reviews from other boto3 team members so we can get the ball rolling on this. Thanks again for your input! |
| def _serialize_n(self, value): | ||
| number = str(DYNAMODB_CONTEXT.create_decimal(value)) | ||
| if number in ['Infinity', 'NaN']: | ||
| decimal_value = Decimal(value) |
There was a problem hiding this comment.
This is less familiar code for me, so I'm still trying to reason about all of this.
One alternate solution here could just be removing the rounded trap. Here's the test code I wrote to check it will work:
import decimal
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('table-name')
print(table.put_item(Item={
'primary_key': "test",
#This breaks with the `decimal.Rounded` trap, but is fine with the `decimal.Inexact trap`,
# the difference being that inexact will allow "rounding" if it's not losing precision.
# Adding a 1 on the end would break both as expected
'precise_number': decimal.Decimal("0.12345678901234567890123456789012345678000000")
}))
print(table.get_item(Key={"primary_key": "test"}))
There was a problem hiding this comment.
This also shifts where an exception happens, so a customer relying on us to raise any of the former exceptions (such as if the number was too big for DDB to except). For a hamfisted example:
import decimal
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('test-table')
biiiiig = "1e500"
try:
print(table.put_item(Item={
'primary_key': "test_biiiiig",
'precise_number': decimal.Decimal(biiiiig)
}))
except decimal.Overflow as e:
print("Hi end user, we are sorry but that number is too big for us to store in our database")
raise
# Before, we would never reach this code path, but now we will start getting a client error.
# That's not caught above.
response = table.get_item(Key={"primary_key": "test_biiiiig"})
print("Your number is: " + str(response.get('Item').get('precise_number')))
Outside of this, I think this is a safe change, but I prefer not to break customer's try/except blocks when we can avoid it. What do you think?
Important
High-level resources in boto3 are feature frozen, however, this is a bugfix that addresses a valid use case that is currently broken.
Summay
This PR fixes DynamoDB number handling for high-level resources when values have trailing zeros that inflate the total digit count without exceeding DynamoDB's 38 significant-digit limit.
The final behavior is narrower than removing validation entirely:
Decimal(value)directly for DynamoDB response numbersDYNAMODB_CONTEXTfor client-side validationRounded, which allows valid trailing-zero values while still rejecting truly inexact valuesThis preserves existing write-side validation while removing the false failures that prevent customers from reading back or re-serializing valid DynamoDB numbers.
Addresses: #2500, #4693
Background
Both serialization and deserialization currently use
DYNAMODB_CONTEXT.create_decimal()to validate numbers:Serialization:
boto3/boto3/dynamodb/types.py
Lines 213 to 217 in a6ff277
Deserialization:
boto3/boto3/dynamodb/types.py
Lines 288 to 289 in a6ff277
This works for most customers because numbers that pass serialization will also pass deserialization. However, data that does not flow through the DynamoDB resource serialization/deserialization logic can fail when read back through resources. Examples include data written by other SDKs (Go, Java, etc.), via the low-level client, or using
UpdateExpressionandExpressionAttributeValuesas shown below:Root Cause
DynamoDB limits numbers to 38 significant digits, while Python's
Decimalcontext can signalRoundedwhen normalizing values with trailing zeros even when no significant digits are lost.1234567895171680000000000000000000000000Roundedexception1E+100This mismatch causes valid DynamoDB numbers to raise
decimal.Roundedduring deserialization.It also creates an asymmetry: a value that DynamoDB accepts and stores may not be readable or re-serializable through the boto3 resource layer because boto3 is enforcing its own local decimal rules.