Hi
I'm attempting to consume data from our Confluent Could Kafka environment, but I keep getting an error saying that the payloads are not formatted correctly according to the AVRO format.
The exact message is this:
Data preview "MasterDataStream_Test": ["Source 'EventHubInputAdapter' had 1 occurrences of kind 'InputDeserializerError.InvalidData' between processing times '2024-08-23T14:02:15.4723431Z' and '2024-08-23T14:02:15.4723431Z'. Invalid Avro Format, drop invalid record.","Source 'EventHubInputAdapter' had 1 occurrences of kind 'InputDeserializerError.InvalidData' between processing times '2024-08-23T14:02:15.4723431Z' and '2024-08-23T14:02:15.4723431Z'. Invalid Avro Format, drop invalid record.","Source 'EventHubInputAdapter' had 1 occurrences of kind 'InputDeserializerError.InvalidData' between processing times '2024-08-23T14:02:15.4723431Z' and '2024-08-23T14:02:15.4723431Z'. Invalid Avro Format, drop invalid record.","Source 'EventHubInputAdapter' had 1 occurrences of kind 'InputDeserializerError.InvalidData' between processing times '2024-08-23T14:02:15.4723431Z' and '2024-08-23T14:02:15.4723431Z'. Invalid Avro Format, drop invalid record."]
Looking at the data insight within the eventstream, it appears that my connection can consume data, but is not able to deserialize and display it.
We're using the AVRO format when distributing data via Kafka and we have multiple other in-house services that can consume, deserialize, and work with the formatted data.
The following is an example of a message payload:
[
{
"exceededFields": null,
"headers": [
],
"key": {
"data": [
0,
0,
0,
0,
0,
13,
23,
106
],
"type": "Buffer"
},
"offset": 392735,
"partition": 1,
"timestamp": 1706708606183,
"timestampType": "CREATE_TIME",
"value": {
"created": {
"long": 1706708606038
},
"created_date": {
"long": 1702637923000
},
"is_deleted": {
"boolean": false
},
"order_number": {
"string": "ABC123"
},
"order_states_id": {
"long": 30
},
"orders_id": {
"long": 123123
},
"public_id": {
"long": 123123
},
"source_data_version": {
"long": 879127895786
},
"styles_public_id": {
"long": 123123
},
"updated": {
"long": 1706708606038
},
"vendor_name": {
"string": "VENDOR_NAME"
},
"vendor_number": {
"string": "VENDOR_NUMBER"
}
}
}
]
So my concern is regarding the AVRO deserialization within the eventstream activity.
Are we wrongful in our AVRO format or could there be some issue with the way the format is handled within the eventstream?