Supported versions: 4.3+
Control AVRO record name and namespace
When writing output as AVRO
, Lenses creates schemas for you, automatically generating AVRO
record names.
In this tutorial we will learn how to override the default record naming strategy.
SET
statement to control the record and namespace name generated for the AVRO schema.
Setting up our input topic
We are going to create and populate a topic that we will later use in a couple of SQL Processors.
In SQL Studio, create the topic running the following query:
CREATE TABLE mytopic(a string, b string, c string) FORMAT (STRING, AVRO);
For the purposes of our tutorial, it is enough to insert a single topic:
INSERT INTO mytopic(a, b, c) VALUES ("a", "b", "c");
Create a simple SQL Processor
We are now going to create a processor that will show the default behaviour of AVRO
record naming in Lenses.
The processor does not do much, it just reshapes the fields of the original topic, putting some of them in a nested field:
SET defaults.topic.autocreate=true;
INSERT INTO mytopic_2
SELECT STREAM a as x.a, b as x.y.b, c as x.y.c
FROM mytopic
We then start the processor. Lenses will create the new topic mytopic_2
, and new schema will be created in the Schema Registry,
as soon as the first (and only) record is processed.
If we inspect the value schema of mytopic_2
, we see that this is the one generated:
{
"type": "record",
"name": "record1",
"fields": [
{
"name": "x",
"type": {
"type": "record",
"name": "record0",
"fields": [
{
"name": "a",
"type": "string"
},
{
"name": "y",
"type": {
"type": "record",
"name": "record",
"fields": [
{
"name": "b",
"type": "string"
},
{
"name": "c",
"type": "string"
}
]
}
}
]
}
}
]
}
As we can see, each record
type has a name (it is mandatory in AVRO
), and Lenses has generated those names automatically
for us (record
, record0
, record1
etc.).
Set the record name and the namespace of the value schema
We are now going to see how to override that default behaviour.
Let’s create and start the new processor with the following SQL:
SET defaults.topic.autocreate=true;
SET defaults.topic.value.avro.record="myRecordName";
SET defaults.topic.value.avro.namespace="myNamespace";
INSERT INTO mytopic_3
SELECT STREAM a as x.a, b as x.y.b, c as x.y.c
FROM mytopic
Notice how we added the new SET
statements to the query:
SET defaults.topic.value.avro.record="myRecordName";
SET defaults.topic.value.avro.namespace="myNamespace";
These settings are telling Lenses to set the root record name and namespace to the values specified.
If we now check the value schema for mytopic_3
we get:
{
"type": "record",
"name": "myRecordName",
"namespace": "myNamespace",
"fields": [
{
"name": "x",
"type": {
"type": "record",
"name": "record0",
"fields": [
{
"name": "a",
"type": "string"
},
{
"name": "y",
"type": {
"type": "record",
"name": "record",
"fields": [
{
"name": "b",
"type": "string"
},
{
"name": "c",
"type": "string"
}
]
}
}
]
}
}
]
}
As we can see, the root record
element has now name myRecordName
and namespace myNamespace
.
Notice how the settings did not affect nested records.
Set the record name and the namespace of the key schema
If the key of the generated topic has AVRO
format as well, you can use the following analogous settings to control
the key record name and namespace:
SET defaults.topic.key.avro.record="myRecordName";
SET defaults.topic.key.avro.namespace="myNamespace";
More control on the topic affected by the setting
A setting like the one we used before for the value schema:
SET defaults.topic.value.avro.record="myRecordName"
will affect all the topics used by the processor.
If you want instead to target a single topic, you can use the topic-specific version:
SET topic.mytopic_3.value.avro.record="myRecordName"
The setting above will override the record name only for topic mytopic_3
. Other topics will not be affected and
will keep using the default naming strategy.
More settings
You can check our documentation to see all the settings available in Lenses for your SQL Processors.