Data policies enable compliance with regulations such as
Policies operate in a secure layer of the architecture, yet they do not affect
your raw Kafka data and applications.
Lenses follows the National Institute of Standards and Technology (NIST) standards. Policies gives you the ability to apply masking to Kafka specific fields to all Lenses channels (UI/CLI/API/SQL).
- Redaction Policy: Whether to protect messages at a field level.
- Category: Category of sensitivity in the data.
- Impact: The business impact levels concerning the data.
- Fields: Definition of fields that the data policy will apply to.
Create a policy
To create a policy:
1. Navigate to Policies
2. Click the New Policy button
No data policies are enabled by default. Import automatically recommended data policies, or create a new one.
In the above example, we are:
- Creating a new
Credit Card Numbersdata policy
LAST-4as the redaction policy
Financial Dataas the data category
HIGHas the impact to the business
- Applying this policy to all datasets (Kafka topics or Elasticsearch indexes) that begin with
user_. Note that if no datasets are specified, then the field will apply to all datasets.
- Applying this policy to the field
credit_cardin the datasets matching the above.
Data policies in action
Once the above has been created, Lenses will automatically identify ALL Kafka topics and Elasticsearch indexes that contain credit card info.
Any data on Kafka, whether serialized as Avro, JSON, XML or even ProtoBuf that
credit_card information will automatically be detected.
Apart from identifying all the sensitive data at a field level, Lenses will also protect the data for you.
That means that anyone accessing data via Lenses (UI/CLI/Python) can access production data while respecting the the sensitivity of the underlying data.
Lenses applies the masking to data any time you request to access it. The available policies are:
- None Track sensitive data, but do not protect them.
- Last-4 Display the last 4 characters of the value.
- First-4 Display the first 4 characters of the value.
- Initials Display the first letter of each word.
- Email Mask email address, showing the domain name.
- All Mask the entire value.
- Number-to-negative-one Replace a numeric value with -1. Note that this only affects numeric types, it will have no effect on strings that contain numbers.
- Number-to-zero Replace a numeric value with 0. Note that this only affects numeric types, it will have no effect on strings that contain numbers.
- Number-to-null Replace a numeric value with
null. Note that this only affects numeric types, it will have no effect on strings that contain numbers.
Advanced field specification
In the case of nested data, it is possible to specify nested fields using the
For example, if your
users Kafka topic has a field called
details which in turn contains a field called
it is possible to specify the field
details.name so that only that particular field is masked, rather than every field called
Note that, for a Kafka topic, there may be both a key and a value, and the policy will apply to each of these if they contain the corresponding field.
In the event of two policies matching a given field, the more specific one will be applied,
e.g. if there is a policy for
name with a redaction of
First-4 and a policy for
users.details.name with a redaction of
Initials, the latter will be applied. Wildcards (see below) and dataset rules do not affect this.
Note that masking is only performed on nodes without children. Continuing with the example above,
details.name can be masked, but if we attempt to apply a data policy to
details, it will have no effect, as it has child properties.
It is also possible to specify wildcards using the
* character, so that
d*s.name will match both
. is considered to be a field separator, a wildcard will not match against it. So
u*s.name will match
users.name but will not match
Once you setup a data policy, you can view all policies and how sensitive data exist in your data platform.
Provided your account has the relevant access level, you can click on data policy and edit or remove it.
If you select to edit a data policy you can change its configuration.
Policy associated resources
Lenses identifies all data resources automatically with messages that contain any sensitive payload field. This happens across all data format (JSON, AVRO, XML etc.) and whether the field lives at the record level in the key or value or even in a nested structure.
Flows / Connectors
Lenses identifies all flows and connectors that are consuming or producing such sensitive data so that you can track their usage across multiple data systems.
Flows / SQL Processors
Lenses identifies all streaming SQL processors that produce or consume sensitive data.
Flows / Custom Applications
Lenses identifies all your micro-services and application that are producing or consuming sensitive data.