To improve the security of Conversational AI Cloud in general, we've added Global Masking Rules that are automatically activated for each new project, These rules are the following:
- IBAN (#iban#): Checks for an IBAN
- Visa/MasterCard/AmericanExpress credit card (#creditcardnr#): checks for credit card number
- Email Address (#email): email address checker
- Euro currency (#currency#): euro and amount checker
- Date with dot/space/hyphen/slash separator (#date#): date checker
- SIM card number (#simnr#): 12 digit check (also checks for word boundaries)
- PIN and PUK code number (#pinpuknr#): 4 digit check (also checks for word boundaries)
- BSN (#bsn#): 9 digit check (doesn't check for word boundaries)
- International Passport Number (#passportnr#): checks for a passport number
- Long number (#longnumber#): can match any length number sequence longer than 6 digits
Masking of Sensitive User Input
In situations where end users enter a credit card number in their question, that user input would normally just be logged and be presented in the Interaction Logs and it might be displayed in one of the Dashboards. This is a problem for many -- especially financial -- institutions since these type of data need special attention; they are considered sensitive and represent so-called 'Personally Identifiable Information' (PII). This data is not needed for Content Editors to improve the Knowledge Base, so it's better if they don't see it. And it's easier for Conversational AI Cloud as a platform to not have to handle these data in accordance with the wide variety of national and international regulations for data protection.
The common solution for these cases is to mask the input. A user input that holds a credit card number (for instance), with a masking rule in place for credit card numbers, will show something like '#ccnr#' in the Interaction Logs. Masking means a replacement, so we don't store the real complete user input if Masking Rules are in place and if one of those rules is activated. We then replace the sensitive information with a mask.
This is not encryption, which we would be able to decrypt to still get the original user input. We don't store the original input so decryption / retrieving the original user input is not possible. This can be a disadvantage but it ensures sensitive information that users enter themselves, is not exposed to anyone, which is the safest solution in the end.
Security Notice
- If you choose to disable the masking, the collected data for that slot (or several) will end up in the Interaction Logs and Dashboards. This might be useful when wanting to learn about the exact user input to thus optimize the flow or the webhook code. Be aware though, you are potentially collecting personal data of your end users and data can't be masked in hindsight; Interaction logs are immutable. So be aware of who has access to the Interaction Logs and the Dashboards.
- if you're using custom logic to repeat the user's input value in the output response, this information could show in certain Dashboards, even if masking is enabled for the slot.