How to remove non-printable / ASCII characters oracle - REGEXP

Removing non-printable / ASCII characters  oracle


🔹 1. Your Current Approach (Valid but Heavy)

REGEXP_REPLACE(column, '[^[:print:]]', '')

✔ Removes all non-printable characters
❌ Expensive for large datasets (regex on every column)


🔹 2. Optimized Approach (Better Performance)

If your main issue is control characters like newline, tab, carriage return, you can replace them explicitly:

REPLACE(REPLACE(REPLACE(column, CHR(10), ''), CHR(13), ''), CHR(9), '')

👉 Faster than REGEXP

Example:

REPLACE(REPLACE(REPLACE(PHONE_NO, CHR(10), ''), CHR(13), ''), CHR(9), '') PHONE_NO

🔹 3. Create a Reusable Function (Best Practice)

Instead of repeating everywhere, create a function:

CREATE OR REPLACE FUNCTION CLEAN_TEXT(p_text VARCHAR2)
RETURN VARCHAR2 IS
BEGIN
  RETURN REGEXP_REPLACE(p_text, '[^[:print:]]', '');
END;
/

Then your query becomes:

SELECT
    CLEAN_TEXT(TO_CHAR(SLNO)) SLNO,
    CLEAN_TEXT(PHONE_NO) PHONE_NO,
    CLEAN_TEXT(SERVICE_TYPE) SERVICE_TYPE,
    CLEAN_TEXT(SERVICE_SUB_TYPE) SERVICE_SUB_TYPE,
    CLEAN_TEXT(SERVICE_OPER_STATUS) SERVICE_OPER_STATUS,
    CLEAN_TEXT(CUST_ACCNT_NO) CUST_ACCNT_NO,
    CLEAN_TEXT(BILL_ACCNT_NO) BILL_ACCNT_NO,
    CLEAN_TEXT(BILL_ACCNT_TYPE) BILL_ACCNT_TYPE,
    CLEAN_TEXT(BILL_ACCNT_SUB_TYPE) BILL_ACCNT_SUB_TYPE,
    CLEAN_TEXT(MOBILE_NO) MOBILE_NO,
    CLEAN_TEXT(EMAIL_ID) EMAIL_ID,
    CLEAN_TEXT(OLT_IP) OLT_IP,
    CLEAN_TEXT(CUSTOMER_NAME) CUSTOMER_NAME,
    CLEAN_TEXT(ADDR_1) ADDR_1,
    CLEAN_TEXT(ADDR_2) ADDR_2,
    CLEAN_TEXT(DISTRICT) DISTRICT,
    CLEAN_TEXT(PIN_CODE) PIN_CODE,
    CLEAN_TEXT(SSA) SSA,
    CLEAN_TEXT(CIRCLE) CIRCLE,
    CLEAN_TEXT(ZONE) ZONE,
    CLEAN_TEXT(BBM_USERNAME) BBM_USERNAME,
    CLEAN_TEXT(BBM_NAME) BBM_NAME,
    CLEAN_TEXT(BBM_MOBILE_NO) BBM_MOBILE_NO,
    CLEAN_TEXT(TYPE_OF_ORGANISTAION) TYPE_OF_ORGANISTAION,
    CLEAN_TEXT(MINISTRY) MINISTRY,
    CLEAN_TEXT(DEPARTMENT) DEPARTMENT,
    CLEAN_TEXT(ORGANISATION) ORGANISATION,
    CLEAN_TEXT(PROJECT) PROJECT,
    CLEAN_TEXT(REMARKS) REMARKS,
    CLEAN_TEXT(TO_CHAR(UPDATED_ON)) UPDATED_ON,
    CLEAN_TEXT(UPDATED_BY) UPDATED_BY
FROM INFRA_ADMIN.GOVT_CONNECTIONS;

✔ Cleaner
✔ Maintainable
✔ Easier to modify logic later


🔹 4. If Data Fix is One-Time (Recommended)

Instead of cleaning every query, fix data permanently:

UPDATE INFRA_ADMIN.GOVT_CONNECTIONS
SET PHONE_NO = REGEXP_REPLACE(PHONE_NO, '[^[:print:]]', '');

👉 Do this for all required columns once


You want to update only when non-printable characters exist, to avoid unnecessary updates

✔ Correct Conditional UPDATE

UPDATE INFRA_ADMIN.GOVT_CONNECTIONS
SET PHONE_NO = REGEXP_REPLACE(PHONE_NO, '[^[:print:]]', '')
WHERE REGEXP_LIKE(PHONE_NO, '[^[:print:]]');

🔍 Explanation

  • REGEXP_REPLACE(...) → removes non-printable characters

  • REGEXP_LIKE(... '[^[:print:]]') → ensures update happens only if such characters exist


🔹 If column can have NULLs (safe version)

UPDATE INFRA_ADMIN.GOVT_CONNECTIONS
SET PHONE_NO = REGEXP_REPLACE(PHONE_NO, '[^[:print:]]', '')
WHERE PHONE_NO IS NOT NULL
  AND REGEXP_LIKE(PHONE_NO, '[^[:print:]]');

🔹 To check before update (recommended)

SELECT PHONE_NO
FROM INFRA_ADMIN.GOVT_CONNECTIONS
WHERE REGEXP_LIKE(PHONE_NO, '[^[:print:]]');

🔹 For multiple columns (pattern)

UPDATE INFRA_ADMIN.GOVT_CONNECTIONS
SET PHONE_NO = REGEXP_REPLACE(PHONE_NO, '[^[:print:]]', ''),
    EMAIL_ID = REGEXP_REPLACE(EMAIL_ID, '[^[:print:]]', '')
WHERE REGEXP_LIKE(PHONE_NO, '[^[:print:]]')
   OR REGEXP_LIKE(EMAIL_ID, '[^[:print:]]');

⚡ Performance Tip (Important for large tables)

Regex is costly. If mostly dealing with:

  • newline → CHR(10)

  • carriage return → CHR(13)

  • tab → CHR(9)

Use:

UPDATE INFRA_ADMIN.GOVT_CONNECTIONS
SET PHONE_NO = REPLACE(REPLACE(REPLACE(PHONE_NO, CHR(10), ''), CHR(13), ''), CHR(9), '')
WHERE PHONE_NO LIKE '%' || CHR(10) || '%'
   OR PHONE_NO LIKE '%' || CHR(13) || '%'
   OR PHONE_NO LIKE '%' || CHR(9) || '%';


Post a Comment

And that's all there is to it!

If anyone has any other questions or requests for future How To posts, you can either ask them in the comments or email me. Please don't feel shy at all!

I'm certainly not an expert, but I'll try my hardest to explain what I do know and research what I don't know.

Previous Post Next Post