Remove Invalid XML Characters with an SSIS Visual Basic Script

By kenglish

My boss is forcing us to use Microsoft SQL Service Intergration Services for our ETL process. I Googled around for a bit and could not find a good example of how to do this simple task: open an XML file, read the text, replace any invalid characters and write it back out to the same file. My VB is very rusty but this works pretty well. The ReadVariable portion is specific to SSIS but the rest should be generic. Hopefully, the next poor who needs to do this will be able to find this blog entry!

 
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Text.RegularExpressions
Imports System.Object
 
 
Public Class ScriptMain
 
    Public Sub Main()
        Dts.TaskResult = Dts.Results.Success
        Dim strPath, strXML As String
        strPath = CStr(ReadVariable("User::strFileName"))
        strXML = FileIO.FileSystem.ReadAllText(strPath)
 
        Dim rgx As Regex = New Regex("[\x00-\x08\x0B-\x0C\x0E-\x1F]", RegexOptions.None)
        rgx.Replace(strXML, " ")
        FileIO.FileSystem.WriteAllText(strPath, strXML, False)
    End Sub
 
    Private Function ReadVariable(ByVal varName As String) As Object
        Dim result As Object
 
        Try
            Dim vars As Variables
            Dts.VariableDispenser.LockForRead(varName)
            Dts.VariableDispenser.GetVariables(vars)
            Try
                result = vars(varName).Value
            Catch ex As Exception
                Throw ex
            Finally
                vars.Unlock()
            End Try
        Catch ex As Exception
            Throw ex
        End Try
 
        Return result
    End Function
 
End Class


categoriaProgramming commentoNo Comments dataNovember 24th, 2009

About... kenglish

This author published 76 posts in this site.

Share

FacebookTwitterEmailWindows LiveTechnoratiDeliciousDiggStumbleponMyspaceLikedin

Leave a comment