Remove Invalid XML Characters with an SSIS Visual Basic Script
By kenglish
My boss is forcing us to use Microsoft SQL Service Intergration Services for our ETL process. I Googled around for a bit and could not find a good example of how to do this simple task: open an XML file, read the text, replace any invalid characters and write it back out to the same file. My VB is very rusty but this works pretty well. The ReadVariable portion is specific to SSIS but the rest should be generic. Hopefully, the next poor who needs to do this will be able to find this blog entry!
Imports System Imports System.Data Imports System.Math Imports Microsoft.SqlServer.Dts.Runtime Imports System.Text.RegularExpressions Imports System.Object Public Class ScriptMain Public Sub Main() Dts.TaskResult = Dts.Results.Success Dim strPath, strXML As String strPath = CStr(ReadVariable("User::strFileName")) strXML = FileIO.FileSystem.ReadAllText(strPath) Dim rgx As Regex = New Regex("[\x00-\x08\x0B-\x0C\x0E-\x1F]", RegexOptions.None) rgx.Replace(strXML, " ") FileIO.FileSystem.WriteAllText(strPath, strXML, False) End Sub Private Function ReadVariable(ByVal varName As String) As Object Dim result As Object Try Dim vars As Variables Dts.VariableDispenser.LockForRead(varName) Dts.VariableDispenser.GetVariables(vars) Try result = vars(varName).Value Catch ex As Exception Throw ex Finally vars.Unlock() End Try Catch ex As Exception Throw ex End Try Return result End Function End Class



November 24th, 2009