Monday, February 26, 2007

3597.aspx

Replacing HTML markup in SQL Server

I automatically generate a short summary for each post on my blog. The logic for creating the summary should stay on the front end but I did not have the source code for my blog engine (.text) so I put it in the DB.


The following function removes all HTML tags, leaving only the text I want to display in the summary:


CREATE FUNCTION [dbo].[replaceHTML]


(


       @cont varchar(8000)


)


RETURNS varchar(8000)


AS


BEGIN


     


  declare @out  varchar (8000)


  declare @tag  varchar (8000)


  declare @pos1 int


  declare @pos2 int


 


  -- Find the start/end of first tag


  set @pos1=charindex('<', @cont);


  set @pos2=charindex('>', @cont);


 


  if not(@pos1>0 and @pos2>@pos1)


  begin


    set @out=@cont


  end


  else


  begin


    set @out='';


 


    -- as long as there are tags


    while @pos1>0 and @pos2>@pos1


    begin


      set @tag=substring(@cont,@pos1,@pos2-@pos1+1);


      if @pos1>1


      begin


        set @out=@out+substring(@cont,1,@pos1-1)


      end


 


      set @cont=substring(@cont,@pos2+1, len(@cont) - @pos2);


 


        -- Find the start/end of next tag


      set @pos1=charindex('<', @cont);


      set @pos2=charindex('>', @cont)


    end


    set @out=@out+@cont


  end


  return(@out)   


END


 


It may not be very nice, but it works and it improves the Google search results (which is strange as Google gets a complete list my posts via the sitemap...)

No comments:

Post a Comment