.net Regex To Find Anchor Tags And Replace Their Url?
May 13, 2010
i'm trying to find all the anchor tags and appending the href value with a variable.
for example
<a href="/page.aspx">link</a> will become <a href="/page.aspx?id=2">
<A hRef='http://www.google.com'><img src='pic.jpg'></a> will become <A hRef='http://www.google.com?id=2'><img src='pic.jpg'></a>
I'm able to match all the anchor tags and href values using regex, then i manually replace the values using string.replace, however i dont think its the efficient way to do this.Is there a solution where i can use something like regex.replace(html,newurlvalue)
I am building a forum and I want to use forum-style tags to let the users format their posts in a limited fashion.Currently I am using Regex to do this.As per this question:How to use C# regular expressions to emulate forum tags. The problem with this,is that the regex does not distinguish between nested tags.Here is a sample of how I implemented this method:
public static string MyExtensionMethod(this string text){return TransformTags(text);} private static string TransformTags(string input) {string regex = @"[([^=]+)[=x22']*(S*?)['x22]*](.+?)[/(1)]"; MatchCollection matches = new Regex(regex).Matches(input); for (int i = 0; i < matches.Count; i++) var tag = matches[i].Groups[1].Value; var optionalValue = matches[i].Groups[2].Value; var content = matches[i].Groups[3].Value; Now,if I submit something like [quote] This user posted [quote] blah [/quote] [/quote] it does not properly detect the nested quote.Instead it takes the first opening quote tag and puts it with the first closing quote tag.Do you guys recommend any solutions?Can the regex be modified to grab nested tags?Maybe I shouldn't use regex for this?
I need to replace <span> entries in a string to legacy html code because it's going to be used in a report for Crystal Reports. <b> works with Crystal, but the<span>'s do not.
Here's the string which I'm trying to replace: <span style="font-weight: bold">%THIS CAN BE ANY TEXT%</span>. I want to replace it to
I have a long alphabetical list to be displayed by a gridview.
I need to have links at the top of the page that will link to anchors in the gridview. How can I get anchor tags in the gridview so that the links will jump to them when clicked?
I need a regex or any other solution to replace an id in the middle of a url (not in querystring). url example - http://localhost:1876/category/6?sortBy=asc&orderBy=Popular
replace - category/6 with category/anotherID - routing used - routes.MapRoute( "categories", "category/{categoryID}/{categoryName}", new { controller = "Search", action = "SearchResults", categoryID = "", categoryName = "" } );
with a regex that would replace the following. The only thing that would remain the same is the div tags, the id's and classes could change and so could the content.
<div id="nav" class="whatever">Content is whatever</div>
The code works, but I need to include some exceptions to the replace - e.g. I will not replace anything i an img-, li- and a-tag (including link-text and attributes like href and title) but still allow replacements in p-, td- and div-tags.
How do I solve the problem below? I'm creating a simple content management system, where there is a HTML template with specific markup that denotes where content should be:
Separate from this, there is content in a database field that looks a little like this:
<!-- #BeginEditable "Body1" -->This is Test Text<!-- #EndEditable --><!-- #BeginEditable "Extra" -->This is more test text<!-- #EndEditable -->
As you can guess I need to merge the two, that is, replacing
<!-- #Editable "Body1" -->
with: This is Test Text. I've begun the code here. But I'm having problems using the Regex Replace function that should be located at the very bottom of that For/Each.
//Html Template string html = "<html><head></head><body><!-- #Editable "Body1" --><p>etc etc</p><!-- #Editable "Extra" --></body></html>"; //Regions that need to be put in the Html Template string regions = "<!-- #BeginEditable "Body1" -->This is Test Text<!-- #EndEditable --><!-- #BeginEditable "Extra" -->This is more test #EndEditable -->"; //Create a Regex to only extract what's between the 'Body' tag Regex oRegex = new Regex("<body.*?>(.*?)</body>", RegexOptions.Multiline); //Get only the 'Body' of the html template string body = oRegex.Match(html).Groups[1].Value.ToString(); // Regex to find sections inside the 'Body' that need replacing with what's in the string 'regions' Regex oRegex1 = new Regex("<!-- #Editable "(.*?)"[^>]*>",RegexOptions.Multiline); MatchCollection matches = oRegex1.Matches(body); // Locate section titles i.e. Body1, Extra foreach (Match match in matches) { string title = oRegex1.Match(match.ToString()).Groups[1].ToString(); Regex oRegex2 = new Regex("<!-- #BeginEditable "" + title + ""[^>]*>(.*?)<!-- #EndEditable [^>]*>", RegexOptions.Multiline); // // // Replace the 'Body' sections with whats in the 'regions' string cross referencing the titles i.e. Body1, Extra // // // }
so that the resulting output does not contain words at the centre.In the above code instead of giving the word vocation exclusively, i have to mention some pattern, so that it will replace all the words instead of doing it for first sentence only.How to modify my code?
I have a textbox where I accept multiple email ids separated by a comma. I then split it in my code-behind. If an email id is invalid, I change the background of it using Regex.replace, like this:
I want to change the way my blogs are displayed on my website. I currently use a seperate table in SQL to hold them and do a loop and replace.
All I really need to do is have a code that can be translated into real HTML in the blog code. I need to know the image name and the css class.
I was thinking is it possible to have somthing like this in a blog stored in the DB
<img L 1234.jpg> and use regex to match it and change it ton<img src="1234.jpg class="imgleft">
I know it looks like well why not just use the long code and not use regex, but I have a method that gets in the image path from the image name with padding. So in this case the image path would be 000/000/001/234/1234.jpg
I would have more then one occurance in the original string so would need to either somehow replace them all at once or use the regex to loop through untle they have all be matched
is something like this possible or do I need a different approach
I'm trying to replace some code generated by the AJAX Control Toolkit HTMLEditor from the XHTML standard to legacy code; <span style=*> to <b>, <u>, <i>, etc. This needs to be done because Crystal Reports doesn't understand the <span style=*> and needs the legacy items.
This is the code being generated by the HTMLEditor:
[Code]....
Is Regex.Replace the best way to replace these items? I need to keep the text between the opening and closing statements as well as ensure the proper formatting to the text so doing just ReplacementText.Replace will not work. I've tried a number of different things to try and get Regex.Replace working properly but keep having different issues and different things happen. But why it's not using the the closing span for the bold statement but using the closing span for the italics is beyond me.
I would like to conditionally remove a block of text between specifed start and stop delimiters. The code below does not work, but hopefully it suggests enough of what I am trying to accomplish.
I have a long string containing the ,<p> </p> and <br>. I want to clean my string from all these tags and spaces. How it can be done with String.Replace() method. I am doing separately right now, it is working but is there a way to do it at once, without replace() method.
<div id="mydiv">This is a "div" with quotation marks</div>
I want to use regular expressions to return the following:
<div id='mydiv'>This is a "div" with quotation marks</div>
Notice how the id attribute in the div is now surrounded by apostrophes?
How can I do this with a regular expression?
Edit: I'm not looking for a magic bullet to handle every edge case in every situation. We should all be weary of using regex to parse HTML but, in this particular case and for my particular need, regex IS the solution.
Edit #2: Jens Ameskamp helped to find a solution for me but anyone randomly coming to this page should think long and very hard about using this solution. In my case it works because I am very confident of the type of strings that I'll be dealing with. I know the dangers and the risks and make sure you do to. If you're not sure if you know then it probably indicates that you don't know and shouldn't use this method.