C# - Scraping Content From Webpage?
		
			Sep 6, 2010
				I need to scrape a remote html page looking for images and links.  I need to find an image that is "most likely" the product image on the page and links that are "near" that image.  I currently do this with a javascript bookmarklet so that I am able to get the rendered x/y coordinates of images and links to help me determine if those are the ones that I want. What I want is the ability to get this information by just using a url and not the bookmarklet.  The issues it that by using the url and trying something like httpwebrequest and getting the html on the server, I will not have location values since it wasn't rendered in a browser.  I need the location of images and links to help me determine the images and links that I want.So how can I get html from a remote site on the server AND use the rendered location
	
	View 3 Replies
  
    
	Similar Messages:
	
    	
    	
        Apr 29, 2010
        I'm having difficulties scraping dynamically generated table in ASPX. Trying to scrape the gas prices from a site like this GasPrices. I can extract all the information in the gas price table (address, time submitted etc.), except for the actual gas price. 
Is there a way I could scrape the gas prices? i.e. somehow get a text representation of it. I'm not very familiar with ASP/ASPX - but what's being generated now is not showing up in the final HTML. I'm using Python to do the scraping, but that's irrelevant unless there's a specific library...
	View 1 Replies
   
  
    
	
    	
    	
        Apr 8, 2010
        Every night I have a program that runs that creates a natural gas report. In this report is something called "expert analysis." The is just text which is a commentary of the current Natural Gas market and because it's a volatile market there is new commentary every day.  So also every day, an employee in our office runs a public webpage which displays this text, copies and pastes it into a webpage that is part of our system and it gets saved to a file which gets pulled into our nightly report.
The page that is in our system is as ASP page and it uses FileUp to create the file. We want to break our dependancy on FileUp (it's a SoftArtisans product, an excellent vendor, but we are trying to save some money). Right now we don't have FileUp because our trial expired.  So every day I run the public webpage and copy the text and paste it to my expert.txt file all manually.  Another problem is that the expert analysis is updated after our office staff has all gone home,
Here is my question - can I write a program in .NET that runs a page, parses the resulting HTML and extracts out the expert analysis all automatically?  I won't need FileUp anymore becaus I can use the .NET upload control. And I won't have to log on every night to manually copy and paste the expert analysis. Is that what web/screen scraping is? What happens then the public page's format changes because they do maintenance on their site?  Is it just like if I were to run [URL] and try to read the Moderator column of the ASP.NET forum? I'd be okay for a while (would I?) but then if the layout changed I wouldn't find the Moderator anymore?
	View 14 Replies
   
  
    
	
    	
    	
        Apr 24, 2010
        I developed an application form which includes some textboxes for input. When the user click on the button the following tasks has to be done.
1) If page is valid all data should be stored in database
2) A new webform should appear on the same window and the some content of the application form should be displayed in it.
3) When clicking on browser back button it should not post back to previous page.....
I did the first task..and i don't know the code for the remaining tasks. Here is some information
.aspx button control code
[code]....
I opened new webform by using Response.Redirect ("submit.aspx"). Where submit.aspx is the form to be opened after data stored upon the button click in application form.
	View 9 Replies
   
  
    
	
    	
    	
        Feb 10, 2011
        I'm new to c#. I need to do a script to get the HTML content of a webpage. Where I can get examples on how to do this? I have searched here but I can't find.
	View 5 Replies
   
  
    
	
    	
    	
        May 12, 2010
        How to display document(xls,doc,pdf) dirctely on the web page. send the related links also..
	View 5 Replies
   
  
    
	
    	
    	
        Oct 14, 2010
        if I have a data grid on my web page along many other controls, and that datagrid is fetching some data while the page loading.
The page won't load until data grid finish fetching .
I want to show the page to user even thou data grid didn't finished loading and after it loaded it will show. I saw that on few sites, it has preloading bar.
	View 3 Replies
   
  
    
	
    	
    	
        Dec 9, 2010
        When I made an AJAX call to an ASP.NET page, I had a mechanism to return some text based on QueryString parameters. Such as :
Response.Write("<text>");
But in the response, I got a lot of extra information about viewstate status. This does not happen in classic ASP or PHP. Also if I ask for the whole page, it returns it with the page directive 
<%@ Page Language="C#" AutoEventWireup="true"  CodeFile="Default.aspx.cs" Inherits="_Default" %>
How to avoid this extra information and pass only the required one?
Currently I am using PHP page for returning things for the same purpose. It works totally fine.
	View 4 Replies
   
  
    
	
    	
    	
        Jan 7, 2011
        On my web page i have to restrict content selection if user drag mouse on page, as on normal page have enable that if user try to drag mouse then content will get select.
I have found javascript for same and apply it, its running well for IE and i have found css tag for Mozzila to restrict content selection, both working well,but now what i want , on my web page i have textboxes, due to apply JS and CSS user also  can't able to select text from textbox..and i want to allow to select content from textbox.
window.onload = function () {
document.onselectstart = function () { return false; } // ie
}
CSS
Body
{
-moz-user-select:none; // Mozzila
}
	View 2 Replies
   
  
    
	
    	
    	
        May 12, 2010
        Actually i want code of when user upload his document, its automatically generate all the content of document to display in the web page.. and all the code belongs to c#,javascript,jquery
	View 1 Replies
   
  
    
	
    	
    	
        Jan 13, 2010
        i need to divide my page content in a page.
on left side i have a Tree Control on the basis of that i have to refresh the right side details element into div. Scrolling tree should not disturbed the right side elements as well as right side details should not scroll the page or tree control scrolling.
How can i achieve this task?
	View 2 Replies
   
  
    
	
    	
    	
        Sep 30, 2010
        I'm trying to display content in 3 columns on a page. I've done a 2 column layout as follows using a foreach which just puts a hyperlink from the database into a div:
foreach (WBC.BusinessObjects.Tasc content in ci)
	View 3 Replies
   
  
    
	
    	
    	
        Apr 15, 2010
        i developed one simple xml program in asp.net  i ll display the xml content in the webpage specific area of the page but i am trying to use this code
string strPath = Server.MapPath(@"App_Datamain_page.xml");
XmlTextReader textReader = new XmlTextReader(strPath);
textReader.Read();
// If the node has value
while (textReader.Read())
{
// Move to fist element
textReader.MoveToElement();
Response.Write(textReader.Value.ToString());
}
this code ll print the xml output in top of page . how to avoid this and to display the outpur some specific area of the page.
	View 5 Replies
   
  
    
	
    	
    	
        Jan 11, 2010
        We have some table with messages. Each message could have attachments.
We need to have text of attachmnet inside html table of messages on web page. For the first stage we need to have at least support of doc, excel attachments (in future possibly will desire pdf). So in table we show from, subject and body as html cells of message tr. And cell for body should contain firstly rendered content of attached documents (with all images and styles), and then real message body. They don't want any links for download or something similar :( 
I know only  about possibility to use MS Word Save As logic (but don't have a lot of details). So in this case I should have MS word, excel installed on server. And based on type of attachment use one of the compopnents.
	View 4 Replies
   
  
    
	
    	
    	
        Jan 20, 2011
        this web application must scan any webpage and save result if some data has been changed.
it should to search for key words and seek if their values has been modifyed/changed.
i will create this application with asp.net mvc.
what should i use to scan some webpage? if i will insert in my page any url of page which i will to scan, what should happens? are they some robots which looking for it if some content changes?
	View 1 Replies
   
  
    
	
    	
    	
        Jul 13, 2010
        trying to parse an excel file. its structure is very complex. The possible way i know are.
Use Office introp libraries
Use OLEDB provider and read excel file in a dataset.
But the issue is of its complexity like some columns,cells or rows blank etc.
What are the best possible ways to do this ?
	View 5 Replies
   
  
    
	
    	
    	
        Nov 19, 2011
        We have a site that was scraping a site to gather all models available of a product.   The 3rd party site recently changed the website so it now uses ajax for users to select the manufacturer and then once they select that it loads a dropdown with products using ajax.
I currently was using httpwebrequest for all requests (see below).
Code:
Public Function fnRequest(ByVal sPOSTData As String, Optional ByVal bAutoRedirect As Boolean = False) As String
        Dim uriSite As Uri
        Dim sReturn As String
        Dim srReader As StreamReader
        Dim sTemp As String
[Code] ....
Now in fiddler the post appears to be done with ajax.   I tried to send the post data the normal way but it didn't like that. Any example of how to do this?  To get an idea go to [URL] ... and see Mount finder and select Projector.
	View 5 Replies
   
  
    
	
    	
    	
        Mar 18, 2010
        I've created a web application in asp.net so far. where i've tried to get some data(site scraping) from secure page of a web site.I've used the HttpWebRequest class for this functionality but i haven't accessed the secure page yet. Every time the login pages was scraped not secure page.I have the site user id and password and don't know that which language site has been developed in.
	View 1 Replies
   
  
    
	
    	
    	
        Jul 6, 2010
        I'm trying to php/curl scrape data from an .NET site (those with __VIEWSTATE, __EVENTVALIDATION). I monitor headers and post vars using Tamper Data so I'm pretty sure I haven't missed anything. My approach is to micmic the post back when the user click on one of the links and parse the response. But the response I'm getting is a page redirect to "Unable to validate data".
	View 1 Replies
   
  
    
	
    	
    	
        Jan 7, 2010
        I'm trying to write a small application to collect(Scrape) one piece of data from a web site. I would like to be able to simply run the app and it will open the page, find the one piece of data and display it. So far so good...my problem is that the web site is a secure site, meaning I have to provide a user name and password. I've searched all over the web, found many discussions but have yet to find anything that provides specifics on how to accomplish this. I understand a little bit about tokens etc, but I'm really looking for a detailed description of how to do this. Please feel free to direct me to a different forum if I'm in the wrong place.
	View 3 Replies
   
  
    
	
    	
    	
        Jan 18, 2010
        I am building a site that need to scrape information from a partner site. Now my scraping code works great with other sites but not this one. It is a regular .html site. My thoughts is that it might be generated some how with php (site is build with php). 
If it matters here is my code I use. The htmlDocument is htmlAgilityPack but that has nothing to do with it. Result is null on the site I try.
[Code]....  
this is from the w3 validator, might have something with this? The site checked is this
[URL]
I am unable to validate this document because on line 422 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). check both the content of the file and the character encoding indication.
The error was: utf8 "xA9" does not map to Unicode""
	View 4 Replies
   
  
    
	
    	
    	
        Mar 18, 2010
        I've created a web application in asp.net so far. where I've tried to get some data(site scraping) from secure page of a web site.I've used the HttpWebRequest class for this functionality but I haven't accessed the secure page yet. Every time
 the login pages was scraped not secure page.I have the site user id and password and don't know that which language site has been developed in.
	View 7 Replies
   
  
    
	
    	
    	
        Aug 31, 2010
        I need to screen scrape a web page and change its style to match the look and feel of the site where it will be displayed in. Is this possible? I'll be using asp.net to do the screen scraping.
	View 1 Replies
   
  
    
	
    	
    	
        Feb 24, 2010
        I am working on a project that uses data scraping technique to retrieve some url links. I encounter this problem when i pass in the url of a [previous page button] link frm the html code and pass it in to httpWebRequest, the httpWebResponse that i get back is different form the actually content. i have been try to sovle this problem for days and no result, as anyone encounter similar problem and manage to sovle it? below is my sample code: [previous page button] [URL] note: i have change the domain name to a dummy address which is localhost
[Code]....
	View 1 Replies
   
  
    
	
    	
    	
        Jul 29, 2010
        I seem to be having some challenges with the data I am retriveing from a Webpage using the Webclient class. The code works fine,  however I observe that the regular expression is not picking up the negative or positive sign in the Daily_Movement data. For example, a daily movement can be -0.31 or +0.31 but the code is not picking the sign in front of the decimal values.Here is my code
[Code]....
I think where the problem lies is the part of the code   Regex r1 = new Regex("<span class="quoteData">.*</span>"); It picks up the values between the tag quite well, but not the signs in front of it. [Code]....
	View 6 Replies