Web Forms :: Build A Strong Web Crawler?

Oct 6, 2010

I would like to state literally that I really do not want to reinvent the wheel, but as you know some times we get some unique "Must-requirement(s)" that will hinder us to re-use the open source codes.I want a proper, flawless and consistent Web Crawler. Basically, I want this Crawler (As a Web app NOT desktop app - Of course based on asp.net and C#) to grab the pages of any website and store them locally (Including resources like images and CSS...etc), download them locally and adjust any resource hyperlinks to point to the locally downloaded resources.

I saw HTTrack (http://www.httrack.com/), and it seems quite excellent, but the problem is that I want this Crawler to be a part of a system which includes other features and process. So, I really can't have this Crawler as an external tool.Main challenges:1) User should be able to specify to which level s/he wants to crawl, which means: User might specify a sub-site and want to crawl everything underneath it and not the upper level. You see ? So, there should be full crawling for the entire site and partial crawling.

2) URLs and how to deal with them? I faced some weird URLs and it was hard for me to identify the actual page because there is no file name. How to handle that? For example: (http://www.blue1.com/en/uk/Travel-info/At-the-airport/Security-control/) this is a URL of a website that is built on EPiServer (.Net based) but as can be seen from the URL that there is not actual aspx page, Therefore, how to deal with such URLs ?I have already started developing a POC using HttpWebRequest class, but frankly I am totally dissatisfied with it. It is inconsistent and the generated static content misses a lot of images and styles. Besides, the threads act up sometimes strangely.I would greatly and sincerely appreciate any input (Approaches, source codes, ideas , links...etc)P.S. I already saw: (http://www.codeproject.com/KB/IP/Crawler.aspx) and (http://www.codeproject.com/KB/aspnet/ZetaWebSpider.aspx).

View 1 Replies


Similar Messages:

Writing Crawler For Screen Scrapping

Mar 26, 2010

I want to write crawler for screen scrapping What I want is, I want to get price of particular hotel from a website, like here is website e.g. In the above URL, there is list of hotels and its price. I want to get the price of the beaufort

View 3 Replies

Web Crawler To Generate Output Cache?

Dec 6, 2010

I implemented : <%@ OutputCache Duration="43200" VaryByParam="none" location="Server" VaryByCustom="RawURL" %>

I have got a sitemap.xml with all url(about 12000) possible in my site.

I would now if it's judicious to create an application that will parse my sitemap in
order to request all url?

The Goal is caching all my web site to increase velocity.

I precise my question :Each page take about 10 sec to be created and the cache duration and cache duration is 12 hours

View 1 Replies

C# - Crawler Webresponse Operation Timed Out

May 18, 2010

I have built a simple threadpool based web crawler within my web application. Its job is to crawl its own application space and build a Lucene index of every valid web page and their meta content. Here's the problem. When I run the crawler from a debug server instance of Visual Studio Express, and provide the starting instance as the IIS url, it works fine. However, when I do not provide the IIS instance and it takes its own url to start the crawl process(ie. crawling its own domain space), I get hit by operation timed out exception on the Webresponse statement. Could someone please guide me into what I should or should not be doing here? Here is my code for fetching the page. It is executed in the multithreaded environment.

private static string GetWebText(string url)
{
string htmlText = "";

[code]...

View 1 Replies

Web Forms :: Build Survey System Where Build A Form With Questions And Some Answers?

May 25, 2010

I want to build a survey system where you can build a form with questions and some answers to these questions and then members who will log in will be able to take the test.

Then i want to present the different results from the test in some diagram or something like that.

View 5 Replies

Web Forms :: Assembly Generation Failed - Doesn't Have Strong Name

May 21, 2010

can anyone tell me how to solve this issue Assembly generation failed -- Referenced assembly 'Microsoft.Web.UI.WebControls' does not have a strong name

View 1 Replies

.net - About Strong Name Verification Skipping?

Mar 7, 2011

My ASP.NET application is using an assembly without strong name. When I run it in IE, it shows an error saying: Could not load file or assemlby 'xxxxx.' or one of its dependencies. Strong name signatuer could not be verified. The assembly may have been tampered with, or it was delay signed but not fully signed with the correct private key. (Exception from HRESULT: 0x80131045)"I use sn.exe -Vr xxxx to register that assembly to skip the strong name verification, but still it shows that error. What could be causing this problem, and what can I do next to fix it?

View 1 Replies

RegEx For Strong Password?

Mar 6, 2010

I have the following password requirements:

1) Should be 6-15 characters in length
2) Should have atleast one lowercase character
3) Should have atleast one uppercase character
4) Should have atleast one number
5) Should have atleast one special character
6) Should not have spaces

Can anyone suggest me a RegEx for this requirement?

View 4 Replies

RouteDebug From NuGet Does Not Have A Strong Name?

Mar 25, 2011

Today I had a problem with some routing in my ASP.NET MVC 3 application (with Visual Studio 2010).So I thought I install the ASP.NET RouteDebugger and fix my route problem. After I get the package through NuGet my project doesn't build anymore: referenced assembly ' RouteDebug' does not have a strong nameI could download the source of the RouteDebugger and build (and strongly sign) it myself, but that's not the purpose of NuGet isnt' it ;)

View 2 Replies

C# - Prevent GridView Saving Data From Build To Build In Visual Studio?

Apr 1, 2011

I have a question regarding a situation that occurs with GridView, ObjectDataSource in ASP .NET application. The GridView is linked to the ObjectDataSource and both are included within an UpdatePanel letting the GridView to fill in an asynchronous way from a form in the same page so it gets more rows as the user enters the data:

<asp:ScriptManager ID="ScriptManager1" runat="server">
</asp:ScriptManager>
<asp:UpdatePanel ID="UpdatePanel1" runat="server" UpdateMode="Conditional">
<ContentTemplate>
<asp:GridView ID="GridView1" runat="server" AutoGenerateColumns="False"
DataSourceID="ObjectDataSource1">
<Columns>
<asp:BoundField DataField="Name" HeaderText="Name" ReadOnly="True"
SortExpression="Name" />
<asp:BoundField DataField="Periodicty" HeaderText="Periodicty" ReadOnly="True"
SortExpression="Periodicty" />
</Columns>
</asp:GridView>
<asp:ObjectDataSource ID="ObjectDataSource1" runat="server"
SelectMethod="GetSessionNames" TypeName="Simulation"></asp:ObjectDataSource>
<asp:Label ID="Label27" runat="server" Text="Label"></asp:Label>
</ContentTemplate>
<Triggers>
<asp:AsyncPostBackTrigger ControlID="NewWebSessionButton" EventName="Click" />
</Triggers>
</asp:UpdatePanel>

I start the project with Visual Studio 2008, fill the form and it works correctly. Then I stop the execution: rerun again and the data I entered in the previous run is in the GridView. Is like some sort of cache saved the data from the session before. I checked that EnableCaching property is set to false for the ObjectDataSource. If I Rebuild Web Site in Visual Studio (not just Build) then it works corretly leaving the GridView empty. Is this caused just becuase of Visual Studio? Can it be turned off? And will it happen in the final IIS it will run on?

View 1 Replies

.net - ICustomTypeDescriptor For Simulating Strong-typing?

Jan 31, 2010

I thought about simulating strong-typing for key-value configuration of a new project by providing fake property info via implementing ICustomTypeDescriptor.The configuration instance should provide all default config keys as properties with default values however: I noticed that VS08 intellisense doesn't include "faked" properties which are created in example similar to [URL]

View 1 Replies

HOw To Create A Strong Named Assembly

May 6, 2010

I have a web site project in which my architecture is n layered architecture.I am using Micorsoft Enterprise library's validation dll.As of now this dll is not strongly named. I need to make that assembly strongly named. how can i do this.I saw some articles which depicts how to create strong named assembly by taking the vs 2008 command promtp and type sn -k publickey.snk, and then add the assembly tag to the assemblyinfo.cs. I tried to do that, but my website project dosen't have any assemblyinfo.cs file.

View 1 Replies

WCF / ASMX :: Give A Strong Name To Proxy Dll?

May 28, 2010

I have an asmx file that was created using notepad. Then I created the proxy class using wsdl.exe. Now I have a dll that I want to put it in GAC. GAC needs the DLL to have a strong name. How can I create a strong name for the web service?

View 2 Replies

Installation :: Strong Naming An Assembly Of A Website?

Jan 24, 2010

Strong Named Their Assembly. I've read on-line and in the help files for 6 hours straight and I am no where closer to getting this. It seems that every 6 months I run up against one of these types of things with VB.net and the .Net Framework. I read on line and find dozens of people who get the same exact error. Most of the threads are never resolved and the ones that seem to resolve the issue do it in a way that doesn't work for ever one else. It is really absurd.

My web site runs fine in debug mode on my computer (local host). It loads in FireFox and runs fine. When I post it to my web site I get the error below. I try to "Strong Name The Assembly" with the command line command "aspnet_compiler -v default.aspx X:NetProjectsHumMPI -keyfile X:NetProjectsHumMPIkeypair.snk -aptca"

Default.aspx is what fails. This should be the virtual folder of my web app. Obviously "Default.aspx" is not right. I have tried 42 variations on what I think the virtual path to my web app might be. Every single time the compiler fails telling me that it is not a valid path.

I'm moving in to week two of trying to get a simple "Hello world" web app to load on my web site. It runs perfect on my development machine but generates constant errors on the web site. Each time I fix a problem that only happens on the web site another one crops up with even the slightest change - or sometimes even NO change to the code.

View 5 Replies

MVC :: Dropdownlist Template With Strong Typed Views?

Apr 12, 2010

Im trying to make a template for a dropdownlist.

In my Model i have:

[Code]....

The PageTemplate, is a class, but I what my view to render a DropDownList, that can set the key.

I have in my shared/EditorTemplates/String.ascs - witch is render as that template

But my /shared/DropDownList.ascs, does not render at all. why?

// dennis

In a sence im trying to recreate this article:[URL]

View 11 Replies

MVC :: Model Metadata And Strong Types Views?

Jan 14, 2010

In my project I use Strong Types Views. Because I find it nice structured.

public abstract class AbstractViewData: View Page ( ICollection Foo; )
public class HTML Component View Data: AbstractViewData ( string Foo2; )

I have the same structure in my code, as in the corresponding pages.And here starts the problems. I would like to use HTML.Display (o => o.Foo) could be a customer for that matter.But my Strong Types Views have not posted Metadata Model into my classes.Like: Return View ( "FooView", customer); would.Is there a way to write some code that can solve this problem for me?

View 10 Replies

MVC :: How To Pass Strong Typed List From View To Controller

Jun 15, 2010

[Code]....

for List<AnswerInfo>,

[Code]....

now, what i want to do is to place few textboxes on a view that allow users to input the Answer Text. after click the submit button. i want List<AnswerInfo> which contains the Text information pass to the controller. can anyone tell me how to do it?

View 4 Replies

MVC :: Custom Strong Type HTML Helper For Checkboxes?

Oct 1, 2010

MVC contains a strong typed HTML Helper (HTML.CehckBoxFor()) this takes a bool and returns a bool. To make the URL smaller I am thinking to change the bool values on the model object to byte or somthing like that so the url only contains &Parameter=1.

I have found this snippet :

[Code]....

But I have no clue how to turn this in to a strong typed HTML Helper for checkbox that takes byte instead of bool. Its also important that the model object is set with byte instead of bool.

View 3 Replies

Create A Strong Name Key(snk) For Mulitiple References Dll In N-tier Application?

Mar 18, 2011

how to create a snk for all existing dlls(multiple) in n-tier ASP.net application?I have created a asp.net application using n-tier. My web layer contain refrences of all layers(data,facade,core,common).but when I try to create strong key of web layer it throws error as "refrences assembliy can not have Strong name".

View 4 Replies

What Is Difference Between Build Solution And Build Website

Mar 11, 2010

It may be obvious to everyone. I am learning this: what is difference between build solution and build website

View 2 Replies

Build Tool Which Can Build A Web App Into Multiple Dlls?

Sep 22, 2010

I have a large solution which has multiple apps which all share some common site elements (masterpages, navigation, etc).

Currently, all of these get built into a single DLL

If my structure looks like:

WebRoot
- Common/
- Shared/
- Images/
- App1/
- App2/
- etc

Is there a build tool which will allow me to build WebRoot.dll, App1.dll, App2.dll? I don't believe this is possible in VS2008 or the MSBuild tool.

View 2 Replies

How To Specify A Direct Build Name Directory Using TFS Build Of .net Web Application

May 11, 2010

I have a TFS build set up to deploy an ASP.net project to a test server.The build works great, and deploys to the test server fine, but instead of putting it into the Website directory that my IIS webserver is configured for, it puts the build into Website_20100511.6

Why is the date suffixed to the directory name? Is there a way to turn that off so I can publish directly to the Website?

View 1 Replies

Programmatically Change Table Names In .net Strong Typed Dataset?

Oct 29, 2010

I've developed an application using strong-typed dataset with .net framework 3.5.is there a way to change the source table for a tableadapter programmatically?

View 1 Replies

Security :: Want To Enforce Strong Passwords And Do Not Want To Use The Secret Question And Answer Features?

Sep 17, 2010

I have a website running on iis 5.1 with asp.net 2.0. Where in the windows registry can I change the requirtements for some the security features? For example, I do not want to enforce strong passwords and I do not want to use the secret question and answer features.

View 4 Replies

Need To Build Web Application That Will Do:1 - Build Web Pages?

Jan 15, 2011

I need to build web application that will do:1 - build web pages. 1a - build template for page. 2 - add module(by module I mean ContacUs form, Search, Billing System...). Each module can be constructed by submodules or diveded to submodules 2a - build module(add form, textbox, button...) and that all
entered data by user could be saved in dbCan you advise me a DB structure that will contain it allI looked some cms db, but it's NOT this. Please, don't ask why I mess with it. I just need to build it.

View 2 Replies







Copyrights 2005-15 www.BigResource.com, All rights reserved