C# - Running Long Process: Indexing 5GB Docs With Lucene?
May 14, 2010
Situation:I have an ASP .NET application that will search through docs using Lucene. I want to run the initial indexing (the index will be incremental after the initial run so there wont be need to index the whole directory again in future). Currently, I have about 5GB of docs (45000files).Problem: My application times out before completing the process. I have altered the TimeOut like this:HttpContext.Current.Server.ScriptTimeout = 200000;but it still does not complete the process.
I know that similar questions have been asked all over the place, but I'm having trouble finding one that relates directly to what I'm after.
I have a website where a user uploads a data file, then that file is transformed and imported into SQL. The file could be up to 50mb in size, and some times this process can take 30 minutes or sometimes even longer.
I realise I need to palm off the actual work to another process, and poll that process on the web page. I'm wondering what the best approach would be though? Being a web developer by trade, I'm finding all this new Windows Service stuff a bit confusing, and I just wanted somewhere to start.
So:
Can I do / should I being doing this with a windows service? if so, how?
Should I use WCF? If this runs under IIS, will I have problems with aspnet_wp.exe recycling and timing out my process?
clarifications
The data is imported into sql, there's no file distribution taking place.
If there is a failure, it absolutely MUST be reported to the user. The web page will poll every, lets say, 5 seconds, from the time the async task begins, to get the 'status' of the import. Once it's finished another response will tell the page to stop polling for status updates.
queries on final decision
ok, so as I thought, it seems that a windows service is the best idea. So as to HOW to get it to work, it seems the 'put the file there and wait for the service to pick it up' idea is the generally accepted way, is there a way I can start a process run by the service, without it having to constantly be checking a database table / folder? As I said earlier, I don't have any experience with Windows Services - I wondered if I put a public method in the service, can I call it somehow?
In an asp.net web form, I keep getting a connection reset error message. The page is doing a some long running processing (about 2-5 minutes).
I have no problem when the web request comes from the same machine as the web server. But when the request originates across the network, I get a connection reset error about 1:30 or 2 minutes into waiting for a response.
I have set the in web.config for this application and put the application it's own application pool.
What else can I try?
Edit
The purpose of this page is to accept input from the user, calculate something, and send the result back to them. The long running calculation isn't something I can offload until a later time.
Note:- i don't want to use updateprogress etc. control of ajax
on button click, long task(e.g thread) runs in my webpage for about 4-5 minutes.I want to show status to user either by a processing image through javascript(image must be shown in a certain part of page other part of page will remain intact) or an exact status of process if possible. i have tried a lot but all in vein.
I've been having a difficult time with this. I have an asp.net page. The user hits the "Run" button and I have code IN AN ASSEMBLY, not in the APP_CODE folder that is called and runs a long process that moves product info from a file into the database. While the user waits, I would like them to see status updates like what product the import process in on and status info. I'm assuming I'd break off into another thread and use Ajax but I have no idea how to do this.
Wondering if there is a performance difference between letting a long running process hang in asp.net vs running the process via a windows service. I have done this once before and the windows service was much quicker and didnt bog down my system, whereas the asp.net request seemed to wreak havoc.
I've developed a web application to accept video file uploads and then pass them to a backend service on an external server. The application runs without error on the visual studio debugging webserver, but once on a production iis 6 or 7 server, yields a timeout error at about a consistent amount of time into handling a large upload. Specifically, it errors in the middle of transferring the video file to the external server, once the application has successfully received it from the client. I'm aware of several timeouts to be configured related to the problem, and have done so. The application's web config has been tested with one or both of the following settings
<system.web> <httpRuntime executionTimeout="9999999" maxRequestLength="2048000" /> </system.web> and <configuration> <location path="default.aspx"> (the page at issue that's timing out) <system.web> <httpRuntime executionTimeout="9999999" maxRequestLength="2048000" /> </system.web> </location> </configuration>
And within the initialization of the webrequest made to the external server to send the video received from the client browser:
So with the execution time limits on both the webform as a whole and the connection made to the external server, I'm at a loss for what timeout is left unconfigured, or how to determine such, when I continue to get the following error: Unexpected error executing Brightcove Upload:...........................
I am executing a long-running Oracle stored procedure from .NET. The procedure takes about three hours to run. Ideally, the user should be able to kick off the procedure, close the browser, and come back later to check the results.
The problem is that the connection to the Oracle procedure is lost after exactly an hour. As you would expect, the Oracle procedre runs to completion if it is executed from SQL Plus. Strangely enough, it will also run to completion if I run in debug mode on my local machine (I start two threads, one of which executes the procedure. I set a breakpoint on the second thread).
Here is my connection string:
data source= (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=serverx)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=TestSID)))
I have an asp.net page that calls a dll that will start a long process that updates product information. As the process is running, I want to provide the user with constant updates on which product that process is on and the status of the products. I've been having trouble getting this to work. I add information to a log text file and I was thinking that I'd redirect to a new page and have that page use Javascript to read from the text file every few seconds. My question is has anyone else tried this and does it work fairly well?
Want to display staus of long process. when user click on button calling some function which will return the percentage of work done .I want to display that persentage in some where in the page.How can i do it?
When my Application face a long-time process, i.e fetch a query (SELECT a, b, c FROM d)This query needs 10 seconds to be completed in the MSSQL Management Studio, but when the ASP.NET application try to fetch it, it refuse to return any response to any other requests made on that Server.
I am hosting my Application on VPS Server with good specifications, and I am giving this example the (SELECT a, b, c FROM d) just to tell you the issue, it can be any process, maybe processing a movie, or even fetching some data through external API that is experiencing some slow-down,or whatever.
I've asked this before but I was hoping for another answer and perhaps some code samples because I've been having a difficult time with this. I have an asp.net page. The user hits the "Run" button and I have code IN AN ASSEMBLY, not in the APP_CODE folder that is called and runs a long process that moves product info from a file into the database. While the user waits, I would like them to see status updates like what product the import process in on and status info. I'm assuming I'd break off into another thread and use Ajax but I have no idea how to do this.
On my current project I have to accept 2 Excel spreadsheets that will be uploaded by clients, process them and create a download package based on this information. This process includes extracting data from the sheets, updating our databases and building several PDF files for the download package. This takes between 15 seconds to 2 minutes to complete, depending on the complexity of the request. Naturally, I want to show some kind of processing indicator rather than just leaving the user hanging while the page loads.
Here's the problem: How to show this processing indicator.
I have to do a full postback to upload the files so this eliminates several nice AJAX indicator methods (sponsoring users rejected the AJAX toolkit async file upload, saying it was confusing to them). If I process on any of the page events during the postback, the page doesn't load until the lengthy process is completed so the browser/site looks 'hung'.
Basically, I need some ideas on how to display a 'building your download' graphic while the lengthy process is working that will also work with a full postback.
When testing this locally (on my dev machine) i see the application is working alot slower.
Is there a better way to write this code? Should i use ThreadPool.QueueUserWorkItem or create a new thread using Thread t = new Thread(new ThreadStart(DoWork)); ? Will it be better to create a totally seperate application for the purpose of sending the newsletters. will that help if ill run this application on the same machine?
i've seen other posts here talking about ThreadPool vs Thread but its seem no one is sure which is better.
When an ASPX page needs to make a call to a potentially long-running operation (lengthy DB query, call to a remote webservice, etc.), I use RegisterAsyncTask, so the IIS worker thread is returned to the pool, rather than being tied up for the duration of the long-running operation. However ASMX webservices don't have a RegisterAsyncTask function. When an ASMX webservice needs to call a potentially long-running operation, how can I implement the same behavior as RegisterAsyncTask? Note: the ASMX webservice is implemented as a script-service: returning json to a direct jQuery/ajax call. Therefore, I cannot use the "BeginXXX" approach described by MSDN, since that implements the asynchronous behavior within the generated client-stub (which isn't used when calling the webservice directly via ajax). EDIT: Adding source code: implemented the BeginXXX/EndXXX approach listed in John's answer. The synchronous "Parrot" function works fine. But the asynchronous "SlowParrot" function gives an internal server error: "Unknown web method SlowParrot"
I would like to makea windows service. whenever the user of my ASP.NET application has to do a time-consuming task, the IIS would give the task to the service which will return a token(a temporary name for the task) and in the background the service would do the task. At anytime, the user would see the status of his/her task which would be either pending in queue, processing, or completed. The service would do a fixed number of jobs in parallel, and would keep a queue for the next-incoming tasks. In addition there would be a WinForms application for system administrator that would allow adding special ADMIn tasks such as "Clean orphaned files" or "archive data of inactive users".
Can you point me to something that can jump start me on this as a whole concept - I know I can google for windows services and I am able to do it myself from scratch but time is of the Essence so maybe you know of something that is already there and i can use block to build out of.
I'm looking for ways to improve a web page that initiates a long-running (>2 minutes) server-side task. The current version of the page just clocks for the full duration of the task, which can be very frustrating to the user.
I already have a few ideas about how I could improve the user's experience, but they all would involve the use of AJAX to some extent. Because of previous experiences that I've had on this project, I know that not all users have JavaScript enabled or available.
Assuming that the server-side process has already been optimized as much as possible, what else could I do to improve the experience of all users as much as possible?
I have a long-running WCF service that I need to call, but I would like to do is to open modal window (modal popup extender) that simply shows progress and stops the user from interacting on the page until the service returns. What I was trying to do was the following:
1. Click button to activate the process which calls a method in my code-behind. 2. This method opens my modal panel with some pretty animation. 3. I call my WCF service asychronously so that the UI will refresh. 4. Service ends which calls my delegate I setup. 5. My delegate method would then refresh the page with results, and dismiss the model popup.
I have a report that takes a couple of minutes to generate.What I am trying to do is display a progress indicatior onscreen (progress bar or spinning circle) while this is running.I was thinking of using javascript to display the progress indicator but am not sure how to get started on this. am using ASP.NET 2008, C#.
I have NHibernate sessions cached in the ASP.NET session.
I came across a situation where a user edited an object so it's in their first level cache in the ISession. Another user then edited the same object.
At this point User1 still sees their original version of their edits where as User2 sees the correct state of the object?
What is the correct way to handle this without manually calling session.Refresh(myObj) explicitly for every single object all the time?
I also have a 2nd level cache enabled. For NHibernate Long Session should I just disable the first level cache entirely?
Edit: Adding some more terminology to what I'm looking to achieve from 10.4.1. Long session with automatic versioning the end of this section concludes with
As the ISession is also the (mandatory) first-level cache and contains all loaded objects, we can propably use this strategy only for a few request/response cycles. This is indeed recommended, as the ISession will soon also have stale data.
I'm not sure what kind of documentation this is for it to include both probably and then immediately say the session will have stale data (which is what I'm seeing).
I have a long poll HTTP request using ASP.NET 4, MVC 2 and AsyncController. If a user closes their browser and kills the HTTP connection without the request completing, I'd like to know about it and completely clean up after them. If I don't, the open and incomplete requests just sit there and eventually IIS stops accepting new requests.
You can simulate my long running HTTP request by making a normal ASP.NET application with a page that has a Thread.Sleep. Even if you close the browser, the request carries on as if it hasn't.
There is a property called Response.IsClientConnected that gets switched to false if the client disconnects, and I can poll this to achieve the desired effect but it's not very clean and I'd like to avoid polling. Is there a way of getting notified when this happens rather than having to poll this property?
I need to invoke a long running task from an ASP.NET page, and allow the user to view the tasks progress as it executes.
In my current case I want to import data from a series of data files into a database, but this involves a fair amount of processing. I would like the user to see how far through the files the task is, and any problems encountered along the way.
Due to limited processing resources I would like to queue the requests for this service.
I have recently looked at Windows Workflow and wondered if it might offer a solution?
I am thinking of a solution that might look like:
ASP.NET AJAX page -> WCF Service -> MSMQ -> Workflow Service *or* Windows Service