Architecture :: Read And Analyze A Very Large Log File With High Efficiency And Performance?
Sep 13, 2010
It is a very large .txt file (more than 3M), and produced everyday, the content is user's system log like below:
2007-11-01 18:20:42,983 [4520] INFO GetXXX() SERVICE START
2007-11-01 18:21:42,983 [4520] WARING USER ACCESS DENIED
2007-11-01 18:22:42,983 [4520] ERROR INPUT PARAMETER IS NULL CAN NOT CONVERT TO INT32
2007-11-01 18:23:59,968 [4520] INFO USER LOGOUT
Imagine I have a class MyTestClass. And I need an instance of this Type throughout my whole web application.Now there are several possibilities to accomplish this.1. Make MyTestClass static, make it contain static methods only Probably the most performant solution. I'm not feeling lucky about using static fields though. Thread safety? What if my static class contained a static System.Collections.Queue?
Until now, I have not create any massive applications using ASP.Net. However, I am looking to create an application that has the potential to be very performance intensive. So I am looking for some tools or best practices when it comes to performance. I would like to be able to know how to: See my current performance (good or bad) View items that need fixing And being able to compare two performance variable items would be great as well.
The website I am working on has a section where users can search for content they have digitally subscribed to. way to cross-check both data sets, the first being content that matches the search criteria, the second being the content the user has subscribed to. What's making this is a mess is how the data is structured. Here is an example...
We store all of our content in a "Content" table in our database. A piece of content, a "journal", for example, and all of its child records (volumes, issues, and articles) are all stored as records in the "Content" table, each with a ParentId of its corresponding parent record. So, A journal can have n volumes. Each volume can have n issues. Each issue can have n articles. A user can have a subscription to any of these, which would implicitly give them access to all child records. For example, if a user was subscribed to a journal, they would have access to all of its volumes, issues, and articles. If the user was subscribed to an issue, they would have access to all of its articles, but not the parent volume or journal.
Some users own over 30,000 records, and we have over 100,000 content records in the database. The Content table has relationships with several other tables that are used for the search. This leaves me with an expensive query to find what the user has access to, and another expensive query to search through all of the records to find search criteria matches. Some of our searches take 20-30 seconds and I would like to speed it up to a max of 5sec per search.
I tried running a query to get the ContentIds of everything the user owned when they first visited the search page, and then caching it to eliminate further database hits, but when I passed the list of ints into the query via a linq .Contains statement, I hit the SQL parameter limit of 2100, since apparently .Contains() splits them all out.
I have a scenario which I am looking at where large files which are about 30-40 MB are being FTPed to a server. I am looking at creating a .net screen with the FTP control to upload the file to a Unix server. I need to know how much of a performance hit it is to work with such large files, is it a feasible option in this scenario? I might have to create a .net component for the same and call from ASP application. Is it doable?
I'm building a asp.net web application with lots and lots of controls and huge volumes of data. My application is very slow and it is taking a large amount of time to load the data into the .net controls like grid, tree view etc. I also have some ajaxified pages and controls in my application. I want to reduce the page load time in each postbacks.What are the standards/best practices to be followed while developing large asp.net applications?
I have about 5,00,000 data. I want to display the records according to group by. It takes to me about 2 minutes. Is it good?
If not, then what is best process to display the record in a very fast way?
Moreover, I have encrypted one field in database and encryption is done from my code behind. So when I display the records, I have to decrypt those. That's why I make paging on 100 records per page.
I have a web performance test which contains a request whose response is greater than 5MB, and the Extract Hidden Fields rule fails to find (necessary and required!) hidden fields in the response. Response header contains
HTTP/1.1 200 OK Transfer-Encoding : chunked Vary : Accept-Encoding, User-Agent Cache-Control : private Content-Type : text/plain; charset=utf-8 Date : Sat, 19 Feb 2011 15:24:38 GMT Server : Microsoft-IIS/6.0 X-AspNet-Version : 2.0.50727 X-Powered-By : ASP.NET
Other than that and the response size, there is nothing remarkable about this scenario. In fact, this same test succeeds when a smaller data set is used. I suspect the Web Performance Test framework is having issues parsing the "chunked" encoding or sheer volume of data. Ahem, how can I obtain these required hidden fields from my response? ie resolutions, work arounds, converting auto-extraction to manual, etc.
I tried to find out about subject but with no success. The point is that in the beginning I've made many user controls. My site is too slow. I have not idea yet if it is because of user controls.
i build online course selection for a university now i want to make it layering and i dont know to make it 2 layer or 3 layer i mean mix business layer and database layer because i use linq so it is not require to check request data that in sql from client in business layer because linq check it autimatically to preent injection and etc.is it right?can i put my class and DB class in same place or layer?if i make it 2 layer the performance becom better?
I have almost 100 website that will update in a condition, I have a winzip archive that contains the files that replaces those websites. I want to know that
I can extract that files in a folder and then copy them to all 100 websites folders
I can extract the archive directly to 100 websites folders
which one is better in performance and less prone to errors
I'm building a new n-tier web application and I would like to know the performance differences between developing my tiers in one single assembly (each tier with its own namespace) or into different assemblies, one for each tier.
I am currently working on a page that has 13 gridviews in it. Each one is in a control and a separate updatepanel. The time from link click to page load is 20 seconds though and I am looking for ways to load the controls asynchronously so that the page can load and then the data can come after. I have tried a few different methods so far but have been unsucessful so far. I thought about using the Ajax incremental page display pattern but I did not think this would work because it only returns html. I need to load the gridviews in and they need to support pagination and sorting.
how a large ASP.NET webapplication can be structured / designed with "subwebs" !? In other words I want to structure a large web-application where I have a solution with more web-application projects. These projects are more or less independent "modules". One project should be a kind of frame application with shared masterpages, an shared sitemap and and a shared authentication. Later it should also be maybe possible also to integrate loosely older websites (which where written in classic ASP. ASP.NET 1.1, ASP.NET 2.0), but this not so important at the moment
I need to create a method which looks through a large string of text, and determines which words (apart from words like "a" "and" "the") are the most frequently used. I would like to determine which are the top 3 most frequently used words in a string of text...is this possible?
I am building a small mass email application for my department. Which basically emails out a notice to a large list of email addresses. Because the company email server limits the amount of email addresses that can be contained in a single email I have to break the list apart into smaller 100 email groups.
I've create the query to pull all email addresses needed, stuffed them into a collection but I am not sure how to grab 100 emails at a time and send it off to another sub to perform the email send before grabbing the next 100.
I have a project that will be assigned to me soon whereby I need to develop a survey which needs to support 40,000 users approximately. Now we are thinking of doing a staged approach so not all users are on the server at any one point of time. Probably splitting it so we can serve at least a few thousand of users...
I dont have much experience of how I can ensure to serve those levels of users on a server and how to manage this so I am after some advice?
From my understanding I need to do some stress testing on a server and obviously I need some figures i.e. average size of request, average size of response and content of response.
- Do I have to build the database and add records in to see what size is a typical survey row?
- Do I have to build the survey in .NET i.e. by adding controls etc and seeing what size is the page?
- The survey shouldnt be too instensive processing wise, it will be adding information into a backend sql database...
Edit: I would like to keep the infrastructure as is, so while the framework ideas are appreciated, please keep your suggestions centered on the context I have provided.
Background
I'm building a web-based application that dynamically loads plugins. Each plugin comes with a manifest file that contains its dll location, namespace, and type.
Right now I'm using System.Reflection.Assembly.LoadFile to load up the dlls based off the locations provided in the manifest files. Then I load the types and so on.
As an Aside: I may wind up changing to System.Reflection.Assembly.LoadFrom since I'll eventually be loading files from outside the bin directory. But if their is a better way (Assembly.Load or something), feel free to add that in as well
Problem
The problem is that Multiple plugins can potentially run off the same dll. So I wind up executing System.Reflection.Assembly.LoadFile("Identical.dll") multiple times.
I have the idea to check if my assembly has already been loaded by iterating through AppDomain.CurrentDomain.GetAssemblies(), but I don't know if that will help with performance (or if it will work period, I haven't tried it).
Also, I can't keep a list of loaded assemblies due to the project's design constraints (though you may argue that it's a poor design: I can't change it, even if I wanted to OR agreed with you... so please don't press the issue.
Ultimately my goals are:
Don't ever re-load the same assembly twice. Performance is key.
I'm developing chat application on asp.net mvc, in my app, user can create room and invite other join to chat, but don't need to save chat room information.So, I designed to save all chat message ,room information and user info in Session and clear it when owner close the room.I'm worry about stressing server when I save alot of data (include room info, user info, and message) in Session if there are up to 5000 rooms created and alot of messages transfered in that room.Is my solution good enough ? is it ok to save in Session ?
I need to create a website that analyze a SQL database with Chart; I have some application that modify the database, and i want to show the changes in another application using a Chart.