loadrunner Hi All, A web based application needs to support 2000 Concurrent...

  • perf-test.com need your contributions to build up a strong repository of performance engineering resources.

S

Sudeep Dutt

Guest
Hi All, A web based application needs to support 2000 Concurrent Users. It works perfectly fine till 450 Users but once 450 threshold gets crossed thereafter the test starts throwing errors "Error -27794: Failed to connect to server "xyz.com:443": [10060] Connection timed out" Almost 95%+ of errors are the once mentioned above. Hence any thoughts regarding the same ?
 
well, ,my first thought is that you found the threshold that your current system setup supports. This is what performance testing is for. calculate how many application servers you are using, try to increase that amount and see if you can reach your 2000 concurrent user goal. Good Luck!
 
Well, LoadRunner proves to be the TOOL :) Now you know how much users your app can hold, and you need to fix the app. BTW, check that your pacing, think time, and other runtime settings match realistic load.
 
Check the server logs as well for the errors and also make sure the max server connection has attained
 
Check your app manually once load reaches the 450 mark and if it is not working then you found your performance issue . Also enable "snapshot on error" in RTS. Once you see the issue use diagnostic tool - Dynatrace, hp diagnostics etc or simply check the logs, DB report . Also as mentioned here before make sure you r not artificially over stressing the system by checking your pacing thinktime etc. make sure you LGS are healthy during rub
 
Once 450 users have ramped up, the application opens properly. The majority of errors are during the application URL launch.
 
If application works fine manually with 450 users. Then I would suggest try to rerun the test with allocating more LG against each script
 
then may be a problem with LG check the utilization of the LG. then check whether the Time out in LR has been set to 1200 secs in the Settings
 
1) I added 2 more LG's each having capacity of 3000 concurrent users, hence total 4 LG's, but nothing changes the error remains the same and starts getting populated after the benchmark users (450) gets crossed. 2) The CPU/Memory/Disk Utilization's during the performance test run of the LG's are below15%. Hence from a virtual user generating capacity I do not see a resource crunch in the LG's.
 
You are looking at the Load Generators....Why are you not looking at the servers? You should have a minimum of three load generators involved in any given test, n-1 for primary load and one reserved for control load (one user of each type). If the control load and the primary load both experience the same issue (server stops at 450) then it is a server issue, not a load generator one. If the control load response times are highly differentiated from the primary load response times then you have a load generator issue where you are exceeding resources. Your logging should be at maximum log on error only during the test. Anything more and you are likely to engage the disk on your load generators as a speed brake in your test. This would not manifest as users being refused at above level 'x' (such as 450 in this case) but as slowed users. Also, watch your swap level. For some reason, all performance manufacturers have not marked their virtual user code as non-swappable, so in a situation where you do not have enough physical memory (but plenty of address space) to run your intended virtual user level then you can have a load generator swapping itself to death. How this is manifested is no different than too high a logging level, slowed users. Your control group of only one virtual user of each type on the control PC would be unaffected and would not show the slowed performance.
 
How is the distributed application infrastructure? layers, load balancers, finally as resources are consumed, they say execution logs and other layers?
 
Based on that failure point I would suggest looking at your web server front end settings. Apache defaults for 2 front end web servers would get you almost exactly 450 connects... That would mean playing with httpd.conf for sure.
 
This looks like more as a web server config issue. Means you are required to make some changes in your web server - worker MPM to be able to solve this problem. Calculation: threads per child * server limit= Max client that can be processed. Let me know if this helps :) .