Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Sun Jan 24, 2010 12:06 am 
Offline
Senior Member

Joined: Thu Sep 11, 2008 10:49 pm
Posts: 70
ICQ: 4155271
Website: http://mikeage.net
WLM: msn@mikeage.net
Yahoo Messenger: m_i_k_e_miller
AOL: MikeageCM
Location: Israel
Hi,

I'm currently running a small, but PHP heavy, site, with relatively few users. My static files are served from nginx, and my dynamic from php (fastcgi). However, I'm only running 4 PHP-CGI processes, as each tends to take about 25MB, and that's all the memory I can spare [MySQL gets another 120 or so, and the rest of for misc. applications, squid, openvpn, etc].

My concern is that one person can easily DoS the site (by accident) just by opening a few tabs at once with long-ish PHP actions. Then, when things don't load immediately, they'll hit reload [or open some more tabs for while the first ones load], and things back of in the iowait state; system loads goes up to 5, 10, 15, etc, while all of the processes wait to try and run (mysql gets backed up, as does journeld, etc).

Any suggestions for preventing this? Should I run fewer PHP processes and let NGINX either return a proxy not available or hold the things in the queue? Is there any way I can assign priorities, or resources limitations, or something else?

Thanks

EDIT: I'm already running APC.


Top
   
 Post subject:
PostPosted: Sun Jan 24, 2010 5:11 am 
Offline
Senior Member
User avatar

Joined: Sun Jan 18, 2009 2:41 pm
Posts: 830
I think the best thing is to speed up whatever PHP page is taking so long. If that's not possible, limit who can access that page and/or how often they can load it. A blunt method would be to use the connlimit module in iptables to restrict the number of simultaneous connections.


Top
   
 Post subject:
PostPosted: Sun Jan 24, 2010 8:05 am 
Offline
Senior Member

Joined: Thu Sep 11, 2008 10:49 pm
Posts: 70
ICQ: 4155271
Website: http://mikeage.net
WLM: msn@mikeage.net
Yahoo Messenger: m_i_k_e_miller
AOL: MikeageCM
Location: Israel
I have two pages that are particularly slow [and cannot really be optimized]; one uses some external web connections, and one is third party code with several complex searches.

iptables seems like a good idea; I'll look into that. Hopefully, clients will handle it gracefully.


Top
   
 Post subject:
PostPosted: Sun Jan 24, 2010 10:26 am 
Offline
Senior Member

Joined: Mon Dec 07, 2009 6:46 am
Posts: 331
It's a problem of application design. You have several options:

1. Use client side technologies wherever possible (aggregation, computation, ...)
2. Server side caching, unless the data has to be always very fresh.
2+1. Server side caching of JSON data to be imported into pages. Yeah, you get more hits but those are to static files.*
3. Offload to external processes, ie. your web application starts an external application/process and client-side checks status via async js (aka ajax)

The proposed iptables solution should work, but imho is a hack. Especially because your pages might not load properly if each page additionally relies on a number of images, stylesheets or js files. Even if you do client-side caching, browsers might send a If-not-modified request.

*If you use sessions, a request locks the session file until it finishes, meaning all additional requests (under that session id) are serialized, waiting for the session file lock. So if the first hit is a hog, all other, however quick they are, will wait until the first one finishes, regardless of cpus or processes involved.


Top
   
 Post subject:
PostPosted: Sun Jan 24, 2010 8:17 pm 
Offline
Senior Member
User avatar

Joined: Sun Jan 18, 2009 2:41 pm
Posts: 830
4. Handle the overload condition in PHP rather than iptables.

Some pseudocode, you'd put this early in the PHP page:

Code:
if (lock_exists) {
  send "503 Service Unavailable" header
  send "Retry-After: n" header
  send error page text
  exit 0
}
else {
  acquire_lock

...the rest of your page goes here...

  release_lock
}


This would permit only one pending request for this particular page at a time. The value of n depends on how long your page typically takes to complete whatever it does (could even be calculated based on load).

As Azathoth pointed out, the iptables method may cause problems because it's pretty common for browsers to open many simultaneous connections to load images, CSS files, and such.


Top
   
 Post subject:
PostPosted: Mon Jan 25, 2010 6:32 am 
Offline
Senior Member

Joined: Mon Dec 07, 2009 6:46 am
Posts: 331
I am not sure how nice the idea with serializing requests to PHP is. It could backfire.

However rejecting requests based on load might (in theory) be a good idea. I'd suggest using APC vars or memcached to track number of requests per second (simpy reset the counter if current $_SERVER['REQUEST_TIME'] is not the same as cached) and reject if too many, with a 503.

But still that's all hacking to patch temporarily. The solution is in reducing the processing time as much as possible, and after you're done with that, invest into more resources if needed.


Top
   
 Post subject:
PostPosted: Mon Jan 25, 2010 9:21 am 
Offline
Senior Member

Joined: Thu Sep 11, 2008 10:49 pm
Posts: 70
ICQ: 4155271
Website: http://mikeage.net
WLM: msn@mikeage.net
Yahoo Messenger: m_i_k_e_miller
AOL: MikeageCM
Location: Israel
Thanks for all of your advice. As I said, I really can't optimize the applications at this time [nor change one of the clients; see the post script], so I added a limit of one simultaneous instance of the really slow script, and am currently looking into using squid as a reverse proxy for some dynamic data; that seems to offer the best "drop in" solution without rewriting existing packages to use memcached or some other solution.

P.S. one of my clients is a kodak photo frame which requests images in 128KB blocks using partial gets. The server is running G2 [gallery.menalto.com] which serves images from behind an image firewall by sending them through PHP. This is a bad combination... I'm hoping the squid will help here.


Top
   
 Post subject:
PostPosted: Mon Jan 25, 2010 8:52 pm 
Offline
Senior Member

Joined: Thu May 21, 2009 3:19 am
Posts: 336
Have you optimized G2 to use it's performance caching options under Site Admin > Performance?

Do you do anything special with G2, or are you simply displaying your photos on your website? If that's all you're doing, take a look at G3. The next iteration in Gallery and it doesn't act as an image firewall, so there's an immediate performance boost right there.

If you need to continue using G2, here are some other docs about squeaking out more performance:
http://codex.gallery2.org/Gallery2:Performance_Tips
http://codex.gallery2.org/Gallery2:ACL_Performance


Top
   
 Post subject:
PostPosted: Mon Jan 25, 2010 11:09 pm 
Offline
Senior Member

Joined: Thu Sep 11, 2008 10:49 pm
Posts: 70
ICQ: 4155271
Website: http://mikeage.net
WLM: msn@mikeage.net
Yahoo Messenger: m_i_k_e_miller
AOL: MikeageCM
Location: Israel
I'm not using the caching in G2, since I have no more memory to allocate for mysql, and I found that the caching would most often just cache things that a spider accessed once.

I haven't move to G3 yet, since (a) it's Apache only [although I do have my testing env. set up to proxy from nginx to apache and (b) many of the things I use are not yet supported. I probably will, but not just yet..


Top
   
 Post subject:
PostPosted: Tue Jan 26, 2010 12:50 am 
Offline
Senior Member

Joined: Thu May 21, 2009 3:19 am
Posts: 336
Quote:
I'm not using the caching in G2, since I have no more memory to allocate for mysql, and I found that the caching would most often just cache things that a spider accessed once.


All the more reason to use it. Gallery doesn't utilize the database for caching. It caches DB queries as well as derivatives, pages, comments, etc. The caching will cache everything that's viewed by your guests or registered users (depending on how you configure it)

As for G3, yeah, as long as you're using more advanced features available in G2, G3 isn't for you (yet). Lighttpd and Nginx probably won't ever be supported by the core team, but I've had G3 working under Lighty without problems and someone will eventually come up with rewrite rules that work for the image protection features.

For G2, you really should look into it's performance features, you'll reduce your DB and memory resources needed for the webserver. As for browsing, if you hack G2 to not do any view counting, the when people are browsing your site, there won't be any DB writes at all, just reads and even fewer if you use the caching. Check out the links I provided earlier.


Top
   
 Post subject:
PostPosted: Tue Jan 26, 2010 3:20 am 
Offline
Senior Member

Joined: Thu Sep 11, 2008 10:49 pm
Posts: 70
ICQ: 4155271
Website: http://mikeage.net
WLM: msn@mikeage.net
Yahoo Messenger: m_i_k_e_miller
AOL: MikeageCM
Location: Israel
waldo wrote:
Quote:
I'm not using the caching in G2, since I have no more memory to allocate for mysql, and I found that the caching would most often just cache things that a spider accessed once.


All the more reason to use it. Gallery doesn't utilize the database for caching. It caches DB queries as well as derivatives, pages, comments, etc. The caching will cache everything that's viewed by your guests or registered users (depending on how you configure it)


Hrm. I remember problems with g2_cacheMap getting _huge_; i.e., hundreds of megabytes.

waldo wrote:
As for G3, yeah, as long as you're using more advanced features available in G2, G3 isn't for you (yet). Lighttpd and Nginx probably won't ever be supported by the core team, but I've had G3 working under Lighty without problems and someone will eventually come up with rewrite rules that work for the image protection features.


I actually had some success working with nginx and G3, but it started becoming more trouble than it was worth (especially when kohana and all of the AJAX calls began to be problematic). The actual permissions weren't so bad; I once sent a patch that support NGINX, subject to the rather annoying restriction of needing to send a HUP every time permissions changed.

Thanks for your comments; I look into their caching again.


Top
   
 Post subject:
PostPosted: Tue Jan 26, 2010 6:06 pm 
Offline
Senior Member

Joined: Thu May 21, 2009 3:19 am
Posts: 336
Quote:
Hrm. I remember problems with g2_cacheMap getting _huge_; i.e., hundreds of megabytes.


hmmm, that shouldn't happen, but may depend on your settings. You may want to pipe in on this active thread on the G2 forums:
http://gallery.menalto.com/node/94053


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group