Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
 Post subject: Page allocation errors
PostPosted: Sat Sep 24, 2011 8:43 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
Using latest 2.6, I've been seeing, regularly, page allocation errors.

In my case the error always appears to be in the IP stack, so I wonder if there's some Xen level issue. It also seems to be coincidentally with my home machine doing an rsync of my /BACKUP directory, which might explain why the stacks are display tcp4 entries. Note the process ID...
eg
Code:
Call Trace:
swapper: page allocation failure. order:2, mode:0x20
Pid: 0, comm: swapper Not tainted 2.6.39.1-linode34 #1
 [<c0189a30>] ? __alloc_pages_nodemask+0x530/0x6f0
 [<c01afb13>] ? T.819+0xb3/0x2e0
 [<c01aff86>] ? cache_alloc_refill+0x246/0x290
 [<c0139826>] ? local_bh_enable+0x16/0x80
 [<c01b008d>] ? __kmalloc+0xbd/0xd0
 [<c050f07e>] ? pskb_expand_head+0x12e/0x200
 [<c050f5bd>] ? __pskb_pull_tail+0x4d/0x2b0
 [<c05d9263>] ? ipv4_confirm+0xd3/0x180
 [<c0517d6d>] ? dev_hard_start_xmit+0x1dd/0x3e0
 [<c059a900>] ? ip_finish_output2+0x260/0x260
 [<c059a900>] ? ip_finish_output2+0x260/0x260
 [<c052bcc2>] ? sch_direct_xmit+0xb2/0x170
 [<c0518069>] ? dev_queue_xmit+0xf9/0x320
 [<c059aa3b>] ? ip_finish_output+0x13b/0x300
 [<c059acaa>] ? ip_output+0xaa/0xe0
 [<c0599e78>] ? ip_local_out+0x18/0x20
 [<c059a257>] ? ip_queue_xmit+0x117/0x3d0
 [<c01062bb>] ? xen_restore_fl_direct_reloc+0x4/0x4
 [<c068fb71>] ? _raw_spin_unlock_irqrestore+0x11/0x20
 [<c013fcb9>] ? mod_timer+0xf9/0x1b0
 [<c05ad70f>] ? tcp_transmit_skb+0x37f/0x660
 [<c05b0165>] ? tcp_write_xmit+0x1e5/0x4f0
 [<c05b04d4>] ? __tcp_push_pending_frames+0x24/0x90
 [<c05ac4e2>] ? tcp_rcv_established+0x3d2/0x610
 [<c05b2fee>] ? tcp_v4_do_rcv+0xce/0x170
 [<c05b3749>] ? tcp_v4_rcv+0x6b9/0x7a0
 [<c0595887>] ? ip_local_deliver_finish+0x97/0x220
 [<c05957f0>] ? ip_rcv+0x320/0x320
 [<c059524b>] ? ip_rcv_finish+0xfb/0x380
 [<c0516ca9>] ? __netif_receive_skb+0x339/0x3d0
 [<c0516f47>] ? netif_receive_skb+0x67/0x70
 [<c04b84cc>] ? handle_incoming_queue+0x17c/0x250
 [<c04b87bc>] ? xennet_poll+0x21c/0x540
 [<c0131061>] ? load_balance+0x71/0x590
 [<c05176da>] ? net_rx_action+0xea/0x190
 [<c013956c>] ? __do_softirq+0x7c/0x110
 [<c01394f0>] ? __local_bh_enable+0x70/0x70
 <IRQ>  [<c013944e>] ? irq_exit+0x6e/0x90
 [<c044d14d>] ? xen_evtchn_do_upcall+0x1d/0x30
 [<c0690c07>] ? xen_do_upcall+0x7/0xc
 [<c01013a7>] ? hypercall_page+0x3a7/0x1000
 [<c0105b3f>] ? xen_safe_halt+0xf/0x20
 [<c010f1ff>] ? default_idle+0x2f/0x60
 [<c0107e52>] ? cpu_idle+0x42/0x70
 [<c0830797>] ? start_kernel+0x2c8/0x2cd
 [<c083030d>] ? kernel_init+0x126/0x126
 [<c083395b>] ? xen_start_kernel+0x4f7/0x4ff
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  52
CPU    1: hi:  186, btch:  31 usd:  64
CPU    2: hi:  186, btch:  31 usd:  83
CPU    3: hi:  186, btch:  31 usd: 158
active_anon:1419 inactive_anon:1841 isolated_anon:0
 active_file:54491 inactive_file:55433 isolated_file:0
 unevictable:1137 dirty:3 writeback:0 unstable:0
 free:2130 slab_reclaimable:5041 slab_unreclaimable:2407
 mapped:1924 shmem:6 pagetables:270 bounce:0
DMA free:2048kB min:84kB low:104kB high:124kB active_anon:0kB inactive_anon:0kB active_file:440kB inactive_file:3988kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:100kB slab_unreclaimable:112kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 500 500 500
Normal free:6472kB min:2816kB low:3520kB high:4224kB active_anon:5676kB inactive_anon:7364kB active_file:217524kB inactive_file:217744kB unevictable:4548kB isolated(anon):0kB isolated(file):0kB present:512064kB mlocked:4548kB dirty:12kB writeback:0kB mapped:7696kB shmem:24kB slab_reclaimable:20064kB slab_unreclaimable:9516kB kernel_stack:832kB pagetables:1080kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 126*4kB 81*8kB 28*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2048kB
Normal: 1618*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6472kB
111437 total pagecache pages
609 pages in swap cache
Swap cache stats: add 5243, delete 4634, find 170851/171200
Free swap  = 254928kB
Total swap = 263164kB
133104 pages RAM
0 pages HighMem
5748 pages reserved
44857 pages shared
89078 pages non-shared

_________________
Rgds
Stephen
(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Sat Sep 24, 2011 8:49 pm 
Offline
Sysop

Joined: Sat Nov 27, 2010 3:32 am
Posts: 180
Website: https://blog.timheckman.net/
Location: San Francisco, CA
Try switching to our Latest 3.0 (3.0.4-linode38) kernel. We recently deployed a new kernel that should help with this.


Top
   
 Post subject:
PostPosted: Sat Sep 24, 2011 8:53 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
theckman wrote:
Try switching to our Latest 3.0 kernel. We recently deployed a new kernel that should help with this.

I'm running CentOS 5.7; how much testing has been done with this older OS and the new kernel? I don't want to be a guinea pig :-)

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Sat Sep 24, 2011 8:56 pm 
Offline
Senior Member
User avatar

Joined: Thu Jun 16, 2011 8:24 am
Posts: 412
Location: Cyberspace
I never thought I'd see the day when root became a process... :shock: :D

Seriously, though, I always thought that the pid would always increment by one with every process spawned on a system, and that the pid should always start with 1, which should only be used by the first ever process spawned on the system when the OS initially starts booting? O.o First time I ever seen that :)

_________________
Kris the Piki Geeker


Top
   
 Post subject:
PostPosted: Sat Sep 24, 2011 9:01 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
Piki wrote:
I never thought I'd see the day when root became a process... :shock: :D

Seriously, though, I always thought that the pid would always increment by one with every process spawned on a system, and that the pid should always start with 1, which should only be used by the first ever process spawned on the system when the OS initially starts booting? O.o First time I ever seen that :)

Exactly; the error is showing from kernel allocation failures and not userspace overloading; this isn't a typical OOM error, no process is failing :-)

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Sun Sep 25, 2011 5:13 pm 
Offline
Sysop

Joined: Sat Nov 27, 2010 3:32 am
Posts: 180
Website: https://blog.timheckman.net/
Location: San Francisco, CA
By default the new CentOS deployments use the Latest 3.0 kernel. There should not be any problems as "3.0" is simply "2.6.40". The version numbers were changed by Linus because they were getting too long.


Top
   
 Post subject:
PostPosted: Sun Sep 25, 2011 6:08 pm 
Offline
Senior Member
User avatar

Joined: Thu Jun 16, 2011 8:24 am
Posts: 412
Location: Cyberspace
I just thought he changed to 3.0 because that's a better age than 40... After all, wouldn't anybody in their 40's want to go back to being 30? :lol:

Actually, I remember seeing somewhere that it had to do with the 20 year anniversary of the kernel.

_________________
Kris the Piki Geeker


Top
   
 Post subject:
PostPosted: Tue Sep 27, 2011 6:47 am 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
theckman wrote:
Try switching to our Latest 3.0 (3.0.4-linode38) kernel. We recently deployed a new kernel that should help with this.

Did not fix the problem.

Code:
swapper: page allocation failure: order:3, mode:0x20
Pid: 0, comm: swapper Not tainted 3.0.4-linode38 #1
Call Trace:   
 [<c018b258>] ? warn_alloc_failed+0x98/0x100
 [<c018baa4>] ? __alloc_pages_nodemask+0x3f4/0x630
 [<c014000a>] ? mod_timer_pending+0xba/0x110
 [<c01b22b3>] ? T.833+0xb3/0x2e0
 [<c01b2726>] ? cache_alloc_refill+0x246/0x290
 [<c060f0ff>] ? ipt_do_table+0x24f/0x580
 [<c01b282d>] ? __kmalloc+0xbd/0xd0
 [<c053a7fe>] ? pskb_expand_head+0x12e/0x200
 [<c053ad3d>] ? __pskb_pull_tail+0x4d/0x2b0
 [<c0607ad3>] ? ipv4_confirm+0xf3/0x1b0
 [<c05436dd>] ? dev_hard_start_xmit+0x1dd/0x3e0
 [<c05c8020>] ? ip_finish_output2+0x260/0x260
 [<c05c8020>] ? ip_finish_output2+0x260/0x260
 [<c0557a62>] ? sch_direct_xmit+0xb2/0x170
 [<c05439d9>] ? dev_queue_xmit+0xf9/0x320
 [<c05c815b>] ? ip_finish_output+0x13b/0x300
 [<c05c83ca>] ? ip_output+0xaa/0xe0
 [<c05c7568>] ? ip_local_out+0x18/0x20
 [<c05daf25>] ? tcp_transmit_skb+0x385/0x670
 [<c05dd965>] ? tcp_write_xmit+0x1e5/0x4f0
 [<c05ddcd4>] ? __tcp_push_pending_frames+0x24/0x90
 [<c05d9cf2>] ? tcp_rcv_established+0x3d2/0x610
 [<c05e080e>] ? tcp_v4_do_rcv+0xce/0x1a0
 [<c05e0f99>] ? tcp_v4_rcv+0x6b9/0x7a0
 [<c05c2fc7>] ? ip_local_deliver_finish+0x97/0x220
 [<c05c2f30>] ? ip_rcv+0x320/0x320
 [<c05c298b>] ? ip_rcv_finish+0xfb/0x380
 [<c0540dae>] ? __netif_receive_skb+0x2fe/0x370
 [<c0542597>] ? netif_receive_skb+0x67/0x70
 [<c04e35fc>] ? handle_incoming_queue+0x17c/0x250
 [<c04e38ec>] ? xennet_poll+0x21c/0x540
 [<c0542d2a>] ? net_rx_action+0xea/0x190
 [<c0139cfc>] ? __do_softirq+0x7c/0x110
 [<c0139c80>] ? irq_enter+0x60/0x60
 <IRQ>  [<c0139ade>] ? irq_exit+0x6e/0xa0
 [<c047829d>] ? xen_evtchn_do_upcall+0x1d/0x30
 [<c06c0947>] ? xen_do_upcall+0x7/0xc
 [<c01013a7>] ? hypercall_page+0x3a7/0x1000
 [<c0105c7f>] ? xen_safe_halt+0xf/0x20
 [<c010f41e>] ? default_idle+0x2e/0x60
 [<c0107f72>] ? cpu_idle+0x42/0x70
 [<c086977f>] ? start_kernel+0x2ce/0x2d3
 [<c08692ef>] ? kernel_init+0x112/0x112
 [<c086c943>] ? xen_start_kernel+0x4f7/0x4ff
Mem-Info:
DMA per-cpu: 
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 121
CPU    1: hi:  186, btch:  31 usd: 143
CPU    2: hi:  186, btch:  31 usd: 183
CPU    3: hi:  186, btch:  31 usd: 213
active_anon:1695 inactive_anon:2199 isolated_anon:0
 active_file:56223 inactive_file:50078 isolated_file:0
 unevictable:1137 dirty:16 writeback:0 unstable:0
 free:5126 slab_reclaimable:5292 slab_unreclaimable:1992
 mapped:2511 shmem:4 pagetables:291 bounce:0
DMA free:3428kB min:84kB low:104kB high:124kB active_anon:0kB inactive_anon:0kB active_file:788kB inactive_file:1416kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:1492kB slab_unreclaimable:132kB kernel_stack:72kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 500 500 500
Normal free:17076kB min:2816kB low:3520kB high:4224kB active_anon:6780kB inactive_anon:8796kB active_file:224104kB inactive_file:198896kB unevictable:4548kB isolated(anon):0kB isolated(file):0kB present:512064kB mlocked:4548kB dirty:64kB writeback:0kB mapped:10044kB shmem:16kB slab_reclaimable:19676kB slab_unreclaimable:7836kB kernel_stack:824kB pagetables:1164kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 251*4kB 142*8kB 79*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3404kB
Normal: 4269*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17076kB
108423 total pagecache pages
1226 pages in swap cache
Swap cache stats: add 86205, delete 84979, find 55713/67312
Free swap  = 251912kB
Total swap = 263164kB
133104 pages RAM
0 pages HighMem
5833 pages reserved
64798 pages shared
66511 pages non-shared

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Tue Sep 27, 2011 7:13 am 
Offline
Linode Staff

Joined: Fri Jan 29, 2010 12:28 pm
Posts: 8
sweh wrote:
theckman wrote:
Try switching to our Latest 3.0 (3.0.4-linode38) kernel. We recently deployed a new kernel that should help with this.

Did not fix the problem.

Yes and no. It fixed the panic problem that you were having. Now that that has been resolved your Linode is OOMing instead like it should. Tune up your memory usage a bit and you should be good to go.


Top
   
 Post subject:
PostPosted: Tue Sep 27, 2011 8:14 am 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1739
Location: Rochester, New York
That doesn't look like an OOM:

Code:
Free swap  = 251912kB 

_________________
Code:
/* TODO: need to add signature to posts */


Top
   
 Post subject:
PostPosted: Tue Sep 27, 2011 9:11 am 
Offline
Linode Staff

Joined: Fri Jan 29, 2010 12:28 pm
Posts: 8
Good catch HoopyCat. Arg, and I was so glad to be done with this too. Now to stare at it harder.

--

The body of the trace looks similar to an issue we saw with the amount of RAM the kernel was reserving. What does "sysctl vm.min_free_kbytes" show? Is it less than 16384? If so that was supposed to be fixed already. If not that means more digging.


Top
   
 Post subject:
PostPosted: Tue Sep 27, 2011 7:06 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
psandin wrote:
sweh wrote:
theckman wrote:
Try switching to our Latest 3.0 (3.0.4-linode38) kernel. We recently deployed a new kernel that should help with this.


Did not fix the problem.

Yes and no. It fixed the panic problem that you were having. Now that that has been resolved your Linode is OOMing instead like it should. Tune up your memory usage a bit and you should be good to go.

I think you're confusing me with someone else. I was not panic()ing and wasn't OOMing. The kernel was giving page allocation faults inside the kernel itself whenever I rsync (over ssh) my /BACKUP directory back home.

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Tue Sep 27, 2011 7:14 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
psandin wrote:
The body of the trace looks similar to an issue we saw with the amount of RAM the kernel was reserving. What does "sysctl vm.min_free_kbytes" show? Is it less than 16384? If so that was supposed to be fixed already. If not that means more digging.

% sysctl vm.min_free_kbytes
vm.min_free_kbytes = 2906

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Tue Sep 27, 2011 7:54 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
sweh - reboot into Latest 3.0 if you haven't already, and try bumping vm.min_free_kbytes to 4096 or more. Let us know if that fixes it.

Thanks,
-Chris


Top
   
 Post subject:
PostPosted: Tue Sep 27, 2011 8:13 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
Code:
# uname -a
Linux linode 3.0.4-linode38 #1 SMP Thu Sep 22 14:59:08 EDT 2011 i686 i686 i386 GNU/Linux
# tail -1 /etc/sysctl.conf
vm.min_free_kbytes = 4096
# sysctl vm.min_free_kbytes
vm.min_free_kbytes = 4096
#


Machine rebooted. We'll see if it show's up in the next few days!

I wonder what's different about the linode kernels; on my Panix v-colo (Xen based, equiv to a linode512) the value is 2882, and this problem never seems to show.

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group