Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Wed Jan 14, 2009 12:33 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
Our round of 2.6.27.x testing went pretty well. There was only one new issue discovered, which two people managed to reproduce, however, so far we haven't been able to hit it in 2.6.28.-

Some of you noticed that the 2.6.28 kernels have been available for a few weeks, and have been running them. We received two reports of this problem. As per Jeremy Fitzhardinge's (pv_ops kernel guru) suggestions, I've enabled some debugging stuff and would appreciate some wider testing of these babies.

Thanks!
-Chris


Top
   
 Post subject:
PostPosted: Fri Jan 16, 2009 2:10 pm 
Offline
Senior Newbie

Joined: Mon Oct 23, 2006 12:21 am
Posts: 6
I started using the 2.6.28 kernel on January 5. Immediately, my network usage went up dramatically, with no increase in traffic for the sites I host.

Does this usually happen?


Top
   
PostPosted: Fri Jan 23, 2009 7:50 am 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
My linode froze using 2.6.27 yesterday for no good reason I could see, so I rebooted into 2.6.28 and in under 14 hours later the kernel crashes

Code:
kernel BUG at drivers/block/xen-blkfront.c:243!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/class/net/ip6tnl0/type
Modules linked in:

Pid: 18889, comm: dump Not tainted (2.6.28-linode15 #2)
EIP: 0061:[<c03ee830>] EFLAGS: 00010046 CPU: 1
EIP is at do_blkif_request+0x2e0/0x360
EAX: 00000001 EBX: 00000000 ECX: c7055740 EDX: d054a650
ESI: d596c144 EDI: d596c184 EBP: 00000325 ESP: c147fc78
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process dump (pid: 18889, ti=c147e000 task=d5b01690 task.ti=c147e000)
Stack:
 00000001 d596c144 00000144 d58fb9e0 d5926000 c9fdba84 00000000 0000000a
 d5926000 00000004 00000003 d596c000 00000000 c7055740 d6132edc ffffffff
 d58fb9e0 d58fb9e0 00000000 00000004 c03a7503 cc471d0c c03a7a42 00000000
Call Trace:
 [<c03a7503>] __generic_unplug_device+0x23/0x30
 [<c03a7a42>] __make_request+0xe2/0x3e0
 [<c0561b15>] _spin_lock+0x5/0x10
 [<c01827d6>] cache_alloc_refill+0x186/0x520
 [<c03a6566>] generic_make_request+0xe6/0x230
 [<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
 [<c01825f7>] kmem_cache_alloc+0x57/0xb0
 [<c016141d>] mempool_alloc+0x2d/0xe0
 [<c03a78d3>] submit_bio+0x63/0xf0
 [<c01a72bd>] bvec_alloc_bs+0x8d/0x140
 [<c01a758b>] bio_alloc_bioset+0x6b/0xe0
 [<c01a389a>] submit_bh+0xba/0xf0
 [<c01a6a05>] block_read_full_page+0x255/0x2d0
 [<c01a9d10>] blkdev_get_block+0x0/0x60
 [<c0166757>] __do_page_cache_readahead+0x1e7/0x220
 [<c0160767>] generic_file_aio_read+0x227/0x670
 [<c0185c65>] do_sync_read+0xd5/0x120
 [<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
 [<c0561ca3>] _spin_unlock_irqrestore+0x13/0x20
 [<c013c3e0>] autoremove_wake_function+0x0/0x40
 [<c012d78b>] do_softirq+0x4b/0x60
 [<c036c5ac>] security_file_permission+0xc/0x10
 [<c0185d0e>] rw_verify_area+0x5e/0xd0
 [<c0185b90>] do_sync_read+0x0/0x120
 [<c018659d>] vfs_read+0x9d/0x160
 [<c01869b1>] sys_read+0x41/0x70
 [<c010840a>] syscall_call+0x7/0xb
Code: 2c 8d 54 03 40 8d 44 0e 54 b9 6c 00 00 00 e8 98 a5 fc ff 8b 44 24 3c e8 ff 92 fd ff 83 44 24 18 01 e9 40 fd ff ff 0f 0b eb fe 90 <0f> 0b eb fe 8b 44 24 20 ba 40 e5 3e c0 8b 4c 24 20 c7 04 24 0b
EIP: [<c03ee830>] do_blkif_request+0x2e0/0x360 SS:ESP 0069:c147fc78
---[ end trace 89f444916269aa8d ]---



ETA: It appears I was doing a "dump 5" of one of the filesystems at the time of the crash (at least the nightly backup didn't complete).

_________________
Rgds
Stephen
(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Fri Jan 23, 2009 10:33 am 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
:cry: So I repeated the "dump" and the kernel crashed again in the same way. I went back to the .27 kernel and the dump completed without error.

So .27 hangs occaisionally for no good reason; .28 crashes.

Just as a side comment; I've found my linode reliability a lot worse after the move to Xen. Do you still offer UML based linodes? I know they're not as fast, but they seem to be a lot more reliable!

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Fri Jan 23, 2009 3:51 pm 
Offline
Senior Member

Joined: Fri May 02, 2008 8:44 pm
Posts: 1121
Been using .27 for about a month... no problem yet. My linode crashed once but it was my fault.

But why does my Linode 360 only report 349M of RAM when running the .27 kernel? Gimme back my 11M!!!! :D


Top
   
 Post subject:
PostPosted: Fri Jan 23, 2009 4:34 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
@sweh - I forwarded your crash to xen-devel: http://marc.info/?l=xen-devel&m=123272226620226&w=2

@hybinet - That's something we've noticed with the pv_ops kernels, but I haven't mentioned it to anybody yet. I'm going to make a xen-devel report soon...

-Chris


Top
   
 Post subject:
PostPosted: Fri Jan 23, 2009 8:04 pm 
Offline
Senior Member

Joined: Fri Dec 07, 2007 1:37 am
Posts: 385
Location: NC, USA
Created a profile with 2.6.28-x86_64-linode4 selected, but when I boot it I end up with 2.6.28-x86_64-linode3 kernel.
Code:
 $ uname -a
Linux mydomain.com 2.6.28-x86_64-linode3 #2 SMP Wed Jan 14 10:19:10 EST 2009 x86_64 Intel(R) Xeon(R) CPU L5420 @ 2.50GHz GenuineIntel GNU/Linux

Since the modules.dep helper thinks it is -linode4, the modules.dep gets put in the wrong place.
EDIT: Linode is on atlanta46 if that is important.


Top
   
 Post subject:
PostPosted: Fri Jan 30, 2009 7:16 am 
Offline
Junior Member
User avatar

Joined: Sat Feb 21, 2004 8:59 am
Posts: 27
Updating the kernel to stock Debian/Lenny seems to solve the problem for me:
Code:
$ uname -a
Linux linode 2.6.26-1-xen-686 #1 SMP Sat Jan 10 22:52:47 UTC 2009 i686 GNU/Linux

It works perfectly for a week, now.

I've updated the wiki with details required to run stock Debian kernel:
http://www.linode.com/wiki/index.php/PV-GRUB


Top
   
 Post subject:
PostPosted: Wed Feb 04, 2009 11:54 am 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
I just pushed out 2.6.28.3-linode17 and 2.6.28.3-x86_64-linode5. This should fix the "kernel BUG at drivers/block/xen-blkfront.c:243!" -- @sweh, can you confirm?

-Chris


Top
   
 Post subject:
PostPosted: Wed Feb 04, 2009 12:11 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
caker wrote:
I just pushed out 2.6.28.3-linode17 and 2.6.28.3-x86_64-linode5. This should fix the "kernel BUG at drivers/block/xen-blkfront.c:243!" -- @sweh, can you confirm?

I'll try and give it a go tonight and see what happens. (Currently using 2.6.27.4-linode14)

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Wed Feb 04, 2009 5:02 pm 
Offline
Senior Member
User avatar

Joined: Sat Mar 24, 2007 6:09 pm
Posts: 59
Location: South Africa
I moved to 2.6.28-linode15 after timing drift on the 2.6.27 series. I can confirm that, after 24 hrs, the 2.6.28-linode15 kernel shows excellent time stability. It hasn't drifted by even 20ms.

I added ntpd again and there are no steps - just remains stable. The rest of the VM sees light load for throughput testing and a few web domains. No problems so far.

Will give 2.6.28.3-linode17 a go.

--deckert


Top
   
 Post subject:
PostPosted: Wed Feb 04, 2009 11:47 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
sweh wrote:
caker wrote:
I just pushed out 2.6.28.3-linode17 and 2.6.28.3-x86_64-linode5. This should fix the "kernel BUG at drivers/block/xen-blkfront.c:243!" -- @sweh, can you confirm?

I'll try and give it a go tonight and see what happens. (Currently using 2.6.27.4-linode14)

So far, so good. The backup that consistently crashed the previous kernel completed with the linode17 kernel. Fingers crossed!

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Mon Feb 16, 2009 12:06 pm 
Offline
Senior Member

Joined: Sat Nov 15, 2008 4:24 pm
Posts: 55
Location: Czech Republic
I managed to crash 2.6.28.3-x86_64-linode5 during recursive chown Gentoo portage on loopback device formatted to reiserfs twice in row. Since it is production machine, I'd like not to test it again without better ways to describe the issue.

I did something like this:
Code:
mv /usr/portage /usr/portage.tmp
losetup /dev/loop0 /usr/portage.img
mkreiserfs /dev/loop0
mount /dev/loop0 /usr/portage
cp -a /usr/portage.tmp/* /usr/portage
# Everything worked fine so far
chown -R portage:portage /usr/portage
And whole linode crashed. There was kernel panic report on console screen but unfortunatelly was not complete and I did not discovered it in logs.

Can somebody try similar procedure or tell me what I am doing wrong or tell me how to provide more information?


Last edited by drake127 on Mon Feb 16, 2009 12:23 pm, edited 1 time in total.

Top
   
 Post subject:
PostPosted: Mon Feb 16, 2009 12:13 pm 
Offline
Linode Staff
User avatar

Joined: Tue Apr 15, 2003 6:24 pm
Posts: 3090
Website: http://www.linode.com/
Location: Galloway, NJ
The "logview" command via Lish will print 100 lines or so of the *previous* boot's console log -- that should be enough to capture a panic.

http://www.linode.com/wiki/index.php/LISH

-Chris


Top
   
 Post subject:
PostPosted: Mon Feb 16, 2009 12:28 pm 
Offline
Senior Member

Joined: Sat Nov 15, 2008 4:24 pm
Posts: 55
Location: Czech Republic
Don't ask me what it is all about but I hope it will help. Now linode crashed during simple copying portage tree (a lot of small files but I guess you know that one).

Edit: Ah, it seems to be similar (or same) bug as Sweh's one. Maybe only x86 branch has been fixed?

Code:
------------[ cut here ]------------
kernel BUG at drivers/block/xen-blkfront.c:243!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/uevent_seqnum
CPU 0
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.28.3-x86_64-linode5 #1
RIP: e030:[<ffffffff8054e2e0>]  [<ffffffff8054e2e0>] do_blkif_request+0x3e0/0x470
RSP: e02b:ffffffff80998da8  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8800112582c0 RCX: ffff8800112582c0
RDX: ffff88000c416520 RSI: 0000000000000000 RDI: 000000000000016e
RBP: 000000000000016e R08: 0000000000000b70 R09: 0000000100b1ea23
R10: 0000000100b1ea22 R11: ffffffff8027a0e0 R12: 0000000000000004
R13: ffff880005836f40 R14: ffff88001547c270 R15: 0000000000000000
FS:  00007f864c7146f0(0000) GS:ffffffff809a1000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000001a9f048 CR3: 00000000155da000 CR4: 0000000000002620
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff808e2000, task ffffffff80852340)
Stack:
 ffff8800165e2ae8 0000000000000208 0000000000000050 0000000000000005
 ffff880015419430 ffff880015a12000 ffff880015abbcd8 0000000000000000
 0000000000000014 ffff880015a12000 0000000000000001 ffff88001547c000
Call Trace:
 <IRQ> <0> [<ffffffff804fd285>] ? blk_invoke_request_fn+0xa5/0x110
 [<ffffffff8054e388>] ? kick_pending_request_queues+0x18/0x30
 [<ffffffff8054e585>] ? blkif_interrupt+0x1e5/0x230
 [<ffffffff80271eb9>] ? handle_IRQ_event+0x39/0x80
 [<ffffffff80274278>] ? handle_level_irq+0x98/0x120
 [<ffffffff80214595>] ? do_IRQ+0xb5/0x110
 [<ffffffff80524595>] ? xen_evtchn_do_upcall+0xf5/0x140
 [<ffffffff806e460e>] ? xen_do_hypervisor_callback+0x1e/0x30
 <EOI> <0> [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
 [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
 [<ffffffff8054df00>] ? do_blkif_request+0x0/0x470
 [<ffffffff8020de4c>] ? xen_safe_halt+0xc/0x20
 [<ffffffff8020c1e6>] ? xen_idle+0x26/0x50
 [<ffffffff8020ff85>] ? cpu_idle+0x45/0x90
Code: 48 8b 42 60 48 89 41 68 48 8b 42 68 48 89 41 70 8b 7c 24 74 e8 a2 52 fd ff 83 44 24 3c 01 e9 4c fc ff ff 0f 0b eb fe 0f 1f 40 00 <0f> 0b eb fe 48 8b 7c 24 48 48 8b 54 24 48 b9 0b 00 00 00 48 c7
RIP  [<ffffffff8054e2e0>] do_blkif_request+0x3e0/0x470
 RSP <ffffffff80998da8>
Kernel panic - not syncing: Fatal exception in interrupt
------------[ cut here ]------------
WARNING: at kernel/smp.c:333 smp_call_function_mask+0x246/0x250()
Modules linked in:
Pid: 0, comm: swapper Tainted: G      D    2.6.28.3-x86_64-linode5 #1
Call Trace:
 <IRQ>  [<ffffffff8023bc04>] warn_on_slowpath+0x64/0xa0
 [<ffffffff80255946>] up+0x16/0x50
 [<ffffffff806e3e04>] _spin_unlock_irqrestore+0x14/0x20
 [<ffffffff8023c3ad>] release_console_sem+0x18d/0x1d0
 [<ffffffff8023c7e2>] vprintk+0x192/0x3c0
 [<ffffffff806e3dde>] _spin_lock_irqsave+0x2e/0x40
 [<ffffffff8025fb36>] smp_call_function_mask+0x246/0x250
 [<ffffffff8023ca5e>] printk+0x4e/0x60
 [<ffffffff8023bcde>] panic+0x8e/0x150
 [<ffffffff806e3e04>] _spin_unlock_irqrestore+0x14/0x20
 [<ffffffff80246336>] __mod_timer+0xb6/0xd0
 [<ffffffff8053ac39>] set_palette+0x9/0x50
 [<ffffffff8053c718>] do_unblank_screen+0xb8/0x140
 [<ffffffff8053a0fd>] set_cursor+0x3d/0x70
 [<ffffffff80214710>] oops_end+0xb0/0xc0
 [<ffffffff80213674>] do_invalid_op+0x84/0xa0
 [<ffffffff8054e2e0>] do_blkif_request+0x3e0/0x470
 [<ffffffff8050b31a>] __next_cpu+0x1a/0x30
 [<ffffffff80231703>] find_busiest_group+0x193/0x810
 [<ffffffff806e40ea>] error_exit+0x0/0x70
 [<ffffffff8027a0e0>] mempool_free_slab+0x0/0x10
 [<ffffffff8054e2e0>] do_blkif_request+0x3e0/0x470
 [<ffffffff8054e111>] do_blkif_request+0x211/0x470
 [<ffffffff804fd285>] blk_invoke_request_fn+0xa5/0x110
 [<ffffffff8054e388>] kick_pending_request_queues+0x18/0x30
 [<ffffffff8054e585>] blkif_interrupt+0x1e5/0x230
 [<ffffffff80271eb9>] handle_IRQ_event+0x39/0x80
 [<ffffffff80274278>] handle_level_irq+0x98/0x120
 [<ffffffff80214595>] do_IRQ+0xb5/0x110
 [<ffffffff80524595>] xen_evtchn_do_upcall+0xf5/0x140
 [<ffffffff806e460e>] xen_do_hypervisor_callback+0x1e/0x30
 <EOI>  [<ffffffff802093aa>] _stext+0x3aa/0x1000
 [<ffffffff802093aa>] _stext+0x3aa/0x1000
 [<ffffffff8054df00>] do_blkif_request+0x0/0x470
 [<ffffffff8020de4c>] xen_safe_halt+0xc/0x20
 [<ffffffff8020c1e6>] xen_idle+0x26/0x50
 [<ffffffff8020ff85>] cpu_idle+0x45/0x90
---[ end trace f9e449f10704c37e ]---


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group