Julia Lawall [Wed, 26 May 2010 05:54:21 +0000 (05:54 +0000)]
net/rds: Add missing mutex_unlock
Add a mutex_unlock missing on the error path. In each case, whenever the
label out is reached from elsewhere in the function, mutex is not locked.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E1;
@@
* mutex_lock(E1);
<+... when != E1
if (...) {
... when != E1
* return ...;
}
...+>
* mutex_unlock(E1);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Reviewed-by: Zach Brown <zach.brown@oracle.com> Acked-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add a spin_unlock missing on the error path. The return value of write_reg
seems to be completely ignored, so it seems that the lock should be
released in every case.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression E1;
@@
* spin_lock(E1,...);
<+... when != E1
if (...) {
... when != E1
* return ...;
}
...+>
* spin_unlock(E1,...);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Ware [Sat, 29 May 2010 07:16:28 +0000 (00:16 -0700)]
fs_enet: Adjust BDs after tx error
This patch fixes an occasional transmit lockup in the mac-fcc which
occurs after a tx error. The test scenario had the local port set
to autoneg and the other end fixed at 100FD, resulting in a large
number of late collisions.
According to the MPC8280RM 30.10.1.3 (also 8272RM 29.10.1.3), after
a tx error occurs, TBPTR may sometimes point beyond BDs still marked
as ready. This patch walks back through the BDs and points TBPTR to
the earliest one marked as ready.
Tested on a custom board with a MPC8280.
Signed-off-by: Mark Ware <mware@elphinstone.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Haley [Sat, 29 May 2010 06:02:35 +0000 (23:02 -0700)]
IPv6: fix Mobile IPv6 regression
Commit f4f914b5 (net: ipv6 bind to device issue) caused
a regression with Mobile IPv6 when it changed the meaning
of fl->oif to become a strict requirement of the route
lookup. Instead, only force strict mode when
sk->sk_bound_dev_if is set on the calling socket, getting
the intended behavior and fixing the regression.
Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 28 May 2010 23:14:17 +0000 (16:14 -0700)]
Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
intel_idle: native hardware cpuidle driver for latest Intel processors
ACPI: acpi_idle: touch TS_POLLING only in the non-MWAIT case
acpi_pad: uses MONITOR/MWAIT, so it doesn't need to clear TS_POLLING
sched: clarify commment for TS_POLLING
ACPI: allow a native cpuidle driver to displace ACPI
cpuidle: make cpuidle_curr_driver static
cpuidle: add cpuidle_unregister_driver() error check
cpuidle: fail to register if !CONFIG_CPU_IDLE
ACPI: Don't let acpi_pad needlessly mark TSC unstable
acpi pad driver kind of aggressively marks TSC as unstable at init
time, on mwait capable and non X86_FEATURE_NONSTOP_TSC systems. This is
irrespective of whether pad driver is ever going to be used on the
system or deep C-states are supported/used. This will affect every user
who just happens to compile in (or get a kernel version which
compiles in) acpi pad driver.
Move mark_tsc_unstable() out of init to the actual idle invocation path
of the pad driver.
There is also another bug/missing_feature in the code that it does not
support 'always running apic timer' and switches to broadcast mode
unconditionally. Shaohua, can you take a look at that please.
Signed-off-by: Venkatesh Pallipadi <venki@google.com> Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Mon, 8 Mar 2010 19:07:30 +0000 (14:07 -0500)]
intel_idle: native hardware cpuidle driver for latest Intel processors
This EXPERIMENTAL driver supersedes acpi_idle on
Intel Atom Processors, Intel Core i3/i5/i7 Processors
and associated Intel Xeon processors.
It does not support the Intel Core2 processor or earlier.
For kernels configured with ACPI, CONFIG_INTEL_IDLE=y
allows intel_idle to probe before the ACPI processor driver.
Booting with "intel_idle.max_cstate=0" disables intel_idle
and the system will fall back on ACPI's "acpi_idle".
Typical Linux distributions load ACPI processor module early,
making CONFIG_INTEL_IDLE=m not easily useful on ACPI platforms.
intel_idle probes all processors at module_init time.
Processors that are hot-added later will be limited
to using C1 in idle.
Zhenyu Wang [Thu, 27 May 2010 02:26:43 +0000 (10:26 +0800)]
drm/i915: Unmask interrupt for render engine on Sandybridge
With splitted engines on Sandybridge, each engine has its own
interrupt control as well. This unmasks the interrupt to properly
enable pipe control notify event for render engine.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>
Zhenyu Wang [Thu, 27 May 2010 02:26:42 +0000 (10:26 +0800)]
drm/i915: Fix PIPE_CONTROL command on Sandybridge
Sandybridge(Gen6) has new format for PIPE_CONTROL command,
the flush and post-op control are in dword 1 now. This
changes command length field for difference between Ironlake
and Sandybridge.
I tried to test this with noop request and issue PIPE_CONTROL
command for each sequence and track notify interrupts, which
seems work fine. Hopefully we don't need workaround like on
Ironlake for Sandybridge.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>
Chris Wilson [Thu, 27 May 2010 13:15:35 +0000 (14:15 +0100)]
drm/i915: Fix up address spaces in slow_kernel_write()
Since we now get_user_pages() outside of the mutex prior to performing
the copy, we kmap() the page inside the copy routine and so need to
perform an ordinary memcpy() and not copy_from_user().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>
Chris Wilson [Thu, 27 May 2010 13:15:34 +0000 (14:15 +0100)]
drm/i915: Use non-atomic kmap for slow copy paths
As we do not have a requirement to be atomic and avoid sleeping whilst
performing the slow copy for shmem based pread and pwrite, we can use
kmap instead, thus simplifying the code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>
Chris Wilson [Thu, 27 May 2010 13:21:01 +0000 (14:21 +0100)]
drm/i915: Avoid moving from CPU domain during pwrite
We can avoid an early clflush when pwriting if we use the current CPU
write domain rather than moving the object to the GTT domain for the
purposes of the pwrite. This has the advantage of not flushing the
presumably hot data that we want to upload into the bo, and of ascribing
the clflush to the execution when profiling.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>
Len Brown [Mon, 24 May 2010 18:27:44 +0000 (14:27 -0400)]
ACPI: acpi_idle: touch TS_POLLING only in the non-MWAIT case
commit d306ebc28649b89877a22158fe0076f06cc46f60
(ACPI: Be in TS_POLLING state during mwait based C-state entry)
fixed an important power & performance issue where ACPI c2 and c3 C-states
were clearing TS_POLLING even when using MWAIT (ACPI_STATE_FFH).
That bug had been causing us to receive redundant scheduling interrups
when we had already been woken up by MONITOR/MWAIT.
Following up on that...
In the MWAIT case, we don't have to subsequently
check need_resched(), as that c heck was there
for the TS_POLLING-clearing case.
Note that not only does the cpuidle calling function
already check need_resched() before calling us, the
low-level entry into monitor/mwait calls it twice --
guaranteeing that a write to the trigger address
can not go un-noticed.
Also, in this case, we don't have to set TS_POLLING
when we wake, because we never cleared it.
Signed-off-by: Len Brown <len.brown@intel.com> Acked-by: Venkatesh Pallipadi <venki@google.com>
Chris Wilson [Thu, 27 May 2010 12:18:19 +0000 (13:18 +0100)]
drm/i915: Remove spurious warning "Failure to install fence"
This particular warning is harmless as we emit during the normal
pinning process where the batch buffer requires more fences than is
available without eviction. Only if we fail to evict enough fences does
this become a problem, so include the requested number of fences in the
ultimate *error* message.
v2: Remember to compile test even trial patches to remove warnings.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>
Chris Wilson [Thu, 27 May 2010 12:18:16 +0000 (13:18 +0100)]
drm/i915: Only print "nothing to do" debug message as required.
If the FBC is already disabled, then we do not even attempt to disable
FBC and so there is no point emitting a debug statement at that point,
having already emitted one saying why we are disabling FBC.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>
Chris Wilson [Thu, 27 May 2010 12:18:14 +0000 (13:18 +0100)]
drm/i915: Avoid nesting of domain changes when setting display plane
Nesting domain changes will cause confusion when trying to interpret the
tracepoints describing the sequence of changes for the object, as well
as obscuring the order of operations for the reader of the code.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>
Prarit Bhargava [Thu, 27 May 2010 18:41:20 +0000 (14:41 -0400)]
libertas: fix uninitialized variable warning
Fixes:
drivers/net/wireless/libertas/rx.c: In function process_rxed_802_11_packet:
drivers/net/wireless/libertas/rx.c:354: error: radiotap_hdr.flags may be used uninitialized in this function
Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Chris Wilson [Thu, 27 May 2010 12:18:13 +0000 (13:18 +0100)]
drm/i915: Hold the spinlock whilst resetting unpin_work along error path
Delay taking the mutex until we need to and ensure that we hold the
spinlock when resetting unpin_work on the error path. Also defer the
debugging print messages until after we have released the spinlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net>
ath9k: Remove ATH9K_TX_SW_ABORTED and introduce a bool for this purpose
Wrong buffer is checked for bf_tx_aborted field in ath_tx_num_badfrms(),
this may result in a rate scaling with wrong feedback (number
of unacked frames in this case). It is the last one in the chain
of buffers for an aggregate frame that should be checked.
Also it misses the initialization of this field in the buffer,
this may lead to a situation where we stop the sw retransmission
of failed subframes associated to this buffer.
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
ath_print in xmit.c should say "Reseting hardware"
instead of Resetting HAL!(since HAL is being fazed out).
dmesg shows:
[ 8660.899624] ath: Failed to stop TX DMA in 100 msec after killing last frame
[ 8660.899676] ath: Unable to stop TxDMA. Reset HAL!
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
netlink: bug fix: wrong size was calculated for vfinfo list blob
netlink: bug fix: don't overrun skbs on vf_port dump
xt_tee: use skb_dst_drop()
netdev/fec: fix ifconfig eth0 down hang issue
cnic: Fix context memory init. on 5709.
drivers/net: Eliminate a NULL pointer dereference
drivers/net/hamradio: Eliminate a NULL pointer dereference
be2net: Patch removes redundant while statement in loop.
ipv6: Add GSO support on forwarding path
net: fix __neigh_event_send()
vhost: fix the memory leak which will happen when memory_access_ok fails
vhost-net: fix to check the return value of copy_to/from_user() correctly
vhost: fix to check the return value of copy_to/from_user() correctly
vhost: Fix host panic if ioctl called with wrong index
net: fix lock_sock_bh/unlock_sock_bh
net/iucv: Add missing spin_unlock
net: ll_temac: fix checksum offload logic
net: ll_temac: fix interrupt bug when interrupt 0 is used
sctp: dubious bitfields in sctp_transport
ipmr: off by one in __ipmr_fill_mroute()
...
Architectures that handle DMA-non-coherent memory need to set
ARCH_KMALLOC_MINALIGN to make sure that kmalloc'ed buffer is
DMA-safe: the buffer doesn't share a cache with the others.
Linus Torvalds [Fri, 28 May 2010 17:07:48 +0000 (10:07 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
remove detritus left by "mm: make read_cache_page synchronous"
fix fs/sysv s_dirt handling
fat: convert to use the new truncate convention.
ext2: convert to use the new truncate convention.
tmpfs: convert to use the new truncate convention
fs: convert simple fs to new truncate
kill spurious reference to vmtruncate
fs: introduce new truncate sequence
fs/super: fix kernel-doc warning
fs/minix: bugfix, number of indirect block ptrs per block depends on block size
rename the generic fsync implementations
drop unused dentry argument to ->fsync
fs: Add missing mutex_unlock
Fix racy use of anon_inode_getfd() in perf_event.c
get rid of the magic around f_count in aio
VFS: fix recent breakage of FS_REVAL_DOT
Revert "anon_inode: set S_IFREG on the anon_inode"
Al Viro [Fri, 28 May 2010 15:34:50 +0000 (11:34 -0400)]
remove detritus left by "mm: make read_cache_page synchronous"
gets minix get_dir_page() in sync with its analogs; back in 2007
Nick has switched read_cache_page() and friends to sync behaviour
(i.e. they wait for the page to get unlocked, check if it's uptodate
and if it isn't return ERR_PTR(-EIO) instead) and removed the
duplicate logics from the callers. In case of fs/minix/dir.c he'd
removed only half of that...
Toralf Förster [Wed, 26 May 2010 18:22:02 +0000 (20:22 +0200)]
kconfig: Hide error output in find command in streamline_config.pl
Finding the list of Makefiles in streamline-config should not report errors.
Also move the "chomp" to the @makefiles array instead of doing it in the
for loop. This is more efficient, and does not make it any less readable
by C programmers.
Signed-off-by: Toralf Foerster <toralf.foerster@gmx.de>
LKML-Reference: <201005262022.02928.toralf.foerster@gmx.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Scott Feldman [Fri, 28 May 2010 10:42:43 +0000 (03:42 -0700)]
netlink: bug fix: wrong size was calculated for vfinfo list blob
The wrong size was being calculated for vfinfo. In one case, it was over-
calculating using nlmsg_total_size on attrs, in another case, it was
under-calculating by assuming ifla_vf_* structs are packed together, but
each struct is it's own attr w/ hdr (and padding).
Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Scott Feldman [Fri, 28 May 2010 10:42:18 +0000 (03:42 -0700)]
netlink: bug fix: don't overrun skbs on vf_port dump
Noticed by Patrick McHardy: was continuing to fill skb after a
nla_put_failure, ignoring the size calculated by upper layer. Now,
return -EMSGSIZE on any overruns, but also allow netdev to
fail ndo_get_vf_port with error other than -EMSGSIZE, thus unwinding
nest.
Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 28 May 2010 10:41:17 +0000 (03:41 -0700)]
xt_tee: use skb_dst_drop()
After commit 7fee226a (net: add a noref bit on skb dst), its wrong to
use : dst_release(skb_dst(skb)), since we could decrement a refcount
while skb dst was not refcounted.
We should use skb_dst_drop(skb) instead.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bryan Wu [Fri, 28 May 2010 10:40:39 +0000 (03:40 -0700)]
netdev/fec: fix ifconfig eth0 down hang issue
BugLink: http://bugs.launchpad.net/bugs/559065
In fec open/close function, we need to use phy_connect and phy_disconnect
operation before we start/stop phy. Otherwise it will cause system hang.
Only call fec_enet_mii_probe() in open function, because the first open
action will cause NULL pointer error.
Signed-off-by: Bryan Wu <bryan.wu@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the structure to avoid the following warning:
WARNING: drivers/serial/built-in.o(.data+0x534): Section mismatch in reference from the variable s5p_serial_drv to the function .devexit.text:s3c24xx_serial_remove()
The variable s5p_serial_drv references
the function __devexit s3c24xx_serial_remove()
If the reference is valid then annotate the
variable with __exit* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com> Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Thomas Abraham [Fri, 28 May 2010 02:41:16 +0000 (11:41 +0900)]
ARM: S5P: Regoster clk_xusbxti clock for hsotg driver
The clk_xusbxti clock is added to the list of clocks to be
registred during boot time clock registration.
Signed-off-by: Thomas Abraham <thomas.ab@samsung.com> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
[ben-linux@fluff.org: edited title] Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Michael Chan [Thu, 27 May 2010 23:31:41 +0000 (16:31 -0700)]
cnic: Fix context memory init. on 5709.
We need to zero context memory on 5709 in the function cnic_init_context().
Without this, iscsid restart on 5709 will not work because of stale data.
TX context blocks should not be initialized by cnic_init_context() because
of the special remapping on 5709.
Update version to 2.1.2.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 27 May 2010 23:14:30 +0000 (16:14 -0700)]
ipv6: Add GSO support on forwarding path
Currently we disallow GSO packets on the IPv6 forward path.
This patch fixes this.
Note that I discovered that our existing GSO MTU checks (e.g.,
IPv4 forwarding) are buggy in that they skip the check altogether,
when they really should be checking gso_size + header instead.
I have also been lazy here in that I haven't bothered to segment
the GSO packet by hand before generating an ICMP message. Someone
should add that to be 100% correct.
Reported-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 27 May 2010 23:09:39 +0000 (16:09 -0700)]
net: fix __neigh_event_send()
commit 7fee226ad23 (net: add a noref bit on skb dst) missed one spot
where an skb is enqueued, with a possibly not refcounted dst entry.
__neigh_event_send() inserts skb into arp_queue, so we must make sure
dst entry is refcounted, or dst entry can be freed by garbage collector
after caller exits from rcu protected section.
Reported-by: Ingo Molnar <mingo@elte.hu> Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andreas Herrmann [Fri, 28 May 2010 07:57:12 +0000 (09:57 +0200)]
ALSA: hda: Add support for another Lenovo ThinkPad Edge in conexant codec
On a Thinkpad Edge 13 "01972NG" I had the problem that speakers played
sound although headphones were plugged in. Using model=ideapad with
latest alsa-git kernel fixed this. So adding this quirk to use ideapad
for another Thinkpad Edge variant seems sensible.
Cc: Jerone Young <jerone.young@canonical.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Daniel T Chen [Thu, 27 May 2010 22:32:18 +0000 (18:32 -0400)]
ALSA: hda: Use LPIB for Sony VPCS11V9E
BugLink: https://launchpad.net/bugs/586347
Symptom: On the Sony VPCS11V9E, using GStreamer-based applications with
PulseAudio in Ubuntu 10.04 LTS results in stuttering audio. It appears
to worsen with increased I/O.
Test case: use Rhythmbox under increased I/O pressure. This symptom is
reproducible in the current daily stable alsa-driver snapshots (at least
up until 21 May 2010; later snapshots fail to build from source due to
missing preprocessor directives when compiled against 2.6.32).
Resolution: add SSID for this machine to the position_fix quirk table,
explicitly specifying the LPIB method.
Reported-and-Tested-By: Lauri Kainulainen <lauri@sokkelo.net> Cc: <stable@kernel.org> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Daniel Mack [Thu, 27 May 2010 18:15:14 +0000 (20:15 +0200)]
ALSA: usb-audio: fix feature unit parser for UAC2
Fix a small off-by-one bug which causes the feature unit to announce a
wrong number of channels. This leads to illegal requests sent to the
firmware eventually.
Signed-off-by: Daniel Mack <daniel@caiaq.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
npiggin@suse.de [Wed, 26 May 2010 15:05:37 +0000 (01:05 +1000)]
ext2: convert to use the new truncate convention.
I also have commented a possible bug in existing ext2 code, marked with XXX.
Cc: linux-ext4@vger.kernel.org Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
npiggin@suse.de [Wed, 26 May 2010 15:05:34 +0000 (01:05 +1000)]
kill spurious reference to vmtruncate
Lots of filesystems calls vmtruncate despite not implementing the old
->truncate method. Switch them to use simple_setsize and add some
comments about the truncate code where it seems fitting.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
npiggin@suse.de [Wed, 26 May 2010 15:05:33 +0000 (01:05 +1000)]
fs: introduce new truncate sequence
Introduce a new truncate calling sequence into fs/mm subsystems. Rather than
setattr > vmtruncate > truncate, have filesystems call their truncate sequence
from ->setattr if filesystem specific operations are required. vmtruncate is
deprecated, and truncate_pagecache and inode_newsize_ok helpers introduced
previously should be used.
simple_setattr is introduced for simple in-ram filesystems to implement
the new truncate sequence. Eventually all filesystems should be converted
to implement a setattr, and the default code in notify_change should go
away.
simple_setsize is also introduced to perform just the ATTR_SIZE portion
of simple_setattr (ie. changing i_size and trimming pagecache).
To implement the new truncate sequence:
- filesystem specific manipulations (eg freeing blocks) must be done in
the setattr method rather than ->truncate.
- vmtruncate can not be used by core code to trim blocks past i_size in
the event of write failure after allocation, so this must be performed
in the fs code.
- convert usage of helpers block_write_begin, nobh_write_begin,
cont_write_begin, and *blockdev_direct_IO* to use _newtrunc postfixed
variants. These avoid calling vmtruncate to trim blocks (see previous).
- inode_setattr should not be used. generic_setattr is a new function
to be used to copy simple attributes into the generic inode.
- make use of the better opportunity to handle errors with the new sequence.
Big problem with the previous calling sequence: the filesystem is not called
until i_size has already changed. This means it is not allowed to fail the
call, and also it does not know what the previous i_size was. Also, generic
code calling vmtruncate to truncate allocated blocks in case of error had
no good way to return a meaningful error (or, for example, atomically handle
block deallocation).
Cc: Christoph Hellwig <hch@lst.de> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
fs/minix: bugfix, number of indirect block ptrs per block depends on block size
The MINIX filesystem driver used a constant number of indirect block
pointers in an indirect block. This worked only for filesystems with 1kb
block, while the MINIX default block size is now 4kb. As a consequence,
large files were read incorrectly on such filesystems and writing a
large file would cause the filesystem to become corrupted. This patch
computes the number of indirect block pointers based on the block size,
making the driver work for each block size.
I would like to thank Feiran Zheng ('Fam') for pointing out the cause
of the corruption.
Signed-off-by: Erik van der Kouwe <vdkouwe@cs.vu.nl> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We don't name our generic fsync implementations very well currently.
The no-op implementation for in-memory filesystems currently is called
simple_sync_file which doesn't make too much sense to start with,
the the generic one for simple filesystems is called simple_fsync
which can lead to some confusion.
This patch renames the generic file fsync method to generic_file_fsync
to match the other generic_file_* routines it is supposed to be used
with, and the no-op implementation to noop_fsync to make it obvious
what to expect. In addition add some documentation for both methods.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 26 May 2010 21:40:29 +0000 (17:40 -0400)]
Fix racy use of anon_inode_getfd() in perf_event.c
once anon_inode_getfd() is called, you can't expect *anything* about
struct file that descriptor points to - another thread might be doing
whatever it likes with descriptor table at that point.
Cc: stable <stable@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 26 May 2010 19:13:55 +0000 (15:13 -0400)]
get rid of the magic around f_count in aio
__aio_put_req() plays sick games with file refcount. What
it wants is fput() from atomic context; it's almost always
done with f_count > 1, so they only have to deal with delayed
work in rare cases when their reference happens to be the
last one. Current code decrements f_count and if it hasn't
hit 0, everything is fine. Otherwise it keeps a pointer
to struct file (with zero f_count!) around and has delayed
work do __fput() on it.
Better way to do it: use atomic_long_add_unless( , -1, 1)
instead of !atomic_long_dec_and_test(). IOW, decrement it
only if it's not the last reference, leave refcount alone
if it was. And use normal fput() in delayed work.
I've made that atomic_long_add_unless call a new helper -
fput_atomic(). Drops a reference to file if it's safe to
do in atomic (i.e. if that's not the last one), tells if
it had been able to do that. aio.c converted to it, __fput()
use is gone. req->ki_file *always* contributes to refcount
now. And __fput() became static.
In particular, before this patch, the command
ls -l
in an NFS mounted directory would always check if the directory on the server
had changed and if so would flush and refill the pagecache for the dir.
After this patch, the same "ls -l" will repeatedly return stale date until
the cached attributes for the directory time out.
The following patch fixes this by ensuring the d_revalidate is called by
do_last when "." is being looked-up.
link_path_walk has already called d_revalidate, but in that case LOOKUP_OPEN
is not set so nfs_lookup_verify_inode chooses not to do any validation.
The following patch restores the original behaviour.
Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>