aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/intel_ringbuffer.c
AgeCommit message (Collapse)Author
2011-03-23drm/i915: Restore missing command flush before interrupt on BLT ringChris Wilson
We always skipped flushing the BLT ring if the request flush did not include the RENDER domain. However, this neglects that we try to flush the COMMAND domain after every batch and before the breadcrumb interrupt (to make sure the batch is indeed completed prior to the interrupt firing and so insuring CPU coherency). As a result of the missing flush, incoherency did indeed creep in, most notable when using lots of command buffers and so potentially rewritting an active command buffer (i.e. the GPU was still executing from it even though the following interrupt had already fired and the request/buffer retired). As all ring->flush routines now have the same preconditions, de-duplicate and move those checks up into i915_gem_flush_ring(). Fixes gem_linear_blit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35284 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: mengmeng.meng@intel.com
2011-02-16Merge branch 'drm-intel-fixes' into drm-intel-nextChris Wilson
Grab the latest stabilisation bits from -fixes and some suspend and resume fixes from linus. Conflicts: drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_irq.c
2011-02-07drm/i915: Refine tracepointsChris Wilson
A lot of minor tweaks to fix the tracepoints, improve the outputting for ftrace, and to generally make the tracepoints useful again. It is a start and enough to begin identifying performance issues and gaps in our coverage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-02-02drm/i915: Invalidate TLB caches on SNB BLT/BSD ringsChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
2011-01-27drm/i915: Defer reporting EIO until we try to use the GPUChris Wilson
Instead of reporting EIO upfront in the entrance of an ioctl that may or may not attempt to use the GPU, defer the actual detection of an invalid ioctl to when we issue a GPU instruction. This allows us to continue to use bo in video memory (via pread/pwrite and mmap) after the GPU has hung. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-25drm/i915: Silence a few -Wunused-but-set-variableChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-24Merge branch 'drm-intel-fixes' into drm-intel-nextChris Wilson
Merge important suspend and resume regression fixes and resolve the small conflict. Conflicts: drivers/gpu/drm/i915/i915_dma.c
2011-01-20drm/i915/ringbuffer: Fix use of stale HEAD position whilst polling for spaceChris Wilson
During suspend, Linus found that his machine would hang for 3 seconds, and identified that intel_ring_buffer_wait() was the culprit: "Because from looking at the code, I get the notion that "intel_read_status_page()" may not be exact. But what happens if that inexact value matches our cached ring->actual_head, so we never even try to read the exact case? Does it _stay_ inexact for arbitrarily long times? If so, we might wait for the ring to empty forever (well, until the timeout - the behavior I see), even though the ring really _is_ empty." As the reported HEAD position is only updated every time it crosses a 64k boundary, whilst draining the ring it is indeed likely to remain one value. If that value matches the last known HEAD position, we never read the true value from the register and so trigger a timeout. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-20drm/i915/ringbuffer: Kill an annoyingly frequent debug messageChris Wilson
This is better handled through the tracepoints and just clutters the debug logs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-20drm/i915: Initialise ring vfuncs for old DRI pathsChris Wilson
We weren't setting up the vfunc table when initialising the old DRI ringbuffer, leading to such OOPSes as: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<(null)>] (null) PGD 10c441067 PUD 1185e5067 PMD 0 Oops: 0010 [#1] PREEMPT SMP last sysfs file: /sys/class/dmi/id/chassis_asset_tag CPU 3 Modules linked in: i915 drm_kms_helper drm fb fbdev i2c_algo_bit cfbcopyarea video backlight output cfbimgblt cfbfillrect autofs4 ipv6 nfs lockd fscache nfs_acl auth_rpcgss sunrpc coretemp hwmon_vid mousedev usbhid hid option usb_wwan snd_hda_codec_via asus_atk0110 atl1e usbserial snd_hda_intel snd_hda_codec firmware_class snd_hwdep snd_pcm snd_seq snd_timer snd_seq_device processor parport_pc thermal snd thermal_sys parport 8250_pnp button rng_core rtc_cmos shpchp hwmon rtc_core ehci_hcd pci_hotplug uhci_hcd soundcore tpm_tis i2c_i801 rtc_lib tpm serio_raw snd_page_alloc tpm_bios i2c_core usbcore psmouse intel_agp sg pcspkr sr_mod evdev cdrom ext3 jbd mbcache dm_mod sd_mod ata_piix libata scsi_mod unix Jan 18 15:49:29 lithui kernel: Pid: 3605, comm: Xorg Not tainted 2.6.36.2 #5 P5KPL-CM/System Product Name RIP: 0010:[<0000000000000000>] [<(null)>] (null) RSP: 0018:ffff8801150d1d40 EFLAGS: 00010202 RAX: 000000000001ffff RBX: ffff88011a011b00 RCX: 000000000001a704 RDX: ffff880118566028 RSI: ffff880118566028 RDI: ffff880117876800 RBP: ffff8801150d1d48 R08: ffff8801195fe300 R09: 00000000c0086444 R10: 0000000000000001 R11: 0000000000003206 R12: ffff880117876800 R13: ffff880118566000 R14: ffff880117876820 R15: ffff8801150d1df8 FS: 00007f1038d456e0(0000) GS:ffff880001780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001187e7000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process Xorg (pid: 3605, threadinfo ffff8801150d0000, task ffff88011b016e40) Stack: ffffffffa043b8e6 ffff8801150d1d98 ffffffffa041768b dead000000000000 <0> 0000000000000048 00007f1023f2a000 0000000000000044 0000000000000008 <0> ffff88010d26bd80 ffff880117876800 ffff8801150d1df8 ffff8801150d1ea8 Call Trace: [<ffffffffa043b8e6>] ? intel_ring_advance+0x16/0x20 [i915] [<ffffffffa041768b>] i915_irq_emit+0x15b/0x240 [i915] [<ffffffffa03ea7b1>] drm_ioctl+0x1f1/0x460 [drm] [<ffffffffa0417530>] ? i915_irq_emit+0x0/0x240 [i915] [<ffffffff810dd8f1>] ? do_sync_read+0xd1/0x120 [<ffffffff81025b1f>] ? do_page_fault+0x1df/0x3d0 [<ffffffff810ed5c7>] do_vfs_ioctl+0x97/0x550 [<ffffffff8115c2ea>] ? security_file_permission+0x7a/0x90 [<ffffffff810edb19>] sys_ioctl+0x99/0xa0 [<ffffffff810024ab>] system_call_fastpath+0x16/0x1b Code: Bad RIP value. RIP [<(null)>] (null) RSP <ffff8801150d1d40> CR2: 0000000000000000 Reported-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Herbert Xu <herbert@gondor.apana.org.au> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29153 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=23172 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
2011-01-11drm/i915: Make the ring IMR handling privateChris Wilson
As the IMR for the USER interrupts are not modified elsewhere, we can separate the spinlock used for these from that of hpd and pipestats. Those two IMR are manipulated under an IRQ and so need heavier locking. Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11drm/i915/ringbuffer: Simplify the ring irq refcountingChris Wilson
... and move it under the spinlock to gain the appropriate memory barriers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32752 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11drm/i915: Mask USER interrupts on gen6 (until required)Chris Wilson
Otherwise we may consume 20% of the CPU just handling IRQs whilst rendering. Ouch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11drm/i915: Handle ringbuffer stalls when flushingChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2011-01-11drm/i915: Workaround erratum on i830 for TAIL pointer within last 2 cachelinesChris Wilson
On i830 if the tail pointer is set to within 2 cachelines of the end of the buffer, the chip may hang. So instead if the tail were to land in that location, we pad the end of the buffer with NOPs, and start again at the beginning. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-15Revert "drm/i915: Avoid using PIPE_CONTROL on Ironlake"Chris Wilson
Restore PIPE_CONTROL once again just for Ironlake, as it appears that MI_USER_INTERRUPT does not have the same coherency guarantees, that is on Ironlake the interrupt following a GPU write is not guaranteed to arrive after the write is coherent from the CPU, as it does on the other generations. Reported-by: Zhenyu Wang <zhenyuw@linux.intel.com> Reported-by: Shuang He <shuang.he@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32402 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-14drm/i915/ringbuffer: Make IRQ refcnting atomicChris Wilson
In order to enforce the correct memory barriers for irq get/put, we need to perform the actual counting using atomic operations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-09Merge branch 'drm-intel-fixes' into drm-intel-nextChris Wilson
2010-12-09drm/i915/ringbuffer: Handle wrapping of the autoreported HEADChris Wilson
If the tail advances beyond the autoreport HEAD value, then we need to fallback to an uncached read of the HEAD register in order to ascertain the correct amount of remaining space in the ringbuffer. Reported-by: Fang, Xun <xunx.fang@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32259 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-07Merge branch 'drm-intel-fixes' into drm-intel-nextChris Wilson
Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_dp.c
2010-12-05drm/i915: Avoid using PIPE_CONTROL on IronlakeChris Wilson
The workaround is hideous and we are using the STORE_DWORD on all other generations on all other rings, so use for the gen5 render ring as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-12-05drm/i915/ringbuffer: Only print an error on the second attempt to reset headChris Wilson
There's not much we can do here but hope for the best. However the first failure happens quite frequently and if often remedied by the second attempt to reset HEAD. So only print the error if that attempt also fails. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=19802 Reported-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
2010-12-05drm/i915: Implement GPU semaphores for inter-ring synchronisation on SNBChris Wilson
The bulk of the change is to convert the growing list of rings into an array so that the relationship between the rings and the semaphore sync registers can be easily computed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-30drm/i915/ringbuffer: Handle cliprects in the callerChris Wilson
This makes the various rings more consistent by removing the anomalous handing of the rendering ring execbuffer dispatch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-30drm/i915: Move instruction state invalidation from execbuffer to flushChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23drm/i915: Move the implementation details of PIPE_CONTROL to the ringbufferChris Wilson
The pipe control object is allocated by the device for the sole use of the render ringbuffer. Move this detail from the general code to the render ring buffer initialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-23drm/i915: Use drm_i915_gem_object as the preferred typeChris Wilson
A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and many characters! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-11Revert "drm/i915/ringbuffer: Ignore failure to setup the ring on Sandybridge"Chris Wilson
This reverts commit 629e894173c9de589913cf649deaadec4b0579bd.
2010-11-11drm/i915/ringbuffer: set FORCE_WAKE bit before reading ring registerZou Nan hai
Before reading ring register, set FORCE_WAKE bit to prevent GT core power down to low power state, otherwise we may read stale values. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> [ickle: added a udelay which seemed to do the trick on my SNB] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-08drm/i915/ringbuffer: Use the HEAD auto-reporting mechanismChris Wilson
My Sandybridge only reports 0 for the ring buffer registers, causing it to hang as soon as we exhaust the available ring. As a workaround, take advantage of our huge ring buffers and use the auto-reporting mechanism to update the status page with the HEAD location every 64 KiB. Cherry-picked from 6aa56062eaba67adfb247cded244fd877329588d. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31404 Tested-by: Zhao Jian <jian.j.zhao@intel.com> Cc: stable@kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-07drm/i915/ringbuffer: Ignore failure to setup the ring on SandybridgeChris Wilson
The ring buffer registers return 0 whilst idle (for some values of idle) on early Sandybridge hw. Persevere even when all appears hopeless... Fortunately the head auto-reporting prevents most hangs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31370 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-07drm/i915/ringbuffer: Be consistent in use of ring->size when initialisingChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-04drm/i915: kill mappable/fenceable disdinctionDaniel Vetter
a00b10c360b35d6431a "Only enforce fence limits inside the GTT" also added a fenceable/mappable disdinction when binding/pinning buffers. This only complicates the code with no pratical gain: - In execbuffer this matters on for g33/pineview, as this is the only chip that needs fences and has an unmappable gtt area. But fences are only possible in the mappable part of the gtt, so need_fence implies need_mappable. And need_mappable is only set independantly with relocations which implies (for sane userspace) that the buffer is untiled. - The overlay code is only really used on i8xx, which doesn't have unmappable gtt. And it doesn't support tiled buffers, currently. - For all other buffers it's a bug to pass in a tiled bo. In short, this disdinction doesn't have any practical gain. I've also reverted mapping the overlay and context pages as possibly unmappable. It's not worth being overtly clever here, all the big gains from unmappable are for execbuf bos. Also add a comment for a clever optimization that confused me while reading the original patch by Chris Wilson. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-02drm/i915: Drop the iomem accessors when writing to the kmapped blt batchChris Wilson
I presumed that we would be writing to the batch through the GTT having bound it, so I converted it to use iomem. Even later as I spotted that we didn't even move the batch to the GTT (now an issue since we default to uncached memory on SNB) I still didn't realise that using iomem for kmapped memory was incorrect. Fix it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-02drm/i915: SNB BLT workaroundChris Wilson
On some stepping of SNB cpu, the first command to be parsed in BLT command streamer should be MI_BATCHBUFFER_START otherwise the GPU may hang. (cherry picked from commit 8d19215be8254f4f75e9c5a0d28345947b0382db) Conflicts: drivers/gpu/drm/i915/intel_ringbuffer.c drivers/gpu/drm/i915/intel_ringbuffer.h Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Cc: stable@kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-11-02drm/i915: SNB BLT workaroundZou Nan hai
On some stepping of SNB cpu, the first command to be parsed in BLT command streamer should be MI_BATCHBUFFER_START otherwise the GPU may hang. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> [ickle: rebased for -next] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-29drm/i915/ringbuffer: Use the HEAD auto-reporting mechanismChris Wilson
My Sandybridge only reports 0 for the ring buffer registers, causing it to hang as soon as we exhaust the available ring. As a workaround, take advantage of our huge ring buffers and use the auto-reporting mechanism to update the status page with the HEAD location every 64 KiB. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-29drm/i915: Check if the GPU hung whilst waiting for the ring to clearChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-29drm/i915/ringbuffer: Remove duplicate initialisation of ring controlChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-29drm/i915/ringbuffer: Disable the ringbuffer on cleanup.Chris Wilson
It should be idle on cleanup anyway... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-29drm/i915: Only enforce fence limits inside the GTT.Chris Wilson
So long as we adhere to the fence registers rules for alignment and no overlaps (including with unfenced accesses to linear memory) and account for the tiled access in our size allocation, we do not have to allocate the full fenced region for the object. This allows us to fight the bloat tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside the GTT we still suffer the additional alignment constraints, so it doesn't magic allow us to render larger scenes without stalls -- we need the expanded GTT and fence pipelining to overcome those...] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-28drm/i915/ringbuffer: Check that we setup the ringbufferChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-27drm/i915: range-restricted bind_to_gttDaniel Vetter
Like before add a parameter mappable (also to gem_object_pin) and set it depending upon the context. Only bos that are brought into the gtt due to an execbuffer call can be put into the unmappable part of the gtt, everything else (especially pinned objects) need to be put into the mappable part of the gtt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-27drm/i915: Propagate error from failing to queue a requestChris Wilson
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-27drm/i915: Propagate errors from writing to ringbufferChris Wilson
Preparing the ringbuffer for adding new commands can fail (a timeout whilst waiting for the GPU to catch up and free some space). So check for any potential error before overwriting HEAD with new commands, and propagate that error back to the user where possible. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-27drm/i915/ringbuffer: Drop the redundant dev from the vfunc interfaceChris Wilson
The ringbuffer keeps a pointer to the parent device, so we can use that instead of passing around the pointer on the stack. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-24drm/i915: Move gpu_write_list to per-ringChris Wilson
... to prevent flush processing of an idle (or even absent) ring. This fixes a regression during suspend from 87acb0a5. Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net> Tested-by: Peter Clifton <pcjc2@cam.ac.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-22drm/i915/ringbuffer: Write the value passed in to the tail registerChris Wilson
This should fix the error along the reset path were we tried to clear the tail register by setting it to 0, but were in fact setting it to the current value and complaining when it did not reset to 0. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-21drm/i915: IS_IRONLAKE is synonymous with gen == 5Chris Wilson
So remove the redundant bit in the capabilities block and s/IS_IRONLAKE/IS_GEN5/. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-10-21drm/i915: Enable SandyBridge blitter ringChris Wilson
Based on an original patch by Zhenyu Wang, this initializes the BLT ring for SandyBridge and enables support for user execbuffers. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>